ADSP-21160 SHARC® DSP Hardware Reference Revision 4.1, April 2013 Part Number 82-001966-01 Analog Devices, Inc. One Technology Way Norwood, Mass. 02062-9106 a Copyright Information © 2013 Analog Devices, Inc., ALL RIGHTS RESERVED. This document may not be reproduced in any form without prior, express written consent from Analog Devices, Inc. Printed in the USA. Disclaimer Analog Devices, Inc. reserves the right to change this product without prior notice. Information furnished by Analog Devices is believed to be accurate and reliable. However, no responsibility is assumed by Analog Devices for its use; nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under the patent rights of Analog Devices, Inc. Trademark and Service Mark Notice The Analog Devices logo, Blackfin, SHARC, TigerSHARC, CrossCore, VisualDSP++, and EZ-KIT Lite are registered trademarks of Analog Devices, Inc. All other brand and product names are trademarks or service marks of their respective owners. CONTENTS PREFACE Purpose of This Manual .............................................................. xxvii Intended Audience ...................................................................... xxvii Manual Contents ....................................................................... xxviii What’s New in This Manual .......................................................... xxx Technical Support ......................................................................... xxx Supported Processors .................................................................... xxxi Product Information .................................................................... xxxi Analog Devices Web Site ....................................................... xxxii EngineerZone ........................................................................ xxxii Notation Conventions ................................................................ xxxiii Register Diagram Conventions ................................................... xxxiv INTRODUCTION Overview – Why Floating-Point DSP? ........................................... 1-1 ADSP-21160 DSP Design Advantages ........................................... 1-2 ADSP-21160 DSP Architecture Overview ..................................... 1-5 ADSP-21160 SHARC DSP Hardware Reference iii Processor Core ........................................................................ 1-6 Processing Elements ............................................................ 1-6 Program Sequence Control ................................................. 1-8 Processor Internal Buses .................................................... 1-10 Processor Peripherals ............................................................. 1-11 Dual-Ported Internal Memory (SRAM) ............................. 1-11 External Port .................................................................... 1-12 I/O Processor ................................................................... 1-13 JTAG Port ............................................................................ 1-15 Development Tools ..................................................................... 1-15 Differences From Previous SHARC DSPs .................................... 1-16 Processor Core Enhancements ............................................... 1-16 Processor Internal Bus Enhancements .................................... 1-17 Memory Organization Enhancements .................................... 1-18 External Port Enhancements .................................................. 1-18 Host Interface Enhancements ........................................... 1-18 Multiprocessor Interface Enhancements ............................ 1-19 IO Architecture Enhancements .............................................. 1-19 DMA Controller Enhancements ........................................ 1-19 Link Port Enhancements ................................................... 1-19 Instruction Set Enhancements ............................................... 1-20 iv ADSP-21160 SHARC DSP Hardware Reference PROCESSING ELEMENTS Overview ...................................................................................... 2-1 Setting Computational Modes ....................................................... 2-2 32-bit (Normal Word) Floating-Point Format .......................... 2-3 40-bit Floating-Point Format ................................................... 2-4 16-bit (Short Word) Floating-Point Format .............................. 2-5 32-Bit Fixed-Point Format ....................................................... 2-5 Rounding Mode ...................................................................... 2-6 Using Computational Status .......................................................... 2-7 Arithmetic Logic Unit (ALU) ........................................................ 2-7 ALU Operation ....................................................................... 2-8 ALU Saturation ....................................................................... 2-9 ALU Status Flags ..................................................................... 2-9 ALU Instruction Summary .................................................... 2-10 Multiply—Accumulator (Multiplier) ........................................... 2-13 Multiplier Operation ............................................................. 2-14 Multiplier (Fixed-Point) Result Register ................................. 2-15 Multiplier Status Flags ........................................................... 2-17 Multiplier Instruction Summary ............................................ 2-18 Barrel-Shifter (Shifter) ................................................................. 2-21 Shifter Operation .................................................................. 2-21 Shifter Status Flags ................................................................ 2-25 Shifter Instruction Summary .................................................. 2-26 Data Register File ........................................................................ 2-28 ADSP-21160 SHARC DSP Hardware Reference v Alternate (Secondary) Data Registers ........................................... 2-30 Multifunction Computations ...................................................... 2-31 Secondary Processing Element (PEy) ........................................... 2-35 Dual Compute Units Sets ...................................................... 2-37 Dual Register Files ................................................................ 2-38 Dual Alternate Registers ....................................................... 2-39 SIMD (Computational) Operations ....................................... 2-39 SIMD and Status Flags .......................................................... 2-42 PROGRAM SEQUENCER Overview ...................................................................................... 3-1 Instruction Pipeline ...................................................................... 3-8 Instruction Cache ......................................................................... 3-9 Using the Cache .................................................................... 3-12 Optimizing Cache Usage ....................................................... 3-12 Branches and Sequencing ............................................................ 3-14 Conditional Branches ............................................................ 3-16 Delayed Branches .................................................................. 3-16 Loops and Sequencing ................................................................ 3-20 Restrictions On Ending Loops ............................................... 3-22 Restrictions On Short Loops ................................................. 3-23 Loop Address Stack ............................................................... 3-27 Loop Counter Stack .............................................................. 3-28 vi ADSP-21160 SHARC DSP Hardware Reference Interrupts and Sequencing ........................................................... 3-32 Sensing Interrupts ................................................................. 3-38 Masking Interrupts ................................................................ 3-39 Latching Interrupts ................................................................ 3-40 Stacking Status During Interrupts .......................................... 3-42 Nesting Interrupts ................................................................. 3-43 Reusing Interrupts ................................................................. 3-45 Interrupting IDLE ................................................................. 3-46 Multiprocessing Interrupts ..................................................... 3-47 Timer and Sequencing ................................................................ 3-48 Stacks and Sequencing ................................................................ 3-50 Conditional Sequencing .............................................................. 3-52 SIMD Mode and Sequencing ...................................................... 3-55 Conditional Compute Operations .......................................... 3-56 Conditional Branches and Loops ........................................... 3-57 Conditional Data Moves ........................................................ 3-57 Case 1: Complementary Register Pair Data Move ....................... 3-57 Case 2: Uncomplemented to Complementary Register Move ....... 3-61 Case 3: Complementary Register => Uncomplimentary Register . 3-62 Case 4: Data Move Involves External Memory or IOP Memory Space 3-63 Conditional DAG Operations ................................................ 3-64 ADSP-21160 SHARC DSP Hardware Reference vii DATA ADDRESS GENERATORS Overview ...................................................................................... 4-1 Setting DAG Modes ..................................................................... 4-3 Circular Buffering Mode ......................................................... 4-4 Broadcast Loading Mode ......................................................... 4-5 Alternate (Secondary) DAG Registers ...................................... 4-6 Bit-Reverse Addressing Mode .................................................. 4-8 Using DAG Status ........................................................................ 4-9 DAG Operations .......................................................................... 4-9 Addressing With DAGs ......................................................... 4-10 Addressing Circular Buffers ................................................... 4-12 Modifying DAG Registers ..................................................... 4-17 Addressing in SISD and SIMD Modes ................................... 4-18 DAGs, Registers, and Memory .................................................... 4-18 DAG Register-to-Bus Alignment ........................................... 4-19 DAG Register Transfer Restrictions ....................................... 4-21 DAG Instruction Summary ......................................................... 4-22 MEMORY Overview ...................................................................................... 5-1 Internal Address and Data Buses .............................................. 5-6 Internal Data Bus Exchange .................................................... 5-7 viii ADSP-21160 SHARC DSP Hardware Reference ADSP-21160 DSP Memory Map ................................................ 5-12 Internal Memory ................................................................... 5-14 Multiprocessor Memory ......................................................... 5-17 External Memory .................................................................. 5-20 Shadow Write FIFO .............................................................. 5-21 Memory Organization and Word Size .................................... 5-22 Placing 32-Bit Words and 48-Bit Words ............................ 5-23 Mixing 32-Bit and 48-Bit Words ....................................... 5-23 Restrictions on Mixing 32-Bit and 48-Bit Words ............... 5-26 Setting Data Access Modes .......................................................... 5-28 Using Boot Memory .............................................................. 5-29 Reading from Boot Memory .............................................. 5-30 Writing to Boot Memory ................................................... 5-31 Internal Interrupt Vector Table .............................................. 5-31 Internal Memory Block Data Width ...................................... 5-32 Memory Bank Size ................................................................ 5-33 External Bus Priority ............................................................. 5-33 Secondary Processor Element (PEy) ........................................ 5-34 Broadcast Register Loads ....................................................... 5-34 Illegal I/O Processor Register Access ....................................... 5-35 Unaligned 64-bit Memory Access ........................................... 5-36 External Bank X Access Mode ................................................ 5-36 External Bank X Waitstates .................................................... 5-37 External (Bank 0) DRAM Page Size ....................................... 5-38 ADSP-21160 SHARC DSP Hardware Reference ix Using Memory Access Status ....................................................... 5-38 Accessing Memory ...................................................................... 5-39 Access Word Size ................................................................... 5-40 Long Word (64-Bit) Accesses ............................................ 5-41 Instruction Word (48-Bit) and Extended Precision Normal Word (40-Bit) Accesses ............................................................ 5-43 Normal Word (32-Bit) Accesses ........................................ 5-43 Short Word (16-Bit) Accesses ............................................ 5-44 SISD, SIMD, and Broadcast Load Modes .............................. 5-44 Single-and Dual-Data Accesses .............................................. 5-45 Data Access Options ............................................................. 5-45 Short Word Addressing of Single Data in SISD Mode ........ 5-47 Short Word Addressing of Single Data in SIMD Mode ...... 5-48 Short Word Addressing of Dual-Data in SISD Mode ......... 5-51 Short Word Addressing of Dual-Data in SIMD Mode ....... 5-53 32-Bit Normal Word Addressing of Single Data in SISD Mode 5-55 32-Bit Normal Word Addressing of Single Data in SIMD Mode 5-57 32-Bit Normal Word Addressing of Dual Data in SISD Mode 5-59 32-Bit Normal Word Addressing of Dual Data in SIMD Mode 5-59 Extended Precision Normal Word Addressing of Single Data 5-62 Extended Precision Normal Word Addressing of Dual Data in SISD Mode ............................................................................ 5-64 Extended Precision Normal Word Addressing of Dual Data in SIMD Mode ............................................................................ 5-64 x ADSP-21160 SHARC DSP Hardware Reference Long Word Addressing of Single Data ............................... 5-67 Long Word Addressing of Dual Data in SISD Mode .......... 5-69 Long Word Addressing of Dual Data in SIMD Mode ......... 5-69 Mixed Word Width Addressing of Dual Data in SISD Mode 5-72 Mixed Word Width Addressing of Dual Data in SIMD Mode 5-74 Broadcast Load Access ....................................................... 5-74 Arranging Data in Memory ......................................................... 5-84 I/O PROCESSOR Overview ...................................................................................... 6-1 Setting I/O Processor—EPort Modes ........................................... 6-14 Boot Memory DMA Mode .................................................... 6-17 External Port Buffer Modes .................................................... 6-17 External Port Channel Priority Modes .................................... 6-19 External Port Channel Transfer Modes ................................... 6-21 External Port Channel Handshake Modes .............................. 6-22 Master Mode .................................................................... 6-25 Paced Master Mode ........................................................... 6-30 Slave Mode ....................................................................... 6-31 Handshake Mode .............................................................. 6-34 External-Handshake Mode ................................................ 6-40 Setting I/O Processor—LPort Modes ........................................... 6-43 Link Port Buffer Modes ......................................................... 6-45 Link Port Channel Priority Modes ......................................... 6-45 Link Port Channel Transfer Modes ......................................... 6-48 ADSP-21160 SHARC DSP Hardware Reference xi Setting I/O Processor—SPort Modes ........................................... 6-49 Serial Port Buffer Modes ....................................................... 6-51 Serial Port Channel Priority Modes ........................................ 6-52 Serial Port Channel Transfer Modes ....................................... 6-52 Using I/O Processor Status .......................................................... 6-53 External Port Status ............................................................... 6-57 Link Port Status .................................................................... 6-60 Serial Port Status ................................................................... 6-63 DMA Controller Operation ........................................................ 6-65 Managing DMA Channel Priority ......................................... 6-67 Chaining DMA Processes ...................................................... 6-69 Transfer Control Block (TCB) Chain Loading ................... 6-71 Setting Up and Starting The Chain ................................... 6-72 Inserting a TCB in an Active Chain .................................. 6-73 External Port DMA .................................................................... 6-74 Setting up External Port DMA .............................................. 6-74 Bootloading Through The External Port ................................ 6-76 Link Port DMA .......................................................................... 6-80 Setting up Link Port DMA .................................................... 6-81 Using Two-Dimensional Link Port DMA ............................... 6-83 Bootloading Through The Link Port ..................................... 6-87 Serial Port DMA ......................................................................... 6-89 Setting up Serial Port DMA ................................................... 6-90 Using Two-Dimensional Serial Port DMA ............................. 6-91 xii ADSP-21160 SHARC DSP Hardware Reference Optimizing DMA Throughput .................................................... 6-92 Internal Memory DMA ......................................................... 6-92 External Memory DMA ......................................................... 6-93 System-Level Considerations .................................................. 6-96 EXTERNAL PORT Overview ...................................................................................... 7-1 Setting External Port Modes .......................................................... 7-2 External Memory Interface ............................................................ 7-3 Banked External Memory ...................................................... 7-10 Unbanked External Memory .................................................. 7-10 Boot Memory ........................................................................ 7-11 Idle Cycle ......................................................................... 7-11 Data Hold Cycle ............................................................... 7-13 Multiprocessor Memory Space Waitstates and Acknowledge 7-14 DRAM Page Boundary Detection .......................................... 7-15 Timing External Memory Accesses ......................................... 7-18 Asynchronous Mode Interface Timing ............................... 7-19 Asynchronous Mode Read – Bus Master ........................ 7-21 Asynchronous Mode Write – Bus Master ....................... 7-22 Synchronous Mode Interface Timing ................................. 7-23 Synchronous Mode Read – Bus Master .......................... 7-25 Synchronous Write, Zero-Waitstate Mode ...................... 7-27 Synchronous Write, One Waitstate Mode ....................... 7-31 ADSP-21160 SHARC DSP Hardware Reference xiii Synchronous Burst Mode Interface Timing ....................... 7-33 Burst Length Determination ......................................... 7-34 Burst Stall Criteria ........................................................ 7-35 Synchronous Burst Reads .............................................. 7-37 Synchronous Burst Writes ............................................. 7-39 Using External SBSRAM ....................................................... 7-43 Executing Instructions From External Memory ...................... 7-48 Host Processor Interface ............................................................. 7-49 Acquiring the Bus ................................................................. 7-53 Asynchronous Transfers ......................................................... 7-57 Asynchronous Transfer Timing .............................................. 7-59 Synchronous Transfers ........................................................... 7-62 Synchronous Broadcast Writes .......................................... 7-63 Synchronous Burst Read Transfers ..................................... 7-65 Slave Direct Reads and Writes ............................................... 7-66 IOP Shadow Registers ....................................................... 7-66 Instruction Transfers ......................................................... 7-67 Host Direct Writes and Reads ................................................ 7-68 Direct Writes .................................................................... 7-68 Direct Write Latency ........................................................ 7-69 Direct Reads ..................................................................... 7-70 Broadcast Writes ................................................................... 7-71 Shadow Write FIFO .............................................................. 7-71 Data Transfers Through the EPBx Buffers .............................. 7-71 xiv ADSP-21160 SHARC DSP Hardware Reference DMA Transfers ...................................................................... 7-72 Host Data Packing ................................................................. 7-73 32-bit Data Packing .......................................................... 7-73 48-Bit Instruction Packing ................................................ 7-76 Host Interface Status ............................................................. 7-77 Interprocessor Messages and Vector Interrupts ........................ 7-77 Message Passing (MSGRx) ................................................ 7-78 Host Vector Interrupts (VIRPT) ........................................ 7-79 System Bus Interfacing .......................................................... 7-80 Access to the DSP Bus—Slave DSP ................................... 7-80 Access to the System Bus—Master DSP ............................. 7-82 Processor Core Access To System Bus ................................. 7-84 Deadlock Resolution ......................................................... 7-84 DSP DMA Access To System Bus ...................................... 7-88 Multiprocessing with Local Memory .................................. 7-89 DSP To Microprocessor Interface ...................................... 7-91 Multiprocessor (DSPs) Interface .................................................. 7-91 Multiprocessing System Architectures .................................... 7-94 Data Flow Multiprocessing ................................................ 7-94 Cluster Multiprocessing .................................................... 7-95 Multiprocessor Bus Arbitration .............................................. 7-98 Bus Arbitration Protocol ................................................. 7-100 Bus Arbitration Priority (RPBA) ...................................... 7-104 ADSP-21160 SHARC DSP Hardware Reference xv Mastership Timeout Bus ................................................. 7-106 Priority Access ................................................................ 7-107 Bus Synchronization After Reset .......................................... 7-110 Booting Another DSP ......................................................... 7-113 Multiprocessor Direct Writes and Reads .............................. 7-113 IOP Shadow Registers ..................................................... 7-114 Instruction Transfers ....................................................... 7-114 Direct Writes .................................................................. 7-114 Direct Write Latency ...................................................... 7-115 Direct Reads ................................................................... 7-115 Broadcast Writes ................................................................. 7-115 Shadow Write FIFO ............................................................ 7-117 Data Transfers Through the EPBx Buffers ............................ 7-118 Bus Lock and Semaphores ................................................... 7-118 Interprocessor Messages and Vector Interrupts ..................... 7-120 Message Passing (MSGRx) .............................................. 7-121 Vector Interrupts (VIRPT) .............................................. 7-121 Multiprocessor Interface Status ....................................... 7-122 LINK PORTS Overview ...................................................................................... 8-1 Link Port To Link Buffer Assignment ...................................... 8-3 Link Port DMA Channels ....................................................... 8-5 Link Port Booting ................................................................... 8-5 xvi ADSP-21160 SHARC DSP Hardware Reference Setting Link Port Modes ............................................................... 8-5 Link Data Path (and Compatibility) Modes .............................. 8-7 Using Link Port Handshake Signals ............................................... 8-7 Using Link Buffers ...................................................................... 8-10 Core Processor Access To Link Buffers ................................... 8-11 Host Processor Access To Link Buffers ................................... 8-11 Using Link Port DMA ................................................................ 8-12 Using Link Port Interrupts .......................................................... 8-12 Link Port Interrupts With DMA Enabled ............................... 8-13 Link Port Interrupts With DMA Disabled .............................. 8-13 Link Port Service Request Interrupts (LSRQ) ......................... 8-14 Detecting Errors On Link Transmissions ..................................... 8-16 Using Token Passing With Link Ports .......................................... 8-17 Designing Link Port Systems ....................................................... 8-21 Terminations For Link Transmission Lines ............................. 8-21 Peripheral I/O Using Link Ports ............................................. 8-22 Data Flow Multiprocessing With Link Ports ........................... 8-22 SERIAL PORTS Overview ...................................................................................... 9-1 SPORT Interrupts ................................................................... 9-5 SPORT Reset .......................................................................... 9-6 ADSP-21160 SHARC DSP Hardware Reference xvii Setting Serial Port Modes .............................................................. 9-7 Transmit and Receive Control Registers (STCTL, SRCTL) ................................................................ 9-9 Register Writes and Effect Latency ......................................... 9-12 Transmit and Receive Data Buffers (TX, RX) ......................... 9-12 Clock and Frame Sync Frequencies (TDIV, RDIV) ................ 9-14 Data Word Formats ............................................................... 9-17 Word Length .................................................................... 9-17 Endian Format ................................................................. 9-17 Data Packing and Unpacking ............................................ 9-18 Data Type ........................................................................ 9-19 Companding .................................................................... 9-20 Clock Signal Options ............................................................ 9-21 Frame Sync Options .............................................................. 9-22 Framed Versus Unframed .................................................. 9-22 Internal Versus External Frame Syncs ................................ 9-24 Active Low Versus Active High Frame Syncs ...................... 9-24 Sampling Edge For Data and Frame Syncs ......................... 9-25 Early Versus Late Frame Syncs .......................................... 9-25 Data-Independent Transmit Frame Sync ........................... 9-27 SPORT Loopback ................................................................. 9-27 Multichannel Operation ........................................................ 9-28 Frame Syncs in Multichannel Mode .................................. 9-30 Multichannel Control Bits in STCTL, SRCTL .................. 9-30 xviii ADSP-21160 SHARC DSP Hardware Reference Channel Selection Registers ............................................... 9-32 SPORT Receive Comparison Registers .............................. 9-33 Moving Data Between SPORTS and Memory .............................. 9-36 DMA Block Transfers ............................................................ 9-36 Single-Word Transfers ............................................................ 9-37 SPORT Pin/Line Terminations .................................................... 9-38 JTAG TEST EMULATION PORT JTAG Test Access Port ................................................................. 10-3 Instruction Register ..................................................................... 10-4 EMUPMD Shift Register ....................................................... 10-5 EMUPX Shift Register ........................................................... 10-7 EMU64PX Shift Register ....................................................... 10-7 EMUPC Shift Register .......................................................... 10-8 EMUCTL Shift Register ........................................................ 10-8 EMUSTAT Shift Register .................................................... 10-12 BRKSTAT Shift Register .................................................... 10-12 MEMTST Shift Register ...................................................... 10-13 PSx, DMx, IOx, and EPx (Breakpoint) Registers .................. 10-14 EMUN Register .................................................................. 10-16 EMUCLK and EMUCLK2 Registers ................................... 10-17 EMUIDLE Instruction ........................................................ 10-17 In-Circuit Signal Analyzer (ICSA) Function ......................... 10-17 Boundary Register ..................................................................... 10-17 Device Identification Register .................................................... 10-55 ADSP-21160 SHARC DSP Hardware Reference xix Built-in Self-test Operation (BIST) ........................................... 10-55 Private Instructions ................................................................... 10-55 References ................................................................................ 10-55 SYSTEM DESIGN DSP Pin Descriptions ................................................................. 11-2 Pin States At Reset .............................................................. 11-12 Clock Derivation ................................................................ 11-15 RESET and CLKIN ............................................................ 11-16 Input Synchronization Delay .......................................... 11-17 Interrupt and Timer Pins .................................................... 11-18 Flag Pins ............................................................................. 11-18 Flag Inputs ..................................................................... 11-19 Flag Outputs .................................................................. 11-19 JTAG Interface Pins ............................................................ 11-20 Dual-Voltage Power-up Sequencing ........................................... 11-21 Designing for JTAG Emulation ................................................. 11-24 Target Board Connector ...................................................... 11-25 Layout Requirements ................................................................ 11-30 Power Sequence for Emulation .................................................. 11-30 Additional JTAG Emulator References ...................................... 11-31 Pod Specifications ..................................................................... 11-31 DSP JTAG Pod Connector .................................................. 11-31 DSP 3.3V Pod Logic ........................................................... 11-32 DSP 2.5V Pod Logic ........................................................... 11-33 xx ADSP-21160 SHARC DSP Hardware Reference Conditioning Input Signals ....................................................... 11-34 Link Port Input Filter Circuits ............................................. 11-35 RESET Input Hysteresis ...................................................... 11-36 Designing For High Frequency Operation ................................. 11-36 Clock Specifications and Jitter ............................................. 11-37 Clock Distribution .............................................................. 11-38 Point-To-Point Connections ................................................ 11-40 Signal Integrity .................................................................... 11-41 Other Recommendations and Suggestions ............................ 11-44 Decoupling Capacitors and Ground Planes .......................... 11-45 Oscilloscope Probes ............................................................. 11-47 Recommended Reading ....................................................... 11-47 Booting Single and Multiple Processors ..................................... 11-48 Multiprocessor Host Booting ............................................... 11-48 Multiprocessor EPROM Booting ......................................... 11-49 All DSPs Boot in Turn from a Single EPROM ................. 11-49 One DSP is Booted, which then Boots the Others ........... 11-50 Multiprocessor Link Port Booting ........................................ 11-51 Multiprocessor Booting From External Memory ................... 11-52 REGISTERS Control and Status System Registers ............................................. A-2 Mode Control 1 Register (MODE1) ....................................... A-3 Mode Mask Register (MMASK) .............................................. A-5 Mode Control 2 Register (MODE2) ....................................... A-6 ADSP-21160 SHARC DSP Hardware Reference xxi Arithmetic Status Registers (ASTATx and ASTATy) ................. A-8 Sticky Status Registers (STKYx and STKYy) .......................... A-12 User-Defined Status Registers (USTATx) ............................... A-15 Processing Element Registers ....................................................... A-15 Data File Data Registers (Rx, Fx, Sx) ..................................... A-16 Multiplier Results Registers (MRxF, MRxB) ........................... A-16 Program Memory Bus Exchange Register (PX) ....................... A-17 Program Sequencer Registers ....................................................... A-17 Interrupt Latch Register (IRPTL) .......................................... A-18 Interrupt Mask Register (IMASK) ......................................... A-23 Interrupt Mask Pointer Register (IMASKP) ........................... A-23 Link Port Interrupt Register (LIRPTL) .................................. A-24 Flag Value Register (FLAGS) ................................................. A-27 Program Counter Register (PC) ............................................. A-28 Program Counter Stack Register (PCSTK) ............................. A-29 Program Counter Stack Pointer Register (PCSTKP) ............... A-29 Fetch Address Register (FADDR) .......................................... A-30 Decode Address Register (DADDR) ...................................... A-30 Loop Address Stack Register (LADDR) .................................. A-30 Current Loop Counter Register (CURLCNTR) ..................... A-31 Loop Counter Register (LCNTR) .......................................... A-31 Timer Period Register (TPERIOD) ....................................... A-31 Timer Count Register (TCOUNT) ....................................... A-31 xxii ADSP-21160 SHARC DSP Hardware Reference Data Address Generator Registers ............................................... A-32 Index Registers (Ix) ............................................................... A-32 Modify Registers (Mx) .......................................................... A-32 Length and Base Register (Lx, Bx) ......................................... A-33 I/O Processor Registers ............................................................... A-33 System Configuration Register (SYSCON) ............................ A-45 Vector Interrupt Address Register (VIRPT) ........................... A-48 External Memory Waitstate and Access Mode Register (WAIT) A-48 System Status Register (SYSTAT) .......................................... A-51 External Port DMA Buffer Registers (EPBx) .......................... A-52 Message Registers (MSGRx) ................................................. A-53 PC Shadow Register (PC_SHDW) ........................................ A-53 MODE2 Shadow Register (MODE2_SHDW) ...................... A-53 Bus Timeout Maximum Register (BMAX) ............................. A-53 Bus (Timeout) Counter Register (BCNT) ............................. A-54 Address of Last DRAM Page Register (ELAST) ..................... A-54 External Port DMA Control Registers (DMACx) .................. A-54 Internal Memory DMA Index Registers (IIx) ......................... A-58 Internal Memory DMA Modifier Registers (IMx) .................. A-59 Internal Memory DMA Count Registers (Cx) ....................... A-59 Chain Pointer For Next DMA TCB Registers (CPx) .............. A-59 General Purpose DMA Registers (GPx, DBx, DAx) ............... A-60 DMA Channel Status Register (DMASTAT) ......................... A-60 External Memory DMA Index Registers (EIx) ....................... A-61 ADSP-21160 SHARC DSP Hardware Reference xxiii External Memory DMA Modifier Registers (EMx) ................. A-61 External Memory DMA Count Registers (ECx) ..................... A-62 Link Port Buffer Registers (LBUFx) ....................................... A-62 Link Port Buffer Control Registers (LCTLx) .......................... A-62 Link Port Common Control Register (LCOM) ...................... A-65 Link Port Assignment Register (LAR) .................................... A-67 Link Port Service Request and Mask Register (LSRQ) ............ A-68 Link Port Path Registers (LPATHx) ....................................... A-70 Link Port Path Counter Register (LPCNT) ............................ A-70 Link Port Constant Registers (CNSTx) .................................. A-71 SPORT Serial Transmit Control Registers (STCTLx) ............. A-71 SPORT Serial Receive Control Registers (SRCTLx) ............... A-73 SPORT Transmit Buffer Registers (TXx) ............................... A-73 SPORT Receive Buffer Registers (RXx) .................................. A-76 SPORT Transmit Divisor Registers (TDIVx) ......................... A-76 SPORT Transmit Count Registers (TCNTx) ......................... A-77 SPORT Receive Divisor Registers (RDIVx) ........................... A-77 SPORT Receive Count Registers (RCNTx) ............................ A-78 SPORT Transmit Select Registers (MTCSx) ........................... A-78 SPORT Receive Select Registers (MRCSx) ............................. A-78 SPORT Transmit Compand Registers (MTCCSx) .................. A-79 SPORT Receive Compand Register (MRCCSx) ..................... A-79 SPORT Receive Comparison and Mask Registers (KEYWDx and KEYMASKx) ..................................................................... A-80 xxiv ADSP-21160 SHARC DSP Hardware Reference SPORT Serial Path Length Registers (SPATHx) .................... A-80 SPORT Serial Path Counter Registers (SPCNTx) .................. A-80 Register and Bit #Defines File (def21160.h) ................................ A-81 INTERRUPT VECTOR ADDRESSES NUMERIC FORMATS IEEE Single-Precision Floating-Point Data Format ........................ C-1 Extended-Precision Floating-Point Format .................................... C-3 Short Word Floating-Point Format ............................................... C-4 Packing for Floating-Point Data ................................................... C-4 Fixed-Point Formats ..................................................................... C-6 GLOSSARY INDEX ADSP-21160 SHARC DSP Hardware Reference xxv xxvi ADSP-21160 SHARC DSP Hardware Reference Preface PREFACE Thank you for purchasing and developing systems using SHARC® processors from Analog Devices. Purpose of This Manual ADSP-21160 SHARC DSP Hardware Reference provides architectural information on the ADSP-21160 Super Harvard Architecture (SHARC) Digital Signal Processor (DSP). The architectural descriptions cover functional blocks, buses, and ports, including all features and processes they support. For programming information, see ADSP-21160 SHARC DSP Instruction Set Reference. Intended Audience The primary audience for this manual is a programmer who is familiar with Analog Devices processors. The manual assumes the audience has a working knowledge of the appropriate processor architecture and instruction set. Programmers who are unfamiliar with Analog Devices processors can use this manual, but should supplement it with other texts, such as hardware and programming reference manuals that describe their target architecture. ADSP-21160 SHARC DSP Hardware Reference xxvii Manual Contents Manual Contents This manual provides detailed information about the ADSP-214xx processor peripherals in the following chapters: • Chapter 1, “Introduction” Provides an architectural overview of the SHARC processors. • Chapter 2, “Processing Elements” Describes the arithmetic/logic units (ALUs), multiplier/accumulator units, and shifter. The chapter also discusses data formats, data types, and register files. • Chapter 3, “Program Sequencer” Describes the operation of the program sequencer, which controls program flow by providing the address of the next instruction to be executed. The chapter also discusses loops, subroutines, jumps, interrupts, exceptions, and the IDLE instruction. • Chapter 4, “Data Address Generators” Describes the Data Address Generators (DAGs), addressing modes, how to modify DAG and pointer registers, memory address alignment, and DAG instructions. • Chapter 5, “Memory” Describes all aspects of processor memory including internal memory, address and data bus structure, and memory accesses. • Chapter 6, “I/O Processor” Describes input/output processor architecture. • Chapter 7, “External Port” Describes how the processor connects to external memories. • Chapter 8, “Link Ports” Describes the two bidirectional 8-bit wide link ports, which can connect to other processor or peripheral link ports. xxviii ADSP-21160 SHARC DSP Hardware Reference Preface • Chapter 9, “Serial Ports” Describes the data line serial ports. Each SPORT contains a clock, a frame sync, and two data lines that can be configured as either a receiver or transmitter pair. • Chapter 10, “JTAG Test Emulation Port” Discusses the JTAG standard and how to use the ADSP-21160 in a test environment. Includes boundary-scan architecture, instruction, and breakpoint registers. • Chapter 11, “System Design” Describes system features of the ADSP-21160 processor. These include power, reset, clock, JTAG, and booting, as well as pin descriptions and other system level information. • Appendix A, “Registers” Provides a graphical presentation of all registers and describes the bit usage in each register. • Appendix B, “Interrupt Vector Addresses” Provides descriptions of all ADSP-21160 DSP interrupts. • Appendix C, “Numeric Formats” Provides descriptions of the supported data formats. hardware reference is a companion document to SHARC Pro This cessor Programming Reference. ADSP-21160 SHARC DSP Hardware Reference xxix What’s New in This Manual What’s New in This Manual This manual is Revision 4.1 of ADSP-21160 SHARC DSP Hardware Reference. This revision corrects minor typographical errors and the following issues: • Globally replaced ADSP-21535 with ADSP-21160. • Active low signals represented correctly in equations for ALU conditions in Chapter 3, “Program Sequencer”. • Bit 0 descriptions for the STYKx and STYKy registers and link port interrupt bits of the LIRPTL register in Appendix A, “Registers”. Technical Support You can reach Analog Devices processors and DSP technical support in the following ways: • Post your questions in the processors and DSP support community at EngineerZone®: http://ez.analog.com/community/dsp • Submit your questions to technical support directly at: http://www.analog.com/support • E-mail your questions about processors, DSPs, and tools development software from CrossCore® Embedded Studio or VisualDSP++®: xxx ADSP-21160 SHARC DSP Hardware Reference Preface Choose Help > Email Support. This creates an e-mail to [email protected] and automatically attaches your CrossCore Embedded Studio or VisualDSP++ version information and license.dat file. • E-mail your questions about processors and processor applications to: [email protected] or [email protected] (Greater China support) • In the USA only, call 1-800-ANALOGD (1-800-262-5643) • Contact your Analog Devices sales office or authorized distributor. Locate one at: www.analog.com/adi-sales • Send questions by mail to: Processors and DSP Technical Support Analog Devices, Inc. Three Technology Way P.O. Box 9106 Norwood, MA 02062-9106 USA Supported Processors The name “SHARC” refers to a family of high-performance, floating-point embedded processors. Refer to the CCES or VisualDSP++ online help for a complete list of supported processors. Product Information Product information can be obtained from the Analog Devices Web site and the CCES or VisualDSP++ online help. ADSP-21160 SHARC DSP Hardware Reference xxxi Product Information Analog Devices Web Site The Analog Devices Web site, www.analog.com, provides information about a broad range of products—analog integrated circuits, amplifiers, converters, and digital signal processors. To access a complete technical library for each processor family, go to http://www.analog.com/processors/technical_library. The manuals selection opens a list of current manuals related to the product as well as a link to the previous revisions of the manuals. When locating your manual title, note a possible errata check mark next to the title that leads to the current correction report against the manual. Also note, myAnalog is a free feature of the Analog Devices Web site that allows customization of a Web page to display only the latest information about products you are interested in. You can choose to receive weekly e-mail notifications containing updates to the Web pages that meet your interests, including documentation errata against all manuals. myAnalog provides access to books, application notes, data sheets, code examples, and more. Visit myAnalog to sign up. If you are a registered user, just log on. Your user name is your e-mail address. EngineerZone EngineerZone is a technical support forum from Analog Devices, Inc. It allows you direct access to ADI technical support engineers. You can search FAQs and technical information to get quick answers to your embedded processing and DSP design questions. Use EngineerZone to connect with other DSP developers who face similar design challenges. You can also use this open forum to share knowledge and collaborate with the ADI support team and your peers. Visit http://ez.analog.com to sign up. xxxii ADSP-21160 SHARC DSP Hardware Reference Preface Notation Conventions Text conventions in this manual are identified and described as follows. Example Description File > Close Titles in reference sections indicate the location of an item within the IDE environment’s menu system (for example, the Close command appears on the File menu). {this | that} Alternative required items in syntax descriptions appear within curly brackets and separated by vertical bars; read the example as this or that. One or the other is required. [this | that] Optional items in syntax descriptions appear within brackets and separated by vertical bars; read the example as an optional this or that. [this,…] Optional item lists in syntax descriptions appear within brackets delimited by commas and terminated with an ellipsis; read the example as an optional comma-separated list of this. .SECTION Commands, directives, keywords, and feature names are in text with letter gothic font. filename Non-keyword placeholders appear in text with italic style format. Note: For correct operation, ... A Note provides supplementary information on a related topic. In the online version of this book, the word Note appears instead of this symbol. Caution: Incorrect device operation may result if ... Caution: Device damage may result if ... A Caution identifies conditions or inappropriate usage of the product that could lead to undesirable results or product damage. In the online version of this book, the word Caution appears instead of this symbol. Warning: Injury to device users may result if ... A Warning identifies conditions or inappropriate usage of the product that could lead to conditions that are potentially hazardous for devices users. In the online version of this book, the word Warning appears instead of this symbol. ADSP-21160 SHARC DSP Hardware Reference xxxiii Register Diagram Conventions Register Diagram Conventions Register diagrams use the following conventions: • The descriptive name of the register appears at the top, followed by the short form of the name in parentheses. • If the register is read-only (RO), write-1-to-set (W1S), or write-1-to-clear (W1C), this information appears under the name. Read/write is the default and is not noted. Additional descriptive text may follow. • If any bits in the register do not follow the overall read/write convention, this is noted in the bit description after the bit name. • If a bit has a short name, the short name appears first in the bit description, followed by the long name in parentheses. • The reset value appears in binary in the individual bits and in hexadecimal to the right of the register. • Bits marked x have an unknown reset value. Consequently, the reset value of registers that contain such bits is undefined or dependent on pin values at reset. • Shaded bits are reserved. upward compatibility with future implementations, Towriteensure back the value that is read for reserved bits in a register, unless otherwise specified. xxxiv ADSP-21160 SHARC DSP Hardware Reference Preface The following figure shows an example of these conventions. Timer Configuration Registers (TIMERx_CONFIG) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ERR_TYP[1:0] (Error Type) - RO 00 - No error. 01 - Counter overflow error. 10 - Period register programming error. 11 - Pulse width register programming error. EMU_RUN (Emulation Behavior Select) 0 - Timer counter stops during emulation. 1 - Timer counter runs during emulation. TOGGLE_HI (PWM_OUT PULSE_HI Toggle Mode) 0 - The effective state of PULSE_HI is the programmed state. 1 - The effective state of PULSE_HI alternates each period. CLK_SEL (Timer Clock Select) This bit must be set to 1, when operating the PPI in GP Output modes. 0 - Use system clock SCLK for counter. 1 - Use PWM_CLK to clock counter. OUT_DIS (Output Pad Disable) 0 - Enable pad in PWM_OUT mode. 1 - Disable pad in PWM_OUT mode. Reset = 0x0000 TMODE[1:0] (Timer Mode) 00 - Reset state - unused. 01 - PWM_OUT mode. 10 - WDTH_CAP mode. 11 - EXT_CLK mode. PULSE_HI 0 - Negative action pulse. 1 - Positive action pulse. PERIOD_CNT (Period Count) 0 - Count to end of width. 1 - Count to end of period. IRQ_ENA (Interrupt Request Enable) 0 - Interrupt request disable. 1 - Interrupt request enable TIN_SEL (Timer Input Select) 0 - Sample TMRx pin or PF1 pin. 1 - Sample UART RX pin or PPI_CLK pin. Figure 1. Register Diagram Example ADSP-21160 SHARC DSP Hardware Reference xxxv Register Diagram Conventions xxxvi ADSP-21160 SHARC DSP Hardware Reference Introduction 1 INTRODUCTION Thank you for purchasing Analog Devices SHARC® digital signal processor (DSP). Overview – Why Floating-Point DSP? A digital signal processor’s data format determines its ability to handle signals of differing precision, dynamic range, and signal-to-noise ratios. Because floating-point DSP math reduces the need for scaling and probability of overflow, using a floating-point DSP can ease algorithm and software development. The extent to which this is true depends on the floating-point processor’s architecture. Consistency with IEEE workstation simulations and the elimination of scaling are two clear ease-of-use advantages. High-level language programmability, large address spaces, and wide dynamic range allow system development time to be spent on algorithms and signal processing concerns, rather than assembly language coding, code paging, and error handling. The ADSP-21160 is a highly integrated, 32-bit floating-point DSP that provides many of these design advantages. ADSP-21160 SHARC DSP Hardware Reference 1-1 ADSP-21160 DSP Design Advantages ADSP-21160 DSP Design Advantages The ADSP-21160 processor is a high-performance 32-bit DSP for medical imaging, communications, military, audio, test equipment, 3D graphics, speech recognition, motor control, imaging, and other applications. This DSP builds on the ADSP-21000 DSP core to form a complete system-on-a-chip, adding a dual-ported on-chip SRAM, integrated I/O peripherals, and an additional processing element for Single-Instruction-Multiple-Data (SIMD) support. SHARC is an acronym for Super Harvard Architecture. This DSP architecture balances a high performance processor core with high performance buses (PM, DM, IO). In the core, every instruction can execute in a single cycle. The buses and instruction cache provide rapid, unimpeded data flow to the core to maintain the execution rate. Figure 1-1 shows a detailed block diagram of the processor, illustrating the following architectural features: • Two Processing Elements (PEx and PEy), each containing a 32-Bit IEEE floating-point computation units—multiplier, ALU, Shifter, and data register file • Program sequencer with related instruction cache, interval timer, and Data Address Generators (DAG1 and DAG2) • Dual-ported SRAM • External port for interfacing to off-chip memory, peripherals, hosts, and multiprocessor systems • Input/Output (IO) processor with integrated DMA controller, serial ports, and link ports for point-to-point multiprocessor communications • JTAG Test Access Port for emulation 1-2 ADSP-21160 SHARC DSP Hardware Reference Introduction Figure 1-1 also shows the three on-chip buses of the ADSP-21160: the Program Memory (PM) bus, Data Memory (DM) bus, and Input/Output (IO) bus. The PM bus provides access to either instructions or data. During a single cycle, these buses let the processor access two data operands (one from PM and one from DM), access an instruction (from the cache), and perform a DMA transfer. The buses connect to the ADSP-21160 DSP’s external port, which provides the processor’s interface to external memory, memory-mapped I/O, a host processor, and additional multiprocessing ADSP-21160 DSPs. The external port performs bus arbitration and supplies control signals to shared, global memory and I/O devices. INS T R U CT ION CACH E T W O I NDE P E ND E NT DU AL -P OR T E D B L OCK S 32 x 4 8- B IT P R OCE S S OR P OR T A DD R D A TA AD D R DAG1 D AG2 8x4 x32 8 x4x 32 BUS CONNECT (P X) B LO C K 0 T IME R I/O P OR T D A TA DA T A ADDR D AT A B LO C K 1 DUAL-PORTE D SRAM CORE PROCESSOR JT AG 6 T EST & E MU L AT ION AD D R P R OGR AM S E QU E NCE R P M AD DR E S S B U S 32 D M AD DR E S S B U S 32 P M DAT A B U S 48/64 D M DAT A B U S 32/40/64 IOD 64 IOA 32 EXTERNAL PO RT AD DR B U S MU X 32 MUL T IPR OCE S S OR INT E R F ACE DAT A B U S MU X 64 H OS T P OR T MU L T DAT A REGISTER FILE (PEx) 16 x 40- BIT B AR R E L S H IF T E R AL U B AR R E L S H IF T E R DATA R EG ISTER F ILE (PEy) 16 x 40- BIT MU L T AL U IOP R E GIS T E R S (MEMORY MAPPED) CONT R OL , S T AT U S , & D AT A B U F F E R S DMA CONTROLLER S E R IAL P OR T S (2) L INK P OR T S (6) 4 6 6 60 I/O PROCESSOR Figure 1-1. ADSP-21160 SHARC DSP Block Diagram Figure 1-2 illustrates a typical single-processor system. The ADSP-21160 DSP includes extensive support for multiprocessor systems as well. For more information, see “Multiprocessor (DSPs) Interface” on page 7-91. ADSP-21160 SHARC DSP Hardware Reference 1-3 ADSP-21160 DSP Design Advantages ADSP-2116X CLKIN 4 FLAG3-0 TIMEXP LINK DEVICES (6 MAX) (OPTIONAL) LXACK SERIAL SERIAL DEVICE DEVICE (OPTIONAL) (OPTIONAL) TCLK0 RCLK0 TFS0 RSF0 DT0 DR0 SERIAL SERIAL DEVICE DEVICE (OPTIONAL) (OPTIONAL) TCLK1 RCLK1 TFS1 RSF1 DT1 DR1 LXCLK LXDAT7-0 RPBA ID2-0 RESET ADDR CIF DATA BRST ADDR31-0 ADDR DATA63-0 DATA OE WE RDX WRX ACK ACK CS MS3-0 PAGE SBTS CLKOUT DMAR1-2 BOOT EPROM (OPTIONAL) MEMORY AND PERIPHERALS (OPTIONAL) DATA 3 EBOOT LBOOT IRQ2-0 CS BMS CLK_CFG3-0 ADDRESS 4 CONTROL CLOCK DMA DEVICE (OPTIONAL) DATA DMAG1-2 CS HOST PROCESSOR INTERFACE (OPTIONAL) HBR HBG REDY BR1-6 ADDR PA DATA JTAG 6 Figure 1-2. ADSP-21160 Processor System Further, the ADSP-21160 DSP addresses the five central requirements for DSPs: • Fast, flexible arithmetic computation units • Unconstrained data flow to and from the computation units • Extended precision and dynamic range in the computation units • Dual address generators with circular buffering support • Efficient program sequencing 1-4 ADSP-21160 SHARC DSP Hardware Reference Introduction Fast, Flexible Arithmetic. The ADSP-21000 Family processors execute all instructions in a single cycle. They provide both fast cycle times and a complete set of arithmetic operations. The DSP is IEEE floating-point compatible and allows either interrupt on arithmetic exception or latched status exception handling. Unconstrained Data Flow. The ADSP-21160 DSP has a Super Harvard Architecture combined with a 10-port data register file. In every cycle, the DSP can write or read two operands to or from the register file, supply two operands to the ALU, supply two operands to the multiplier, and receive three results from the ALU and multiplier. The processor’s 48-bit orthogonal instruction word supports parallel data transfers and arithmetic operations in the same instruction. 40-Bit Extended Precision. The DSP handles 32-bit IEEE floating-point format, 32-bit integer and fractional formats (twos-complement and unsigned), and extended-precision 40-bit floating-point format. The processors carry extended precision throughout their computation units, limiting intermediate data truncation errors. Dual Address Generators. The DSP has two data address generators (DAGs) that provide immediate or indirect (pre- and post-modify) addressing. Modulus, bit-reverse, and broadcast operations are supported with no constraints on data buffer placement. Efficient Program Sequencing. In addition to zero-overhead loops, the DSP supports single-cycle setup and exit for loops. Loops are both nestable (six levels in hardware) and interruptable. The processors support both delayed and non-delayed branches ADSP-21160 DSP Architecture Overview The ADSP-21160 DSP forms a complete system-on-a-chip, integrating a large, high-speed SRAM and I/O peripherals supported by a dedicated ADSP-21160 SHARC DSP Hardware Reference 1-5 ADSP-21160 DSP Architecture Overview I/O bus. The following sections summarize the features of each functional block in the ADSP-21160 SHARC architecture, which appears in Figure 1-1 on page 1-3. With each summary, a cross reference points to the sections where the features are described in greater detail. Processor Core The processor core of the ADSP-21160 DSP consists of two processing elements (each with three computation units and data register file), a program sequencer, two data address generators, a timer, and an instruction cache. All digital signal processing occurs in the processor core. Processing Elements The processor core contains two Processing Elements (PEx and PEy). Each element contains a data register file and three independent computation units: an ALU, a multiplier with a fixed-point accumulator, and a shifter. For meeting a wide variety of processing needs, the computation units process data in three formats: 32-bit fixed-point, 32-bit floating-point and 40-bit floating-point. The floating-point operations are single-precision IEEE-compatible. The 32-bit floating-point format is the standard IEEE format, whereas the 40-bit extended-precision format has eight additional Least Significant Bits (LSBs) of mantissa for greater accuracy. The ALU performs a set of arithmetic and logic operations on both fixed-point and floating-point formats. The multiplier performs floating-point or fixed-point multiplication and fixed-point multiply/add or multiply/subtract operations. The shifter performs logical and arithmetic shifts, bit manipulation, field deposit and extraction, and exponent derivation operations on 32-bit operands. These computation units perform single-cycle operations; there is no computation pipeline. All units are connected in parallel, rather than serially. The output of any unit may serve as the input of any unit on the next 1-6 ADSP-21160 SHARC DSP Hardware Reference Introduction cycle. In a multifunction computation, the ALU and multiplier perform independent, simultaneous operations. Each processing element has a general-purpose data register file that transfers data between the computation units and the data buses and stores intermediate results. A register file has two sets (primary and alternate) of sixteen registers each, for fast context switching. All of the registers are 40 bits wide. The register file, combined with the core processor’s Harvard architecture, allows unconstrained data flow between computation units and internal memory. Primary Processing Element (PEx). PEx processes all computational instructions whether the DSP is in Single-Instruction, Single-Data (SISD) or Single-Instruction, Multiple-Data (SIMD) mode. This element corresponds to the computational units and register file in previous ADSP-21000 DSPs. Secondary Processing Element (PEy). PEy processes each computational instruction in lock-step with PEx, but only processes these instructions when the DSP is in SIMD mode. Because many operations are influenced by this mode, more information on SIMD is available in multiple locations: • For information on PEy operations, see “Processing Elements” • For information on data addressing in SIMD mode, see “Addressing in SISD and SIMD Modes” on page 4-18 • For information on data accesses in SIMD mode, see “SISD, SIMD, and Broadcast Load Modes” on page 5-44 • For information on multiprocessing in SIMD mode, see “Multiprocessor (DSPs) Interface” on page 7-91. • For information on SIMD programming, see ADSP-21160 SHARC DSP Instruction Set Reference. ADSP-21160 SHARC DSP Hardware Reference 1-7 ADSP-21160 DSP Architecture Overview Program Sequence Control Internal controls for ADSP-21160 DSP’s program execution come from four functional blocks: program sequencer, data address generators, timer, and instruction cache. Two dedicated address generators and a program sequencer supply addresses for memory accesses. Together the sequencer and data address generators allow computational operations to execute with maximum efficiency since the computation units can be devoted exclusively to processing data. With its instruction cache, the ADSP-21160DSP can simultaneously fetch an instruction from the cache and access two data operands from memory. The data address generators implement circular data buffers in hardware. Program Sequencer. The program sequencer supplies instruction addresses to program memory. It controls loop iterations and evaluates conditional instructions. With an internal loop counter and loop stack, the ADSP-21160 DSP executes looped code with zero overhead. No explicit jump instructions are required to loop or decrement and test the counter. The ADSP-21160 DSP achieves its fast execution rate by means of pipelined fetch, decode and execute cycles. If external memories are used, they are allowed more time to complete an access than if there were no decode cycle. Data Address Generators. The data address generators (DAGs) provide memory addresses when data is transferred between memory and registers. Dual data address generators enable the processor to output simultaneous addresses for two operand reads or writes. DAG1 supplies 32-bit addresses to data memory. DAG2 supplies 32-bit addresses to program memory for program memory data accesses. Each DAG keeps track of up to eight address pointers, eight modifiers and eight length values. A pointer used for indirect addressing can be modified by a value in a specified register, either before (pre-modify) or after (post-modify) the access. A length value may be associated with each 1-8 ADSP-21160 SHARC DSP Hardware Reference Introduction pointer to perform automatic modulo addressing for circular data buffers; the circular buffers can be located at arbitrary boundaries in memory. Each DAG register has an alternate register that can be activated for fast context switching. Circular buffers allow efficient implementation of delay lines and other data structures required in digital signal processing, and are commonly used in digital filters and Fourier transforms. The DAGs automatically handle address pointer wraparound, reducing overhead, increasing performance, and simplifying implementation. Interrupts. The ADSP-21160 DSP has four external hardware interrupts: three general-purpose interrupts, IRQ2-0, and a special interrupt for reset. The processor also has internally generated interrupts for the timer, DMA controller operations, circular buffer overflow, stack overflows, arithmetic exceptions, multiprocessor vector interrupts, and user-defined software interrupts. For the general-purpose external interrupts and the internal timer interrupt, the ADSP-21160 DSP automatically stacks the arithmetic status and mode (MODE1) registers in parallel with the interrupt servicing, allowing fifteen nesting levels of very fast service for these interrupts. Context Switch. Many of the processor’s registers have alternate registers that can be activated during interrupt servicing for a fast context switch. The data registers in the register file, the DAG registers, and the multiplier result register all have alternates. The Primary Registers are active at reset, while the Alternate (or Secondary) Registers are activated by control bits in a mode control register. Timer. The programmable interval timer provides periodic interrupt generation. When enabled, the timer decrements a 32-bit count register every cycle. When this count register reaches zero, the ADSP-21160 DSP generates an interrupt and asserts its timer expired output. The count register is automatically reloaded from a 32-bit period register and the count resumes immediately. ADSP-21160 SHARC DSP Hardware Reference 1-9 ADSP-21160 DSP Architecture Overview Instruction Cache. The program sequencer includes a 32-word instruction cache that enables three-bus operation for fetching an instruction and two data values. The cache is selective—only instructions whose fetches conflict with program memory data accesses are cached. This caching allows full-speed execution of core, looped operations such as digital filter multiply-accumulates and FFT butterfly processing. Processor Internal Buses The processor core has six buses: PM address, PM data, DM address, DM data, IO address, and IO data. Due to processor’s Harvard Architecture, data memory stores data operands, while program memory can store both instructions and data. This architecture allows dual data fetches, when the instruction is supplied by the cache. Bus Capacities. The PM address bus and DM address bus transfer the addresses for instructions and data. The PM data bus and DM data bus transfer the data or instructions from each type of memory. The PM address bus is 32 bits wide allowing access of up to 4 Gwords of mixed instructions and data. The PM data bus is 64 bits wide to accommodate the 48-bit instructions and 64-bit data. The DM address bus is 32 bits wide allowing direct access of up to 4G words of data. The DM data bus is 64 bits wide. The DM data bus provides a path for the contents of any register in the processor to be transferred to any other register or to any data memory location in a single cycle. The data memory address comes from one of two sources: an absolute value specified in the instruction code (direct addressing) or the output of a data address generator (indirect addressing). The IO address and IO data buses let the IO processor access internal memory for DMA without delaying the processor core. The IO address bus is 32 bits wide, and the IO data bus is 64 bits wide. Data Transfers. Nearly every register in the processor core is classified as a Universal Register (UREG). Instructions allow transferring data between 1-10 ADSP-21160 SHARC DSP Hardware Reference Introduction any two universal registers or between a universal register and memory. This support includes transfers between control registers, status registers, and data registers in the register file. The PM bus connect (PX) registers permit data to be passed between the 64-bit PM data bus and the 64-bit DM data bus or between the 40-bit register file and the PM data bus. These registers contain hardware to handle the data width difference. For more information, see “Processing Element Registers” on page A-15. Processor Peripherals The term processor peripherals refers to everything outside the processor core. The ADSP-21160 DSP’s peripherals include internal memory, external port, I/O processor, JTAG port, and any external devices that connect to the DSP. Dual-Ported Internal Memory (SRAM) The ADSP-21160 DSP contains 4 megabits of on-chip SRAM, organized as two blocks of 2 Mbits each, which can be configured for different combinations of code and data storage. Each memory block is dual-ported for single-cycle, independent accesses by the core processor and I/O processor or DMA controller. The dual-ported memory and separate on-chip buses allow two data transfers from the core and one from I/O, all in a single cycle. All of the memory can be accessed as 16-, 32-, 48-, or 64-bit words. On the ADSP-21160 DSP, the memory can be configured as a maximum of 128K words of 32-bit data, 256K words of 16-bit data, 80K words of 48-bit instructions (and 40-bit data), or combinations of different word sizes up to 4 megabits. The DSP supports a 16-bit floating-point storage format, which effectively doubles the amount of data that may be stored on chip. Conversion between the 32-bit floating-point and 16-bit floating-point formats completes in a single instruction. ADSP-21160 SHARC DSP Hardware Reference 1-11 ADSP-21160 DSP Architecture Overview While each memory block can store combinations of code and data, accesses are most efficient when one block stores data, using the DM bus for transfers, and the other block stores instructions and data, using the PM bus for transfers. Using the DM bus and PM bus in this way, with one dedicated to each memory block, assures single-cycle execution with two data transfers. In this case, the instruction must be available in the cache. The DSP also maintains single-cycle execution when one of the data operands is transferred to or from off-chip, using the DSP’s external port. External Port The ADSP-21160 DSP’s external port provides the processor’s interface to off-chip memory and peripherals. The 4-gigaword off-chip address space is included in the ADSP-21160 DSP’s unified address space. The separate on-chip buses—for PM address, PM data, DM address, DM data, IO address, and IO data—multiplex at the external port to create an external system bus with a single 32-bit address bus and a single 64-bit data bus. External SRAM can be 16, 32, 48, or 64 bits wide; the DSP’s on-chip DMA controller automatically packs external data into the appropriate word width during transfers. On-chip decoding of high-order address lines generates memory bank select signals for addressing external memory devices. Separate control lines support simplified addressing of page-mode DRAM. The ADSP-21160 DSP provides programmable memory waitstates and external memory acknowledge controls for interfacing to DRAM and peripherals with variable access, hold, and disable time requirements. Host Processor Interface. The ADSP-21160 DSP’s host interface allows easy connection to standard microprocessor buses, both 16-bit and 32-bit, with little additional hardware required. The interface supports asynchronous and synchronous transfers at speeds up to the half the internal clock rate of the ADSP-21160 DSP. The host interface operates through the DSP’s external port and maps into the unified address space. Four channels of DMA are available for the host interface; code and data transfers 1-12 ADSP-21160 SHARC DSP Hardware Reference Introduction occur with low software overhead. The host can directly read and write the internal memory of the ADSP-21160 DSP and can access the DMA channel setup and mailbox registers. Vector interrupt support provides for efficient execution of host commands. Multiprocessor System Interface. The ADSP-21160 DSP offers powerful features tailored to multiprocessing DSP systems. The unified address space allows direct interprocessor accesses of each ADSP-21160 DSP’s internal memory. Distributed bus arbitration logic on the DSP allows simple, glueless connection of systems containing up to six ADSP-21160 DSPs and a host processor. Master processor changeover incurs only one cycle of overhead. Bus arbitration handles either fixed or rotating priority. Processor bus lock allows indivisible read-modify-write sequences for semaphores. A vector interrupt capability is provided for interprocessor commands. Broadcast writes allow simultaneous transmission of data to all ADSP-21160 DSPs and can be used to implement reflective semaphores. I/O Processor The ADSP-21160 DSP’s Input/Output Processor (IOP) includes two serial ports, six link ports, and a DMA controller. One of the I/O processes that the IO processor automates is booting. The DSP can boot from the external port (with data from an 8-bit EPROM or a host processor) or a link port. Alternatively, a no-boot mode lets the DSP start by executing instructions from external memory without booting. Serial Ports. The ADSP-21160 DSP features two synchronous serial ports that provide an inexpensive interface to a wide variety of digital and mixed-signal peripheral devices. The serial ports can operate at up to half the processor core clock rate. Independent transmit and receive functions provide greater flexibility for serial communications. Serial port data can automatically transfer to and from on-chip memory using DMA. Each of the serial ports offers a TDM multichannel mode and supports m-law or A-law companding. ADSP-21160 SHARC DSP Hardware Reference 1-13 ADSP-21160 DSP Architecture Overview The serial ports can operate with little-endian or big-endian transmission formats, with word lengths selectable from 3 to 32 bits. They offer selectable synchronization and transmit modes. Serial port clocks and frame syncs can be internally or externally generated. Link Ports. The ADSP-21160 DSP features six 8-bit link ports that provide additional I/O capabilities. Link port I/O is especially useful for point-to-point interprocessor communication in multiprocessing systems. The link ports can operate independently and simultaneously. The data packs into 32-bit or 48-bit words, which the processor core can directly read or the IO processor can DMA-transfer to on-chip memory. Clock/acknowledge handshaking controls link port transfers. Transfers are programmable as either transmit or receive. DMA Controller. The ADSP-21160 DSP’s on-chip DMA controller allows zero-overhead data transfers without processor intervention. The DMA controller operates independently and invisibly to the processor core to enable DMA operations to occur while the core is simultaneously executing its program. Both code and data can be downloaded to the ADSP-21160 DSP using DMA transfers. DMA transfers can occur between the ADSP-21160 DSP’s internal memory and external memory, external peripherals, or a host processor. DMA transfers can also occur between the ADSP-21160 DSP’s internal memory and its serial ports or link ports. DMA transfers between external memory and external peripheral devices are another option. External bus packing to 16-, 32-, 48-, or 64-bit words is automatically performed during DMA transfers. Fourteen channels of DMA are available on the ADSP-21160 DSP—six over the link ports, four over the serial ports, and four over the processor’s external port. The external port DMA channels serve for host processor, other ADSP-21160 DSPs, memory, or I/O transfers. 1-14 ADSP-21160 SHARC DSP Hardware Reference Introduction JTAG Port The JTAG port on the ADSP-21160 DSP supports the IEEE standard 1149.1 Joint Test Action Group (JTAG) standard for system test. This standard defines a method for serially scanning the I/O status of each component in a system. Emulators use the JTAG port to monitor and control the DSP during emulation. Emulators using this port provide full-speed emulation with access to inspect and modify memory, registers, and processor stacks. JTAG-based emulation is non-intrusive and does not effect target system loading or timing. Development Tools The processor is supported by a complete set of software and hardware development tools, including Analog Devices’ emulators and the CrossCore Embedded Studio or VisualDSP++ development environment. (The emulator hardware that supports other Analog Devices processors also emulates the processor.) The development environments support advanced application code development and debug with features such as: • Create, compile, assemble, and link application programs written in C++, C, and assembly • Load, run, step, halt, and set breakpoints in application programs • Read and write data and program memory • Read and write core and peripheral registers • Plot memory Analog Devices DSP emulators use the IEEE 1149.1 JTAG test access port to monitor and control the target board processor during emulation. The emulator provides full speed emulation, allowing inspection and ADSP-21160 SHARC DSP Hardware Reference 1-15 Differences From Previous SHARC DSPs modification of memory, registers, and processor stacks. Nonintrusive in-circuit emulation is assured by the use of the processor JTAG interface—the emulator does not affect target system loading or timing. Software tools also include Board Support Packages (BSPs). Hardware tools also include standalone evaluation systems (boards and extenders). In addition to the software and hardware development tools available from Analog Devices, third parties provide a wide range of tools supporting the Blackfin processors. Third party software tools include DSP libraries, real-time operating systems, and block diagram design tools. Differences From Previous SHARC DSPs This section identifies differences between the ADSP-21160 DSP and previous SHARC DSPs: ADSP-21060, ADSP-21061, and ADSP-21062 processors. The ADSP-21160 DSP preserves much of the ADSP-2106x architecture, while extending performance and functionality. For background information on SHARC and the ADSP-2106x DSPs, see ADSP-2106x SHARC User’s Manual. Processor Core Enhancements Computational bandwidth on the ADSP-21160 DSP is significantly greater that on the ADSP-2106x DSPs. The increase comes from raising the operational frequency and adding another processing element: ALU, Shifter, Multiplier, and register file. The new processing element lets the DSP process multiple data streams in parallel (SIMD mode). The program sequencer on the ADSP-21160 DSP differs from the ADSP-2106x DSP family, having several enhancements: new interrupt vector table definitions, SIMD mode stack and conditional execution model, and instruction decodes associated with new instructions. Changes to interrupts include new interrupt vectors for detecting illegal memory accesses and supporting new unshared DMA channels. Link port interrupt 1-16 ADSP-21160 SHARC DSP Hardware Reference Introduction control has moved to a new register to support the additional DMA channels. Also, new mode stack and mode mask support has been added to improve context switch time. Data address generators on the ADSP-21160 DSP differ from the ADSP-2106x DSPs in that DAG2 (for the PM bus) has the same addressing capability as DAG1 (for the DM bus). The DAG registers are read/writable in pairs, moving 64-bits/cycle. Additionally, the DAGs support the new memory map and Long Word transfer capability. Circular buffering on the ADSP-21160 DSP can be quickly disabled on interrupts and restored on the return. Data “broadcast”, from one memory location to both data register files, is determined by appropriate index register usage. previous SHARCs, the ADSP-21160 DSP has a global cir Unlike cular buffering enable ( ) bit. Because at reset this bit defaults CBUFEN to disabled, programs that use circular buffering and are being ported from previous SHARCs need to add a line of code to enable circular buffering. For more information, see “Addressing Circular Buffers” on page 4-12. Processor Internal Bus Enhancements The PM, DM, and IO data buses on the ADSP-21160 DSP are much wider than on the ADSP-2106x DSPs, increasing to 64 bits. Additional multiplexing and control logic on the ADSP-21160 DSP enables 16-, 32-, or 64-bit wide moves between both register files and memory. The ADSP-21160 DSP also has the capability of broadcasting a single memory location to each of the register files in parallel. Also, the ADSP-21160 DSP permits register contents to be exchanged between the two processing elements’ register files in a single cycle. ADSP-21160 SHARC DSP Hardware Reference 1-17 Differences From Previous SHARC DSPs Memory Organization Enhancements The ADSP-21160 memory map differs from the ADSP-2106x DSPs. The system memory map on the ADSP-21160DSP supports double-word transfers each cycle, reflects extended internal memory capacity for derivative designs, and works with updated control register for SIMD support. External Port Enhancements The ADSP-21160 DSP’s external port differs from the ADSP-2106x DSPs, greatly extending the external interface. The data bus on the ADSP-21160 DSP is 64 bits wide. The ADSP-21160 DSP has a new synchronous interface that improves local bus switching frequency. Also, burst support on the ADSP-21160 DSP improves bus usage. previous SHARC DSPs, the ADSP-21160DSP sets the buf Unlike fer hang disable ( ) bit at reset. Because this bit prevents the BHD processor core from detecting a buffer-related stall condition, programs that use external port, link port, or serial port I/O and are being ported from previous SHARC DSPs need to add a line of code to disable BHD. For more information, see the BHD discussion on page 6-14. Host Interface Enhancements The ADSP-21160’s host interface differs from the ADSP-2106x DSPs in that this interface can take advantage of the 64-bit data bus width. Though the ADSP-21160 DSP supports the ADSP-2106x’s asynchronous host interface protocols, the ADSP-21160 DSP also provides new synchronous interface protocols for maximum throughput. The host/local bus deadlock resolution function on the ADSP-21160 DSP is extended to the DMA controller. The function allows the host (or bridge) logic to force the local bus to back off and allow the host to complete it’s operation first. 1-18 ADSP-21160 SHARC DSP Hardware Reference Introduction Multiprocessor Interface Enhancements The ADSP-21160’s multiprocessor system interface supports greater throughput than the ADSP-2106x DSPs. The throughput between ADSP-21160 DSPs in a multiprocessing application increases due to shared data bus width increase to 64-bits, new shared bus transfer protocols, shared bus cycle time improvements due to synchronous interface, and improvements in Link Port throughput. The external port supports glueless multiprocessing, with distributed arbitration for up to six ADSP-21160 DSPs. IO Architecture Enhancements The IO processor on the ADSP-21160 DSP provides much greater throughput than the ADSP-2106x DSPs. The Link Ports and DMA controller differ on the ADSP-21160 DSP. DMA Controller Enhancements The ADSP-21160’s DMA controller supports 14 channels (versus 10 on the ADSP-2106x DSPs), with no channel sharing. New packing modes support the new 64-bit external/internal busing. To resolve potential deadlock scenarios, the ADSP-21106’s DMA controller relinquishes the local bus in a similar fashion to the processor core when host logic asserts both HBR and SBTS. Link Port Enhancements The ADSP-21160’s Link ports provide greater throughput than the ADSP-2106x DSPs. The link port data bus width on the ADSP-21160 DSP is 8 bits wide (versus 4 bits on the ADSP-2106x DSPs). Link port clock control on the ADSP-21160 supports a wider frequency range. ADSP-21160 SHARC DSP Hardware Reference 1-19 Differences From Previous SHARC DSPs Instruction Set Enhancements ADSP-21160 DSP provides source code compatibility with the previous SHARC family members, to the application assembly source code level. All instructions, control registers, and system resources available in t the ADSP-2106x core programming model are available in ADSP-21160 DSP. New instructions, control registers, or other facilities, required to support the new feature set of ADSP-21160 processor core are: • Supersets of the ADSP-2106x programming model • Reserved facilities in the ADSP-2106x programming model • Symbol name changes from the ADSP-2106x programming model These name changes can be managed through re-assembly using the ADSP-21160 DSP’s development tools to apply the ADSP-21160 symbol definitions header file and linker description file. While these changes have no direct impact on existing core applications, system and I/O processor initialization code and control code do require modifications. This approach simplifies porting of source code written for the ADSP-2106x DSPs to ADSP-21160 DSP. Code changes will be required to take full advantage of the new ADSP-21160 DSP features. For more information, see ADSP-21160 SHARC DSP Instruction Set Reference. 1-20 ADSP-21160 SHARC DSP Hardware Reference Processing Elements 2 PROCESSING ELEMENTS The DSP’s Processing Elements (PEx and PEy) perform numeric processing for DSP algorithms. Each processing element contains a data register file and three computation units: an arithmetic/logic unit (ALU), a multiplier, and a shifter. Computational instructions for these elements include both fixed-point and floating-point operations, and each computational instruction can execute in a single cycle. Overview The computational units in a processing element handle different types of operations. The ALU performs arithmetic and logic operations on fixed-point and floating-point data. The multiplier does floating-point and fixed-point multiplication and executes fixed-point multiply/add and multiply/subtract operations. The shifter completes logical shifts, arithmetic shifts, bit manipulation, field deposit, and field extraction operations on 32-bit operands. Also, the Shifter can derive exponents. Data flow paths through the computational units are arranged in parallel, as shown in Figure 2-1. The output of any computation unit may serve as the input of any computation unit on the next instruction cycle. Data moving in and out of the computational units goes through a 10-port register file, consisting of sixteen primary registers and sixteen alternate registers. Two ports on the register file connect to the PM and DM data ADSP-21160 SHARC DSP Hardware Reference 2-1 Setting Computational Modes buses, allowing data transfer between the computational units and memory (and anything else) connected to these buses. The DSP’s assembly language provides access to the data register files in both processing elements. The syntax lets programs move data to and from these registers and specify a computation’s data format at the same time with naming conventions for the registers. For information on the data register names, see “Data Register File” on page 2-28 provides a graphical guide to the other topics in this chapter. First, a description of the MODE1 register shows how to set rounding, data format, and other modes for the processing elements. Next, an examination of each computational unit provides details on operation and a summary of computational instructions. Outside the computational units, details on register files and data buses identify how to flow data for computations. Finally, details on the DSP’s advanced parallelism reveal how to take advantage of multifunction instructions and SIMD mode. Setting Computational Modes The MODE1 register controls the operating mode of the processing elements. Table A-2 on page A-3 lists all the bits in MODE1. The following bits in MODE1 control computational modes: • Floating-point data format. Bit 16 (RND32) directs the computational units to round floating-point data to 32 bits (if 1) or round to 40 bits (if 0) • Rounding mode. Bit 15 (TRUNC) directs the computational units to round results with round-to-zero (if 1) or round-to-nearest (if 0) • ALU saturation. Bit 13 (ALUSAT) directs the computational units to saturate results on positive or negative fixed-point overflows (if 1) or return unsaturated results (if 0) 2-2 ADSP-21160 SHARC DSP Hardware Reference Processing Elements PM DATA BUS MODE1 DM DATA BUS REGISTER FILE (16 40-BIT) X MULTIPLIER Y R0 R1 R2 R3 R8 R9 R10 R11 R4 R5 R6 R7 R12 R13 R14 R15 Z Y SHIFTER X Y X ALU MR2F MR1F MR0F ASTATx STKYx TO PROGRAM SEQUENCER Figure 2-1. Computations Units • Short word sign extension. Bit 14 (SSE) directs the computational units to sign extend short-word, 16-bit data (if 1) or zero-fill the upper 16 bits (if 0) • Secondary processor element (PEy). Bit 21 (PEYEN) enables computations in PEy—SIMD mode—(if 1) or disables PEy—SISD mode—(if 0) 32-bit (Normal Word) Floating-Point Format In the default mode of the DSP (RND32 bit=1), the multiplier and ALU support a single-precision floating-point format, which is specified in the ADSP-21160 SHARC DSP Hardware Reference 2-3 Setting Computational Modes IEEE 754/854 standard. For more information on this standard, see “Numeric Formats”. This format is IEEE 754/854 compatible for single-precision floating-point operations in all respects except that: • The DSP does not provide inexact flags. • NAN (“Not-A-Number”) inputs generate an invalid exception and return a quiet NAN (all 1s). • Denormal operands flush to zero when input to a computation unit and do not generate an underflow exception. Any denormal or underflow result from an arithmetic operation flushes to zero and generates an underflow exception. • The DSP supports round to nearest and round toward zero modes, but does not support round to +Infinity and round to -Infinity. IEEE single-precision floating-point data uses a 23-bit mantissa with an 8-bit exponent plus sign bit. In this case, the computation unit sets the eight LSBs of floating-point inputs to zeros before performing the operation. The mantissa of a result rounds to 23 bits (not including the hidden bit), and the 8 LSBs of the 40-bit result clear to zeros to form a 32-bit number, which is equivalent to the IEEE standard result. In fixed-point to floating-point conversion, the rounding boundary is always 40 bits even if the RND32 bit is set. 40-bit Floating-Point Format When in extended precision mode (RND32 bit=0), the DSP supports a 40-bit extended precision floating-point mode, which has eight additional LSBs of the mantissa and is compliant with the 754/854 standards; however, results in this format are more precise than the IEEE single-precision standard specifies. Extended-precision floating-point data uses a 31-bit mantissa with a 8-bit exponent plus sign bit. 2-4 ADSP-21160 SHARC DSP Hardware Reference Processing Elements 16-bit (Short Word) Floating-Point Format The DSP supports a 16-bit floating-point storage format and provides instructions that convert the data for 40-bit computations. The 16-bit floating-point format uses an 11-bit mantissa with a 4-bit exponent plus sign bit. The 16-bit data goes into bits 23 through 8 of a data register. Two shifter instructions, Fpack and Funpack, perform the packing and unpacking conversions between 32-bit floating-point words and 16-bit floating-point words. The Fpack instruction converts a 32-bit IEEE floating-point number in a data register into a 16-bit floating-point number. Funpack converts a 16-bit floating-point number in a data register into a 32-bit IEEE floating-point number. Each instruction executes in a single cycle. When 16-bit data is written to bits 23 through 8 of a data register, the DSP automatically extends the data into a 32-bit integer (bits 39 through 8). If the SSE bit in MODE1 is set (1), the DSP sign extends the upper 16 bits. If the SSE bit is cleared (0), the DSP zeros the upper 16 bits. The 16-bit floating-point format supports gradual underflow. This method sacrifices precision for dynamic range. When packing a number that would have underflowed, the exponent clears to zero and the mantissa (including “hidden” 1) right-shifts the appropriate amount. The packed result is a denormal, which can be unpacked into a normal IEEE floating-point number. 32-Bit Fixed-Point Format The DSP always represents fixed-point numbers in 32 bits, occupying the 32 MSBs in 40-bit data registers. Fixed-point data may be fractional or integer numbers and unsigned or twos-complement. Each computational unit has its own limitations on how these formats may be mixed for a given operation. All computational units read the upper 32 bits of data (inputs, operands) from the 40-bit registers (ignoring the 8 LSBs) and write results to the upper 32 bits (zeroing the 8 LSBs). ADSP-21160 SHARC DSP Hardware Reference 2-5 Setting Computational Modes Rounding Mode The TRUNC bit in the MODE1 register determines the rounding mode for all ALU operations, all floating-point multiplies, and fixed-point multiplies of fractional data. The DSP supports two modes of rounding: round-toward-zero and round-toward-nearest. The rounding modes comply with the IEEE 754 standard and have the following definitions: • Round-Toward-Zero (TRUNC bit=1). If the result before rounding is not exactly representable in the destination format, the rounded result is the number that is nearer to zero. This is equivalent to truncation. • Round-Toward-Nearest (TRUNC bit=0). If the result before rounding is not exactly representable in the destination format, the rounded result is the number that is nearer to the result before rounding. If the result before rounding is exactly halfway between two numbers in the destination format (differing by an LSB), the rounded result is the number that has an LSB equal to zero. Statistically, rounding up occurs as often as rounding down, so there is no large sample bias. Because the maximum floating-point value is one LSB less than the value that represents Infinity, a result that is halfway between the maximum floating-point value and Infinity rounds to Infinity in this mode. Though these rounding modes comply with standards set for floating-point data, they also apply for fixed-point multiplier operations on fractional data. The same two rounding modes are supported, but only the round-to-nearest operation is actually performed by the multiplier. Using its local result register for fixed-point operations, the multiplier rounds-to-zero by reading only the upper bits of the result and discarding the lower bits. 2-6 ADSP-21160 SHARC DSP Hardware Reference Processing Elements Using Computational Status The multiplier and ALU each provide exception information when executing floating-point operations. Each unit updates overflow, underflow, and invalid operation flags in the processing element’s arithmetic status (ASTATx and ASTATy) register and sticky status (STKYx and STKYy) register. An underflow, overflow, or invalid operation from any unit also generates a maskable interrupt. There are three ways to use floating-point exceptions from computations in program sequencing: • Interrupts. Enable interrupts and use an interrupt service routine to handle the exception condition immediately. This method is appropriate if it is important to correct all exceptions as they occur. • ASTATx and ASTATy registers. Use conditional instructions to test the exception flags in the ASTATx or ASTATy register after the instruction executes. This method permits monitoring each instruction’s outcome. • STKYx and STKYy registers. Use the Bit Tst instruction to examine exception flags in the STKY register after a series of operations. If any flags are set, some of the results are incorrect. This method is useful when exception handling is not critical. More information on ASTAT and STKY status appears in the sections that describe the computational units. For summaries relating instructions and status bits, see Table 2-1 on page 2-11, Table 2-2 on page 2-12, Table 2-4 on page 2-19,Table 2-6 on page 2-20,and Table 2-7 on page 2-26. Arithmetic Logic Unit (ALU) The ALU performs arithmetic operations on fixed-point or floating-point data and logical operations on fixed-point data. ALU fixed-point instructions operate on 32-bit fixed-point operands and output 32-bit fixed-point results. ALU floating-point instructions operate on 32-bit or ADSP-21160 SHARC DSP Hardware Reference 2-7 Arithmetic Logic Unit (ALU) 40-bit floating-point operands and output 32-bit or 40-bit floating-point results. ALU instructions include: • Floating-point addition, subtraction, add/subtract, average • Fixed-point addition, subtraction, add/subtract, average • Floating-point manipulation: binary log, scale, mantissa • Fixed-point add with carry, subtract with borrow, increment, decrement • Logical And, Or, Xor, Not • Functions: Abs, pass, min, max, clip, compare • Format conversion • Reciprocal and reciprocal square root primitives ALU Operation ALU instructions take one or two inputs: X input and Y input. These inputs (also known as operands) can be any data registers in the register file. Most ALU operations return one result; in add/subtract operations, the ALU operation returns two results, and in compare operations, the ALU operation returns no result (only flags are updated). ALU results can be returned to any location in the register file. The DSP transfers input operands from the register file during the first half of the cycle and transfers results to the register file during the second half of the cycle. With this arrangement, the ALU can read and write the same register file location in a single cycle. If the ALU operation is fixed-point, the inputs are treated as 32-bit fixed-point operands. The ALU transfers the upper 32 bits from the source location in the register file. For fixed-point operations, the result(s) are always 32-bit fixed-point 2-8 ADSP-21160 SHARC DSP Hardware Reference Processing Elements values. Some floating-point operations (Logb, Mant and Fix) can also yield fixed-point results. The DSP transfers fixed-point results to the upper 32 bits of the data register and clears the lower eight bits of the register. The format of fixed-point operands and results depends on the operation. In most arithmetic operations, there is no need to distinguish between integer and fractional formats. Fixed-point inputs to operations such as scaling a floating-point value are treated as integers. For purposes of determining status such as overflow, fixed-point arithmetic operands and results are treated as twos-complement numbers. ALU Saturation When the ALUSAT bit is set (1) in the MODE1 register, the ALU is in saturation mode. In this mode, all positive fixed-point overflows return the maximum positive fixed-point number (0x7FFF FFFF), and all negative overflows return the maximum negative number (0x8000 0000). When the ALUSAT bit is cleared (0) in the MODE1 register, fixed-point results that overflow are not saturated; the upper 32 bits of the result are returned unaltered. The ALU overflow flag reflects the ALU result before saturation. ALU Status Flags ALU operations update seven status flags in the processing element’s Arithmetic Status (ASTATx and ASTATy) register. Table A-4 on page A-9 lists all the bits in these registers. The following bits in ASTATx or ASTATy flag ALU status (a 1 indicates the condition) for the most recent ALU operation: • ALU result zero or floating-point underflow. Bit 0 (AZ) • ALU overflow. Bit 1 (AV) ADSP-21160 SHARC DSP Hardware Reference 2-9 Arithmetic Logic Unit (ALU) • ALU result negative. Bit 2 (AN) • ALU fixed-point carry. Bit 3 (AC) • ALU X input sign for Abs, Mant operations. Bit 4 (AS) • ALU floating-point invalid operation. Bit 5 (AI) • Last ALU operation was a floating-point operation. Bit 10 (AF) • Compare Accumulation register results of last 8 compare operations. Bits 31-24 (CACC) ALU operations also update four “sticky” status flags in the processing element’s Sticky status (STKYx and STKYy) register. Table A-5 on page A-13 lists all the bits in these registers. The following bits in STKYx or STKYy flag ALU status (a 1 indicates the condition). Once set, a sticky flag remains high until explicitly cleared: • ALU floating-point underflow. Bit 0 (AUS) • ALU floating-point overflow. Bit 1 (AVS) • ALU fixed-point overflow. Bit 2 (AOS) • ALU floating-point invalid operation. Bit 5 (AIS) Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register or sticky status register explicitly in the same cycle that the ALU is performing an operation, the explicit write to the status register supersedes any flag update from the ALU operation. ALU Instruction Summary Table 2-1 and Table 2-2 list the ALU instructions and how they relate to ASTATx,y and STKYx,y flags. For more information on assembly language 2-10 ADSP-21160 SHARC DSP Hardware Reference Processing Elements syntax, see ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols: • Rn, Rx, Ry indicate a register file location; treated as fixed-point • Fn, Fx, Fy indicate a register file location; treated as floating-point • * indicates that the flag may be set or cleared, depending on results of instruction • ** indicates that the flag may be set (but not cleared), depending on results of instruction • – indicates no effect Table 2-1. Fixed-Point ALU Instruction Summary Instruction ASTATx,y Status Flags Fixed-point: A Z AV A N A C AS AI AF C A C C A AV A US S O S AI S Rn = Rx + Ry * * * * 0 0 0 – – – ** – Rn = Rx – Ry * * * * 0 0 0 – – – ** – Rn = Rx + Ry + CI * * * * 0 0 0 – – – ** – Rn = Rx – Ry + CI – 1 * * * * 0 0 0 – – – ** – Rn = (Rx + Ry)/2 * 0 * * 0 0 0 – – – – – COMP(Rx, Ry) * 0 * 0 0 0 0 * – – – – COMPU(Rx,Ry) * 0 * 0 0 0 0 * -- -- -- -- Rn = Rx + CI * * * * 0 0 0 – – – ** – Rn = Rx + CI – 1 * * * * 0 0 0 – – – ** – Rn = Rx + 1 * * * * 0 0 0 – – – ** – Rn = Rx – 1 * * * * 0 0 0 – – – ** – Rn = –Rx * * * * 0 0 0 – – – ** – Rn = ABS Rx * * 0 0 * 0 0 – – – ** – ADSP-21160 SHARC DSP Hardware Reference STKYx,y Status Flags 2-11 Arithmetic Logic Unit (ALU) Table 2-1. Fixed-Point ALU Instruction Summary (Cont’d) Instruction ASTATx,y Status Flags STKYx,y Status Flags Fixed-point: A Z AV A N A C AS AI AF C A C C A AV A US S O S AI S Rn = PASS Rx * 0 * 0 0 0 0 – – – – – Rn = Rx AND Ry * 0 * 0 0 0 0 – – – – – Rn = Rx OR Ry * 0 * 0 0 0 0 – – – – – Rn = Rx XOR Ry * 0 * 0 0 0 0 – – – – – Rn = NOT Rx * 0 * 0 0 0 0 – – – – – Rn = MIN(Rx, Ry) * 0 * 0 0 0 0 – – – – – Rn = MAX(Rx, Ry) * 0 * 0 0 0 0 – – – – – Rn = CLIP Rx BY Ry * 0 * 0 0 0 0 – – – – – Table 2-2. Floating-Point ALU Instruction Summary Instruction ASTATx,y Status Flags Floating–point: AZ AV AN AC AS AI AF CA CC AU S AV S AO S AIS Fn = Fx + Fy * * * 0 0 * 1 – ** ** – ** Fn = Fx – Fy * * * 0 0 * 1 – ** ** – ** Fn = ABS (Fx + Fy) * * 0 0 0 * 1 – ** ** – ** Fn = ABS (Fx – Fy) * * 0 0 0 * 1 – ** ** – ** Fn = (Fx + Fy)/2 * 0 * 0 0 * 1 – ** – – ** COMP(Fx, Fy) * 0 * 0 0 * 1 * – – – ** Fn = –Fx * * * 0 0 * 1 – – ** – ** Fn = ABS Fx * * 0 0 * * 1 – – ** – ** Fn = PASS Fx * 0 * 0 0 * 1 – – – – ** Fn = RND Fx * * * 0 0 * 1 – – ** – ** Fn = SCALB Fx BY Ry * * * 0 0 * 1 – ** ** – ** Rn = MANT Fx * * 0 0 * * 1 – – ** – ** 2-12 STKYx,y Status Flags ADSP-21160 SHARC DSP Hardware Reference Processing Elements Table 2-2. Floating-Point ALU Instruction Summary (Cont’d) Instruction ASTATx,y Status Flags STKYx,y Status Flags Floating–point: AZ AV AN AC AS AI AF CA CC AU S AV S AO S AIS Rn = LOGB Fx * * * 0 0 * 1 – – ** – ** Rn = FIX Fx BY Ry * * * 0 0 * 1 – ** ** – ** Rn = FIX Fx * * * 0 0 * 1 – ** ** – ** Fn = FLOAT Rx BY Ry * * * 0 0 0 1 – ** ** – – Fn = FLOAT Rx * 0 * 0 0 0 1 – – – – – Fn = RECIPS Fx * * * 0 0 * 1 – ** ** – ** Fn = RSQRTS Fx * * * 0 0 * 1 – – ** – ** Fn = Fx COPYSIGN Fy * 0 * 0 0 * 1 – – – – ** Fn = MIN(Fx, Fy) * 0 * 0 0 * 1 – – – – ** Fn = MAX(Fx, Fy) * 0 * 0 0 * 1 – – – – ** Fn = CLIP Fx BY Fy * 0 * 0 0 * 1 – – – – ** Multiply—Accumulator (Multiplier) The multiplier performs fixed-point or floating-point multiplication and fixed-point multiply/accumulate operations. Fixed-point multiply/accumulates are available with either cumulative addition or cumulative subtraction. Multiplier floating-point instructions operate on 32-bit or 40-bit floating-point operands and output 32-bit or 40-bit floating-point results. Multiplier fixed-point instructions operate on 32-bit fixed-point data and produce 80-bit results. Inputs are treated as fractional or integer, unsigned or twos-complement. Multiplier instructions include: • Floating-point multiplication • Fixed-point multiplication • Fixed-point multiply/accumulate with addition, rounding optional ADSP-21160 SHARC DSP Hardware Reference 2-13 Multiply—Accumulator (Multiplier) • Fixed-point multiply/accumulate with subtraction, rounding optional • Rounding result register • Saturating result register • Clearing result register Multiplier Operation The multiplier takes two inputs: X input and Y input. These inputs (also known as operands) can be any data registers in the register file. The multiplier can accumulate fixed-point results in the local Multiplier Result (MRF) registers or write results back to the register file. The results in MRF can also be rounded or saturated in separate operations. Floating-point multiplies yield floating-point results, which the multiplier always writes directly to the register file. The multiplier transfers input operands during the first half of the cycle and transfers results during the second half of the cycle. With this arrangement, the multiplier can read and write the same register file location in a single cycle. For fixed-point multiplies, the multiplier reads the inputs from the upper 32 bits of the data registers. Fixed-point operands may be either both in integer format or both in fractional format. The format of the result matches the format of the inputs. Each fixed-point operand may be either an unsigned or a twos-complement number. If both inputs are fractional and signed, the multiplier automatically shifts the result left one bit to remove the redundant sign bit. The register name(s) within the multiplier instruction specify input data type(s)—Fx for floating-point and Rx for fixed-point. 2-14 ADSP-21160 SHARC DSP Hardware Reference Processing Elements Multiplier (Fixed-Point) Result Register Fixed-point operations place 80-bit results in the multiplier’s foreground MRF register or background MRB register, depending on which is active. For more information on selecting the result register, see “Alternate (Secondary) Data Registers” on page 2-30. The location of a result in the MRF register’s 80-bit field depends on whether the result is in fractional or integer format, as shown in Table 2-1 on page 2-11. If the result is sent directly to a data register, the 32-bit result with the same format as the input data is transferred, using bits 63-32 for a fractional result or bits 31-0 for an integer result. The eight LSBs of the 40-bit register file location are zero-filled. Fractional results can be rounded-to-nearest before being sent to the register file. If rounding is not specified, discarding bits 31-0 effectively truncates a fractional result (rounds to zero). For more information on rounding, see “Rounding Mode” on page 2-6. 79 63 31 MR2F MR1F MR0F OVERFLOW FRACTIONAL RESULT UNDERFLOW OVERFLOW OVERFLOW INTEGER RESULT Figure 2-2. Multiplier Fixed-Point Result Placement The MRF register is divided into MR2F, MR1F, and MR0F registers, which can be individually read from or written to the register file. Each of these registers has the same format. When data is read from MR2F, it is sign-extended to 32 bits as shown in Figure 2-3. The DSP zero fills the eight LSBs of the 40-bit register file location when data is read from MR2F, MR1F, or MR0F to the register file. When the DSP writes data into MR2F, ADSP-21160 SHARC DSP Hardware Reference 2-15 Multiply—Accumulator (Multiplier) MR1F, or MR0F from the 32 MSBs of a register file location, the eight LSBs are ignored. Data written to MR1F is sign-extended to MR2F, repeating the MSB of MR1F in the 16 bits of MR2F. Data written to MR0F is not sign-extended. 16 BITS SIGN EXTEND 16 BITS 16 BITS MR2F ZEROS 32 BITS 8 BITS MR1F ZEROS 32- BITS 8- BITS MR0F ZEROS Figure 2-3. MR Transfer Formats In addition to multiplication, fixed-point operations include accumulation, rounding and saturation of fixed-point data. There are three MRF register operations: Clear, Round, and Saturate. The clear operation—MRF=0—resets the specified MRF register to zero. Often, it is best to perform this operation at the start of a multiply/accumulate operation to remove results left over from the previous operation. The rounding operation—MRF=Rnd MRF—applies only to fractional results, so integer results are not effected. This operation rounds the 80-bit MRF value to nearest at bit 32; for example, the MR1F-MR0F boundary. Rounding of a fixed-point result occurs either as part of a multiply or multiply/accumulate operation or as an explicit operation on the MRF register. The rounded result in MR1F can be sent either to the register file or back to the same MRF register. To round a fractional result to zero (truncation) instead of to nearest, a program would transfer the unrounded result from MR1F, discarding the lower 32 bits in MR0F. 2-16 ADSP-21160 SHARC DSP Hardware Reference Processing Elements The saturate operation—MRF=Sat MRF—sets MRF to a maximum value if the MRF value has overflowed. Overflow occurs when the MRF value is greater than the maximum value for the data format—unsigned or twos-complement and integer or fractional—as specified in the saturate instruction. The six possible maximum values appear in Table 2-3. The result from MRF saturation can be sent either to the register file or back to the same MRF register. Table 2-3. Fixed-Point Format Maximum Values (for Saturation) Maximum Number (Hexadecimal) MR2F MR1F MR0F 2’s complement fractional (positive) 0000 7FFF FFFF FFFF FFFF 2’s complement fractional (negative) FFFF 8000 0000 0000 0000 2’s complement integer (positive) 0000 0000 0000 7FFF FFFF 2’s complement integer (negative) FFFF FFFF FFFF 8000 0000 Unsigned fractional number 0000 FFFF FFFF FFFF FFFF Unsigned integer number 0000 0000 0000 FFFF FFFF Multiplier Status Flags Multiplier operations update four status flags in the processing element’s arithmetic status register (ASTATx and ASTATy). Table A-4 on page A-9 lists all the bits in these registers. The following bits in ASTATx or ASTATy flag multiplier status (a 1 indicates the condition) for the most recent multiplier operation: • Multiplier result negative. Bit 6 (MN) • Multiplier overflow. Bit 7 (MV) • Multiplier underflow. Bit 8 (MU) • Multiplier floating-point invalid operation. Bit 9 (MI) ADSP-21160 SHARC DSP Hardware Reference 2-17 Multiply—Accumulator (Multiplier) Multiplier operations also update four “sticky” status flags in the processing element’s Sticky status ( STKYx and STKYy) register. Table A-5 on page A-13 lists all the bits in these registers. The following bits in STKYx or STKYy flag multiplier status (a 1 indicates the condition). Once set, a sticky flag remains high until explicitly cleared: • Multiplier fixed-point overflow. Bit 6 (MOS) • Multiplier floating-point overflow. Bit 7 (MVS) • Multiplier underflow. Bit 8 (MUS) • Multiplier floating-point invalid operation. Bit 9 (MIS) Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register or sticky register explicitly in the same cycle that the multiplier is performing an operation, the explicit write to ASTAT or STKY supersedes any flag update from the multiplier operation. Multiplier Instruction Summary Table 2-4 on page 2-19 and Table 2-6 on page 2-20 list the Multiplier instructions and how they relate to ASTATx,y and STKYx,y flags. For more information on assembly language syntax, see ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols: • Rn, Rx, Ry indicate any register file location; treated as fixed-point • Fn, Fx, Fy indicate any register file location; treated as floating-point • * indicates the flag may be set or cleared, depending on results of instruction 2-18 ADSP-21160 SHARC DSP Hardware Reference Processing Elements • ** indicates the flag may be set (but not cleared), depending on results of instruction • – indicates no effect • The Input Mods column indicates the types of optional modifiers that you can apply to the instructions inputs. For a list of modifiers, see Table 2-5. Table 2-4. Fixed-Point Multiplier Instruction Summary Instruction Input Mods ASTATx,y Flags MU MN MV MI MU S MO S MVS MIS Rn = Rx * Ry 1 * * * 0 – ** – – MRF = Rx * Ry 1 * * * 0 – ** – – MRB = Rx * Ry 1 * * * 0 – ** – – Rn = MRF + Rx * Ry 1 * * * 0 – ** – – Rn = MRB + Rx * Ry 1 * * * 0 – ** – – MRF = MRF + Rx * Ry 1 * * * 0 – ** – – MRB = MRB + Rx * Ry 1 * * * 0 – ** – – Rn = MRF – Rx * Ry 1 * * * 0 – ** – – Rn = MRB – Rx * Ry 1 * * * 0 – ** – – MRF = MRF – Rx * Ry 1 * * * 0 – ** – – MRB = MRB – Rx * Ry 1 * * * 0 – ** – – Rn = SAT MRF 2 * * * 0 – ** – – Rn = SAT MRB 2 * * * 0 – ** – – MRF = SAT MRF 2 * * * 0 – ** – – MRB = SAT MRB 2 * * * 0 – ** – – Rn = RND MRF 3 * * * 0 – ** – – Rn = RND MRB 3 * * * 0 – ** – – MRF = RND MRF 3 * * * 0 – ** – – Fixed-Point: For Input Mods, see Table 2-5 ADSP-21160 SHARC DSP Hardware Reference STKYx,y Flags 2-19 Multiply—Accumulator (Multiplier) Table 2-4. Fixed-Point Multiplier Instruction Summary (Cont’d) Instruction Input Mods ASTATx,y Flags MU MN MV MI MU S MO S MVS MIS MRB = RND MRB 3 * * * 0 – ** – – MRF= 0 – – – – – – – – – MRB= 0 – – – – – – – – – MRxF = Rn – – – – – – – – – MRxB = Rn – – – – – – – – – Rn = MRxF – – – – – – – – – Rn = MRxB – – – – – – – – – Fixed-Point: For Input Mods, see Table 2-5 STKYx,y Flags Table 2-5. Input Modifiers for Fixed-Point Multiplier Instruction Input Mods from Table 2-2 Input Mods—Options For Fixed-point Multiplier Instructions Note the meaning of the following symbols in this table: SSigned input UUnsigned input IInteger input(s) FFractional input(s) FRFractional inputs, Rounded output Note that (SF) is the default format for 1-input operations, and (SSF) is the default format for 2-input operations 1 (SSF), (SSI), (SSFR), (SUF), (SUI), (SUFR), (USF), (USI), (USFR), (UUF), (UUI), or (UUFR) 2 (SF), (SI), (UF), or (UI) 3 (SF) or (UF) Table 2-6. Floating-Point Multiplier Instruction Summary Instruction ASTATx,y Flags Floating-Point: MU MN MV MI MUS MOS MVS MIS Fn = Fx * Fy * * * * ** ** 2-20 STKYx,y Flags – ** ADSP-21160 SHARC DSP Hardware Reference Processing Elements Barrel-Shifter (Shifter) The shifter performs bit-wise operations on 32-bit fixed-point operands. Shifter operations include: • Shifts and rotates from off-scale left to off-scale right • Bit manipulation operations, including bit set, clear, toggle, and test • Bit field manipulation operations, including extract and deposit • Fixed-point/floating-point conversion operations, including exponent extract, number of leading 1s or 0s Shifter Operation The shifter takes from one to three inputs: X-input, Y-input, and Z-input. The inputs (also known as operands) can be any register in the register file. Within a shifter instruction, the inputs serve as follows: • The X-input provides data that is operated on • The Y-input specifies shift magnitudes, bit field lengths or bit positions • The Z-input provides data that is operated on and updated In the following example, Rx is the X-input, Ry is the Y-input, and Rn is the Z-input. The shifter returns one output (Rn) to the register file. Rn = Rn OR LSHIFT Rx BY Ry; As shown in Figure 2-4, the shifter fetches input operands from the upper 32 bits of a register file location (bits 39-8) or from an immediate value in the instruction. The shifter transfers operands during the first half of the cycle and transfers the result to the upper 32 bits of a register (with the eight LSBs zero-filled) during the second half of the cycle. With this ADSP-21160 SHARC DSP Hardware Reference 2-21 Barrel-Shifter (Shifter) arrangement, the shifter can read and write the same register file location in a single cycle. The X-input and Z-input are always 32-bit fixed-point values. The Y-input is a 32-bit fixed-point value or an 8-bit field (shf8), positioned in the register file. These inputs appear in Figure 2-4. Some shifter operations produce 8-bit or 6-bit results. As shown in Figure 2-5, the shifter places these results in either the shf8 field or the bit6 field and sign-extends the results to 32 bits. The shifter always returns a 32-bit result. 39 7 0 7 0 32-BIT Y-INPUT OR RESULT 39 15 SHF8 8-BIT Y-INPUT OR RESULT Figure 2-4. Register File Fields for Shifter Instructions The shifter supports bit field deposit and bit field extract instructions for manipulating groups of bits within an input. The Y-input for bit field instructions specifies two 6-bit values: bit6 and len6, which are positioned in the Ry register as shown in Figure 2-5. The shifter interprets bit6 and len6 as positive integers. Bit6 is the starting bit position for the deposit or extract, and len6 is the bit field length, which specifies how many bits are deposited or extracted. 2-22 ADSP-21160 SHARC DSP Hardware Reference Processing Elements 39 19 13 7 LEN6 0 BIT6 12-BIT Y-INPUT Figure 2-5. Register File Fields for FDEP and FEXT Instructions Field deposit (Fdep) instructions take a group of bits from the input register (starting at the LSB of the 32-bit integer field) and deposit the bits as directed anywhere within the result register. The bit6 value specifies the starting bit position for the deposit. Figure 2-6 shows bit placement for the following field deposit instruction: R0 = FDEP R1 BY R2; 39 R2 32 00000000 24 00000000 16 8 bit6 len6 39 R1 32 00000000 24 00000000 16 00000000 16 39 R0 32 00000000 8 16 0 0 0 0 0 0 0 0 0 0x0000 00FF 00 0 16 00000000 len6 = 8 bit6 = 16 11111111 8 24 11111111 0 0 0 0 0 0 0 0 0 0x0000 0210 00 00000010 0001 0000 8 8 Starting bit position for deposit 0 00000000 00000000 0x00FF 0000 00 0 Reference point Figure 2-6. Bit Field Deposit Example ADSP-21160 SHARC DSP Hardware Reference 2-23 Barrel-Shifter (Shifter) Figure 2-7 shows how the inputs, bit6 and len6, work in an field deposit instruction (Rn=Fdep Rx By Ry). Field extract (Fext) instructions extract a group of bits as directed from anywhere within the input register and place them in the result register (aligned with the LSB of the 32-bit integer field). The bit6 value specifies the starting bit position for the extract. Figure 2-8 shows bit placement for the following field extract instruction: R3 = FEXT R4 BY R5; 39 R5 32 00000000 24 00000000 16 0 0 00 11 00 11 11 11 00000010 len6 39 R4 32 10000111 24 16 39 32 00000000 0 00000000 0x0000 0217 00 len6 = 8 bit6 = 23 8 00000000 8 Starting bit position for deposit R3 bit6 16 00000000 1 0000000 8 0 00000000 0x8788 0000 00 0 Reference point 24 00000000 16 00000000 16 8 8 00000111 0 00000000 0x0000 000F 00 0 Figure 2-7. Bit Field Extract Example 2-24 ADSP-21160 SHARC DSP Hardware Reference Processing Elements 39 19 RY 13 LEN6 7 0 BIT6 RY DETERMINES LENGTH OF BIT FIELD TO TAKE FROM RX AND STARTING POSITION FOR DEPOSIT IN RN 39 7 0 7 0 RX LEN6 = NUMBER OF BITS TO TAKE FROM RX, STARTING FROM LSB OF 32-BIT FIELD 39 RN DEPOSIT FIELD BIT6 REFERENCE POINT BIT6 = STARTING BIT POSITION FOR DEPOSIT, REFERENCED FROM LSB OF 32-BIT FIELD Figure 2-8. Bit Field Deposit Instruction Shifter Status Flags Shifter operations update three status flags in the processing element’s arithmetic status register (ASTATx and ASTATy). Table A-4 on page A-9 lists all the bits in these registers. The following bits in ASTATx or ASTATy flag shifter status (a 1 indicates the condition) for the most recent ALU operation: • Shifter overflow of bits to left of MSB. Bit 11 (SV) • Shifter result zero. Bit 12 (SZ) • Shifter input sign for exponent extract only. Bit 13 (SS) Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register explicitly in the same cycle that the shifter is performing an ADSP-21160 SHARC DSP Hardware Reference 2-25 Barrel-Shifter (Shifter) operation, the explicit write to ASTAT supersedes any flag update caused by the shift operation. Shifter Instruction Summary Table 2-7 on page 2-26 lists the Shifter instructions and how they relate to ASTATx,y flags. For more information on assembly language syntax, see ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols: • Rn, Rx, Ry indicate any register file location; bit fields used depend on instruction • Fn, Fx indicate any register file location; floating-point word • * indicates the flag may set or cleared, depending on data Table 2-7. Shifter Instruction Summary Instruction 2-26 ASTATx,y Flags SZ SV SS Rn = LSHIFT Rx BY Ry * * 0 Rn = LSHIFT Rx BY <data8> * * 0 Rn = Rn OR LSHIFT Rx BY Ry * * 0 Rn = Rn OR LSHIFT Rx BY <data8> * * 0 Rn = ASHIFT Rx BY Ry * * 0 Rn = ASHIFT Rx BY<data8> * * 0 Rn = Rn OR ASHIFT Rx BY Ry * * 0 Rn = Rn OR ASHIFT Rx BY <data8> * * 0 Rn = ROT Rx BY Ry * 0 0 Rn = ROT Rx BY <data8> * 0 0 Rn = BCLR Rx BY Ry * * 0 Rn = BCLR Rx BY <data8> * * 0 Rn = BSET Rx BY Ry * * 0 ADSP-21160 SHARC DSP Hardware Reference Processing Elements Table 2-7. Shifter Instruction Summary (Cont’d) Instruction ASTATx,y Flags SZ SV SS Rn = BSET Rx BY <data8> * * 0 Rn = BTGL Rx BY Ry * * 0 Rn = BTGL Rx BY <data8> * * 0 BTST Rx BY Ry * * 0 BTST Rx BY <data8> * * 0 Rn = FDEP Rx BY Ry * * 0 Rn = FDEP Rx BY <bit6>:<len6> * * 0 Rn = Rn OR FDEP Rx BY Ry * * 0 Rn = Rn OR FDEP Rx BY <bit6>:<len6> * * 0 Rn = FDEP Rx BY Ry (SE) * * 0 Rn = FDEP Rx BY <bit6>:<len6> (SE) * * 0 Rn = Rn OR FDEP Rx BY Ry (SE) * * 0 Rn = Rn OR FDEP Rx BY <bit6>:<len6> (SE) * * 0 Rn = FEXT Rx BY Ry * * 0 Rn = FEXT Rx BY <bit6>:<len6> * * 0 Rn = FEXT Rx BY Ry (SE) * * 0 Rn = FEXT Rx BY <bit6>:<len6> (SE) * * 0 Rn = EXP Rx (EX) * 0 * Rn = EXP Rx * 0 * Rn = LEFTZ Rx * * 0 Rn = LEFTO Rx * * 0 Rn = FPACK Fx 0 * 0 Fn = FUNPACK Rx 0 0 0 ADSP-21160 SHARC DSP Hardware Reference 2-27 Data Register File Data Register File Each of the DSP’s processing elements has a data register file: a set of data registers that transfer data between the data buses and the computation units. These registers also provide local storage for operands and results. The two register files each consist of 16 primary registers and 16 alternate (secondary) registers. All of the data registers are 40 bits wide. Within these registers, 32-bit data is always left-justified. If an operation specifies a 32-bit data transfer to these 40-bit registers, the eight LSBs are ignored on register reads, and the eight LSBs are cleared to zeros on writes. Program memory data accesses and data memory accesses to/from the register file(s) occur on the PM data bus and DM data bus, respectively. One PM data bus access for each processing element and/or one DM data bus access for each processing element can occur in one cycle. Transfers between the register files and the DM or PM data buses can move up to 64-bits of valid data on each bus. If an operation specifies the same register file location as both an input and output, the read occurs in the first half of the cycle and the write in the second half. With this arrangement, the DSP uses the old data as the operand, before updating the location with the new result data. If writes to the same location take place in the same cycle, only the write with higher precedence actually occurs. The DSP determines precedence for the write operation from the source of the data; from highest to lowest, the precedence is: 1. Data memory or universal register 2. Program memory 3. PEx ALU 4. PEy ALU 5. PEx Multiplier 2-28 ADSP-21160 SHARC DSP Hardware Reference Processing Elements 6. PEy Multiplier 7. PEx Shifter 8. PEy Shifter The data register file in Figure 2-1 on page 2-3 lists register names of R0 through R15 within PEx’s register file. When a program refers to these registers as R0 through R15, the computational units treat the registers’ contents as fixed-point data. To perform floating point computations, refer to these registers as F0 through F15. For example, the following instructions refer to the same registers, but direct the computational units to perform different operations: F0=F1 * F2; floating-point multiply R0=R1 * R2; fixed-point multiply The F and R prefixes on register names do not effect the 32-bit or 40-bit data transfer; the naming convention only determines how the ALU, multiplier, and shifter treat the data. compatibility with code written for previous SHARC ToDSPs,maintain the assembly syntax accommodates references to data PEx registers and PEy data registers. Code may only refer to the PEy data registers (S0 through S15) for data move instructions. The rules for using register names are as follows: • R0 through R15 and F0 through F15 always refer to PEx registers for data move and computational instructions, whether the DSP is in SISD or SIMD mode • R0 through R15 and F0 through F15 refer to both PEx and PEy register for computational instructions in SIMD mode • S0 through S15 always refer to PEy registers for data move instructions, whether the DSP is in SISD or SIMD mode ADSP-21160 SHARC DSP Hardware Reference 2-29 Alternate (Secondary) Data Registers For more information on SISD and SIMD computational operations, see “Secondary Processing Element (PEy)” on page 2-35. For more information on ADSP-21160 assembly language, see ADSP-21160 SHARC DSP Instruction Set Reference. Alternate (Secondary) Data Registers Each register file has an alternate register set. To facilitate fast context switching, the DSP includes alternate register sets for data, results, and data address generator registers. Bits in the MODE1 register control when alternate registers become accessible. While inaccessible, the contents of alternate registers are not effected by DSP operations. Note that there is a one cycle latency between writing to MODE1 and being able to access an alternate register set. The alternate register sets for data and results are described in this section. For more information on alternate data address generator registers, see “Alternate (Secondary) Data Registers” on page 2-30. Bits in the MODE1 register can activate independent-alternate-data-register sets: the lower half (R0-R7 and S0-S7) and the upper half (R8-R15 and S8-S15). To share data between contexts, a program places the data to be shared in one half of either the current processing element’s register file or the opposite processing element’s register file and activates the alternate register set of the other half. For information on how to activate alternate data registers, see the description on page 2-30. Each multiplier has a primary or foreground (MRF) register and alternate or background (MRB) results register. A bit in the MODE1 register selects which result register receives the result from the multiplier operation, swapping which register is the current MRF or MRB. This swapping facilitates context switching. Unlike other registers that have alternates, both MRF and MRB are accessible at the same time. All fixed-point multiplies can accumulate results in either MRF or MRB, without regard to the state of the MODE1 register. With this arrangement, code can use the result registers as primary 2-30 ADSP-21160 SHARC DSP Hardware Reference Processing Elements and alternate accumulators, or code can use these registers as two parallel accumulators. This feature facilitates complex math. The MODE1 register controls the access to alternate registers. Table A-2 on page A-3 lists all the bits in MODE1. The following bits in MODE1 control alternate registers (a 1 enables the alternate set): • Secondary registers for computation unit results. Bit 2 (SRCU) • Secondary registers for hi register file, R8-R15 and S8-15. Bit 7 (SRRFH) • Secondary registers for lo register file, R0-R7 and S0-S7. Bit 10 (SRRFL) The following example demonstrates how code should handle the one cycle of latency from the instruction setting the bit in MODE1 to when the alternate registers may be accessed. BIT SET MODE1 SRRFL;/* activate alternate reg. file */ NOP;/* wait for access to alternates */ R0=7; Multifunction Computations Using the many parallel data paths within its computational units, the DSP supports multiple-parallel (multifunction) computations. These instructions complete in a single cycle, and they combine parallel operation of the multiplier and the ALU or dual ALU functions. The multiple operations perform the same as if they were in corresponding single-function computations. Multifunction computations also handle flags in the same way as the single-function computations, except that in the dual add/subtract computation the ALU flags from the two operations are ORed together. ADSP-21160 SHARC DSP Hardware Reference 2-31 Multifunction Computations To work with the available data paths, the computation units constrain which data registers may hold the four input operands for multifunction computations. These constraints limit which registers may hold the X-input and Y-input for the ALU and multiplier. Figure 2-9 shows a computational unit and indicates which registers may serve as X-inputs and Y-inputs for the ALU and multiplier. PM DATA BUS MODE1 DM DATA BUS NOTE THAT SHIFTER IS FADED HERE, INDICATING THAT IT IS NOT AVAILABLE FOR MULTIFUNCTION INSTRUCTIONS. REGISTER FILE (16 40-BIT) X MULTIPLIER Y R0 R1 R2 R3 R8 R9 R10 R11 R4 R5 R6 R7 R12 R13 R14 R15 Z Y SHIFTER X Y X ALU MR2F MR1F MR0F ASTATx STKYx TO PROGRAM SEQUENCER Figure 2-9. Input Registers for Multifunction Computations (ALU and Multiplier) For example, the X-input to the ALU can only be R8, R9, R10 or R11. Note that the shifter is gray in Figure 2-9 to indicate that there are no shifter multifunction operations. 2-32 ADSP-21160 SHARC DSP Hardware Reference Processing Elements Table 2-8, Table 2-9, Table 2-10, and Table 2-11 list the multifunction computations. For more information on assembly language syntax, see ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols: • Rm, Ra, Rs, Rx, Ry indicate any register file location; fixed-point • Fm, Fa, Fs, Fx, Fy indicate any register file location; floating-point • R3-0 indicates data file registers R3, R2, R1, or R0, and F3-0 indicates data file registers F3, F2, F1, or F0 • R7-4 indicates data file registers R7, R6, R5, or R4, and F7-4 indicates data file registers F7, F6, F5, or F4 • R11-8 indicates data file registers R11, R10, R9, or R8, and F11-8 indicates data file registers F11, F10, F9, or F8 • R15-12 indicates data file registers R15, R14, R13, or R12, and F15-12 indicates data file registers F15, F14, F13, or F12 • SSFR indicates the X-input is signed, Y-input is signed, use Fractional inputs, and Rounded-to-nearest output • SSF indicates the X-input is signed, Y-input is signed, use Fractional input Table 2-8. Dual Add And Subtract Ra = Rx + Ry, Rs = Rx – Ry Fa = Fx + Fy, Fs = Fx – Fy ADSP-21160 SHARC DSP Hardware Reference 2-33 Multifunction Computations Table 2-9. Fixed-Point Multiply and Add, Subtract, or Average (Any combination of left and right column) Rm=R3-0 * R7-4 (SSFR), Ra=R11-8 + R15-12 MRF=MRF + R3-0 * R7-4 (SSF), Ra=R11-8 – R15-12 Rm=MRF + R3-0 * R7-4 (SSFR), Ra=(R11-8 + R15-12)/2 MRF=MRF – R3-0 * R7-4 (SSF), Rm=MRF – R3-0 * R7-4 (SSFR), Table 2-10. Floating-Point Multiply And ALU Operation Fm=F3-0 * F7-4, Fa=F11-8 + F15-12 Fm=F3-0 * F7-4, Fa=F11-8 – F15-12 Fm=F3-0 * F7-4, Fa=FLOAT R11-8 by R15-12 Fm=F3-0 * F7-4, Ra=FIX F11-8 by R15-12 Fm=F3-0 * F7-4, Fa=(F11-8 + F15-12)/2 Fm=F3-0 * F7-4, Fa=ABS F11-8 Fm=F3-0 * F7-4, Fa=MAX (F11-8, F15-12) Fm=F3-0 * F7-4, Fa=MIN (F11-8, F15-12) Table 2-11. Multiply With Dual Add and Subtract Rm = R3-0 * R7-4 (SSFR), Ra = R11-8 + R15-12, Rs = R11-8 – R15-12 Fm = F3-0 * F7-4, Fa = F11-8 + F15-12, Fs = F11-8 – F15-12 Another type of multifunction operation is also available on the DSP, combining transfers between the results and data registers and transfers between memory and data registers. Like other multifunction instructions, these parallel operations complete in a single cycle. For example, the DSP can perform the following multiply and parallel read of data memory: MRF=MRF-R5*R0, R6=DM(I1,M2); 2-34 ADSP-21160 SHARC DSP Hardware Reference Processing Elements Or, the DSP can perform the following result register transfer and parallel read: R5=MR1F, R6=DM(I1,M2); Secondary Processing Element (PEy) The ADSP-21160 DSP contains two sets of computation units and associated register files. As shown in Figure 2-10 on page 2-34, these two Processing Elements (PEx and PEy) support Single Instruction, Multiple Data (SIMD) operation. DIFFERENT DATA GOES TO EACH ELEMENT BUS CO NNECT (PX) MULT PM DATA BUS 16/32/40/64 DM DATA BUS 16/32/40/64 DATA REGISTER FILE (PEx) 16 x 40-BIT BARREL SHIFTER BARREL SHIFTER ALU DATA REGISTER FILE (PEy) 16 x 40-BIT MULT ALU SAME INSTRUCTION GOES TO BOTH ELEMENTS PROGRAM SEQUENCER Figure 2-10. Block Diagram Showing Secondary Execution The MODE1 register controls the operating mode of the processing elements. Table A-2 on page A-3 lists all the bits in MODE1. ADSP-21160 SHARC DSP Hardware Reference 2-35 Secondary Processing Element (PEy) The PEYEN bit (bit 21) in the MODE1 register enables or disables the PEy processing element. When PEYEN is cleared (0), the ADSP-21160 DSP operates in Single-Instruction-Single-Data (SISD) mode, using only PEx; this is the mode in which ADSP-2106x DSPs operate. When the PEYEN bit is set (1), the ADSP-21160 DSP operates in SIMD mode, using the PEx and PEy processing elements. There is a one cycle delay after PEYEN is set or cleared, before the change to or from SIMD mode takes effect. To support SIMD, the DSP performs the following parallel operations: • Dispatches a single instruction to both processing element’s computation units • Loads two sets of data from memory, one for each processing element • Executes the same instruction simultaneously in both processing elements • Stores data results from the dual executions to memory the information here and in the ADSP-21160 SHARC DSP Using Instruction Set Reference, it is possible through SIMD mode’s parallelism to double performance over similar algorithms running in SISD (ADSP-2106x DSP compatible) mode. The two processing elements are symmetrical, each containing the following functional blocks: • ALU • Multiplier primary and alternate result registers • Shifter • Data register file and alternate register file 2-36 ADSP-21160 SHARC DSP Hardware Reference Processing Elements Dual Compute Units Sets The computation units (ALU, Multiplier, and Shifter) in PEx and PEy are identical. The data bus connections for the dual computation units permit asymmetric data moves to, from, and between the two processing elements. Identical instructions execute on the PEx and PEy computational units; the difference is the data. The data registers for PEy operations are identified (implicitly) from the PEx registers in the instruction. This implicit relation between PEx and PEy data registers corresponds to complementary register pairs in Table 2-12. Any universal registers that do not appear in Table 2-12 have the same identities in both PEx and PEy. When a computation in SIMD mode refers to a register in the PEx column, the corresponding computation in PEy refers to the complimentary register in the PEy column. Table 2-12. SIMD Mode Complementary Register Pairs PEx PEy PEx PEy R0 S0 R11 S11 R1 S1 R12 S12 R2 S2 R13 S13 R3 S3 R14 S14 R4 S4 R15 S15 R5 S5 USTAT1 USTAT2 R6 S6 USTAT3 USTAT4 R7 S7 ASTATx ASTATy R8 S8 STKYx STKYy R9 S9 PX1 PX2 R10 S10 ADSP-21160 SHARC DSP Hardware Reference 2-37 Secondary Processing Element (PEy) Dual Register Files The two 16 entry data register files (one in each PE) and their operand and result busing and porting are identical. The same is true for each 16 entry alternate register files. The transfer direction, source and destination registers, and data bus usage depend on the following conditions: • Computational mode: • Is PEy enabled (PEYEN bit=1 in MODE1 register)? • Is the data register file in PEx (R0-R15, F0-F15) or PEy (S0-S15)? • Is the instruction a data register swap between processing elements? • Data addressing mode: • What is the state of the Internal Memory Data Width (IMDW) bits in the System Configuration (SYSCON) register? • Is Broadcast write enabled (BDCST1,9 bits in MODE1 register)? • What is the type of address (long, normal, or short word)? • Is long-word override (LW) specified in the instruction? • What are the states of instruction fields for DAG1 or DAG2? • Program sequencing (conditional logic): • What is the outcome of the instruction’s condition comparison on each processing element? 2-38 ADSP-21160 SHARC DSP Hardware Reference Processing Elements For information on SIMD issues that relate to computational modes, see “SIMD (Computational) Operations” on page 2-39. For information on SIMD issues relating to data addressing, see “SIMD Mode and Sequencing” on page 3-55. For information on SIMD issues relating to program sequencing, see “Addressing in SISD and SIMD Modes” on page 4-18. Dual Alternate Registers Both register files consist of a primary set of 16 by 40-bit registers and an alternate set of 16 by 40-bit registers. Context switching between the two sets of registers occur in parallel between the two processing elements. For more information, see “Alternate (Secondary) Data Registers” on page 2-30. SIMD (Computational) Operations In SIMD mode, the dual processing elements execute the same instruction, but operate on different data. To support SIMD operation, the elements support a variety of dual data move features. The DSP supports unidirectional and bidirectional register-to-register transfers with the conditional compute and move instruction. All four combinations of inter-register file and intra-register file transfers (PEx <-> PEx, PEx <-> PEy, PEy <-> PEx, and PEy <-> PEy) are possible in both SISD (unidirectional) and SIMD (bidirectional) modes. In SISD mode (PEYEN bit=0), the register-to-register transfers are unidirectional, meaning that an operation performed on one processing element is not duplicated on the other processing element. The SISD transfer uses a source register and a destination register, and either register can be in either element’s data register file. For a summary of unidirectional transfers, see the upper half of Table 2-12 on page 2-37. Note that in SISD mode a condition for an instruction only tests in the PEx element and applies to the entire instruction. ADSP-21160 SHARC DSP Hardware Reference 2-39 Secondary Processing Element (PEy) In SIMD mode (PEYEN bit=1), the register-to-register transfers are bidirectional, meaning that an operation performed on one element is duplicated in parallel on the other element. The instruction uses two source registers (one from each element’s register file) and two destination registers (one from each element’s register file). For a summary of bidirectional transfers, see the lower half of Table 2-12 on page 2-37. Note that in SIMD mode a conditional for an instruction test in both the PEx and PEy elements, dividing control of the explicit and implicit transfers as detailed in Table 2-12 on page 2-37. Bidirectional register-to-register transfers in SIMD mode are allowed between a data register and DAG, control, or status registers. When the DAG, control, or status register is a source of the transfer, the destination can be a data register. This SIMD transfer duplicates the contents of the source register in a data register in both processing elements. programming is required when a DAG, control, or status Careful register is a destination of a transfer from a data register. If the destination register has a complement (for example ASTATx and ASTATy), the SIMD transfer moves the contents of the explicit data register into the explicit destination and moves the contents of the implicit data register into the implicit destination (the complement). If the destination register has no complement (for example, I0), only the explicit transfer occurs. Even if the code uses a conditional operation to select whether the transfer occurs, only the explicit transfer can take place if the destination register has no complement. In the case where a DAG, control, or status register is both source and destination, the data move operation executes the same as if SIMD mode were disabled. 2-40 ADSP-21160 SHARC DSP Hardware Reference Processing Elements In both SISD and SIMD modes, the DSP supports bidirectional register-to-register swaps. The swap always occurs between one register in each processing element’s data register file. Registers swaps use the special swap operator, <->. A register-to-register swap occurs when registers in different processing elements exchange values; for example R0 <-> S1. Only single, 40-bit register to register swaps are supported—no double register operations. When they are unconditional, register-to-register swaps operate the same in SISD mode and SIMD mode. If a condition is added to the instruction in SISD mode, the condition tests only in the PEx element and controls the entire operation. If a condition is added in SIMD mode, the condition tests in both the PEx and PEy elements and controls the halves of the operation as detailed in Table 2-12 on page 2-37. Table 2-13. Register-to-Register Move Summary (SISD versus SIMD) Mode SISD 1 SIMD2 Instruction Explicit Transfer Implicit Transfer IF condition compute, Rx = Ry; Rx loaded from Ry None IF condition compute, Rx = Sy; Rx loaded from Sy None IF condition compute, Sx = Ry; Sx loaded from Ry None IF condition compute, Sx = Sy; Sx loaded from Sy None IF condition compute, Rx <-> Sy; Rx swaps to Sy Sy swaps to Rx None IF condition compute, Rx = Ry; Rx loaded from Ry Sx loaded from Sy IF condition compute, Rx = Sy; Rx loaded from Sy Sx loaded from Ry IF condition compute, Sx = Ry; Sx loaded from Ry Rx loaded from Sy IF condition compute, Sx = Sy; Sx loaded from Sy Rx loaded from Ry IF condition compute, Rx <-> Sy; 3 Sy swaps to Rx Rx swaps to Sy 1 In SISD mode, the conditional applies only to the entire operation and is only tested against PEx’s flags. When the condition tests true, the entire operation occurs. 2 In SIMD mode, the conditional applies separately to the explicit and implicit transfers. Where the condition tests true (PEx for the explicit and PEy for the implicit), the operation occurs in that processing element. ADSP-21160 SHARC DSP Hardware Reference 2-41 Secondary Processing Element (PEy) 3 Register to register transfers (R0=S0) and register swaps (R0<->S0) do not cause a PMD bus conflict. These operations use only the DMD bus and a hidden 16-bit bus to do the two register moves. conditional instructions with the same destination registers SIMD do not produce predictable transfers. For example, the instruction IF EQ R4 = R14 – R15, S4 = R6; may not work as expected. This kind of usage is prohibited, as it is not logical to use it this way. SIMD and Status Flags When the DSP is in SIMD mode (PEYEN bit=1), computations on both processing elements generate status flags, producing a logical OR’ing of the exception status test on each processing element. If one of the four fixed-point or floating-point exceptions is enabled, an exception condition on either or both processing elements generates an exception interrupt. Interrupt service routines must determine which of the processing elements encountered the exception. Note that returning from a floating point interrupt does not automatically clear the STKY state. Code must clear the STKY bits in both processing element’s sticky status (STKYx and STKYy) registers as part of the exception service routine. For more information, see For more information, see “Interrupts and Sequencing” on page 3-31. 2-42 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer 3 PROGRAM SEQUENCER The DSP’s program sequencer implements program flow, constantly providing the address of the next instruction to be executed by other parts of the DSP. Overview Program flow in the DSP is mostly linear with the processor executing program instructions sequentially. This linear flow varies occasionally when the program uses non-sequential program structures, such as those illustrated in Figure 3-1. Non-sequential structures direct the DSP to execute an instruction that is not at the next sequential address, following the current instruction. These structures include: Loops. One sequence of instructions executes several times with zero overhead. • Subroutines. The processor temporarily interrupts sequential flow to execute instructions from another part of program memory. • Jumps. Program flow transfers permanently to another part of program memory. ADSP-21160 SHARC DSP Hardware Reference 3-1 Overview • Interrupts. Subroutines in which a runtime event (not an instruction) triggers the execution of the routine. • Idle. An instruction that causes the processor to cease operations, holding its current state until an interrupt occurs. Then, the processor services the interrupt and continues normal execution. L INE AR F L OW ADDRESS: L OOP JUMP DO UNTIL JUMP N+1 INSTRUCTION INSTRUCTION INSTRUCTION N+2 INSTRUCTION INSTRUCTION INSTRUCTION N+3 INSTRUCTION INSTRUCTION N TIMES INSTRUCTION N+4 INSTRUCTION INSTRUCTION INSTRUCTION N+5 INSTRUCTION INSTRUCTION INSTRUCTION INT E R R UP T IDL E N INSTRUCTION S UB R OUT INE IRQ CALL INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION IDLE INSTRUCTION WAITING FOR IRQ INSTRUCTION VECTOR INSTRUCTION … … INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION RTS RTI INSTRUCTION INSTRUCTION Figure 3-1. Program Flow Variations The sequencer manages execution of these program structures by selecting the address of the next instruction to execute. As part of its process, the sequencer handles the following tasks. 3-2 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer • Increments the fetch address • Maintains stacks • Evaluates conditions • Decrements the loop counter • Calculates new addresses • Maintains an instruction cache • Handles interrupts To accomplish these tasks, the sequencer uses the blocks shown in Figure 3-2. The sequencer’s address multiplexer selects the value of the next fetch address from several possible sources. The fetched address enters the instruction pipeline, made up of the fetch address register, decode address register, and program counter (PC). These contain the 24-bit addresses of the instructions currently being fetched, decoded, and executed. The PC couples with the PC stack, which stores return addresses and top-of-loop addresses. All addresses generated by the sequencer are 24-bit program memory instruction addresses. To manage events, the sequencer’s interrupt controller handles interrupt processing, determines whether an interrupt is masked, and generates the appropriate interrupt vector address. With selective caching, the instruction cache lets the DSP access data in program memory and fetch an instruction (from the cache) in the same cycle. The DAG2 data address generator outputs program memory data addresses. The sequencer evaluates conditional instructions and loop termination conditions using information from the status registers. The loop address stack and loop counter stack support nested loops. The status stack stores status registers for implementing nested interrupt routines. ADSP-21160 SHARC DSP Hardware Reference 3-3 Overview MODE1 MODE2 ST KYX ASTATX AST ATY STKYY USTAT1 USTAT2 USTAT4 INSTRUCTION CACHE TPERIOD INPUT FLAGS INSTRUCTION LATCH LOOP ADDRESS STACK (LADDR) MULTIPLEXER CONDITION LOGIC TCOUNT USTAT3 LOOP COUNT STACK (CURLCNTR, LCNTR) DECREMENT LOOP CO NTROL BRANCH CO NTROL YES ADDRESS FROM DAG2 TCOUNT=0 NO TIMEXP OTHER INTERRUPTS I NST RUCTION PIPELI NE INTERRUPT CONTROLLER PROGRAM COUNTER STACK INTERRUPT LAT CH (IRPTL) INTERRUPT MASK (IMASK) TOP OF PC STACK (PCSTK) INTERRUPT MASK POINTER (IMASKP) PC STACK POI NTER (PCSTKP) INTERRUPT VECTO R 32 32 RETURN ADDRESS OR TO P OF LO OP FETCH DECODE PROGRAM ADDRESS ADDRESS COUNTER (FADDR) (DADDR) (PC) PC-RELATIVE ADDRESS +1 NEXT ADDRESS (LINEAR FLOW) REPEATED ADDRESS (IDLE) NEXT ADDRESS MULTIPLEXER + DIRECT BRANCH INDIRECT BRANCH 48 24 DM DATA BUS (64 BITS) PM ADDRESS BUS (32 BITS) PM DATA BUS (64-BITS) Figure 3-2. Program Sequencer Block Diagram 3-4 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-1 and Table 3-2 list the registers within and related to the program sequencer. All registers in the program sequencer are universal registers, so they are accessible to other universal registers and to data memory. All the sequencer’s registers and the tops of stacks are readable, and all these registers are writable, except for the fetch address, decode address, and PC. Pushing or popping the PC stack is done with a write to the PC stack pointer, which is readable and writable. Pushing or popping the loop address stack requires explicit instructions. A set of system control registers configures or provides input to the sequencer. These registers appear across the top and within the interrupt controller of Figure 3-2. A bit manipulation instruction permits setting, clearing, toggling, or testing specific bits in the system registers. For information on this instruction (Bit), see ADSP-21160 SHARC DSP Instruction Set Reference. Writes to some of these registers do not take effect on the next cycle. For example, after a write to the MODE1 register to enable ALU saturation mode, the change does not take effect until two cycles after the write. Also, some of these registers do not update on the cycle immediately following a write. It takes an extra cycle before a read of the register returns the new value. With the lists of sequencer and system registers, Table 3-1 and Table 3-2 summarize the number of extra cycles (latency) for a write to take effect (effect latency) and for a new value to appear in the register (read latency). A “0” indicates that the write takes effect or appears in the register on the next cycle after the write instruction is executed, and a “1” indicates one extra cycle. ADSP-21160 SHARC DSP Hardware Reference 3-5 Overview Table 3-1. Program Sequencer Registers Read and Effect Latencies Register Contents Bits Read Latency Effect Latency FADDR fetch address 24 — — DADDR decode address 24 — — PC execute address 24 — — PCSTK top of PC stack 24 0 0 PCSTKP PC stack pointer 5 1 1 LADDER top of loop address stack 32 0 0 CURLCNTR top of loop count stack (current loop count) 32 0 0 LCNTR loop count for next DO UNTIL loop 32 0 0 Table 3-2. System Registers Read and Effect Latencies1 3-6 Register Contents Bits Read Latency Maximum Effect Latency MODE1 mode control bits 32 0 1 MODE2 mode control bits 32 0 1 IRPTL interrupt latch 32 0 1 IMASK interrupt mask 32 0 1 IMASKP interrupt mask pointer (for nesting) 32 1 1 MMASK mode mask 32 0 1 FLAGS flag inputs 32 0 1 LIRPTL link port interrupt latch/mask 32 0 1 ASTATX arithmetic status flags 32 0 1 ASTATY arithmetic status flags 32 0 1 STKYX sticky status flags 32 0 1 STKYY sticky status flags 32 0 1 USTAT1 user-defined status flags 32 0 0 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-2. System Registers Read and Effect Latencies1 (Cont’d) Register Contents Bits Read Latency Maximum Effect Latency USTAT2 user-defined status 32 0 0 USTAT3 user-defined status 32 0 0 USTAT4 user-defined status 32 0 0 1 The number of cycles it takes for the effect latencies for different registers (for example, MODE1) given above is just a maximum value. Different bits in these registers have different effect latencies ranging from 0 to the maximum value listed. Hence users can write code – That does not have any dependency on the above effect latencies – Such that there are delays up to the number of cycles specified in the above Maximum Effect Latency column The following sections in this chapter explain how to use each of the functional blocks in Figure 3-2 on page 3-4. • “Instruction Pipeline” on page 3-8 • “Instruction Cache” on page 3-9 • “Branches and Sequencing” on page 3-14 • “Loops and Sequencing” on page 3-20 • “Interrupts and Sequencing” on page 3-31 • “Timer and Sequencing” on page 3-48 • “Stacks and Sequencing” on page 3-50 • “Conditional Sequencing” on page 3-52 • “SIMD Mode and Sequencing” on page 3-55 ADSP-21160 SHARC DSP Hardware Reference 3-7 Instruction Pipeline Instruction Pipeline The program sequencer determines the next instruction address by examining both the current instruction being executed and the current state of the processor. If no conditions require otherwise, the DSP executes instructions from program memory in sequential order by incrementing the fetch address. Using its instruction pipeline, the DSP processes instructions in three clock cycles: • Fetch cycle. The DSP reads the instruction from either the on-chip instruction cache or from program memory. • Decode cycle. The DSP decodes the instruction, generating conditions that control instruction execution. • Execute cycle. The DSP executes the instruction; the operations specified by the instruction complete in a single cycle. These cycles overlap in the pipeline, as shown in Table 3-3. In sequential program flow, when one instruction is being fetched, the instruction fetched in the previous cycle is being decoded, and the instruction fetched two cycles before is being executed. Sequential program flow always has a throughput of one instruction per cycle. Table 3-3. Pipelined Execution Cycles 3-8 Cycles Fetch Decode Execute 1 0x08 2 0x09 0x08 3 0x0A 0x09 0x08 4 0x0B 0x0A 0x09 5 0x0C 0x0B 0x0A ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Any non-sequential program flow can potentially decrease the DSP’s instruction throughput. Non-sequential program operations include: • Program memory data accesses that conflict with instruction fetches • Jumps • Subroutine calls and returns • Interrupts and return • Loops Instruction Cache Usually, the sequencer fetches an instruction from memory on each cycle. Occasionally, bus constraints prevent some of the data and instructions from being fetched in a single cycle. To alleviate these data flow constraints, the DSP has an instruction cache, which appears in Figure 3-2 on page 3-4. When the DSP executes an instruction that requires data access over the PM data bus, there is a bus conflict because the sequencer uses the PM data bus for fetching instructions. To avoid these conflicts, the DSP caches these instructions, reducing delays. Except for enabling or disabling the cache, its operation requires no user intervention. For more information, see “Using the Cache” on page 3-12. The first time the DSP encounters a fetch conflict, the DSP must wait to fetch the instruction on the following cycle, causing a delay. The DSP automatically writes the fetched instruction to the cache to prevent the same delay from happening again. The sequencer checks the instruction cache on every program memory data access. If the instruction needed is in the cache, the instruction fetch from the cache happens in parallel with the program memory data access, without incurring a delay. ADSP-21160 SHARC DSP Hardware Reference 3-9 Instruction Cache Because of the three-stage instruction pipeline, as the DSP executes instruction (at address n) that requires a program memory data access, this execution creates a conflict with the instruction fetch (at address n+2), assuming sequential execution. The cache stores the fetched instruction (n+2), not the instruction requiring the program memory data access. If the instruction needed to avoid a conflict is in the cache, the cache provides the instruction while the program memory data access is performed. If the needed instruction is not in the cache, the instruction fetch from memory takes place in the cycle following the program memory data access, incurring one cycle of overhead. The fetched instruction is loaded into the cache, if the cache is enabled and not frozen, so that it is available the next time the same conflict occurs. Figure 3-3 shows a block diagram of the instruction cache. The cache holds 32 instruction-address pairs. These pairs (or cache entries) are arranged into 16 (15–0) cache sets according to their address’ 4 least significant bits (3–0). The two entries in each set (entry 0 and entry 1) have a valid bit, indicating whether the entry contains a valid instruction. The least recently used (LRU) bit for each set indicates which entry was not used last (0=entry 0 and 1=entry 1). 3-10 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer LRU VALID BIT BIT SET 0 INSTRUCTIONS ADDRESSES BITS (23-4) ADDRESSES BITS (3-0) ENTRY 0 0000 ENTRY 1 0000 ENTRY 0 0001 ENTRY 1 0001 ENTRY 0 0010 ENTRY 1 0010 SET 13 ENTRY 0 1101 ENTRY 1 1101 SET 14 ENTRY 0 1110 SET 1 SET 2 ENTRY 1 1110 SET 15 ENTRY 0 1111 ENTRY 1 1111 Figure 3-3. Instruction Cache Architecture The cache places instructions in entries according to the 4 LSBs of the instruction’s address. When the sequencer checks for an instruction to fetch from the cache, it uses the 4 address LSBs as an index to a cache set. Within that set, the sequencer checks the addresses of the two entries, looking for the needed instruction. If the cache contains the instruction, the sequencer uses the entry and updates the LRU bit (if necessary) to indicate the entry did not contain the needed instruction. When the cache does not contain a needed instruction, the cache loads a new instruction and its address, placing these in the least recently used entry of the appropriate cache set and toggling the LRU bit (if necessary). ADSP-21160 SHARC DSP Hardware Reference 3-11 Instruction Cache Using the Cache After a DSP reset, the cache starts cleared (containing no instructions), unfrozen, and enabled. From then on, the MODE2 register controls the operating mode of the instruction cache. Table A-3 on page A-7 lists all the bits in MODE2. The following bits in MODE2 control cache modes: • Cache Disable. Bit 4 (CADIS) directs the sequencer to disable the cache (if 1) or enable the cache (if 0). Disabling the cache does not mark the current content of the cache as invalid. If the cache is to be enabled again, the existing content is used again. To clear the cache, use the FLUSH CACHE instruction. • Cache Freeze. Bit 19 (CAFRZ) directs the sequencer to freeze the contents of the cache (if 1) or let new entries displace the entries in the cache (if 0). self-modifying code (e.g. software loader kernel) or software Ifoverlays are used, execute a instruction followed by a FLUSH CACHE before executing the new code. Otherwise old content from the cache could still be used, although the code has changed. NOP When changing the cache’s mode, note that an instruction containing a program memory data access must not be placed directly after a cache enable or cache disable instruction, because the DSP must wait at least one cycle before executing the PM data access. A program should have a NOP inserted after the cache enable instruction if necessary. Optimizing Cache Usage Usually, cache operation is efficient and requires no intervention, but certain ordering of instructions can work against the cache’s architecture and can degrade cache efficiency. When the order of PM data accesses and instruction fetches continuously displaces cache entries and loads new entries, the cache is not being efficient. Rearranging the order of these instructions remedies this inefficiency. 3-12 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer An example of code that works against cache efficiency appears in Table 3-4. Table 3-4. Cache-Inefficient Code Address Instruction 0x0100 lcntr=1024, do Outer until lce; 0x0101 r0=dm(i0,m0), pm(i8,m8)=f3; 0x0102 r1=r0-r15; 0x0103 if eq call (Inner); 0x0104 f2=float r1; 0x0105 f3=f2*f2; 0x0106 Outer: f3=f3+f4; 0x0107 pm(i8,m8)=f3; ... 0x0200 Inner: r1=R13; 0x0201 r14=pm(i9,m9); ... 0x0211 pm(i9,m9)=r12; ... 0x021F rts; The program memory data access at address 0x101 in the loop, Outer, causes the cache to load the instruction at 0x103 (into set 3). Each time the program calls the subroutine, Inner, the program memory data accesses at 0x201 and 0x211 displace the instruction at 0x103 by loading the instructions at 0x203 and 0x213 (also into set 3). If the program only calls the Inner subroutine rarely during the Outer loop execution, the repeated cache loads do not greatly influence performance. If the program frequently calls the subroutine while in the loop, the cache inefficiency has a noticeable effect on performance. ADSP-21160 SHARC DSP Hardware Reference 3-13 Branches and Sequencing To improve cache efficiency on this code (for instance, if the execution time of the Outer loop is critical), you should rearrange the order of some instructions. Moving the subroutine call up one location (starting at 0x201) would work here, because with that order the two cached instructions end up in cache set 4 instead of set 3. Branches and Sequencing One of the types of non-sequential program flow that the sequencer supports is branching. A branch occurs when a Jump or Call/return instruction begins execution at a new location, other than the next sequential address. For descriptions on how to use the Jump and Call/return instructions, see ADSP-21160 SHARC DSP Instruction Set Reference. Briefly, these instructions operate as follows. • A Jump or a Call instruction transfers program flow to another memory location. The difference between a Jump and a Call is that a Call automatically pushes the return address (the next sequential address after the Call instruction) onto the PC stack. This push makes the address available for the Call instruction’s matching return instruction to allow easy return from the subroutine. • A return instruction causes the sequencer to fetch the instruction at the return address, which is stored at the top of the PC stack. The two types of return instructions are return from subroutine (RTS) and return from interrupt (RTI). While the return from subroutine (RTS) only pops the return address off the PC stack, the return from interrupt (RTI) pops the return address and: 3-14 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer • Pops the status stack if the ASTATx,y and MODE1 status registers have been pushed for any of the following interrupts: IRQ2-0, timer, or VIRPT. • Clears the interrupt’s bit in the interrupt latch register (IRPTL) and the interrupt mask pointer (IMASKP). You can specify a number of parameters for branches: • Jump and Call/return instructions can be conditional. The program sequencer can evaluate status conditions to decide whether to execute a branch. If no condition is specified, the branch is always taken. For more information on these conditions, see “Conditional Sequencing” on page 3-52. • Jump and Call/return instructions can be immediate or delayed. Because of the instructions pipeline, an immediate branch incurs two lost (overhead) cycles. A delayed branch has no overhead. For more information, see “Delayed Branches” on page 3-16. • Jump instructions that appear within a loop or within an interrupt service routine have additional options. For information on the loop abort (LA) option, see “Loops and Sequencing” on page 3-20. For information on the loop re-entry (LR) option, see “Restrictions On Ending Loops” on page 3-22. For information on the clear interrupt (CI) option, see “Interrupts and Sequencing” on page 3-31. The sequencer block diagram in Figure 3-2 on page 3-4 shows that branches can be direct or indirect. The difference is that the sequencer generates the address for a direct branch, and the PM data address generator (DAG2) produces the address for an indirect branch. Direct branches are Jump or Call/return instructions that use an absolute—not changing at runtime—address (such as a program label) or use a PC-relative address. Some instruction examples that cause a direct branch are: ADSP-21160 SHARC DSP Hardware Reference 3-15 Branches and Sequencing jump fft1024; {where fft1024 is an address label} call (pc,10); {where (pc,10) a PC-relative address} Indirect branches are Jump or Call/Return instructions that use a dynamic—changes at runtime—address that comes from the PM data address generator. For more information on the data address generator, see “Data Address Generators”. Some instruction examples that cause an indirect branch are: jump (m8,i12); {where (m8,i12) are DAG2 registers} call (m9,i13); {where (m9,i13) are DAG2 registers} Conditional Branches The sequencer supports conditional branches. These are Jump or Call/return instructions whose execution is based on testing an If condition. For more information on condition types in If condition instructions, see “Conditional Sequencing” on page 3-52. Note that the DSP’s Single-Instruction, Multiple-Data mode influences the execution of conditional branches. For more information, see “SIMD Mode and Sequencing” on page 3-55. Delayed Branches The instruction pipeline influences how the sequencer handles branches. For immediate branches—Jumps and Call/return instructions not specified as delayed branches (DB), two instruction cycles are lost (NOPs) as the pipeline empties and refills with instructions from the new branch. As shown in Table 3-5 on page 3-17 and Table 3-6 on page 3-17, the DSP does not execute the two instructions after the branch, which are in the fetch and decode stages. For a Call, the decode address (the address of the instruction after the Call) is the return address. During the two lost (no-operation) cycles, the pipeline fetches and decodes the first instruction at the branch address. 3-16 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-5. Pipelined Execution Cycles For Immediate Branch (Jump/Call) Cycles Fetch Decode Execute 1 n+2 n+1nop 2 j2 n+2nop3 NOP 3 j+1 j NOP 4 j+2 j+1 j 1 n Note that n is the branching instruction, and j is the instruction branch address 1. n+1 suppressed 2. For call, n+1 pushed on PC stack 3. n+2 suppressed Table 3-6. Pipelined Execution Cycles For Immediate Branch (Return) Cycles Fetch Decode Execute 1 n+2 n+1nop1 n2 2 r n+2nop3 NOP 3 r+1 r NOP 4 r+2 r+1 r Note that n is the branching instruction, and r is the instruction branch address 1. n+1 suppressed 2. r (n+1 in Table 3-5) popped from PC stack 3. n+2 suppressed For delayed branches—Jumps and Call/return instructions with the delayed branches (DB) modifier, no instruction cycles are lost in the pipeline, because the DSP executes the two instructions after the branch while the pipeline fills with instructions from the new branch. ADSP-21160 SHARC DSP Hardware Reference 3-17 Branches and Sequencing As shown in Table 3-7 and Table 3-8, the DSP executes the two instructions after the branch, while the instruction at the branch address is fetched and decoded. In the case of a Call, the return address is the third address after the branch instruction. While delayed branches use the instruction pipeline more efficiently than immediate branches, it is important to note that delayed branch code can be harder to understand because of the instructions between the branch instruction and the actual branch. Table 3-7. Pipelined Execution Cycles For Delayed Branch (Jump or Call) Cycles Fetch Decode Execute 1 n+2 n+1 n 2 j1 n+2 n+1 3 j+1 j n+2 4 j+2 j+1 j Note that n is the branching instruction, and j is the instruction branch address 1. For call, n+3 pushed on PC stack Table 3-8. Pipelined Execution Cycles For Delayed Branch (return) Cycles Fetch Decode Execute 1 n+2 n+1 n1 2 r n+2 n+1 3 r+1 r n+2 4 r+2 r+1 r Note that n is the branching instruction, and r is the instruction branch address 1. r (n+3 in Table 3-7) popped from PC stack Besides being somewhat more challenging to code, there are also some limitations on delayed branches that stem from the instruction pipeline architecture. Because the delayed branch instruction and the two instructions that follow it must execute sequentially, the instructions in the two 3-18 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer locations that follow a delayed branch instruction may not be any of the following: • Other branches (no Jump, Call, or return instructions) • Any manipulations of the PC stack, status stack or loop stacks • Any writes to the PC stack pointer • Any loops or other breaks in sequential operation (no Do/Until or Idle instructions) software for the DSP should always flag these types Development of instructions in the two locations after a delayed branch instruction as code errors. to follow a delayed branch instruction with a Jump, ItCall,is possible or return instruction in one special case. If the sequential branch instructions use mutually exclusive conditions, one branch may follow another. The following example is valid. if gt jump (PC, 7) (db); // if greater than... if led jump (PC, 11) (db); // if less than or equal... Interrupt processing is also influenced by delayed branches and the instruction pipeline. Because the delayed branch instruction and the two instructions that follow it must execute sequentially, the DSP does not immediately process an interrupt that occurs in between a delayed branch instruction and either of the two instructions that follow. Any interrupt that occurs during these instructions is latched, but not processed until the branch is complete. a delayed branch, a program can read the PC stack or PC During stack pointer immediately after a delayed call or return, but this read shows that the return address on the PC stack has already been pushed or popped, even though the branch has not occurred yet. ADSP-21160 SHARC DSP Hardware Reference 3-19 Loops and Sequencing Loops and Sequencing Another type of non-sequential program flow that the sequencer supports is looping. A loop occurs when a Do/Until instruction causes the DSP to repeat a sequence of instructions until a condition tests true. A special condition for terminating a loop is Loop Counter Expired (LCE). This condition tests whether the loop has completed the number of iterations in the LCNTR register. Loops that terminate with conditions other than LCE have some additional restrictions. For more information, see “Restrictions On Ending Loops” on page 3-22 and “Restrictions On Short Loops” on page 3-23. For more information on condition types in Do/Until instructions, see “Conditional Sequencing” on page 3-52. DSP’s Single-Instruction, Multiple-Data mode influences the The execution of loops. For more information, “SIMD Mode and Sequencing” on page 3-55. The Do/Until instruction uses the sequencer’s loop and condition features, which appear in Figure 3-2 on page 3-4. These features provide efficient software loops, without the overhead of additional instructions to branch, test a condition, or decrement a counter. The following code example shows a Do/Until loop that contains three instructions and iterates 30 times. LCNTR=30, DO the_end UNTIL LCE; {loop iterates 30 times} R0=DM(I0,M0), F2=PM(I8,M8); R1=R0-R15; the_end: F4=F2+F3; {last instruction in loop} When executing a Do/Until instruction, the program sequencer pushes the address of the loop’s last instruction and loop’s termination condition onto the loop address stack. The sequencer also pushes the top-of-loop address—address of the instruction following the Do/Until instruction— onto the PC stack. 3-20 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer The sequencer’s instruction pipeline architecture influences loop termination. Because instructions are pipelined, the sequencer must test the termination condition (and, if the loop is counter-based, decrement the counter) before the end of the loop. Based on the test’s outcome, the next fetch either exits the loop or returns to the top-of-loop. The condition test occurs when the DSP is executing the instruction two locations before the last instruction in the loop (at location e – 2, where e is the end-of-loop address). If the condition tests false, the sequencer repeats the loop, fetching the instruction from the top-of-loop address, which is stored on the top of the PC stack. If the condition tests true, the sequencer terminates the loop, fetching the next instruction after the end of the loop and popping the loop and PC stacks. A special case of loop termination is the loop abort instruction, Jump (LA). This instruction causes an automatic loop abort when it occurs inside a loop. When the loop aborts, the sequencer pops the PC and loop address stacks once. If the aborted loop was nested, the single pop of the stacks leaves the correct values in place for the outer loop. Table 3-9 and Table 3-10 on page 3-22 show the pipeline states for loop iteration and termination. Table 3-9. Pipelined Execution Cycles For Loop Back (Iteration) Cycles Fetch Decode Execute 1 e e–1 e – 21 2 b2 e e–1 3 b+1 b e 4 b+2 b+1 b Note that e is the loop end instruction, and b is the loop start instruction 1. Termination condition tests false 2. Loop start address is top of PC stack ADSP-21160 SHARC DSP Hardware Reference 3-21 Loops and Sequencing Table 3-10. Pipelined Execution Cycles For Loop Termination Cycles Fetch Decode Execute 1 e e–1 e – 21 2 e+12 e e–1 3 e+2 e+1 e 4 e+3 e+2 e+1 Note that e is the loop end instruction 1. Termination condition tests true 2. Loop aborts and loop stacks pop Restrictions On Ending Loops The sequencer’s loop features (which optimize performance in many ways) limit the types of instructions that may appear at or near the end of the loop. Nested loops may not use the same end-of-loop instruction address. • Nested loops with a non-counter-based loop as the outer loop must place the end address of the outer loop at least two addresses after the end address of the inner loop. • Nested loops with a non-counter-based loop as the outer loop that use the loop abort instruction, Jump (LA), to abort the inner loop may not Jump (LA) to the last instruction of the outer loop. • An instruction that writes to the loop counter from memory may not be used as the third-to-last instruction of a counter-based loop (at e – 2, where e is the end-of-loop address). • An If Not LCE instruction may not be used as the instruction that follows a write to CURLCNTR from memory. 3-22 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer • Branch (Jump or Call/return) instructions may not be used as any of the last three instructions of a loop. This no end-of-loop branches rule also applies to single-instruction and two-instruction loops with only one iteration. There is one exception to the no end-of-loop branches rule. The last three instructions of a loop may contain an immediate Call—a Call without a DB modifier—that is paired with a loop re-entry return—a return (RTS) with loop re-entry modifier (LR). The immediate Call may be one of the last three instructions of a loop, but not in a one-instruction loop or a two-instruction, single-iteration loop. Restrictions On Short Loops The sequencer’s pipeline features (which optimize performance in many ways) restrict how short loops iterate and terminate. Short loops (1- or 2-instruction loops) terminate in a special way because they are shorter than the instruction pipeline. Counter-based loops (Do/Until LCE) of one or two instructions are not long enough for the sequencer to check the termination condition two instructions from the end of the loop. In these short loops, the sequencer has already looped back when the termination condition is tested. The sequencer provides special handling to avoid overhead (NOP) cycles if the loop is iterated a minimum number of times. Table 3-11 on page 3-24 and Table 3-12 on page 3-24 show the pipeline execution for counter-based single-instruction loops. ADSP-21160 SHARC DSP Hardware Reference 3-23 Loops and Sequencing Table 3-11. Pipelined Execution Cycles for Single Instruction Counter-Based Loop With Three Iterations Cycles Fetch Decode Execute 1 n+2 n+1 n1 2 n+12 n+1 n+1 (pass 1) 3 n+23 n+1 n+1 (pass 2) 4 n+3 n+2 n+1 (pass 3) 5 n+4 n+3 n+2 Note: n is the loop start instruction, and n+2 is the instruction after the loop 1. Loop count (LCNTR) equals 3 2. No opcode latch or fetch address update; count expired tests true 3. Loop iteration aborts; PC and loop stacks pop Table 3-12. Pipelined Execution Cycles for Single Instruction Counter-Based Loop With Two Iterations (Two Overhead Cycles) Cycles Fetch Decode Execute 1 n+2 n+1 n1 2 n+12 n+1 n+1 (pass 1) 3 n+13 n+1nop4 n+1 (pass 2) 4 n+2 n+1nop5 NOP 5 n+3 n+2 NOP 6 n+4 n+3 n+2 Note: n is the loop start instruction, and n+2 is the instruction after the loop 1. Loop count (LCNTR) equals 2 2. No opcode latch or fetch address update 3. Count expired tests true 4. Loop iteration aborts; PC and loop stacks pop; n+1 suppressed 5. n+1 suppressed 3-24 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-13 and Table 3-14 on page 3-26 show the pipeline execution for counter-based two-instruction loops. For no overhead, a loop of length one must be executed at least three times and a loop of length two must be executed at least twice. Loops of length one that iterate only once or twice and loops of length two that iterate only once incur two cycles of overhead, because there are two aborted instructions after the last iteration to clear the instruction pipeline. Table 3-13. Pipelined Execution Cycles For Two Instruction Counter-Based Loop With Two Iterations Cycles Fetch Decode Execute 1 n+2 n+1 n1 2 n+12 n+2 n+1 (pass 1) 3 n+23 n+1 n+2 (pass 1) 4 n+34 n+2 n+1 (pass 2) 5 n+4 n+3 n+2 (pass 2) 6 n+5 n+4 n+3 Note: n is the loop start instruction, and n+3 is the instruction after the loop 1. Loop count (LCNTR) equals 2 2. PC stack supplies loop start address 3. Count expired tests true 4. Loop iteration aborts; PC and loop stacks pop ADSP-21160 SHARC DSP Hardware Reference 3-25 Loops and Sequencing Table 3-14. Pipelined Execution Cycles For Two Instruction Counter-Based Loop With One Iteration (Two Overhead Cycles) Cycles Fetch Decode Execute 1 n+2 n+1 n1 2 n+12 n+2 n+1 (pass 1) 3 n+23 n+1nop4 n+2 (pass 1) 4 n+3 n+2nop5 NOP 5 n+4 n+3 NOP 6 n+5 n+4 n+3 Note: n is the loop start instruction, and n+3 is the instruction after the loop 1. Loop count (LCNTR) equals 1 2. PC stack supplies loop start address 3. Count expired tests true 4. Loop iteration aborts; PC and loop stacks pop; n+1 suppressed 5. n+2 suppressed Processing of an interrupt that occurs during the last iteration of a one-instruction loop that executes once or twice, a two-instruction loop that executes once, or the cycle following one of these loops (which is a NOP) is delayed by one cycle. Similarly, in a one-instruction loop that iterates at least three times, processing is delayed by one cycle if the interrupt occurs during the third-to-last iteration. For more information on pipeline execution during interrupts, see “Interrupts and Sequencing” on page 3-31. 3-26 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Short non-counter-based loops terminate differently from short counter-based loops. These differences stem from the architecture of the pipeline and conditional logic. • In a three-instruction non-counter-based loop, the sequencer tests the termination condition when the DSP executes the top of loop instruction. When the condition tests true, the sequencer completes the iteration of the loop and terminates. • In a two-instruction non-counter-based loop, the sequencer tests the termination condition when the DSP executes the last (second) instruction. If the condition becomes true when the first instruction is executed, the condition tests true during the second instruction, and the sequencer completes one more iteration of the loop before exiting. If the condition becomes true during the second instruction, the sequencer completes two more iterations of the loop before exiting. • In a one-instruction non-counter-based loop, the sequencer tests the termination condition every cycle. After the cycle when the condition becomes true, the sequencer completes three more iterations of the loop before exiting. Loop Address Stack The sequencer’s loop support, which appears in Figure 3-2 on page 3-4, includes a loop address stack. The loop address stack is six levels deep by 32 bits wide. The LADDR register contains the top entry on the loop address stack. This register is readable and writable over the DM Data bus. Reading and writing LADDR does not move the loop address stack pointer; only a stack push or pop, performed with explicit instructions, moves the stack pointer. LADDR contains the value 0xFFFF FFFF when the loop address stack is empty. Table A-13 on page A-30 lists all the bits in LADDR. ADSP-21160 SHARC DSP Hardware Reference 3-27 Loops and Sequencing The sequencer pushes an entry onto the loop address stack when executing a Do/Until or Push Loop instruction. The stack entry pops off the stack two instructions before the end of its loop’s last iteration or on a Pop Loop instruction. A stack overflow occurs if a seventh entry (one more than full) is pushed onto the loop stack. The stack is empty when no entries are occupied. The loop stacks’ overflow or empty status is available. Because the sequencer keeps the loop stack and loop counter stack synchronized, the same overflow and empty flags apply to both stacks. These flags are in the sticky status register (STKYx). For more information on STKYx, see Table A-5 on page A-13. For more information on how these flags work with the loop stacks, see “Loop Counter Stack”. Note that a loop stack overflow causes a maskable interrupt. Because the sequencer tests the termination condition two instructions before the end of the loop, the loop stack pops before the end of the loop’s final iteration. If a program reads LADDR at either of these instructions, the value is already the termination address for the next loop stack entry. Loop Counter Stack The sequencer’s loop support, which appears in Figure 3-2 on page 3-4, includes a loop counter stack. The sequencer keeps the loop counter stack synchronized with the loop address stack, with both stacks always having the same number of locations occupied. Because these stacks are synchronized, the same empty and overflow status flags from the STKYx register apply to both stacks. The loop counter stack is six locations deep. The stack is full when all entries are occupied, is empty when no entries are occupied, and is overflowed if a push occurs when the stack is already full. Bits in the STKYx register indicate the loop counter stack full and empty states. Table A-5 on page A-13 lists the bits in the STYKx register. The STKYx bits that indicate loop counter stack status are: 3-28 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer • Loop stacks overflowed. Bit 25 (LSOV) indicates that the loop counter stack and loop stack are overflowed (if 1) or not overflowed (if 0)—A sticky bit • Loop stacks empty. Bit 26, (LSEM) indicates that the loop counter stack and loop stack are empty (if 1) or not empty (if 0)—Not sticky, cleared by a Push Within the sequencer, the current loop counter (CURLCNTR) and loop counter (LCNTR) registers allow access to the loop counter stack. CURLCNTR tracks iterations for a loop being executed, and LCNTR holds the count value before the loop is executed. The two counters let the DSP maintain the count for an outer loop, while a program is setting up the count for an inner loop. The top entry in the loop counter stack (CURLCNTR) always contains the current loop count. This register is readable and writable over the DM Data bus. Reading CURLCNTR when the loop counter stack is empty returns the value 0xFFFF FFFF. The sequencer decrements the value of CURLCNTR for each loop iteration. Because the sequencer tests the termination condition two instruction cycles before the end of the loop, the loop counter also decrements before the end of the loop. If a program reads CURLCNTR at either of the last two loop instructions, the value is already the count for the next iteration. The loop counter stack pops two instructions before the end of the last loop iteration. When the loop counter stack pops, the new top entry of the stack becomes the CURLCNTR value—the count in effect for the executing loop. If there is no executing loop, the value of CURLCNTR is 0xFFFF FFFF after the pop. ADSP-21160 SHARC DSP Hardware Reference 3-29 Loops and Sequencing Writing CURLCNTR does not cause a stack push. If a program writes a new value to CURLCNTR, it changes the count value of the loop currently executing. When no Do/Until LCE loop is executing, writing to CURLCNTR has no effect. Because the processor must use CURLCNTR to perform counter-based loops, there are some restrictions on how a program can write CURLCNTR. For more information, see “Restrictions On Ending Loops” on page 3-22. The next-to-top entry in the loop counter stack (LCNTR) is the location on the stack that takes effect on the next loop stack push. To set up a count value for a nested loop without changing the count for the currently executing loop, a program writes the count value to LCNTR. A value of zero in LCNTR causes a loop to execute 232 times. A Do/Until LCE instruction pushes the value of LCNTR onto the loop count stack, making that value the new CURLCNTR value. Figure 3-4 on page 3-31 demonstrates this process for a set of nested loops. The previous CURLCNTR value is preserved one location down in the stack. If a program reads LCNTR when the loop counter stack is full, the stack returns invalid data. When the loop counter stack is full, the stack discards any data written to LCNTR. If a program reads LCNTR during the last two instructions of a terminating loop, the value of LCNTR is the last CURLCNTR value for the loop. 3-30 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer LCNTR AAAA AAAA CURLCNTR 0XFFFF FFFF 1 LCNTR AAAA AAAA 3 2 CURLCNTR AAAA AAAA LCNTR BBBB BBBB AAAA AAAA CURLCNTR BBBB BBBB LCNTR CCCC CCCC 4 5 6 AAAA AAAA AAAA AAAA AAAA AAAA BBBB BBBB BBBB BBBB BBBB BBBB CURLCNTR CCCC CCCC LCNTR DDDD DDDD CCCC CCCC CCCC CCCC CURLCNTR DDDD DDDD DDDD DDDD LCNTR EEEE EEEE CURLCNTR EEEE EEEE LCNTR FFFF FFFF 7 AAAA AAAA BBBB BBBB CCCC CCCC DDDD DDDD EEEE EEEE CURLCNTR FFFF FFFF Figure 3-4. Pushing the Loop Counter Stack for Nested Loops Interrupts and Sequencing Another type of non-sequential program flow that the sequencer supports is interrupt processing. Interrupts may stem from a variety of conditions, ADSP-21160 SHARC DSP Hardware Reference 3-31 Interrupts and Sequencing both internal and external to the processor. In response to an interrupt, the sequencer processes a subroutine call to a predefined address, the interrupt vector. The DSP assigns a unique vector to each type of interrupt. The DSP supports three prioritized, individually-maskable external interrupts, each of which can be either level- or edge-sensitive. External interrupts occur when another device asserts one of the DSP’s interrupt inputs (IRQ2-0). The DSP also supports internal interrupts. An internal interrupt can stem from arithmetic exceptions, stack overflows, or circular data buffer overflows. Several factors control the DSP’s response to an interrupt. The DSP responds to an interrupt request if: • The DSP is executing instructions or is in an Idle state • The interrupt is not masked • Interrupts are globally enabled • A higher priority request is not pending When the DSP responds to an interrupt, the sequencer branches program execution with a Call to the corresponding interrupt vector address. Within the DSP’s program memory, the interrupt vectors are grouped in an area called the interrupt vector table. The interrupt vectors in this table are spaced at 4-instruction intervals. For a list of interrupt vector addresses and their associated latch and mask bits, see Table B-1 on page B-2. Each interrupt vector has associated latch and mask bits. Table A-9 on page A-18 lists the latch and mask bits. To process an interrupt, the DSP’s program sequencer does the following. 1. Outputs the appropriate interrupt vector address 2. Pushes the current PC value (the return address) on to the PC stack 3. Pushes the current value of the ASTATx,y and MODE1 registers onto the status stack (if the interrupt is IRQ2-0, timer, or VIRPT) 3-32 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer 4. Sets the appropriate bit in the interrupt latch register (IRPTL) 5. Alters the interrupt mask pointer (IMASKP) to reflect the current interrupt nesting state, depending on the nesting mode At the end of the interrupt service routine, the sequencer processes the return from interrupt (RTI) instruction and does following. 1. Returns to the address stored at the top of the PC stack 2. Pops this value off of the PC stack 3. Pops the status stack (if the ASTATx,y and MODE1 status registers were pushed for the IRQ2-0, timer, or VIRPT interrupt) 4. Clears the appropriate bit in the interrupt latch register (IRPTL) and interrupt mask pointer (IMASKP) Except for reset, all interrupt service routines should end with a return-from-interrupt (RTI) instruction. After reset, the PC stack is empty, so there is no return address. The last instruction of the reset service routine should be a jump to the start of your program. If software writes to a bit in IRPTL forcing an interrupt, the processor recognizes the interrupt in the following cycle, and two cycles of branching to the interrupt vector follow the recognition cycle. The DSP responds to interrupts in three stages: synchronization and latching (1 cycle), recognition (1 cycle), and branching to the interrupt ADSP-21160 SHARC DSP Hardware Reference 3-33 Interrupts and Sequencing vector (2 cycles). Table 3-15, Table 3-16 on page 3-35, and Table 3-17 on page 3-36 show the pipelined execution cycles for interrupt processing. Table 3-15. Pipelined Execution cycles For Interrupt During Single-Cycle Instruction Cycles Fetch Decode Execute 1 n+1 n n – 11 2 n+ 22 n+1nop3 n 3 v4 n+2NOP5 NOP 4 v+1 v NOP 5 v+2 v+1 v Note that n is the single-cycle instruction, and v is the interrupt vector instruction 1. Interrupt occurs 2. Interrupt recognized 3. n+1 pushed on PC stack; n+1 suppressed 4. Interrupt vector output 5. n+2 suppressed 3-34 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-16. Pipelined Execution Cycles For Interrupt During Instruction With Conflicting PM Data Access (Instruction Not Cached) Cycles Fetch Decode Execute 1 n+1 n n – 11 2 —2 n+1nop3 n 3 n+24 n+1nop5 NOP 4 v6 n+2NOP7 NOP 5 v+1 v NOP 6 v+2 v+1 v Note that n is the conflicting instruction, and v is the interrupt vector instruction 1. Interrupt occurs 2. Interrupt recognized, but not processed; PM data access 3. n+1 suppressed 4. Interrupt processed 5. n+1 suppressed 6. Interrupt vector output 7. n+1 pushed on PC stack; n+2 suppressed ADSP-21160 SHARC DSP Hardware Reference 3-35 Interrupts and Sequencing Table 3-17. Pipelined Execution Cycles for Interrupt During Delayed Branch Instruction Cycles Fetch Decode Execute 1 n+1 n n – 11 2 n+22 n+1 n 3 j n+2 n+1 4 j+13 j nop4 n+2 5 v5 j+1NOP6 NOP 6 v+1 v NOP 7 v+2 v+1 v Note that n is the delayed branch instruction, j is the instruction at the branch address, and v is the interrupt vector instruction 1. Interrupt occurs 2. Interrupt recognized, but not processed 3. Interrupt processed 4. For a Call, n+3 (return address) is pushed onto the PC stack; j suppressed 5. Interrupt vector output 6. j pushed on PC stack; j+1 suppressed For most interrupts, internal and external, only one instruction is executed after the interrupt occurs (and before the two instructions aborted) while the processor fetches and decodes the first instruction of the service routine. Because of the one-cycle delay between an arithmetic exception and the STKYx,y register update, there are two cycles after an arithmetic exception occurs before interrupt processing starts. Table 3-18 on page 3-37 lists the latency associated with the IRQ2-0 interrupts and the multiprocessor vector interrupt. 3-36 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-18. Minimum Latency of the IRQ2-0 and VIRPT Interrupts Interrupt Minimum Latency IRQ2-0 3 cycles VIRPT 6 cycles If nesting is enabled and a higher priority interrupt occurs immediately after a lower priority interrupt, the service routine of the higher priority interrupt is delayed by one additional cycle. This delay allows the first instruction of the lower priority interrupt routine to be executed before it is interrupted. For more information, “Nesting Interrupts” on page 3-43. Certain DSP operations that span more than one cycle hold off interrupt processing. If an interrupt occurs during one of these operations, the DSP latches the interrupt, but delays its processing. The operations that have delayed interrupt processing are as follows. • A branch (Jump or Call/return) instruction and the following cycle, whether it is an instruction (in a delayed branch) or a NOP (in a non-delayed branch) • The first of the two cycles used to perform a program memory data access and an instruction fetch when the instruction is not cached • The third-to-last iteration of a one-instruction loop • The last iteration of a one-instruction loop executed once or twice or of a two-instruction loop executed once, and the following cycle (which is a NOP) • The first of the two cycles used to fetch and decode the first instruction of an interrupt service routine ADSP-21160 SHARC DSP Hardware Reference 3-37 Interrupts and Sequencing • Any waitstates for external memory accesses • Any external memory access that is required when the DSP does not have control of the external bus, during a host bus grant or when the DSP is a bus slave in a multiprocessing system Sensing Interrupts The DSP supports two types of interrupt sensitivity—the signal shape that triggers the interrupt. On interrupt pins (IRQ2-0), either the input signal’s edge or level can trigger an external interrupt. The DSP detects a level-sensitive interrupt if the signal input is low (active) when sampled on the rising edge of CLKIN. A level-sensitive interrupt must go high (inactive), before the processor returns from the interrupt service routine. If a level-sensitive interrupt is still active when the DSP samples it after returning from its service routine, the DSP treats the signal as a new request, repeating the same interrupt routine without returning to the main program, assuming no higher priority interrupts are active. The DSP detects an edge-sensitive interrupt if the input signal is high (inactive) on one cycle and low (active) on the next cycle when sampled on the rising edge of CLKIN. An edge-sensitive interrupt signal can stay active indefinitely without triggering additional interrupts. To request another interrupt, the signal must go high, then low again. Edge-sensitive interrupts require less external hardware compared to level-sensitive requests, because there is never a need to negate the request. An advantage of level-sensitive interrupts is that multiple interrupting devices may share a single level-sensitive request line on a wired-OR basis, allowing for easy system expansion. The MODE2 register controls external interrupt sensitivity. Table A-3 on page A-7 lists all bits in the MODE2 register. 3-38 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer The following bits in MODE2 control interrupt sensitivity. • Interrupt 0 Sensitivity. Bit 0, (IRQ0E), directs the DSP to detect IRQ0 as edge-sensitive (if 1) or level-sensitive (if 0). • Interrupt 1 Sensitivity. Bit 1, (IRQ1E), directs the DSP to detect IRQ1 as edge-sensitive (if 1) or level-sensitive (if 0). • Interrupt 2 Sensitivity. Bit 2, (IRQ2E), directs the DSP to detect IRQ2 as edge-sensitive (if 1) or level-sensitive (if 0). The DSP accepts external interrupts that are asynchronous to the DSP’s clock (CLKIN), allowing external interrupt signals to change at any time. An external interrupt must be held low at least one CLKIN cycle to guarantee that the DSP samples the signal. interrupts must meet the setup and hold time require External ments relative to the rising edge of . For information on CLKIN interrupt signal timing requirements, see the DSP’s data sheet. Masking Interrupts The sequencer supports interrupt masking—latching an interrupt, but not responding to it. Except for the RESET and EMU interrupts, all interrupts are maskable. If a masked interrupt is latched, the DSP responds to the latched interrupt if it is later unmasked. Interrupts can be masked globally or selectively. Bits in the MODE1, IMASK, and LIRPTL registers control interrupt masking. Table A-2 on page A-3 lists the bits in MODE1, Table A-9 on page A-18 lists the bits in IMASK, and Table A-10 on page A-24 lists the bits in LIRPTL. ADSP-21160 SHARC DSP Hardware Reference 3-39 Interrupts and Sequencing These bits control interrupt masking as follows. • Global interrupt enable. MODE1, Bit 12, (IRPTEN), directs the DSP to enable (if 1) or disable (if 0) all interrupts. • Selective interrupt enable. IMASK, Bits 32-0, direct the DSP to enable (if 1) or disable/mask (if 0) the corresponding interrupt. • Selective link port interrupt enable. LIRPTL, Bits 21-16, (LPxMSK) direct the DSP to enable (if 1) or disable/mask (if 0) the corresponding link port interrupt. Except for the non-maskable interrupts and boot interrupts, all interrupts are masked at reset. For booting, the DSP automatically unmasks and uses either the external port (EPOI) or link port (LP4I) interrupt after reset, depending on whether the ADSP-21160 DSP is booting from EPROM, host, or link ports. Latching Interrupts When the DSP recognizes an interrupt, the DSP’s interrupt latch (IRPTL and LIRPTL) registers latch the interrupts—set a bit to record that the interrupt occurred. The bits in these registers indicate all interrupts that are currently being serviced or are pending. Because these registers are readable and writable, any interrupt (except reset) can be set or cleared in software. Note that writing to the reset bit (bit 1) in IRPTL puts the processor into an illegal state. When an interrupt occurs, the sequencer sets the corresponding bit in IRPTL or LIRPTL. During execution of the interrupt’s service routine, the DSP keeps this bit cleared—clearing the bit during every cycle to prevent the same interrupt from being latched while its service routine is executing. After the return from interrupt (RTI), the sequencer stops clearing the latch bit. 3-40 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer If necessary, it is possible to re-use an interrupt while it is being serviced. For more information, see “Reusing Interrupts” on page 3-45. The interrupt latch bits in IRPTL correspond to interrupt mask bits in the IMASK register. In both registers, the interrupt bits are arranged in order of priority. The interrupt priority is from 0 (highest) to 31 (lowest). Interrupt priority determines which interrupt is serviced first when more than one occurs in the same cycle. Priority also determines which interrupts are nested when the DSP has interrupt nesting enabled. For more information, see “Nesting Interrupts” on page 3-43. While IRPTL latches interrupts for a variety of events, the LIRPTL register contains latch and mask bits only for Link Port DMA interrupts. A logical Or’ing of link port interrupts (masked-latch state) appears in the LPSUM bit in the IRPTL register. Because the LPSUM bit has a corresponding mask bit in the IMASK register, programs can use LPSUM for a second level of link port interrupt masking. Multiple events can cause arithmetic interrupts—fixed-point overflow (FIXI) and floating-point overflow (FLTOI), underflow (FLTUI), and invalid operation (FLTII). To determine which event caused the interrupt, a program can read the arithmetic status flags in the STYKx or STKYy status registers. Table A-5 on page A-13 lists the bits in these registers. Service routines for arithmetic interrupts must clear the appropriate STKYx or STKYy bits to clear the interrupt. If the bits are not cleared, the interrupt is still active after the return from interrupt (RTI). bits in only apply in SIMD mode. For more informa Status tion, see “Secondary Processing Element (PEy)” on page 2-35. STKYy One event can cause multiple interrupts. The timer decrementing to zero causes two timer expired interrupts, TMZHI (high priority) and TMZLI (low priority). This feature allows selection of the priority for the timer interrupt. Programs should unmask the timer interrupt with the desired priority and leave the other one masked. If both interrupts are unmasked, IRPTL latches both interrupts when the timer reaches zero, and the DSP ADSP-21160 SHARC DSP Hardware Reference 3-41 Interrupts and Sequencing services the higher priority interrupt first, then the lower priority interrupt. The IRPTL also supports software interrupts. When a program sets the latch bit for one of these interrupts (SFT0I, SFT1I, SFT2I, or SFT3I), the sequencer services the interrupt, and the DSP branches to the corresponding interrupt routine. Software interrupts have the same behavior as all other maskable interrupts. Stacking Status During Interrupts To run in an interrupt driven system, programs depend on the DSP being restored to its pre-interrupt state after an interrupt is serviced. The sequencer’s status stack eases the return from interrupt process by eliminating some interrupt service overhead—register saves and restores. The status stack is fifteen locations deep. The stack is full when all entries are occupied, is empty when no entries are occupied, and is overflowed if a push occurs when the stack is already full. Bits in the STKYx register indicate the status stack full and empty states. Table A-5 on page A-13 lists the bits in the STYKx register. The STKYx bits that indicate status stack status are: • Status stack overflow. Bit 23, (SSOV), indicates that the status stack is overflowed (if 1) or not overflowed (if 0)—a sticky bit. • Status stack empty. Bit 24, (SSEM), indicates that the status stack is empty (if 1) or not empty (if 0)—not sticky, cleared by a Push. For some interrupts (IRQ2-0, timer expired, and VIRPT), the sequencer automatically pushes the ASTATx, ASTATy, and MODE1 registers onto the status stack. When the sequencer pushes an entry onto the status stack, the DSP uses the MMASK register to set up MODE1 register. The sequencer automatically pops the ASTATx, ASTATY, and MODE1 registers from the status stack during the return from interrupt instruction (RTI). 3-42 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer In one other case—Jump (CI), the sequencer pops the stack. For more information, see “Reusing Interrupts” on page 3-45. Only the IRQ2-0, timer expired, and VIRPT interrupts cause the sequencer to push an entry onto the status stack. All other interrupts require explicit saves and restores of effected registers or require an explicit push or pop of the stack (Push/Pop Sts). Pushing ASTATx, ASTATy, and MODE1 preserves the status and control bit settings, allowing a service routine to alter these bits with the knowledge that the original settings are automatically restored upon the return from the interrupt. The top of the status stack contains the current values of ASTATx, ASTATy, and MODE1. Reading and writing these registers does not move the stack pointer. Explicit Push or Pop instructions do move the status stack pointer. Nesting Interrupts The sequencer supports interrupt nesting—responding to another interrupt while a previous interrupt is being serviced. Bits in the MODE1, IMASKP, and LIRPTL registers control interrupt nesting. Table A-2 on page A-3 lists the bits in MODE1, Table A-9 on page A-18 lists the bits in IMASKP, and Table A-10 on page A-24 lists the bits in LIRPTL. ADSP-21160 SHARC DSP Hardware Reference 3-43 Interrupts and Sequencing These bits control interrupt nesting as follows. • Interrupt nesting enable. MODE1, Bit 11 (NESTM), directs the DSP to enable (if 1) or disable (if 0) interrupt nesting. • Interrupt Mask Pointer. IMASKP, 32 Bits, lists the interrupts in priority order and provides a temporary interrupt mask for each nesting level. • Link Port DMA Interrupt Mask Pointer. LIRPTL, Bits 21-16, (LPxMSK), lists link port DMA interrupts in priority order and provides a temporary interrupt mask for each nesting level. When interrupt nesting is disabled, a higher priority interrupt can not interrupt a lower priority interrupt’s service routine. Other interrupts are latched as they occur, but the DSP processes them after the active routine finishes. When interrupt nesting is enabled, a higher priority interrupt can interrupt a lower priority interrupt’s service routine. Lower interrupts are latched as they occur, but the DSP process them after the nested routines finish. Programs should only change the interrupt nesting enable (NESTM) bit while outside of an interrupt service routine or during the reset service routine. If nesting is enabled and a higher priority interrupt occurs immediately after a lower priority interrupt, the service routine of the higher priority interrupt is delayed by one cycle. This delay allows the first instruction of the lower priority interrupt routine to be executed, before it is interrupted. When servicing nested interrupts, the DSP uses the interrupt mask pointer (IMASKP) to create a temporary interrupt mask for each level of interrupt nesting; the IMASK value is not effected. The DSP changes 3-44 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer each time a higher priority interrupt interrupts a lower priority service routine. IMASKP The bits in IMASKP correspond to the interrupts in order of priority. When an interrupt occurs, the DSP sets its bit in IMASKP. If nesting is enabled, the DSP uses IMASKP to generate a new temporary interrupt mask, masking all interrupts of equal or lower priority to the highest priority bit set in IMASKP and keeping higher priority interrupts the same as in IMASK. When a return from an interrupt service routine (RTI) is executed, the DSP clears the highest priority bit set in IMASKP and generates a new temporary interrupt mask, masking all interrupts of equal or lower priority to the highest priority bit set in IMASKP. The bit set in IMASKP that has the highest priority always corresponds to the priority of the interrupt being serviced. If an interrupt re-occurs while its service routine is running and nesting is enabled, the DSP updates IRPTL, but does not service the interrupt. The DSP waits until the return from interrupt (RTI) completes, before vectoring to the service routine again. If nesting is not enabled, the DSP masks out all interrupts and IMASKP is not used, but the DSP still updates IMASKP to create a temporary interrupt mask. The interrupt controller uses the register and the bits of the register. These bits should not be modified to IMASKP LPxMSKP LIRPTL ensure proper functioning of the interrupt controller. Reusing Interrupts Unless interrupt nesting is enabled, the DSP ignores and does not latch an interrupt that re-occurs while its service routine is executing. When the interrupt initially occurs, the sequencer sets the corresponding bit in IRPTL. During execution of the service routine, the sequencer keeps this bit cleared—the DSP clears the bit during every cycle, preventing the same interrupt from being latched while its service routine is already executing. ADSP-21160 SHARC DSP Hardware Reference 3-45 Interrupts and Sequencing If necessary, it is possible to re-use an interrupt while it is being serviced. Using a Jump clear interrupt—Jump (CI)—instruction in the interrupt service routine clears the interrupt, allowing its re-usage while the service routing is executing. The Jump (CI) instruction reduces an interrupt service routine to a normal subroutine, clearing the appropriate bit in the interrupt latch and interrupt mask pointer and popping the status stack. After the Jump (CI) instruction, the DSP stops automatically clearing the interrupt’s latch bit, allowing the interrupt to latch again. When returning from a subroutine entered with a Jump (CI) instruction, a program must use a return loop re-entry—RTS (LR)—instruction. For more information, see “Restrictions On Ending Loops” on page 3-22. The following example shows an interrupt service routine that is reduced to a subroutine with the (CI) modifier: instr1; {interrupt entry from main program} JUMP(PC,3) (DB,CI); {clear interrupt status} instr3; instr4; instr5; RTS (LR); {use LR modifier with return from subroutine} Jump(PC,3)(DB,CI) instruction actually only continues linear The execution flow by jumping to the location PC + 3 ( ), with instr5 the two intervening instructions (instr3, instr4) being executed because of the delayed branch (DB). This Jump instruction is only an example—a Jump (CI) can be to any location. Interrupting IDLE The sequencer supports placing the DSP in Idle—a special instruction that halts the processor core in a low-power state, until an external interrupt (IRQ2-0), timer interrupt, DMA interrupt, or VIRPT vector interrupt 3-46 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer occurs. When executing an Idle instruction, the sequencer fetches one more instruction at the current fetch address and then suspends operation. The DSP’s I/O processor is not effected by the Idle instruction—DMA transfers to or from internal memory continues uninterrupted. The processor’s internal clock and timer (if it is enabled) continue to run during Idle. When an external interrupt (IRQ2-0), timer interrupt, DMA interrupt, or VIRPT vector interrupt occurs, the processor responds normally. After two cycles used to fetch and decode the first instruction of the interrupt service routine, the processor continues executing instructions normally. Multiprocessing Interrupts The sequencer supports a multiprocessor vector interrupt. The vector interrupt (VIRPT) permits passing interprocessor commands in multiple-processor systems. This interrupt occurs when an external processor (a host or another DSP) writes an address to the VIRPT register, inserting a new vector address for VIRPT. There is room in the VIRPT register for the vector address and data for the service routine. Table A-18 on page A-48 lists the bits in the VIRPT registers. When servicing a VIRPT interrupt, the DSP automatically pushes the status stack and executes the service routine located at the address specified in VIRPT. During the return from interrupt (RTI), the DSP automatically pops the status stack. To flag that a VIRPT interrupt is pending, the DSP sets the VIPD bit in the SYSTAT register when the external processor writes to the VIRPT register. ADSP-21160 SHARC DSP Hardware Reference 3-47 Timer and Sequencing Programs passing interprocessor commands must monitor VIPD to check if the DSP can receive a new VIRPT address because: • If an external processor writes VIRPT while a previous vector is pending, the new VIRPT address replaces the previous pending one. • If an external processor writes VIRPT while a previous vector is executing, the new VIRPT address does not execute (no new interrupt is triggered). When returning from a VIRPT interrupt, the DSP clears the VIPD bit. Note that if a DSP writes to its own VIRPT register, the write is ignored. Timer and Sequencing The sequencer includes a programmable interval timer, which appears in Figure 3-2 on page 3-4. Bits in the MODE2, TCOUNT, and TPERIOD registers control timer operations. Table A-3 on page A-7 lists the bits in MODE2. The bits that control the timer are: • Timer enable. MODE2, Bit 5 (TIMEN), directs the DSP to enable (if 1) or disable (if 0) the timer. • Timer count. (TCOUNT) This register contains the decrementing timer count value, counting down the cycles between timer interrupts. • Timer period. (TPERIOD) This register contains the timer period, indicating the number of cycles between timer interrupts. The TCOUNT register contains the timer counter. The timer decrements the register each clock cycle. When the TCOUNT value reaches zero, the timer generates an interrupt and asserts the TIMEXP output high for four core cycles (when the timer is enabled) as shown in Figure 3-4 on TCOUNT 3-48 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer page 3-31. On the clock cycle after TCOUNT reaches zero, the timer automatically reloads TCOUNT from the TPERIOD register. The TPERIOD value specifies the frequency of timer interrupts. The number of cycles between interrupts is TPERIOD + 1. The maximum value of 32 TPERIOD is 2 – 1. To start and stop the timer, programs use the MODE2 register’s TIMEN bit. With the timer disabled (TIMEN=0), the program loads TCOUNT with an initial count value and loads TPERIOD with the number of cycles for the desired interval. Then, the program enables the timer (TIMEN=1) to begin the count. When a program enables the timer, the timer starts decrementing the TCOUNT register at the end of the next clock cycle. If the timer is subsequently disabled, the timer stops decrementing TCOUNT after the next clock cycle as shown in Figure 3-5. The timer expired event (TCOUNT decrements to zero) generates two interrupts, TMZHI and TMZLI. For information on latching and masking these interrupts to select timer expired priority, see “Latching Interrupts” on page 3-40. ADSP-21160 SHARC DSP Hardware Reference 3-49 Stacks and Sequencing TIMER ENABLE SET TIMEN TIMER ACTIVE IN MODE2 CLKIN TCOUNT=N TCOUNT=N TCOUNT=N-1 TIMER DISABLE CLEAR TIMEN TIMER INACTIVE IN MODE2 CLKIN TCOUNT=M-1 TCOUNT=M-2 TCOUNT=M-2 Figure 3-5. Timer Enable and Disable As with other interrupts, the sequencer needs two cycles to fetch and decode the first instruction of the timer expired service routine, before executing the routine. The pipeline execution for the timer interrupt appears in Table 3-15 on page 3-34. Programs can read and write the TPERIOD and TCOUNT registers, using universal register transfers. Reading the registers does not effect the timer. Note that an explicit write to TCOUNT takes priority over the sequencer’s loading TCOUNT from TPERIOD and the timer’s decrementing of TCOUNT. Also note that TCOUNT and TPERIOD are not initialized at reset; programs should initialize these registers before enabling the timer. Stacks and Sequencing The sequencer includes a Program Counter (PC) stack, which appears in Table 3-2 on page 3-6. At the start of a subroutine or loop, the sequencer pushes return addresses for subroutines (Call/return instructions) and 3-50 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer top-of-loop addresses for loops (Do/Until) instructions onto the PC stack. The sequencer pops the PC stack during a return from interrupt (RTI), returns from subroutine (RTS), and loop termination. The PC stack is 30 locations deep. The stack is full when all entries are occupied, is empty when no entries are occupied, and is overflowed if a push occurs when the stack is already full. Bits in the STKYx register indicate the PC stack full and empty states. Table A-5 on page A-13 lists the bits in the STYKx register. The STKYx bits that indicate PC stack status are: • PC stack full. Bit 21, (PCFL), indicates that the PC stack is full (if 1) or not full (if 0)—not a sticky bit, cleared by a Pop. • PC stack empty. Bit 22, (PCEM), indicates that the PC stack is empty (if 1) or not empty (if 0)—not sticky, cleared by a Push. The PC stack full condition causes a maskable interrupt (SOVFI). This interrupt occurs when the PC stack has 29 locations filled (the almost full state). The PC stack full interrupt occurs when one location is left, because the PC stack full service routine needs that last location for its return address. The address of the top of the PC stack is available in the PC stack pointer (PCSTKP) register. The value of PCSTKP is zero when the PC stack is empty, is 1...30 when the stack contains data, and is 31 when the stack overflows. This register is a readable and writable register. A write to PCSTKP takes effect after a one-cycle delay. If the PC stack is overflowed, a write to PCSTKP has no effect. The overflow and full flags provide diagnostic aid only. Programs should not use these flags for runtime recovery from overflow. Note that the status stack, loop stack overflow, and PC stack full conditions trigger a maskable interrupt. The empty flags can ease stack saves to memory. Programs can monitor the empty flag when saving a stack to memory to determine when the DSP has transferred all values. ADSP-21160 SHARC DSP Hardware Reference 3-51 Conditional Sequencing Conditional Sequencing The sequencer supports conditional execution with conditional logic that appears in Figure 3-2 on page 3-4. This logic evaluates conditions for conditional (If) instructions and loop (Do/Until) terminations. The conditions are based on information from the arithmetic status registers (ASTATx and ASTATy), the mode control 1 register (MODE1), the flag inputs and the loop counter. For more information on arithmetic status, see “Using Computational Status” on page 2-7. When in SIMD mode, conditional execution is effected by the arithmetic status of both processing elements. For information on conditional sequencing in SIMD mode, see “SIMD Mode and Sequencing” on page 3-55. Each condition that the DSP evaluates has an assembler mnemonic. The condition mnemonics for conditional instructions appear in Table 3-19 on page 3-53. For most conditions, the sequencer can test both true and false states. For example, the sequencer can evaluate ALU equal-to-zero (EQ) and ALU not-equal-to-zero (NZ). To test conditions that do not appear in Table 3-19, a program can use the Test Flag (TF) condition that is generated from a Bit Test Flag (BTF) instruction. The TF flag is set or cleared as a result of a Bit Test or Bit 3-52 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer XOR instruction, which can test the contents of any of the DSP’s system registers, including STKYx and STKYy. Table 3-19. IF Condition and Do/Until Termination Mnemonics Condition From Description True if… Mnemonic ALU ALU = 0 AZ = 1 EQ ALU 0 AZ = 0 NE ALU > 0 footnote1 GT ALU < zero footnote2 LT ALU 0 footnote3 GE ALU 0 footnote4 LE ALU carry AC = 1 AC ALU not carry AC = 0 NOT AC ALU overflow AV = 1 AV ALU not overflow AV = 0 NOT AV Multiplier overflow MV = 1 MV Multiplier not overflow MV= 0 NOT MV Multiplier sign MN = 1 MS Multiplier not sign MN = 0 NOT MS Shifter overflow SV = 1 SV Shifter not overflow SV = 0 NOT SV Shifter zero SZ = 1 SZ Shifter not zero SZ = 0 NOT SZ Bit test flag true BTF = 1 TF Bit test flag false BTF = 0 NOT TF Multiplier Shifter Bit Test ADSP-21160 SHARC DSP Hardware Reference 3-53 Conditional Sequencing Table 3-19. IF Condition and Do/Until Termination Mnemonics (Cont’d) Condition From Description True if… Mnemonic Flag Input Flag0 asserted FI0 = 1 FLAG0_IN Flag0 not asserted FI0 = 0 NOT FLAG0_IN Flag1 asserted FI1 = 1 FLAG1_IN Flag1 not asserted FI1 = 0 NOT FLAG1_IN Flag2 asserted FI2 = 1 FLAG2_IN Flag2 not asserted FI2 = 0 NOT FLAG2_IN Flag3 asserted FI3 = 1 FLAG3_IN Flag3 not asserted FI3 = 0 NOT FLAG3_IN Mode Sequencer 1 2 3 4 Bus master true BM Bus master false NOT BM Loop counter expired (Do) CURLCNTR = 1 LCE Loop counter not expired (If ) CURLCNTR 1 NOT ICE Always false (Do) Always FOREVER Always true (If ) Always TRUE ALU greater than (GT) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN)] or AZ = 0 ALU less than (LT) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN and AZ)] = 1 ALU greater equal (GE) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN and AZ)] = 0 ALU lesser or equal (LE) is true if: [ AF and (AN xor (AV and ALUSAT)) or (AF and AN)] or AZ = 1 The two conditions that do not have complements are LCE/Not LCE (loop counter expired/not expired) and True/Forever. The context of these condition codes determines their interpretation. Programs should use True and Not LCE in conditional (IF) instructions. Programs should use Forever and LCE to specify loop (Do/Until) termination. A Do Forever instruction executes a loop indefinitely, until an interrupt or reset intervenes. 3-54 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer There are some restrictions on how programs may use conditions in Do/Until loops. For more information, see “Restrictions On Ending Loops” on page 3-22 and “Restrictions On Short Loops” on page 3-23. bus master condition (BM) indicates whether the DSP is the The current bus master in a multiprocessor system. To enable testing this condition, a program must clear the MODE1 register’s Condition Code Select (CSEL) bits. Otherwise, the bus master condition is always false. SIMD Mode and Sequencing The DSP supports a Single-Instruction, Multiple-Data (SIMD) mode. In this mode, both of the DSP’s processing elements (PEx and PEy) execute instructions and generate status conditions. For more information on SIMD computations, see “Secondary Processing Element (PEy)” on page 2-35. Because the two processing elements can generate different outcomes, the sequencers must evaluate conditions from both elements (in SIMD mode) for conditional (If) instructions and loop (Do/Until) terminations. The DSP records status for the PEx element in the ASTATx and STKYx registers. The DSP records status for the PEy element in the ASTATy and STKYy registers. Table A-4 on page A-9 lists the bits in ASTATx and ASTATy, and Table A-5 on page A-13 lists the bits in STKYx and STKYy. Even though the DSP has dual processing elements, the sequencer does not have dual sets of stacks. There is one PC stack, one loop address stack, and one loop counter stack. The status bits for stacks are in STKYx and are not duplicated in STKYy. In SIMD mode, the status stack stores both ASTATx and ASTATy. A status stack Push or Pop instruction in SIMD mode effects both registers in parallel. While in SIMD mode, the sequencer evaluates conditions from both PE’s for conditional (If) and loop (Do/Until) instructions. Table 3-20 ADSP-21160 SHARC DSP Hardware Reference 3-55 SIMD Mode and Sequencing summarizes how the sequencer resolves each conditional test when SIMD mode is enabled. Table 3-20. Conditional Execution Summary Conditional Operation Conditional Outcome Depends On … Compute Operations Executes in each PE independently depending on condition test in each PE Branches and Loops Executes in sequencer depending on And’ing condition test on both PE’s. Data Moves (from complementary pair1 to complementary pair, including X<->Y swap) Executes move in each PE (and/or memory) independently depending on condition test in each PE Data Moves (from uncomplemented universal register2 to complementary pair1) Executes move in each PE (and/or memory) independently depending on condition test in each PE; the same uncomplemented universal is source for each move Data Moves (from complementary pair1 to uncomplemented register2) Executes explicit move to uncomplemented universal register depending on condition test in PEx only; no implicit move occurs DAG Operations Executes modify3 in DAG depending on Or’ing condition test on both PE’s 1 Complementary pairs are registers with SIMD complements, include PEx/y data registers and USTAT1/2, USTAT3/4, ASTATx/y, STKYx/y, and PX1/2 universal registers. This also includes internal memory for conditional execution. 2 Uncomplemented registers are universal registers that do not have SIMD complements. 3 Post-modify operations follow this rule, but pre-modify operations always occur despite outcome Conditional Compute Operations While in SIMD mode, a conditional compute operation may execute on both PE’s, either one PE, or neither PE dependent on the outcome of the status flag test. Flag testing is independently performed on each PE. 3-56 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Conditional Branches and Loops The DSP executes a conditional branch (Jump or Call/return) or loop (Do/Until) based on the result of And’ing the condition tests on both PE’s. A conditional branch or loop in SIMD mode occurs only when the condition is true in PEx and PEy. Using complementary conditions (for example EQ and NE), programs can produce an OR’ing of the condition tests for branches and loops in SIMD mode. A conditional branch or loop that uses this technique should consist of a series of conditional compute operations. These conditional computes generate NOPs on the PE where a branch or loop does not execute. For more information on programming in SIMD mode, see ADSP-21160 SHARC DSP Instruction Set Reference. Conditional Data Moves The execution of a conditional (If) data move (register-to-register and register-to/from-memory) instruction depends on three factors: • The explicit data move depends on the evaluation of the conditional test in the PEx processing element • The implicit data move depends on the evaluation of the conditional test in the PEy processing element • Both moves depend on the types of registers used in the move There are four cases for SIMD conditional data moves. Case 1: Complementary Register Pair Data Move In this case data moves from a complementary register pair to a complementary register pair. The DSP executes the explicit move depending on the evaluation of the conditional test in the PEx processing element and ADSP-21160 SHARC DSP Hardware Reference 3-57 SIMD Mode and Sequencing the implicit move depending on the evaluation of the conditional test in the PEy processing element. Example: Register to Memory Move – PEx Explicit Register IF EQ DM(I0,M0) = R2; For this instruction the DSP is operating in SIMD mode, a register in the PEx data register file is the explicit register and I0 is pointing to an even address in internal memory. Indirect addressing is shown in the instructions shown in this example. However, the same results occur using direct addressing. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-21. Table 3-21. Register to Memory Moves – Complementary Pairs Condition in PEx Condition in PEy Result AZx AZy Explicit Implicit 0 0 NO data move occurs NO data move occurs 0 1 NO data move occurs from r2 to location I0 s2 transfers to location (I0+1) 1 0 r2 transfers to location I0 NO data move occurs from s2 to location (I0+1) 1 1 r2 transfers to location I0 s2 transfers to location (I0+1) Example: Register to Memory Move – PEy Explicit Register IF EQ DM(I0,M0) = S2; For this instruction the DSP is operating in SIMD mode, a register in the PEy data register file is the explicit register and I0 is pointing to an even address in internal memory. The data movement resulting from the 3-58 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-22. Table 3-22. Register to Register Moves – Complementary Pairs Condition in PEx Condition in PEy Result AZx AZy Explicit Implicit 0 0 NO data move occurs NO data move occurs 0 1 NO data move occurs from s2 to location I0 r2 transfers to location I0+1 1 0 s2 transfers to location I0 NO data move occurs from r2 to location I0+1 1 1 s2 transfers to location I0 r2 transfers to location I0+1 Examples: Register to Register Move Instructions IF EQ R8 = R2; IF EQ PX1 = R2; IF EQ USTAT1 = R2; For these instruction the DSP is operating in SIMD mode and registers in the PEx data register file are used as the explicit registers. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-23. ADSP-21160 SHARC DSP Hardware Reference 3-59 SIMD Mode and Sequencing Table 3-23. Register to Register Moves – Complementary Pairs Condition in PEx Condition in PEy Result AZx AZy Explicit Implicit 0 0 NO data move occurs NO data move occurs 0 1 NO data move to registers r9,px1,ustat1 occurs s2 transfers to registers s9,px2 and ustat2 1 0 r2 transfers to registers r9,px1 and ustat1 NO data move to s9, px2, or ustat2 occurs 1 1 r2 transfers to registers r9,px1, and ustat1 s2 transfers to registers s9,px2,and ustat2 Examples: Register to Register Move Instructions IF EQ R8 = S2; IF EQ PX1 = S2; IF EQ USTAT1 = S2; For these instructions the DSP is operating in SIMD mode and registers in the PEy data register file are used as explicit registers. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-24. 3-60 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-24. Register to Register Moves – Complementary Register Pairs Condition in PEx Condition in PEy Result AZx AZy Explicit Implicit 0 0 NO data move occurs NO data move occurs 0 1 NO data move to registers s9,px r2 transfers to registers s9,px2, and ustat1 occurs and ustat2 1 0 s2 transfers to registers r9,px1 and ustat1 NO data move to registers s9,px2, and ustat2 occurs 1 1 s2 transfers to registers r9,px1, and ustat1 r2 transfers to registers s9,px2, and ustat2 Case 2: Uncomplemented to Complementary Register Move In this case data moves from an uncomplemented register (Ureg without a SIMD complement) to a complementary register pair. The DSP executes the explicit move depending on the evaluation of the conditional test in the PEx processing element. The DSP executes the implicit move depending on the evaluation of the conditional test in the PEy processing element. In each processing element where the move occurs, the content of the source register is duplicated in destination. Example: Register to Register Move IF EQ R1 = PX; and are complementary registers, the combined While register has no complementary register. For more information, see PX1 PX2 PX “Internal Data Bus Exchange” on page 5-7. ADSP-21160 SHARC DSP Hardware Reference 3-61 SIMD Mode and Sequencing For this instruction the DSP is operating in SIMD mode. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-25. Table 3-25. Complementary to Uncomplemented Register Move Condition in PEx Condition in PEy Result AZx AZy Explicit 0 0 r1 remains unchanged s1 remains unchanged 0 1 r1 remains unchanged s1 gets px value 1 0 r1 gets px value s1 remains unchanged 1 1 r1 gets px value s1 gets px value Implicit Case 3: Complementary Register => Uncomplimentary Register In this case data moves from a complementary register pair to an uncomplemented register. The DSP executes the explicit move to the uncomplemented universal register, depending on the condition test in the PEx processing element only. The DSP does not perform an implicit move. Example: Register to Register Move IF EQ PX = R1; For this instruction the DSP is operating in SIMD mode. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-26. 3-62 ADSP-21160 SHARC DSP Hardware Reference Program Sequencer Table 3-26. Complementary to Uncomplemented Move Condition in PEx Condition in PEy Result AZx AZy Explicit 0 0 r1 remains unchanged no implicit move s1 remains unchanged 0 1 r1 remains unchanged no implicit move s1 remains unchanged 1 0 r1 40-bit explicit move to px no implicit move s1 remains unchanged 1 1 r1 40-bit explicit move to px no implicit move s1 remains unchanged Implicit For more details on PX register transfers, refer to “Internal Data Bus Exchange” on page 5-7. Case 4: Data Move Involves External Memory or IOP Memory Space Conditional data moves from a complementary register pair to an uncomplemented register with an access to external memory space or IOP memory space. This results in unexpected behavior and should not be used. Example: Register to Memory Move IF EQ DM(I0,M0) = R2; IF EQ DM(I0,M0) = S2; For these instruction the DSP is operating in SIMD mode and the explicit register is either a PEx register or PEy register. I0 points to either external memory space or IOP memory space. ADSP-21160 SHARC DSP Hardware Reference 3-63 SIMD Mode and Sequencing Indirect addressing is shown in the instructions shown in this example. However, the same results occur using direct addressing. Conditional DAG Operations Conditional post-modify DAG operations update the DAG register based on OR’ing of the condition tests on both processing elements. Actual data movement involved in a conditional DAG operation is based on independent evaluation of condition tests in PEx and PEy. Only the post modify update is based on the OR’ing of the these conditional tests. Conditional pre-modify DAG operations behave differently. The DAGs always pre-modify an index, independent of the outcome of the condition tests on each processing element. 3-64 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators 4 DATA ADDRESS GENERATORS This chapter describes Data Address Generators (DAGs). Overview The DSP’s Data Address Generators (DAGs) generate addresses for data moves to and from Data Memory (DM) and Program Memory (PM). By generating addresses, the DAGs let programs refer to addresses indirectly, using a DAG register instead of an absolute address. The DAGs architecture, which appears in Figure 4-1, supports several functions that minimize overhead in data access routines. These functions include: • Supply address and post-modify—provides an address during a data move and auto-increments the stored address for the next move. • Supply pre-modified address—provides a modified address during a data move without incrementing the stored address. • Modify address—increments the stored address without performing a data move. ADSP-21160 SHARC DSP Hardware Reference 4-1 Overview DM OR PM DAT A B US IMME DIAT E VAL UE F R OM INS T R UCT ION 32 I R E GIS T E R S 32 32 32 M R E GIS T E R S L R E GIS T E R S B R E GIS T E R S 8 X 32 8 X 32 8 X 32 8 X 32 32 MUX P OS T -MODIF Y ADDR E S S ING MODUL US L OGIC ADD 32 B IT -R E VE R S E (OP T IONAL ) PR E -MODIF Y ADDR E S S ING MUX 32 UP DAT E 32 ADDR E S S ADJUS T ME NT P E R WOR D S IZ E (S HOR T , NOR MAL , OR L ONG) OP T IONAL B IT -R E VE R S E F OR I0-DAG1 & I8-DAG2 OP T IONAL B R OADCAS T F OR I1-DAG1 & I9-DAG2 32 32 MODE 1 MODE 2 PM ADDR E S S B US (DAG2 - I,M,L ,B 8-15) S T K YX DM ADDR E S S B US (DAG1 - I,M,L ,B 0-7) Figure 4-1. Data Address Generator (DAG) Block Diagram • Bit-reverse address—provides a bit-reversed address during a data move without reversing the stored address. • Broadcast data moves—performs dual data moves to complementary registers in each processing element to support SIMD mode. 4-2 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators As shown in Figure 4-1, each DAG has four types of registers. These registers hold the values that the DAG uses for generating addresses. The four types of registers are: • Index registers (I0-I7 for DAG1 and I8-I15 for DAG2). An index register holds an address and acts as a pointer to memory. For example, the DAG interprets dm(I0,0) and pm(I8,0) syntax in an instruction as addresses. • Modify registers (M0-M7 for DAG1 and M8-M15 for DAG2). A modify register provides the increment or step size by which an index register is pre- or post-modified during a register move. For example, the dm(I0, M1) instruction directs the DAG to output the address in register I0 then modify the contents of I0 using the M1 register. • Length and Base registers (L0-L7 and B0-B7 for DAG1 and L8-L15 and B8-B15 for DAG2). Length and base registers set up the range of addresses and the starting address for a circular buffer. For more information on circular buffers, see “Addressing Circular Buffers” on page 4-12. Setting DAG Modes The MODE1 register controls the operating mode of the DAGs. Table A-2 on page A-3 lists all the bits in MODE1. The following bits in MODE1 control Data Address Generator modes: • Circular buffering enable. Bit 24 (CBUFEN) enables circular buffering (if 1) or disables circular buffering (if 0). • Broadcast register loading enable, DAG1-I1. Bit 23 (BDCST1) enables register broadcast loads to complementary registers from I1 indexed moves (if 1) or disables broadcast loads (if 0). ADSP-21160 SHARC DSP Hardware Reference 4-3 Setting DAG Modes • Broadcast register loading enable, DAG2-I9. Bit 22 (BDCST9) enables register broadcast loads to complementary registers from I9 indexed moves (if 1) or disables broadcast loads (if 0) • SIMD mode enable. Bit 21 (PEYEN) enables computations in PEy—SIMD mode—(if 1) or disables PEy—SISD mode—(if 0). For more information on SIMD mode, see “Secondary Processing Element (PEy)” on page 2-35. • Secondary registers for DAG2 lo, I,M,L,B8-11. Bit 6 (SRD2L) Secondary registers for DAG2 hi, I,M,L,B12-15. Bit 5 (SRD2H) Secondary registers for DAG1 lo, I,M,L,B0-3. Bit 4 (SRD1L) Secondary registers for DAG1 hi, I,M,L,B4-7. Bit 3 (SRD1H) These bits select the corresponding secondary register set (if 1) or select the corresponding primary register set—the set that is available at reset—(if 0). • Bit-reverse addressing enable, DAG1-I0. Bit 1 (BR0) enables bit-reversed addressing on I0 indexed moves (if 1) or disables bit-reversed addressing (if 0). • Bit-reverse addressing enable, DAG2-I8. Bit 0 (BR8) enables bit-reversed addressing on I8 indexed moves (if 1) or disables bit-reversed addressing (if 0). Circular Buffering Mode The CBUFEN bit in the MODE1 register enables circular buffering—a mode in which the DAG supplies addresses ranging within a constrained buffer length (set with an L register), starting at a base address (set with a B register), and incrementing the addresses on each access by a modify value (set with an M register). 4-4 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators previous SHARC DSP’s (ADSP-2106x DSPs), circular buffer On ing is always enabled. For code compatibility, programs ported to the ADSP-21160 DSP should include the instruction: Bit Set Mode1 CBUFEN; This instruction enables circular buffering. For more information on setting up and using circular buffers, see “Addressing Circular Buffers” on page 4-12. When using circular buffers, the DAGs can generate an interrupt on buffer overflow (wrap around). For more information, see “Using DAG Status” on page 4-9. Broadcast Loading Mode The BDCST1 and BDCST9 bits in the MODE1 register enable broadcast loading mode—multiple register loads from a single load command. When the BDCST1 bit is set (1), the DAG performs a dual data register load on instructions that use the I1 register for the address. The DAG loads both the named register (explicit register) in one processing element and loads that register’s complementary register (implicit register) in the other processing element. The BDCST9 bit in the MODE1 register enables this feature for the I9 register. Enabling either DAG1 or DAG2 register load broadcasting has no effect on register stores or loads to universal registers other than the register file data registers. “Dual Processing Element Register Load Broadcasts” on page 4-6 demonstrates the effects of a register load operation on both ADSP-21160 SHARC DSP Hardware Reference 4-5 Setting DAG Modes processing elements with register load broadcasting enabled. In Table 4-1 on page 4-6, note that Rx and Sx are complementary data registers. Table 4-1. Dual Processing Element Register Load Broadcasts 1 Example Instruction syntax Rx = DM(I1,Ma); {Syntax #1} Rx = PM(I9,Mb); {Syntax #2} Rx = DM(I1,Ma), Rx = PM(I9,Mb); PEx explicit operations PEy implicit operations 1 {Syntax #3} Rx = DM(I1,Ma); {Explicit #1} Rx = PM(I9,Mb); {Explicit #2} Rx = DM(I1,Ma), Rx = PM(I9,Mb); {Explicit Sx = DM(I1,Ma); {Implicit #1} Sx = PM(I9,Mb); {Implicit #2} Sx = DM(I1,Ma), Sx = PM(I9,Mb); #3} {Implicit #3} The letters “a” and “b” (as in Ma or Mb) indicate numbers for modify registers in DAG1 and DAG2. The letter “a”, which indicates a DAG1 register, can be replaced with 0 through 7. The letter “b” indicates a DAG2 register and can be replaced with 8 through 15. The bit (SISD/SIMD mode select) does not influence broad cast operations. Broadcast loading is particularly useful in SIMD PEYEN applications where the algorithm needs identical data loaded into each processing element. For more information on SIMD mode (in particular, a list of complementary data registers), see “Secondary Processing Element (PEy)” on page 2-35. Alternate (Secondary) DAG Registers Each DAG has an alternate register set. To facilitate fast context switching, the DSP includes alternate register sets for data, results, and data address generator registers. Bits in the MODE1 register control when alternate registers become accessible. While inaccessible, the contents of alternate registers are not effected by DSP operations. Note that there is a one cycle latency between writing to MODE1 and being able to access an alternate register set. The alternate register sets for the DAGs are described 4-6 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators in this section. For more information on alternate data and results registers, see “Alternate (Secondary) Data Registers” on page 2-30. Bits in the MODE1 register can activate alternate register sets within the DAGs: the lower half of DAG1 (I,M,L,B0-3), the upper half of DAG1 (I,M,L,B4-7), the lower half of DAG2 (I,M,L,B8-11), and the upper half of DAG2 (I,M,L,B12-15). Figure 4-2 shows the DAG’s primary and alternate register sets. MODE 1 S E L E CT B IT SRD1L SRD1H DAG1 R E GIS T E R S (DAT A ME MOR Y) I0 M0 L0 B0 I1 M1 L1 B1 I2 M2 L2 B2 I3 M3 L3 B3 I4 M4 L4 B4 I5 M5 L5 B5 I6 M6 L6 B6 M7 L7 B7 I7 DAG2 R E GIS T E R S (P R OGR AM ME MOR Y) I8 M8 L8 I9 M9 L9 B9 I10 M10 L10 B10 I11 M11 L11 B11 I12 M12 L12 B12 I13 M13 L13 B 13 I14 M14 L14 B14 I15 M15 L15 B15 SRD2L SRD2H B8 Figure 4-2. DAG Primary and Alternate Registers To share data between contexts, a program places the data to be shared in one half of either the current DAG’s registers or the other DAG’s registers and activates the alternate register set of the other half. The following ADSP-21160 SHARC DSP Hardware Reference 4-7 Setting DAG Modes example demonstrates how code should handle the one cycle of latency from the instruction setting the bit in MODE1 to when the alternate registers may be accessed: BIT SET MODE1 SRD1L;/* activate alternate dag1 lo regs */ NOP;/* wait for access to alternates */ R0=dm(i0,m1); Bit-Reverse Addressing Mode The BR0 and BR8 bits in the MODE1 register enable bit-reverse addressing mode—outputting addresses in reverse bit order. When BR0 is set (1), DAG1 bit-reverses 32-bit addresses output from I0. When BR8 is set (1), DAG2 bit-reverses 32-bit addresses output from I8. The DAGs only bit-reverse the address output from I0 or I8; the contents of these registers are not reversed. Bit-reverse addressing mode effects both pre-modify and post-modify operations. The following example demonstrates how bit-reverse mode effects address output: Bit Set Mode1 BR0; /* enables bit-rev. addressing for DAG1 */ I0=0x8a000; /* loads I0 with the bit reverse of the */ buffer’s base address, DM(0x51000) */ M0=0x4000000; /* loads M0 with value for post-modify */ R1=DM(I0,M0); /* loads r1 with contents of DM address */ /* DM(0x51000), which is the bit-reverse of */ /* 0x8a000, then post modifies I0 for the next */ /* access with (0x8a000 + 0x4000000)=0x408a000, */ /* which is the bit-reverse of DM(0x51020) */ In addition to bit-reverse addressing mode, the DSP supports a bit-reverse instruction (Bitrev). This instruction bit-reverses the contents of the selected register. For more information on the Bitrev instruction, see “Modifying DAG Registers” on page 4-17 or ADSP-21160 SHARC DSP Instruction Set Reference. 4-8 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators Using DAG Status As described in “Addressing Circular Buffers” on page 4-12, the DAGs can provide addressing for a constrained range of addresses, repeatedly cycling through this data (or buffer). A buffer overflow (or wrap around) occurs each time the DAG circles past the buffer’s base address. The DAGs can provide buffer overflow information when executing circular buffer addressing for I7 or I15. When a buffer overflow occurs (a circular buffering operation increments the I register past the end of the buffer), the appropriate DAG updates a buffer overflow flag in a sticky status (STKYx) register. A buffer overflow can also generate a maskable interrupt. Two ways to use buffer overflows from circular buffering are: • Interrupts. Enable interrupts and use an interrupt service routine to handle the overflow condition immediately. This method is appropriate if it is important to handle all overflows as they occur; for example in a “ping-pong” or swap I/O buffer pointers routine. • STKYx registers. Use the Bit Tst instruction to examine overflow flags in the STKY register after a series of operations. If an overflow flag is set, the buffer has overflowed—wrapped around—at least once. This method is useful when overflow handling is not critical. DAG Operations The DSP’s DAGs perform several types of operations to generate data addresses. As shown in Figure 4-1 on page 4-2, the DAG registers and the MODE1, MODE2, and STKYx registers all contribute to DAG operations. The following sections provide details on DAG operations: ADSP-21160 SHARC DSP Hardware Reference 4-9 DAG Operations • “Addressing With DAGs” on page 4-10 • “Addressing Circular Buffers” on page 4-12 • “Modifying DAG Registers” on page 4-17 An important item to note from Figure 4-1 on page 4-2 is that the DAG automatically adjusts the output address per the word size of the address location (short word, normal word, or long word). This address adjustment lets internal memory use the address directly. For details on these address adjustments, see “Access Word Size” on page 5-40. mode, access word size, and data location (inter SISD/SIMD nal/external) all influence data access operations. “Data Access Options” on page 5-45 Addressing With DAGs The DAGs support two types of modified addressing—generating an address that is incremented by a value or a register. In pre-modify addressing, the DAG adds an offset (modifier), either an M register or an immediate value, to an I register and outputs the resulting address. Pre-modify addressing does not change (or update) the I register. The other type of modified addressing is post-modify addressing. In post-modify addressing, the DAG outputs the I register value unchanged then adds an M register or immediate value, updating the I register value. Figure 4-3 compares pre- and post-modify addressing. 4-10 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators POST-MODIFY I REGISTER UPDATE PRE-MODIFY NO I REGISTER UPDATE SYNTAX: PM(MX, IX) DM(MX, IX) SYNTAX: PM(IX, MX) DM(IX, MX) 2. UPDATE I OUTPUT 1. OUTPUT I + + M M I+M I+M Figure 4-3. Data Address Generator (DAG) Block Diagram The difference between pre-modify and post-modify instructions in the DSP’s assembly syntax is the position of the index and modifier in the instruction. If the I register comes before the modifier, the instruction is a post-modify operation. If the modifier comes before the I register, the instruction is a pre-modify without update operation. The following instruction accesses the program memory location indicated by the value in I15 and writes the value I15 + M12 to the I15 register: R6 = PM(I15,M12); /* Post-modify addressing with update */ By comparison, the following instruction accesses the program memory location indicated by the value I15 + M12 and does not change the value in I15: R6 = PM(M12,I15); /* Pre-modify addressing without update */ Modify (M) registers can work with any index (I) register in the same DAG (DAG1 or DAG2). For a list of I and M registers and their DAGs, see Figure 4-2 on page 4-7. Instructions can use a number (immediate value), instead of an M register, as the modifier. The size of an immediate value that can modify an I register depends on the instruction type. For all single data access operations, modify immediate values can be up to 32 bits wide. Instructions that ADSP-21160 SHARC DSP Hardware Reference 4-11 DAG Operations combine DAG addressing with computations limit the size of the modify immediate value. In these instructions (multifunction computations), the modify immediate values can be up to 6 bits wide. The following example instruction accepts up to 32-bit modifiers: R1=DM(0x40000000,I1); /* DM address = I1+0x4000 0000 */ The following example instruction accepts up to 6-bit modifiers: F6=F1+F2,PM(I8,0x0B)=ASTAT; /* PM address = I8, I8=I8+0x0B */ Note that pre-modify addressing operations must not change the memory space of the address. For example, pre-modifying an address in the DSP’s internal memory space should not generate an address in external memory space. For more information, see “Access Word Size” on page 5-40. Addressing Circular Buffers The DAGs support addressing circular buffers—a range of addresses containing data that the DAG steps through repeatedly, “wrapping around” to repeat stepping through the range of addresses in a circular pattern. To address a circular buffer, the DAG steps the index pointer (I register) through the buffer, post-modifying and updating the index on each access with a positive or negative modify value (M register or immediate value). If the index pointer falls outside the buffer, the DAG subtracts or adds the length of the buffer from or to the value, wrapping the index pointer back to the start of the buffer. The DAG’s support for circular buffer addressing appears in Figure 4-1 on page 4-2, and an example of circular buffer addressing appears in Figure 4-1. The starting address that the DAG wraps around is called the buffer’s base address (B register). There are no restrictions on the value of the base address for a circular buffer. buffering may only use post-modify addressing. The Circular DAG’s architecture, as shown in Figure 4-1 on page 4-2, cannot support pre-modify addressing for circular buffering, because circular buffering requires that the index be updated on each access. 4-12 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators It is important to note that the DAGs do not detect memory map overflow or underflow. If the address post-modify produces I+M > 0xffffffff or I–M < 0, circular buffering may not function correctly. Also, the length of a circular buffer should not let the buffer straddle the top of the memory map. For more information on the DSP’s memory map, see “ADSP-21160 DSP Memory Map” on page 5-12. As shown in Figure 4-4 on page 4-13, programs use the following steps to set up a circular buffer: T HE F OL L OWING S YNT AX S E T S U P AND ACCE S S E S A CIR CU L AR B U F F E R WIT H: L E NGT H = 11 B AS E ADDR E S S = 0X55000 MODIF IE R = 4 0 1 B IT S E T MODE 1 CB UF E N; /* E NAB L E S CIR CU L AR B U F F E R ADDR E S S ING; J US T ONCE IN P R OGR AM */ B 0 = 0X55000; /* L OADS B 0 AND L 0 R E GIS T E R S WIT H B AS E ADDR E S S */ L 0 = 0XB ; /* L OADS L 0 R E GIS T E R WIT H L E NGT H OF B U F F E R */ M1 = 0X4; /* L OADS M1 WIT H MODIF IE R OR S T E P S IZ E */ L CNT R = 11, DO MY_CIR _B UF F E R UNT IL L CE ; /* S E T S U P A L OOP CONT AINING B U F F E R ACCE S S E S */ R 0 = DM(I0,M1); /* AN ACCE S S W IT HIN T HE B U F F E R U S E S P OS T MODIF Y ADDR E S S ING */ ... 1 /* OT HE R INS T R U CT IONS0 IN T HE MY_CIR _B U F F E R L OOP 0 0 */ MY_CIR _B U F F E R : NOP ; /* E ND OF MY_CIR _B UF F E R L OOP */ 4 1 1 1 7 2 2 2 3 3 3 3 4 4 4 2 4 5 5 6 6 6 7 7 7 8 8 8 9 9 7 8 3 9 9 10 10 5 6 5 2 10 10 5 8 9 6 11 10 THE COLUMNS ABOVE SHOW THE SEQUENCE IN ORDER OF LOCATIONS ACCESSED IN ONE PASS. NOTE THAT "0" ABOVE IS ADDRESS DM(0X55000). THE SEQUENCE REPEATS ON SUBSEQUENT PASSES. Figure 4-4. Data Address Generator (DAG) Block Diagram ADSP-21160 SHARC DSP Hardware Reference 4-13 DAG Operations 1. Enable circular buffering (Bit Set is only needed once in a program. Mode1 CBUFEN;). This operation 2. Load the buffer’s base address into the B register. This operation automatically loads the corresponding I register. 3. Load the buffer’s length into the corresponding L register. For example, L0 corresponds to B0. 4. Load the modify value (step size) into an M register in the corresponding DAG. For example, M0 through M7 correspond to B0. Alternatively, the program can use an immediate value for the modifier. After this set up, the DAGs use the modulus logic in Figure 4-1 on page 4-2 to process circular buffer addressing. On the ADSP-21160 DSP, programs enable circular buffering by setting the CBUFEN bit in the MODE1 register. This bit has a corresponding mask bit in the MMASK register. Setting the corresponding MMASK bit causes the CBUFEN bit to be cleared following a push status instruction (Push Sts), the execution of an external interrupt, timer interrupt, or vectored interrupt. This feature lets programs disable circular buffering while in an interrupt service routine that does not use circular buffering. By disabling circular buffering, the routine does not need to save and restore the DAG’s B and L registers. Clearing the CBUFEN bit disables circular buffering for all data load and store operations. The DAGs perform normal post-modify load and store accesses instead, ignoring the B and L register values. Note that a write to a B register modifies the corresponding I register, independent of the state of the CBUFEN bit. The Modify instruction executes independent of the state of the CBUFEN bit. The Modify instruction always performs circular buffer modify of the index registers if the corresponding B and L registers are set up, independent of the state of the CBUFEN bit. 4-14 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators previous SHARC DSP’s (ADSP-2106x family), circular buffer On ing is always enabled. For code compatibility, programs ported to the ADSP-21160 DSP should enable circular buffering (CBUFEN=1). On the first post-modify access to the buffer, the DAG outputs the I register value on the address bus then modifies the address by adding the modify value. If the updated index value is within the buffer length, the DAG writes the value to the I register. If the updated value is outside the buffer length, the DAG subtracts (positive) or adds (negative) the L register value before writing the updated index value to the I register. In equation form, these post-modify and wrap around operations work as follows: • If M is positive: • Inew = Iold + M if Iold + M < Buffer base + length (end of buffer) • Inew = Iold + M – L if Iold + M Buffer base + length (end of buffer) • If M is negative: • Inew = Iold + M if Iold + M Buffer base (start of buffer) • Inew = Iold + M + L if Iold + M < Buffer base (start of buffer) The DAGs use all four types of DAG registers for addressing circular buffers. These registers operate as follows for circular buffering: The index (I) register contains the value that the DAG outputs on the address bus. • The modify (M) register contains the post-modify amount (positive or negative) that the DAG adds to the I register at the end of each memory access. The M register can be any M register in the same DAG as the I register and does not have to have the same ADSP-21160 SHARC DSP Hardware Reference 4-15 DAG Operations number. The modify value also can be an immediate value instead of an M register. The size of the modify value, whether from an M register or immediate, must be less than the length (L register) of the circular buffer. • The length (L) register sets the size of the circular buffer and the address range that the DAG circulates the I register through. L must be positive and cannot have a value greater than 231 – 1. If an L register’s value is zero, its circular buffer operation is disabled. • The base (B) register, or the B register plus the L register, is the value that the DAG compares the modified I value with after each access. When the B register is loaded, the corresponding I register is simultaneously loaded with the same value. When I is loaded, B is not changed. Programs can read the B and I registers independently. There is one set of registers (I7 and I15) in each DAG that can generate an interrupt on circular buffer overflow (address wraparound). For more information, see “Using DAG Status” on page 4-9. When a program needs to use I7 or I15 without circular buffering and the DSP has the circular buffer overflow interrupts unmasked, the program should disable the generation of these interrupts by setting the B7/B15 and L7/L15 registers to values that prevent the interrupts from occurring. If I7 were accessing the address range 0x1000–0x2000, the program could set B7=0x0000 and L7=0xFFFF. Because the DSP generates the circular buffer interrupt based on the wrap around equations on page 4-15, setting the L register to zero does not necessarily achieve the desired results. If the program is using either of the circular buffer overflow interrupts, it should avoid using the corresponding I register(s) (I7 or I15) where interrupt branching is not needed. In the case of circular buffer overflow interrupts, if CBUFEN = 1 and register L7 = 0 (or L15 = 0), the CB7I (or CB15I) interrupt occurs at every change of I7 (or I15) after the index register ( I7 or I15) crosses the base register (B7 4-16 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators or B15) value. This wrap-around behavior is independent of the context of the DAG registers, both primary and alternate. a Long word access, SIMD access, or Normal word access When (with LW option) crosses the end of the circular buffer, the DSP completes the access before responding to the end of buffer condition. Modifying DAG Registers The DAGs support two operations that modify an address value in an index register without outputting an address. These two operations, address bit-reversal and address modify, are useful for bit-reverse addressing and maintaining pointers. The Modify instruction modifies addresses in any DAG index register (I0-I15) without accessing memory. If the I register’s corresponding B and L registers are set up for circular buffering, a Modify instruction performs the specified buffer wrap around (if needed). The syntax for Modify is similar to post-modify addressing (index, then modifier). Modify accepts either a 32-bit immediate values or an M register as the modifier. The following example adds 4 to I1 and updates I1 with the new value: Modify(I1,4); The Bitrev instruction modifies and bit-reverses addresses in any DAG index register (I0-I15) without accessing memory. This instruction is independent of the bit-reverse mode. The Bitrev instruction adds a 32-bit immediate value to a DAG index register, bit-reverses the result, and writes the result back to the same index register. The following example adds 4 to I1, bit-reverses the result, and updates I1 with the new value: Bitrev(I1,4); ADSP-21160 SHARC DSP Hardware Reference 4-17 DAGs, Registers, and Memory Addressing in SISD and SIMD Modes Single-Instruction, Multiple-Data (SIMD) mode (PEYEN bit=1) does not change the addressing operations in the DAGs, but it does change the amount of data that moves during each access. The DAGs put the same addresses on the address buses in SIMD and SISD modes. In SIMD mode, the DSP’s memory and processing elements get data from the locations named (explicit) in the instruction syntax and complementary (implicit) locations. For more information on data moves between registers, see “Secondary Processing Element (PEy)” on page 2-35. For more information on data accesses and memory, see “Data Access Options” on page 5-45. DAGs, Registers, and Memory DAG registers are part of the DSP’s universal register set. Programs may load the DAG registers from memory, from another universal register, or with an immediate value. Programs may store DAG registers’ contents to memory or to another universal register. The DAG’s registers support the bidirectional register-to-register transfers that are described in “SIMD (Computational) Operations” on page 2-39. When the DAG register is a source of the transfer, the destination can be a register file data register. This transfer results in the contents of the single source register being duplicated in complementary data registers in each processing element. Programs should use care in the case where the DAG register is a destination of a transfer from a register file data register source. Programs should use a conditional operation to select either one processing element or neither as the source. Having both processing elements contribute a source value results in the PEx element’s write having precedence over the PEy element’s write. 4-18 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators In the case where a DAG register is both source and destination, the data move operation executes the same as it would if SIMD mode were disabled (PEYEN cleared). DAG Register-to-Bus Alignment There are three word alignment cases for DAG registers and PM or DM data buses: Normal word, Extended-precision Normal word, and Long word. The DAGs align normal word (32-bit) addressed transfers to the low order bits of the buses. These transfers between memory and 32-bit DAG1or DAG2 registers use the 64-bit DM and PM data buses. Figure 4-5 illustrates these transfers. D M OR P M D AT A B U S 63 31 0 IMP L ICIT (NAME D + OR - 1) D AG1 OR D AG2 R E GIS T E R S 31 0 31 0 E XP L ICIT (NAME D ) D AG1 OR D AG2 R E GIS T E R S Figure 4-5. Normal Word (32-bit) DAG Register Memory Transfers The DAGs align extended-precision normal word (40-bit) addressed transfers or register-to-register transfers to bits 39-8 of the buses. These transfers between a 40-bit data register and 32-bit DAG1 or DAG2 registers use the 64-bit DM and PM data buses. Figure 4-6 illustrates these transfers. ADSP-21160 SHARC DSP Hardware Reference 4-19 DAGs, Registers, and Memory DM OR P M DAT A B U S 63 8 39 0X0000 00 0 0X00 31 0 DAG1 OR DAG2 R E GIS T E R S Figure 4-6. DAG Register to Data Register Transfers Long word (64-bit) addressed transfers between memory and 32-bit DAG1 or DAG2 registers target double DAG registers and use the 64-bit DM and PM data buses. Figure 4-7 illustrates how the bus works in these transfers. DM OR P M DAT A B U S 63 31 0 IMP L ICIT (NAME D + OR - 1) DAG1 OR DAG2 R E GIS T E R S 31 0 31 0 E XP L ICIT (NAME D) DAG1 OR DAG2 R E GIS T E R S Figure 4-7. Long Word DAG Register to Data Register Transfers If the Long word transfer specifies an even-numbered DAG register (e.g., I0 or I2), then the even numbered register value transfers on the lower half of the 64-bit bus, and the even numbered register + 1 value transfers on the upper half (bits 63-32) of the bus. If the Long word transfer specifies an odd numbered DAG register (e.g., I1, or B3), the odd numbered register value transfers on the lower half of the 64-bit bus, and the odd numbered register - 1 value (I0 or B2 in this example) transfers on the upper half (bits 63-32) of the bus. In both the even- and odd-numbered cases, the explicitly specified DAG register sources or sinks bits 31-0 of the Long word addressed memory. 4-20 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators DAG Register Transfer Restrictions The two types of transfer restrictions are hold-off conditions and illegal conditions that the DSP does not detect. For certain instruction sequences involving transfers to and from DAG registers, an extra (Nop) cycle is automatically inserted by the processor. When an instruction that loads a DAG register is followed by an instruction that uses the same DAG register for data addressing, modify instructions, or indirect jumps, the DSP inserts an extra (Nop) cycle between the two instructions. This hold-off happens because the same bus is needed by both operations in the same cycle. So, the second operation must be delayed. The following case causes a delay because it exhibits a write/read dependency in which I0 is written in one cycle. The results of that register write are not available to a register read for one cycle. Note that if either instruction had specified I1, the stall would still occur, because the DSP’s DAG register transfers can occur in pairs. The DAG detects write/read dependencies with a register pair granularity: I0=8; DM(I0,M1)=R1; Certain other sequences of instructions cause incorrect results on the DSP and are flagged as errors by DSP assembler software. These types of instructions can execute on the processor, but cause incorrect results: • An instruction that stores a DAG register in memory using indirect addressing from the same DAG, with or without update of the index register. The instruction writes the wrong data to memory or updates the wrong index register. • Do not try these: DM(M2,I1)=I0; or DM(I1,M2)=I0;These example instructions do not work because I0 and I1 are both DAG1 registers. ADSP-21160 SHARC DSP Hardware Reference 4-21 DAG Instruction Summary • An instruction that loads a DAG register from memory using indirect addressing from the same DAG, with update of the index register. The instruction will either load the DAG register or update the index register, but not both. • Do not try this: L2=DM(I1,M0);This example instruction does not work because both L2 and I1 are DAG1 registers. DAG Instruction Summary Table 4-2, Table 4-3, Table 4-4, Table 4-5, Table 4-6, Table 4-7, Table 4-8, and Table 4-9 list the DAG instructions. For more information on assembly language syntax, see ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols: • I15-8 indicates a DAG2 index register: I15, I14, I13, I12, I11, I10, I9, or I8, and I7-0 indicates a DAG1 index register I7, I6, I5, I4, I3, I2, I1, or I0. • M15-8 indicates a DAG2 modify register: M15, M14, M13, M12, M11, M10, M9, or M8, and M7-0 indicates a DAG1 modify register M7, M6, M5, M4, M3, M2, M1, or M0. • Ureg indicates any universal register; For a list of the DSP’s universal registers, see Table A-1 on page A-2. • Dreg indicates any data register; For a list of the DSP’s data registers, see the Data Register File registers that are listed in Table A-1 on page A-2. • Data32 indicates any 32-bit value, and Data6 indicates any 6-bit value 4-22 ADSP-21160 SHARC DSP Hardware Reference Data Address Generators Table 4-2. Post-Modify Addressing, Modified by M Register and Updating I Register DM(I7-0,M7-0)=Ureg (LW); {DAG1} PM(I15-8,M15-8)=Ureg (LW); {DAG2} Ureg=DM(I7-0,M7-0) (LW); {DAG1} Ureg=PM(I15-8,M15-8) (LW); {DAG2} DM(I7-0,M7-0)=Data32; {DAG1} PM(I15-8,M15-8)=Data32; {DAG2} Table 4-3. Post-Modify Addressing, Modified by 6-Bit Data and Updating I Register DM(I7-0,Data6)=Dreg; {DAG1} PM(I15-8,Data6)=Dreg; {DAG2} Dreg=DM(I7-0,Data6); {DAG1} Dreg=PM(I15-8,Data6); {DAG2} Table 4-4. Pre-Modify Addressing, Modified by M Register (No I Register Update) DM(M7-0,I7-0)=Ureg (LW); {DAG1} PM(M15-8,I15-8)=Ureg (LW); {DAG2} Ureg=DM(M7-0,I7-0) (LW); {DAG1} Ureg=PM(M15-8,I15-8) (LW); {DAG2} ADSP-21160 SHARC DSP Hardware Reference 4-23 DAG Instruction Summary Table 4-5. Pre-Modify Addressing, Modified by 6-Bit Data (No I Register Update) DM(Data6,I7-0)=Dreg; {DAG1} PM(Data6,I15-8)=Dreg; {DAG2} Dreg=DM(Data6,I7-0); {DAG1} Dreg=PM(Data6,I15-8); {DAG2} Table 4-6. Pre-Modify Addressing, Modified by 32-Bit Data (No I Register Update) Ureg=DM(Data32,I7-0) (LW); {DAG1} Ureg=PM(Data32,I15-8) (LW); {DAG2} DM(Data32,I7-0)=Ureg (LW); {DAG1} PM(Data32,I15-8)=Ureg (LW); {DAG2} Table 4-7. Update (Modify) I Register, Modified by M Register Modify(I7-0,M7-0); {DAG1} Modify(I15-8,M15-8); {DAG2} Table 4-8. Update (Modify) I Register, Modified by 32-Bit Data Modify(I7-0,Data32); {DAG1} Modify(I15-8,Data32); {DAG2} Table 4-9. Bit-Reverse and Update I Register, Modified by 32-Bit Data Bitrev(I7-0,Data32); {DAG1} Bitrev(I15-8,Data32); {DAG2} 4-24 ADSP-21160 SHARC DSP Hardware Reference Memory 5 MEMORY The DSP contains a large, dual-ported internal memory and provides access to external memory through the DSP’s external port. This chapter describes the DSP’s memory and how to use it. For information on connecting and timing accesses to external memory, see “External Memory Interface” on page 7-3. Overview There are 8 Mbits of internal memory space on the DSP. Within this space, the ADSP-21160 DSP has 4 Mbits of memory, which is divided into two 2 Mbit blocks: Block 0 and Block 1. The remaining, unpopulated 4 Mbits of the memory space are reserved on the ADSP-21160 DSP. Table 5-1 shows the maximum number of data or instruction words that can fit in a 2 Mbit internal memory block. Table 5-1. Words Per 2 MBit Internal Memory Block Word Type Bits Per Word Maximum Number of Words Per 2 MBit block Instruction 48-bits 42.67K words Long Word Data 64-bits 32K words Extended Precision Normal Word Data 40-bits 42.67K words ADSP-21160 SHARC DSP Hardware Reference 5-1 Overview Table 5-1. Words Per 2 MBit Internal Memory Block Word Type Bits Per Word Maximum Number of Words Per 2 MBit block Normal Word Data 32-bits 64K words Short Word Data 16-bits 128K words There are 4 Gwords of external memory space that the DSP can address. External memory connects to the DSP’s external port, which extends the DSP’s 32-bit address and 64-bit data buses off the DSP. The DSP can make 64-bit or 32-bit accesses to external memory for instructions or data. Table 5-2 shows the access types and words for DSP external memory accesses. The DSP’s DMA controller automatically packs external data into the appropriate word width during data transfer. Table 5-2. Internal-to-External Memory Word Transfers1 Word Type Transfer Type Instruction 48-bit word MSB justified within 64-bit transfer Long Word Data 64-bit word in 64-bit transfer Extended Precision Normal Word Data 40-bit word MSB justified within 64-bit transfer Normal Word Data 32-bit word in 32-bit transfer Short Word Data Not supported 1 For external port word alignment, see Figure 7-1 on page 7-2 Most microprocessors use a single address and data bus for memory access. This type of memory architecture is called Von Neumann architecture. But, DSPs require greater data throughput than Von Neumann architecture provides, so many DSPs use memory architectures that have separate buses for program and data storage. The two buses let the DSP get a data word and an instruction simultaneously. This type of memory architecture is called Harvard architecture. 5-2 ADSP-21160 SHARC DSP Hardware Reference Memory SHARC DSPs go a step farther by using a Super Harvard architecture. This architecture has program and data buses, but provides a single, unified address space for program and data storage. While the Data Memory (DM) bus only carries data, the Program Memory (PM) bus handles instructions or data, allowing dual-data accesses. DSP core and I/O processor accesses to internal memory are completely independent and transparent to one another. Each block of memory can be accessed by the DSP core and I/O processor in every cycle—no extra cycles are incurred if the DSP core and the I/O processor access the same block. A memory access conflict can occur when the processor core attempts two accesses to the same internal memory block in the same cycle. When this conflict happens, an extra cycle is incurred. The DM bus access completes first and the PM bus access completes in the following (extra) cycle. During a single-cycle, dual-data access, the processor core uses the independent PM and DM buses to simultaneously access data from both memory blocks. Though dual-data accesses provide greater data throughput, it is important to note some limitations on how programs may use them. The limitations on single-cycle, dual-data accesses are: • The two pieces of data must come from different memory blocks. If the core tries to access two words from the same memory block (over the same bus) for a single instruction, an extra cycle is needed. For more information on how the buses access these blocks, see “Internal Memory” on page 5-14 • The data access execution may not conflict with an instruction fetch operation. If the cache contains the conflicting instruction, the data access completes in a single-cycle and the sequencer uses the cached instruction. If the conflicting instruction is not in the cache, an ADSP-21160 SHARC DSP Hardware Reference 5-3 Overview extra cycle is needed to complete the data access and cache the conflicting instruction. For more information, see “Instruction Cache” on page 3-9. Efficient memory usage relies on how the program and data are arranged in memory and varies how the program accesses the data. For more information, see “Arranging Data in Memory” on page 5-84. As shown in Table 5-1, the DSP has three internal buses connected to its dual-ported memory, the Program Memory (PM) bus, Data Memory (DM) bus, and I/O Processor (IO) bus. The PM bus and DM bus share one memory port and the IO bus connects to the other port. Memory accesses from the DSP’s core (computational units, data address generators, or program sequencer) use the PM or DM buses, while the I/O processor uses the IO bus for memory accesses. Using the IO bus, the I/O processor can provide data transfers between internal memory and the DSP’s communication ports (link ports, serial ports, and external port) without hindering the DSP core’s access to memory. While the DSP’s internal memory is divided into blocks that can be accessed by DAG1 and DAG2, the DSP’s external memory spaces is divided into banks, which may be addressed by either data address generator. External memory banks also may be configured for size and access waitstates. “External Memory” on page 5-20 The DSP core’s PM bus and DM bus and I/O processor’s External Port (EP) bus can try to access multiprocessor memory space or external memory space in the same cycle. The DSP has a two level arbitration system to handle this conflicting access. Arbitration stems from a priority convention and the state of the SYSCON register’s EBPRx bits. When arbitrating between the processor core buses, the DM bus always has priority over the PM bus. Arbitration between the winning core bus and I/O processor EP bus depends on the priority set with the EBPRx bits. For more information on setting this priority, see “External Bus Priority” on page 5-33. 5-4 ADSP-21160 SHARC DSP Hardware Reference Memory INTERNAL (DSP) MEMORY EXTERNAL (SYSTEM) MEMORY DATA BLOCK 1 (NORMAL WORD 0X50000 - 0X5FFFF) DATA ADDRESS ADDRESS BLOCK 0 (NORMAL WORD 0X40000 - 0X4FFFF) ADDRESS DATA ADDRESS BANK 0 (STARTING AT NORMAL WORD 0X800000) ADDRESS DATA DATA ANY TWO PATHS SIMULTANEOUSLY EXTERNAL PORT ADDRESSES AND DATA FOLLOW PARALLEL PATHS PM ADDRESS BUS PM DATA BUS 18 64 32 64 64 32 64 32 64 PX BUS EXCHANGE REGISTER 64 DM ADDRESS BUS DM DATA BUS IO ADDRESS BUS IO DATA BUS IO ADDRESS IO DATA I/O PROCESSOR EP EP ADDRESS DATA Figure 5-1. ADSP-21160 Memory and Internal Buses Block Diagram ADSP-21160 SHARC DSP Hardware Reference 5-5 Overview Internal Address and Data Buses Figure 5-1 on page 5-5 also shows that the PM buses, DM buses, and I/O processor have access to the external bus (pins DATA63-0, ADDR31-0) through the DSP’s external port. The external port provides access to system (off-DSP) memory and peripherals. This port also lets the DSP access the internal memory of other DSPs if connected in a multiprocessing system. Almost without exception, the DSP’s three buses can access all memory spaces, supporting all data sizes. There are three restrictions on the access of buses to memory. The limitations on the PM, DM, and IO buses are as follows: • The PM, DM, and IO buses may only make Normal Word addressing accesses to multiprocessor or external memory. For more information, see “Multiprocessor Memory” on page 5-17. • The IO bus may not access the I/O processor’s memory mapped registers. For more information, see “I/O Processor” • The IO bus may not use Short word addressing for DMA operation. Addresses for the PM and DM buses come from the DSP’s program sequencer and Data Address Generators (DAGs). The program sequencer and DAGs supply 32-bit addresses for locations throughout the DSP’s memory spaces. The DAGs supply addresses for data reads and writes on both the PM and DM address buses, while the program sequencer uses only the PM address bus for sequencing execution. Each DAG is associated with a particular data bus. DAG1 supplies addresses over the DM bus and DAG2 supplies addresses over the PM bus. For more information on address generation, see “Program Sequencer” or “Data Address Generators”. 5-6 ADSP-21160 SHARC DSP Hardware Reference Memory Because the DSP’s internal memory is arranged in four 16-bit wide by 32K high columns, memory is addressable in widths that are multiples of columns up to 64 bits: 1 column = 16-bit words, 2 columns = 32-bit words, 3 columns = 48- or 40-bit words, and 4 columns = 64-bit words. For more information on the how the DSP works with memory words, see “Memory Organization and Word Size” on page 5-22. The PM and DM data buses are 64 bits wide. Both data buses can handle Long word (64-bit), Normal word (32-bit), Extended-precision Normal word (40-bit), and Short word (16-bit) data, but only the PM data bus carries Instruction words (48-bit). At the processor’s external port, the DSP multiplexes the three memory buses—PM, DM, and I/O—to create a single off-chip data bus (DATA 63-0) and address bus (ADDR 31-0). Internal Data Bus Exchange The data buses let programs transfer the contents of any register in the DSP to any other register or to any internal memory location in a single cycle. As shown in Figure 5-2, the PM Bus Exchange (PX) register permits data to flow between the PM and DM data buses. The PX register can work as one 64-bit register or as two 32-bit registers (PX1 and PX2). The alignment of PX1 and PX2 within PX appears in Figure 5-3. Combined PX Register 32 31 63 PX2 31 0 PX1 0 31 0 Figure 5-2. PM Bus Exchange (PX, PX1, and PX2) Registers ADSP-21160 SHARC DSP Hardware Reference 5-7 Overview PX1, PX2, and the combined PX register are Universal registers (UREG) and are accessible for register-to-register or memory-to-register transfers. register-to-register transfers with data registers are either 40-bit transfers for the combined PX or 32-bit transfers for PX1 or PX2. Figure 5-3 shows the bit alignment for these types of transfers. PX Register File Transfer Register File Transfer 40-bits 39 32-bits 0 40-bits 63 32-bits 0x0 Combined PX 8 7 0 39 24 23 0x0 0 31 0 PX1 or PX2 Figure 5-3. PX, PX1, and PX2 Register-to-Register Transfers Figure 5-3 shows that: • During a transfer between PX1 or PX2 and a data register file register (DREG), the bus transfers the upper 32 bits of the register file and zero fills the eight LSBs. • During a transfer between the combined PX register and a register file register, the bus transfers the upper 40 bits of PX and zero fills the lower 24 bits. register-to-memory transfers over the DM data bus are either 64-bit for the combined PX or 32-bit transfers (on bits 31-0 of the bus) for PX1 or PX2. Figure 5-4 shows these transfers. PX 5-8 ADSP-21160 SHARC DSP Hardware Reference Memory DM (LW) or PM (LW) Data Bus Transfer DM or PM Data Bus Transfer 64-bits 63 31 0 63 31 Combined PX 31 0 32-bits 64-bits 63 32-bits 0x0 0 31 0 PX1 or PX2 Figure 5-4. PX, PX1, PX2 Register-to-Memory Transfers on DM (LW) or PM (LW) Data Bus The LW notation in Figure 5-4 draws attention to an important feature of PX register-to-memory transfers over the PM or DM data bus for the combined PX register. PX transfers to memory are 48-bit (3-column) transfers on bits 63-16 of the PM or DM data bus, unless forced to be 64-bit (4-column) transfers with the LW (Long Word) mnemonic. status of the memory block’s Internal Memory Data Width The ( ) setting does not effect this default transfer size for to IMDWx PX internal memory. Table 5-5 shows the default transfer size between PX and internal memory over the PM or DM data bus. ADSP-21160 SHARC DSP Hardware Reference 5-9 Overview DM and PM Data Bus Transfer (not LW) 48-bits 0x0 31 8 7 0 48-bits 0x0 63 63 31 8 7 0 Combined PX Figure 5-5. PX Register-to-Memory Transfers on PM Data Bus (Without LW) This default 3-column memory access for the PX register over the PM or DM data bus has a particularly useful application in boot loading. If a program was loading a series of instructions from external memory locations into internal memory, the program could use the following code: i8 = start; {sets up dag2 to auto-increment on PM access} m8 = 1; i0 = source; {sets up dag1 to auto-increment on DM access} m0 = 1; lcntr = 1000, do external_load until lce; {sets up load loop} px = dm(i0,m0); external_load: pm(i8,m8) = px; /* the loop moves from a range of external addresses */ /* (starting at source) to a range of internal addresses */ /* (starting at start) using the PX register. The external */ /* move is a 64-bit (long word, 4-column) access, and the */ /* internal move is a 48-bit (instruction word, 3-column) */ /* access. */ 5-10 ADSP-21160 SHARC DSP Hardware Reference Memory jump start; {after load is complete, starts program} Using the PX register for 48-bit moves over the PM data bus can be useful for programs that handle non-standard loading operations. For information on more typical booting techniques, see “I/O Processor”. For more information on using DAGs, see “DAG Operations” on page 4-9. For more information on sequencing (Jump and Do/Until), see “Program Sequencer”. For more information on how the DSP works with memory words, see “Memory Organization and Word Size” on page 5-22. For the previous example, a 64-bit transfer to the Port1 memory location can be accomplished by adding the LW bit extension to the data move instruction: PM(Port1)=PX (LW); {move all 64 bits of PX to Port1} bit extension does not alter the data alignment for the PX The transfer. Specifying the bit extension on a write to internal LW LW PX memory generates a write to all four 16-bit columns of the memory destination, rather than three 16-bit columns. Some transfers over the DM and PM buses have fixed sizes (64- or 48-bit) regardless whether the instruction includes the LW (Long Word) mnemonic. PM and DM data bus transfers that have fixed sizes include: • All transfers between the PX and external memory are 64-bits transfers. • All transfers between the PX register and the I/O processor EPBx registers are 64-bit transfers. • All transfers between the PX register and the I/O processor LBUFx registers are 48-bit transfers (most significant 48-bits of PX). ADSP-21160 SHARC DSP Hardware Reference 5-11 ADSP-21160 DSP Memory Map • All transfers between the PX register (or any other internal register/memory) and any I/O processor register (other than the EPBx or LBUFx) are 32-bit transfers (least significant 32-bits of PX). • All transfers between the PX register and data registers (R0-R15 or S0-S15) are 40-bit transfers. There is no implicit move when the combined PX register is used in SIMD mode. For example in SIMD mode, the following moves could occur: PX1 = R0; {R0 32-bit explicit move to PX1, and R1 32-bit implicit move to PX2} PX = R0; {R0 40-bit explicit move to PX, but no implicit move for R1} ADSP-21160 DSP Memory Map The ADSP-21160 DSP’s memory map appears in Figure 5-6 and has three memory spaces: internal memory space, multiprocessor memory space, and external memory space. 5-12 ADSP-21160 SHARC DSP Hardware Reference Memory 0XFFFF FFFF UNBANKED MS3 BANK 3 EXTERNAL MEMORY MS2 BANK 2 MS1 BANK 1 MS0 BANK 0 0X0080 0000 0X007F FFFF 0X0007 FFFF RESERVED (ALIASED) INTERNAL MEMORY SPACE BLOCK 1 MULTIPROCESSOR MEMORY SPACE BLOCK 0 RESERVED (I/O) 0X0010 0000 0X0000 0000 Figure 5-6. ADSP-21160 Memory Map ADSP-21160 SHARC DSP Hardware Reference 5-13 ADSP-21160 DSP Memory Map These spaces have the following definitions: • Internal memory space. This space ranges from address 0x0000 0000 through 0x0007 FFFF (Normal word). Internal memory space refers to the DSP’s on-chip SRAM and memory mapped registers. • Multiprocessor memory space. This space ranges from address 0x0010 0000 through 0x007F FFFF (Normal word). Multiprocessor memory space refers to the internal memory space of a group of DSPs that are connected in a multiprocessor system. • External memory space. This space ranges from address 0x0080 0000 through 0xFFFF FFFF (Normal word). External memory space refers to the off-chip memory or memory mapped peripherals that are attached to the DSP’s external address (ADDR31-0) and data (DATA63-0) buses. The address ranges of the three memory spaces correspond to fields within the 32-bit address on the PM, DM, or external address buses. For definitions of these bit fields, see “Program Counter Register (PC)” on page A-28. Internal Memory The ADSP-21160 DSP’s internal memory space appears in Figure 5-7. This memory space has four address regions. • I/O processor memory mapped registers. This region ranges from address 0x0000 0000 through 0x0000 00FF (Long word) • Reserved (I/O) memory. This region ranges from address 0x0000 0100 through 0x0001 FFFF (Long word). These addresses are not accessible. 5-14 ADSP-21160 SHARC DSP Hardware Reference Memory INTERNAL MEMORY LONG WORD (64-BIT) 0X003 FFFF NORMAL WORD (32-BIT) EXT. PREC. NORMAL WORD (40-BIT) OR INSTRUCTION WORD (48-BIT) 0X007 FFFF 0X007 FFFF SHORT WORD (16-BIT) 0X00F FFFF MISSING 0X007 AAAA RESERVED (ALIASED) 0X007 0000 MISSING 0X006 AAAA 0X003 0000 0X002 FFFF 0X006 0000 0X005 FFFF 0X006 0000 MISSING 0X00C 0000 0X00B FFFF 0X005 AAAA BLOCK 1 0X002 8000 0X005 0000 0X002 7FFF 0X004 FFFF 0X005 0000 MISSING 0X00A 0000 0X009 FFFF 0X004 AAAA BLOCK 0 0X002 0000 0X001 FFFF 0X004 0000 0X004 0000 0X008 0000 THE DIFFERENT TYPES OF ADDRESSING ADDRESS THE SAME PHYSICAL MEMORY, BUT USE DIFFERENT LEVELS OF GRANULARITY (WORD SIZE). RESERVED (I/O) 0X000 0100 0X000 0000 I/O PROCESSOR REGISTERS Figure 5-7. ADSP-21160 Internal Memory Space ADSP-21160 SHARC DSP Hardware Reference 5-15 ADSP-21160 DSP Memory Map • Block 0 memory. This region ranges from address through 0x0002 7FFF (Long word). 0x0002 0000 • Block 1 memory. This region ranges from address 0x0002 through 0x0002 FFFF (Long word). 8000 • Reserved (aliased) memory. This region consists of a Block 0 aliased region from 0x0003 0000 through 0x0003 7FFF and a Block 1 aliased region from 0x0003 08000 through 0x0003 FFF (Long word). Accesses to these two regions result in accesses to the corresponding addresses in Block 0 and Block 1. The I/O processor’s memory-mapped registers control the system configuration of the DSP and I/O operations. For more information, see “I/O Processor” These registers occupy consecutive 32-bit locations in this region. If a program uses Long word addressing (forced with the LW mnemonic) to accesses this region, the access is only to the addressed 32-bit register, rather than accessing two adjacent I/O processor registers. The register contents are transferred on bits 31-0 of the data bus. There are a couple of exceptions to this one-at-a-time I/O processor register access rule: • Long word accesses to the external port data buffer locations (EPBx) in SIMD mode access two adjacent 32-bit I/O registers. • Long word accesses to external port buffer (EPBx) or link port buffer (LBUFx) locations using the PX register access two adjacent 32-bit I/O registers. As shown in Figure 5-7, the DSP can address memory in the Block 0, Block 1, or the Block 0 and 1 aliased regions using Long word, Normal word, or Short word addressing. The DSP interprets the addressing mode from the address range for the access. Though there are multiple addressing modes for each memory region, these different modes are addressing the same physical memory. For example, the Long word address 0x0002 0000 corresponds to the same locations as Normal word addresses 5-16 ADSP-21160 SHARC DSP Hardware Reference Memory and 0x0004 0001 and corresponds to the same locations as Short word addresses 0x0008 0000, 0x0008 0001, 0x0008 0002, and 0x0008 0003. 0x0004 0000 Figure 5-7 also shows that there are gaps in the DSP’s memory map when using Normal word addressing for 48-bit (instruction word) or 40-bit (extended precision Normal word) accesses. These gaps of missing addresses stem from the arrangement of this 3-column data in memory. For more information, see “Memory Organization and Word Size” on page 5-22. Multiprocessor Memory The ADSP-21160’s multiprocessor memory space appears in Figure 5-8. This memory space has seven address regions that correspond to the internal memory of the DSPs in a multiprocessing system. Each of the processors in such a system has a processor ID, which is set with the DSP’s ID2-0 pins. The address regions by processor ID are: • Internal memory of DSP with ID=001. This region ranges from address 0x0010 0000 through 0x0017 FFFF. • Internal memory of DSP with ID=010. This region ranges from address 0x0020 0000 through 0x0027 FFFF. • Internal memory of DSP with ID=011. This region ranges from address 0x0030 0000 through 0x0037 FFFF. • Internal memory of DSP with ID=100. This region ranges from address 0x0040 0000 through 0x0047 FFFF. • Internal memory of DSP with ID=101. This region ranges from address 0x0050 0000 through 0x0057 FFFF. ADSP-21160 SHARC DSP Hardware Reference 5-17 ADSP-21160 DSP Memory Map 0X07F FFFF 0X0067 FFFF MULTIPROCESSOR MEMORY SPACE 0X0057 FFFF 0X0047 FFFF 0X0037 FFFF 0X0027 FFFF 0X0017 FFFF BROADCAST WRITE TO ALL DSPS (ID = 111) RESERVED (ALIASED) BLOCK 1 BLOCK 1 BLOCK 1 BLOCK 1 BLOCK 1 BLOCK 1 BLOCK 1 BLOCK 0 BLOCK 0 BLOCK 0 BLOCK 0 BLOCK 0 BLOCK 0 BLOCK 0 INTERNAL MEMORY (ID = 110) INTERNAL MEMORY (ID = 101) INTERNAL MEMORY (ID = 100) INTERNAL MEMORY (ID = 011) INTERNAL MEMORY (ID = 010) INTERNAL MEMORY (ID = 001) RESERVED (I/O) 0X0000 0000 0X0060 0000 0X0050 0000 0X0040 0000 0X030 0000 0X0020 0000 0X0010 0000 ALWAYS ADDRESSED AS NORMAL WORD Figure 5-8. ADSP-21160 Multiprocessor Memory Space 5-18 ADSP-21160 SHARC DSP Hardware Reference Memory • Internal memory of DSP with ID=110. This region ranges from address 0x0060 0000 through 0x0067 FFFF. • Broadcast write to internal memory of all DSPs (ID=111). This region ranges from address 0x0070 0000 through 0x0077 FFFF. It is important to note that programs may only use Normal word addressing in multiprocessor memory space. Long or Short word writes may corrupt valid data, and Long or Short word reads return invalid data. The address range of the access determines which DSP’s internal memory is the multiprocessor memory access source or destination. Broadcast writes (writes in the range 0x0070 0000 through 0x007F FFFF) access the memory of all DSPs in the multiprocessing system. Instead of using its own internal memory address range, a DSP can access its memory through the DSP’s corresponding address range in multiprocessor memory space. In this case, the DSP reads or writes to its own internal memory and does not make an access on the external system bus. Note that such self-accesses through multiprocessor memory space may only be accomplished with processor-core-generated addresses, not I/O processor-generated addresses. For more information on memory accesses in multiprocessor systems, see “External Port”. ADSP-21160 SHARC DSP Hardware Reference 5-19 ADSP-21160 DSP Memory Map External Memory The ADSP-21160’s external memory space appears in Figure 5-9. 0XFFFF FFFF ALWAYS ADDRESSED AS NORMAL WORD UNBANKED BANK 3 MS3 EPROM (BOOT) MEMORY EXTERNAL MEMORY BANK 2 MS2 BANK 1 MS1 BANK 0 MS0 OPTIONAL DRAM PAGES MAP TO BANK 0, AND REQUIRE AN EXTERNAL DRAM CONTROLLER BMS DRAM PAGES 0X0080 0000 Figure 5-9. ADSP-21160 External Memory Space 5-20 ADSP-21160 SHARC DSP Hardware Reference Memory The DSP accesses external memory space through the external port, which multiplexes the processor core’s PM and DM buses and the I/O processor’s EP bus. To address this space, the DSP’s DAG1, DAG2, and I/O processor generate 32-bit addresses over the DM, PM, and EP address buses, allowing the DSP access to the complete 4 Gword memory map. But, the program sequencer only generates 24-bit addresses over the PM bus, limiting sequencing to the low 12 Mwords of the memory map. As shown in Figure 5-9, the external memory space has five regions: 4 banks (Bank 0-3) and an unbanked region. The DSP controls access to the banked regions with memory select lines (MS3-0) in addition to the memory address. Access to the unbanked region is controlled only by the memory address. Each region of external memory may be configured for address range and waitstates. For more information on configuring external memory banks, see “Setting Data Access Modes” on page 5-28. For more information on accessing external memory, see “External Port”. The external memory space can also accommodate an optional boot memory EPROM and optional paged DRAM. For more information, see “Using Boot Memory” on page 5-29 and “External (Bank 0) DRAM Page Size” on page 5-38. Shadow Write FIFO Because the DSP’s internal memory operates at high speeds, writes to the memory do not go directly into the memory array, but rather to a two-deep FIFO called the shadow write FIFO. This FIFO uses a non-read cycle (either a write cycle, or a cycle in which there is no access of internal memory) to load data from the FIFO into internal memory. When an internal memory write cycle occurs, the FIFO loads any data from a previous write into memory and accepts new data. FIFO operation is normally transparent, but there is one case in which programs need to intervene in the operation of the shadow write FIFO: mixing 48-bit and 32-bit word accesses to the same locations in memory. ADSP-21160 SHARC DSP Hardware Reference 5-21 ADSP-21160 DSP Memory Map The shadow FIFO cannot differentiate between the mapping of 48-bit words and the mapping of 32-bit words. Examples of these mappings appear in Figure 5-10 on page 5-23, Figure 5-11 on page 5-24, Figure 5-12 on page 5-25, and Figure 5-13 on page 5-26. If a program writes a 48-bit word to memory and then tries to read the data with a 16-, 32-, or 64-bit word access or writes a 16-, 32-, 64-bit word to memory and tries to read the data with a 48-bit access, the shadow FIFO does not intercept the read and returns incorrect data. If a program must mix 48-bit or 40-bit accesses and 16-, 32-, or 64-bit accesses to the same locations, the program must ensure that the FIFO is flushed before attempting to read the data. The program flushes the FIFO by performing two dummy writes or executing two instructions that do not access the internal memory. These operations force the FIFO to automatically use the non-access cycles to push the write data. Memory Organization and Word Size The DSP’s internal memory is organized as four 16-bit wide by 32K high columns. These columns of memory are addressable as a variety of word sizes: • 64-bit Long word data (4-columns) • 48-bit instruction words or 40-bit extended precision Normal word data (3-columns) • 32-bit Normal word data (2-columns) • 16-bit Short word data (1-column) precision Normal word data is left-justified within a Extended 3-column location, using bits 47-8 of the location. 5-22 ADSP-21160 SHARC DSP Hardware Reference Memory Placing 32-Bit Words and 48-Bit Words When the processor core or I/O processor addresses memory, the word width of the access determines which columns within the memory are accessed. For instruction words (48-bit) or extended precision Normal word data (40-bit), the word width is 48 bits, and the access selects from the memory’s 16-bit columns in groups of three. Because these sets of 3 column accesses are packed in a 4 column matrix, there are four rotations of the columns for storing 48-bit data. The 3-column word rotations within the 4-column matrix appear in Table 5-1 and Figure 5-10. Rotation 3 Rotation 2 Addresses Rotation 2 Rotation 1 0 Rotation 1 Rotation 0 15 0 15 0 15 0 15 Column 0 Column 3 Column 2 Column 1 Figure 5-10. 48-bit Word Rotations For Long word (64-bit), Normal word (32-bit), and Short word (16-bit) memory accesses, The DSP selects from fixed columns in memory. No rotations of words within columns occur for these data types. Figure 5-7 on page 5-15 shows the memory ranges for each data size in the DSP’s internal memory. Figure 5-10 describes 48-bit word rotations. Mixing 32-Bit and 48-Bit Words The DSP’s memory organization lets programs freely place memory words of all sizes (see “Memory Organization and Word Size” on page 5-22) ADSP-21160 SHARC DSP Hardware Reference 5-23 ADSP-21160 DSP Memory Map with few restrictions (see “Restrictions on Mixing 32-Bit and 48-Bit Words” on page 5-26). This memory organization also lets programs mix (place in adjacent addresses) words of all sizes. This section discusses how to mix odd (3-column) and even (4-column) data words in the DSP’s memory. Transition boundaries between 48-bit (3-column) data and any other data size, can occur at any 64-bit address boundary within either internal memory block. Depending on the ending address of the 48-bit words, there are zero, one, or two empty locations at the transition between the 48-bit (3-column) words and the 64-bit (4-column) words. These empty locations result from the column rotation for storing 48-bit words. The three possible transition arrangements appear in Figure 5-11, Figure 5-12 on page 5-25, and Figure 5-13 on page 5-26. Transitioning from 48-bit to 32-bit data with zero empty locations: (48-bit word top address) MOD 4 = 3 32-bit word 3 32-bit word 2 32-bit word 1 32-bit word 0 48-bit word top Addresses 48-bit word top-1 48-bit word top-2 0 48-bit word top-1 48-bit word top-2 48-bit word top-3 15 0 15 0 15 0 15 Column 0 Column 3 Column 2 Column 1 Figure 5-11. Mixed Instructions and Data With No Unused Locations 5-24 ADSP-21160 SHARC DSP Hardware Reference Memory Transitioning from 48-bit to 32-bit data with one empty locations: (48-bit word top address) MOD 4 = 0 32-bit word 3 32-bit word 2 32-bit word 1 32-bit word 0 Empty 48-bit word top Addresses 48-bit word top-1 48-bit word top-2 0 48-bit word top-2 48-bit word top-3 15 0 15 0 15 0 15 Column 0 Column 3 Column 2 Column 1 Figure 5-12. Mixed Instructions and Data With One Unused Location ADSP-21160 SHARC DSP Hardware Reference 5-25 ADSP-21160 DSP Memory Map Transitioning from 48-bit to 32-bit data with two empty locations: (48-bit word top address) MOD 4 = 1 32-bit word 3 32-bit word 2 32-bit word 1 32-bit word 0 Empty 48-bit word top Addresses 48-bit word top Empty 48-bit word top-1 48-bit word top-2 0 48-bit word top-3 15 0 15 0 15 0 15 Column 0 Column 3 Column 2 Column 1 Figure 5-13. Mixed Instructions and Data With Two Unused Locations Restrictions on Mixing 32-Bit and 48-Bit Words There are some restrictions that stem from the memory column rotations for 3-column data (48- or 40-bit words) and relate to the way that 3-column data can mix with 4-column data (32-bit words) in memory. These restrictions apply to mixing 48- and 32-bit words, because the DSP uses a Normal word address to access both of these types of data even though 48-bit data maps onto 3-columns of memory and 32-bit data maps onto 2-columns of memory. When a system has a range of 3-column (48-bit) words followed by a range of 2-column (32-bit) words, there is often a gap of empty 16-bit locations between the two address ranges. The size of the address gap varies with the ending address of the range of 48-bit words. Because the addresses within the gap alias to both 48- and 32-bit words, a 48-bit write 5-26 ADSP-21160 SHARC DSP Hardware Reference Memory into the gap corrupts 32-bit locations, and a 32-bit write into the gap corrupts 48-bit locations. The locations within the gap are only accessible with Short word (16-bit) accesses. Calculating the starting address for 4-column data that minimizes the gap after 3-column data is a useful calculation for programs that are mixing 3and 4-column data. Given the last address of the 3-column (48-bit) data, the starting address of the 32-bit range that most efficiently uses memory can be determined as follows: • n is the number of contiguous 48-bit words the system has allocated in the internal memory block (n < 87,381) • B is the base Normal word address of the internal memory block; if {0 < n < 43,691} then B = 0x40000 else B = 0x50000 • m is the first 32-bit Normal word address to use after the end of 48-bit words • m = B + 2 [(n MOD 43,690) – TRUNC((n MOD 43,690) / 4)] Another useful calculation for programs that are mixing 3- and 4-column data is to calculate the amount of 3-column data that minimizes the gap before starting 4-column data. Given the starting address of the 4-column (32-bit) data, the number of 48-bit words to allocate that most efficiently uses memory can be determined as follows: • m is the first 32-bit Normal word address after the end of 48-bit words (0x3FFFF < m < 0x60000) • B is the base Normal word address of the internal memory block; if {0x3FFFF < m < 0x50000} then B = 0x40000 else B = 0x50000 ADSP-21160 SHARC DSP Hardware Reference 5-27 Setting Data Access Modes • W is the number of offset words; if {B = 0x50000} then W = 43,690 else W = 0 • n is the number of contiguous 48-bit words the system should allocate in the internal memory block • n = TRUNC{4[(m - B) / 2] / 3]} + W Setting Data Access Modes The SYSCON, MODE1, MODE2, and WAIT registers control the operating mode of the DSP’s memory. Table A-17 on page A-45 lists all the bits in SYSCON, Table A-2 on page A-3 lists all the bits in MODE1, Table A-3 on page A-7 lists all the bits in MODE2, and Table A-18 on page A-48 lists all the bits in WAIT. The following bits in SYSCON, MODE1, MODE2, and WAIT registers control memory access modes: • Boot Select Override. SYSCON Bit 1 (BSO) overrides normal usage of MSx chip select lines in favor of the BMS select line for access to boot memory instead of external memory (if 1) or allows normal access to external memory with the MSx chip select lines (if 0). • Internal Interrupt Vector Table. SYSCON Bit 2 (IIVT) forces placement of the interrupt vector table at address 0x0004 0000 regardless of booting mode (if 1) or allows placement of the interrupt vector table as selected by the booting mode (if 0). • Internal Memory Block Data Width. SYSCON Bit 9 (IMDW0) and Bit 10 (IMDW1) select the normal word data access size for internal memory Block 0 and Block1. A block’s normal word access size is fixed as 2-column (if IMDWx=0) or 3-column (if IMDWx=1). • Memory Bank Size. SYSCON Bits 15-12 (MSIZE). This bit field selects the size of the four external memory banks (Bank 3-0). The external memory that is not allotted to a bank is part of the Unbanked external memory region. 5-28 ADSP-21160 SHARC DSP Hardware Reference Memory • External Bus Priority. SYSCON Bits 18-17 (EBPRx). This bit field selects the priority for the I/O processor’s EP bus when arbitrating access to the DSP’s external port. • Secondary processor element (PEy). MODE1 Bit 21 (PEYEN) enables computations in PEy—SIMD mode—(if 1) or disables PEy— SISD mode—(if 0). • Broadcast register loads. MODE1 Bit 22 (BDCST9) and Bit 23 (BDCST1) enable broadcast register loads for memory transfers indexed with I1 (if BDCST1=1) or indexed with I9 (if BDCST9=1). • Illegal IOP register access enable. MODE2 Bit 20 (IIRAE) enables detection of I/O processor register access (if 1) or disables detection (if 0). • Unaligned 64-bit memory access enable. MODE2 Bit 21 (U64MAE) enables detection of uneven address memory access (if 1) or disables detection (if 0). • External bank X access mode. WAIT Bits 1-0 (EB0AM), Bits 6-5 (EB1AM), Bits 11-10 (EB2AM), Bits 16-15 (EB3AM), and Bits 21-20 (UBAM). These bit fields select the access modes for the external memory banks. • External bank X waitstates. WAIT Bits 4-2 (EB0WS), Bits 9-7 (EB1WS), Bits 14-12 (EB2WS), Bits 19-17 (EB3WS), and Bits 24-22 (UBWS). These bit fields select the waitstates for the external memory banks. • External bank 0 DRAM page size. WAIT Bits 27-25 (PAGSZ). This bit field selects the page size for DRAM (allowed in Bank 0 only). Using Boot Memory As shown in Figure 5-9 on page 5-20, the DSP supports an external boot EPROM mapped to external memory and selected with the BMS pin. The ADSP-21160 SHARC DSP Hardware Reference 5-29 Setting Data Access Modes boot EPROM provides one of the methods for automatically loading a program in to the internal memory of the DSP after power-up or after a software reset. This process is called booting. For information on boot options and the booting process, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. For information on systems with a boot EPROM, see “Booting Single and Multiple Processors” on page 11-48. Reading from Boot Memory When the DSP boots from an EPROM, the DSP’s I/O processor only loads 256 instructions automatically from EPROM. If the whole program must be loaded into internal memory from the EPROM, the DSP must gain access to the boot EPROM after the I/O processor completes the automatic boot process. To manage continuing access to boot memory, the DSP uses the Boot Select Override (BSO) bit in the SYSCON register. Setting (=1) the BSO bit overrides the external memory selects and asserts the DSP’s BMS pin for an external memory DMA transfer. For accessing boot memory, the program first sets the BSO bit in SYSCON then sets up an external port DMA channel to read the EPROM’s contents. The program must unmask the DMA channel’s interrupt in the IMASK register; if using external port DMA buffer zero (EP0I), the program could enable this interrupt by initializing IMASK to 0x00008003. For more information on external port DMA, see “Setting I/O Processor—EPort Modes” on page 6-14. While a program may use any external port DMA channel for accessing boot memory, it is important to note that only DMA channel 10 has a special packing mode for boot memory reads. By using DMA channel 10 to complete initial program loading, a program can take advantage of this special packing mode. When a program sets BSO, the DSP ignores the DMA channel’s packing mode (PMODE) bits and forces 8-to-48 bit packing for reads. This special 8-bit packing mode is only available on DMA channel 10 during EPROM 5-30 ADSP-21160 SHARC DSP Hardware Reference Memory booting or on DMA reads when BSO is set. While one of the external port DMA channels is making a DMA access to boot memory with the BSO bit set, none of the other three channels may make a DMA access to external (not boot) memory. Only external port DMA transfers assert BMS when BSO is set; processor core accesses to external memory always use the MSx pins. Because the processor core only accesses external (not boot) memory, programs can access external memory in between DMA accesses to boot memory. Writing to Boot Memory In systems using write-able EEPROM or FLASH memory for boot memory, programs can write new data to the DSP’s boot memory using the boot select override (BSO) pin. As described in “Reading from Boot Memory” on page 5-30, setting (=1) the BSO bit overrides the external memory selects and asserts the DSP’s BMS pin for an external memory DMA transfer. To write to boot memory with the BMS asserted, programs must use DMA channels 11, 12 or 13, but not DMA channel 10. With the BSO bit set, programs should only use DMA channel 10 for reads. When BSO is set, programs can use DMA channels 11-13 with any settings in channel’s the DMACx register, any packing mode, and any data or instruction. boot memory is 8-bits wide and no 8-bit packing mode is Because available for these writes, programs must use the shifter to place data in the correct location for each write. Internal Interrupt Vector Table The default location of the ADSP-21160’s interrupt vector table depends on the DSP’s booting mode. When the processor boots from an external source (EPROM, host port, or link port booting), the vector table starts at ADSP-21160 SHARC DSP Hardware Reference 5-31 Setting Data Access Modes address 0x0004 0000 (Normal word). When the processor is in “no boot” mode (runs from external memory location 0x0080 0000 without loading), the interrupt vector table starts at address 0x0080 0000. The Internal Interrupt Vector Table (IIVT) bit in the SYSCON register overrides the default placement of the vector table. If IIVT is set (=1), the interrupt table starts at address 0x0004 0000 (internal memory) regardless of the booting mode. Internal Memory Block Data Width The DSP’s internal memory blocks use Normal word addressing to access either single-precision 32-bit data or extended-precision 40-bit data. Programs select the data width independently for each internal memory block using the Internal Memory Data Width (IMDW0 and IMDW1) bits in the SYSCON register. If a block’s IMDWx bit is cleared (=0), Normal word addressed accesses to the block access Normal word (32-bit) data. If a block’s IMDWx bit is set (=1), Normal word addressed accesses to the block access extended precision Normal word (40-bit) data. Reading or writing 40-bit data using a Normal word access to a memory block whose IMDWx bit is cleared (=0) has the following results. • If a program tries to write 40-bit data (for example, a data register-to-memory transfer), the transfer truncates the lower 8-bits from the register; only writing 32 bits. • If a program tries to read 40-bit data (for example, a memory-to-data register transfer), the transfer zero-fills the lower 8 bits of the register; only reading 32 bits. The Program Memory Bus Exchange (PX) register is the only exception to these transfer rules—all loads/stores of the PX register are performed as 48-bit accesses unless forced to 64-bit access with the LW mnemonic. If any 40-bit data must be stored in a memory block configured for 32-bit words, the program should use the PX register to access the 40-bit data in 48-bit words. Programs should take care not to corrupt any 32-bit data 5-32 ADSP-21160 SHARC DSP Hardware Reference Memory with this type of access. “Restrictions on Mixing 32-Bit and 48-Bit Words” on page 5-26 Long word ( ) mnemonic only effects Normal word address The accesses and overrides all other factors (SIMD, IMDWx). LW Memory Bank Size The DSP’s external memory space has four banks of equal, programmable size. The remaining area of external memory that is not assigned to a bank is called unbanked. Mapping peripherals into different banks lets systems accommodate I/O devices with different timing requirements, because the banked and unbanked regions have associated waitstate and access mode settings. For more information, see “External Bank X Access Mode” on page 5-36 and “External Bank X Waitstates” on page 5-37. As shown in Figure 5-9 on page 5-20, Bank 0 starts at address 0x0080 0000 in external memory, and the Banks 1, 2, 3, and unbanked regions follow. Whenever the DSP generates an address that is located within one of the four banks, the DSP asserts the corresponding memory select line (MS3-0). The size of the memory banks ranges from 8 Kwords to 256 Mwords and is always a power of two. The Memory Size (MSIZE) field of the SYSCON register selects the memory banks size. The value in MSIZE is: • MSIZE = log2 (desired bank size in words) - 13 External Bus Priority The DSP’s internal bus architecture lets the PM bus, DM bus, and I/O processor’s EP bus try to access multiprocessor memory space or external memory space in the same cycle. This contending access produces a conflict that the DSP resolves with a two level arbitration policy. The processor core’s DM bus always has priority over the PM bus. External Bus Priority (EBPRx) bits in the SYSCON register control the further ADSP-21160 SHARC DSP Hardware Reference 5-33 Setting Data Access Modes arbitration between the winning core bus and the I/O processor’s EP bus. The EBPRx field assigns priority as follows: • If EBPR is 00, priority rotates between core and I/O processor buses. • If EBPR is 01, the winning core bus has priority over the I/O processor bus. • If EBPR is 10, the I/O processor bus has priority over the winning core bus. Secondary Processor Element (PEy) When the PEYEN bit in the MODE1 register is set (=1), the DSP is in Single-Instruction, Multiple-Data (SIMD) mode. In SIMD mode, many data access operations differ from the DSP’s default Single-Instruction, Single-Data (SISD) mode. These differences relate to doubling the amount of data transferred for each data access. Accesses in SIMD mode transfer both an explicit (named) location and an implicit (un-named, complementary) location. The explicit transfers is a data transfers between the explicit register and the explicit address, and the implicit transfer is between the implicit register and the implicit address. For information on complementary (implicit) registers in SIMD mode accesses, see “Secondary Processing Element (PEy)” on page 2-35. For more information on complementary (implicit) memory locations in SIMD mode accesses, see “Accessing Memory” on page 5-39. Broadcast Register Loads The DSP’s BDCST1 and BDCST9 bits in the MODE1 register control broadcast register loading. When broadcast loading is enabled, the DSP writes to complementary registers or complementary register pairs in each processing element on writes that are indexed with DAG1 register I1 (if 5-34 ADSP-21160 SHARC DSP Hardware Reference Memory =1) or DAG2 register I9 (if BDCST9 =1). Broadcast load accesses are similar to SIMD mode accesses in that the DSP transfers both an explicit (named) location and an implicit (un-named, complementary) location, but broadcast loading only influences writes to registers and write identical data to these registers. Broadcast mode is independent of SIMD mode. BDCST1 Table 5-3 on page 5-35 shows examples of explicit and implicit effects of broadcast register loads to both processing elements. Note that broadcast loading only effects loads of data registers (register file); broadcast loading does not effect register stores or loads to other system registers. And, broadcast loads only work on register loads; broadcast loading cannot be used for memory writes. For more information on broadcast loading, see “Accessing Memory” on page 5-39. Table 5-3. Register Load Dual PE Broadcast Operation Instruction (Explicit, PEx Operation) (Implicit, PEy operation) Rx = dm(i1,ma); Rx = pm(i9,mb); Rx = dm(i1,ma), Ry = pm(i9,mb); Sx = dm(i1,ma); Sx = pm(i9,mb); Sx = dm(i1,ma), Sy = pm(i9,mb); Illegal I/O Processor Register Access The DSP monitors for I/O processor register access if the Illegal I/O processor Register Access (IIRAE) bit in the MODE2 register is set (=1). When detected, this condition is an input that can cause an Illegal Input Condition Detected (IICDI) interrupt if the interrupt is enabled in the IMASK register. The I/O processor’s DMA controller cannot generate the interrupt. Only master (not slave) I/O register accesses are detectIICDI able. For more information, see “Mode Control 2 Register (MODE2)” on page A-6. ADSP-21160 SHARC DSP Hardware Reference 5-35 Setting Data Access Modes Unaligned 64-bit Memory Access The DSP monitors for unaligned 64-bit memory accesses if the Unaligned 64-bit Memory Accesses (U64MAE) bit in the MODE2 register is set (=1). An unaligned access is an odd numbered address Normal word access that is forced to 64-bit with the LW mnemonic. When detected, this condition is an input that can cause an Illegal Input Condition Detected (IICDI) interrupt if the interrupt is enabled in the IMASK register. For more information, see “Mode Control 2 Register (MODE2)” on page A-6. External Bank X Access Mode The DSP has four modes for accessing external memory space. The External Bank Access Mode (EBxAM) fields in the WAIT register select how the DSP uses waitstates and the acknowledge (ACK) pin to access each external memory bank and unbanked region. The external bank access modes appear in Table 5-4. Table 5-4. External Bank Access Mode EBxAM Field External Bank Access Mode 00 Asynchronous—DSP RDH/L and WRH/L strobes change before CLKOUT’s edge—accesses use the waitstate count setting from EBxWS and require external acknowledge (ACK), allowing a deasserted ACK to extend the access time. 01 Synchronous—DSP RDH/L and WRH/L strobes change on CLKOUT’s edge— reads use the waitstate count setting from EBxWS (minimum EBxWS=001) and require external acknowledge (ACK), allowing a deasserted ACK to extend the read access time; writes are 0-wait state. 10 Synchronous—DSP RDH/L and WRH/L strobes change on CLKOUT’s edge— reads use the waitstate count setting from EBxWS (minimum EBxWS=001) and require external acknowledge (ACK), allowing a deasserted ACK to extend the read access time; writes are 1-wait state. 11 Reserved 5-36 ADSP-21160 SHARC DSP Hardware Reference Memory External Bank X Waitstates The DSP applies waitstates to each external memory access depending on the bank’s external memory access mode (EBxAM). The External Bank Waitstate (EBxWS) field in the WAIT register sets the number of waitstates for each bank as shown in Table 5-5. Table 5-5. External Bank Waitstates EBxWS # of Waitstates Hold Time Cycle?1 000 0 no 001 1 no 010 2 yes 011 3 yes 100 4 yes 101 5 yes 110 6 yes 111 7 yes 1 Hold Cycle applies to asynchronous mode only. Table 5-5 lists the hold time settings that EBxWS associates with external memory accesses. A hold time cycle is an inactive bus cycle that the DSP inserts automatically at the end of a read or write, allowing a longer hold time for address and data. The address and data remain unchanged and are driven for one cycle after the DSP deasserts the read or write strobes. DSP applies hold time cycles regardless of the external bank The access mode ( ). For example, the asynchronous (ACK plus EBxAM waitstate mode) could also have an associated hold time cycle. ADSP-21160 SHARC DSP Hardware Reference 5-37 Using Memory Access Status External (Bank 0) DRAM Page Size As shown in Figure 5-8 on page 5-18, the DSP supports a region of paged DRAM mapped to the Bank 0 region of external memory. Systems placing DRAM in this region require an external DRAM controller to manage page access to the DRAM. For more information, see “DRAM Page Boundary Detection” on page 7-15. To support DRAM accesses in this region, the DSP detects page boundary crossings and outputs the PAGE signal to the system’s DRAM controller. The page boundaries depend on the type of DRAM in the system. For correct operation, programs must configure the page size in the Page Size (PAGESZ) field of the WAIT register. Table 5-6 shows the available PAGESZ settings. Table 5-6. External DRAM (Bank 0) Page Size PAGSZ Field DRAM Page Size 000 256 words 001 512 words 010 1024 words (1K) 011 2048 words (2K) 100 4096 words (4K) 101 8192 words (8K) 110 16384 words (16K) 111 32768 words (32K) Using Memory Access Status As described in “Illegal I/O Processor Register Access” on page 5-35 and “Unaligned 64-bit Memory Access” on page 5-36, the DSP can provide illegal access information for Long word or I/O register accesses. When 5-38 ADSP-21160 SHARC DSP Hardware Reference Memory these conditions occur, the DSP updates an illegal condition flag in a sticky status (STKYx) register. Either of these two conditions can also generate a maskable interrupt. Two ways to use illegal access information are: • Interrupts. Enable interrupts and use an interrupt service routine to handle the illegal access condition immediately. This method is appropriate if it is important to handle all illegal accesses as they occur. • STKYx registers. Use the Bit Tst instruction to examine illegal condition flags in the STKY register after an interrupt to determine which illegal access condition occurred. Accessing Memory The word width of DSP processor core accesses to internal memory vary according to the following rules: • 48-bit access for instruction words, extended precision Normal word (40-bit) data, and PX register • 64-bit access for Long word data, and Normal word (32-bit) or PX register data with the LW mnemonic • 32-bit access for Normal word (32-bit) data • 16-bit access for Short word data The DSP determines whether a Normal word access is 32- or 40-bit from the internal memory block’s IMDWx setting. For more information, see “Internal Memory Block Data Width” on page 5-32. While mixed accesses of 48-bit words and 16-, 32-, or 64-bit words at the same address are not allowed, mixed read/writes of 16-, 32-, and 64-bit words to the same address are allowed. For more information, see “Restrictions on Mixing 32-Bit and 48-Bit Words” on page 5-26. ADSP-21160 SHARC DSP Hardware Reference 5-39 Accessing Memory The DSP’s DM and PM buses support 24 combinations of register-to-memory data access options. The following factors influence the data access type: • Size of words: Short word, Normal word, extended precision Normal word, or Long word • Number of words: single- or dual-data move • Mode of DSP: SISD, SIMD, or broadcast load Access Word Size The DSP’s internal memory accommodates the following word sizes: • 48-bit instruction words • 40-bit extended precision Normal word data • 32-bit Normal word data • 16-bit Short word data The DSP’s external memory accommodates the following word sizes: • 48-bit instruction words • 32-bit Normal word data To access words of memory, the DSP supports the following memory access word sizes: • 64-bit accesses, comprised of two consecutive 32-bit data words • 48-bit accesses, for instruction fetches only • 40-bit data word accesses 5-40 ADSP-21160 SHARC DSP Hardware Reference Memory • 32-bit data word accesses • 16-bit data word accesses Long Word (64-Bit) Accesses A program makes a Long word (64-bit) access to internal memory, using an access to a Long word address. Programs can also make a 64-bit access through Normal word addressing with the LW mnemonic or through a PX register move with the LW mnemonic. Programs may not use Long word addressing to access multiprocessor memory space or external memory. The address ranges for internal memory accesses appear in Figure 5-7 on page 5-15. When data is accessed using Long word addressing, the data is always Long word aligned on 64-bit boundaries in internal memory space. When data is accessed using Normal word addressing and the LW mnemonic, the program should maintain this alignment by using an even Normal word address (least significant bit of address =0). This register selection aligns the Normal word address with a 64-bit boundary (Long word address). All Long word accesses load or store two consecutive 32-bit data values. The register file source or destination of a Long word access is a set of two neighboring data registers in a processing element. In a forced Long word access (uses the LW mnemonic), the even (Normal word address) location moves to or from the explicit register in the neighbor-pair, and the odd (Normal word address) location moves to or from the implicit register in the neighbor-pair. For example, the following Long word moves could occur: DM(0x40000) = R0 (LW); {The data in R0 moves to location DM(0x40000), and the data in R1 moves to location DM(0x40001).} R0 = DM(0x40003) (LW); ADSP-21160 SHARC DSP Hardware Reference 5-41 Accessing Memory {The data at location DM(0x40002) moves to R0, and the data at location DM(0x40003) moves to R1.} The example shows that R0 and R1 are a neighbor registers in the same processing element. Table 5-7 lists the other neighbor register assignments that apply to Long word accesses. In un-forced Long word accesses, the DSP places the lower 32-bits of the Long word in the named (explicit) register and places the upper 32-bits of the Long word in the neighbor (implicit) register. Table 5-7. Neighbor Registers for Long Word Accesses PEx neighbor registers PEy neighbor registers r0 neighbors r1 s0 neighbors s1 r2 neighbors r3 s2 neighbors s3 r4 neighbors r5 s4 neighbors s5 r6 neighbors r7 s6 neighbors s7 r8 neighbors r9 s8 neighbors s9 r10 neighbors r11 s10 neighbors s11 r12 neighbors r13 s12 neighbors s13 r14 neighbors r15 s14 neighbors s15 Programs can monitor for unaligned 64-bit accesses by enabling the U64MAE bit. For more information, see “Unaligned 64-bit Memory Access” on page 5-36. Long word ( ) mnemonic only effects Normal word address The accesses and overrides all other factors ( , ). LW PEYEN IMDWx 5-42 ADSP-21160 SHARC DSP Hardware Reference Memory Instruction Word (48-Bit) and Extended Precision Normal Word (40-Bit) Accesses The sequencer uses a 48-bit memory access for instruction fetches. Program can make 48-bit accesses with PX register moves, which default to 48-bit unless the LW mnemonic is part of the instruction. A program makes an extended precision Normal word (40-bit) access to internal memory using an access to a Normal word address when that internal memory block’s IMDWx bit is set (=1) for 40-bit words. Programs may not use extended precision Normal word addressing to access multiprocessor memory space or external memory. The address ranges for internal memory accesses appear in Figure 5-7 on page 5-15. For more information on configuring memory for extended precision Normal word accesses, see “Internal Memory Block Data Width” on page 5-32. The DSP transfers the 40-bit data to internal memory as a 48-bit value, zero-filling the least significant 8 bits on stores and truncating these 8 bits on loads. The register file source or destination of such an access is a single 40-bit data register. Normal Word (32-Bit) Accesses A program makes a Normal word (32-bit) access to internal memory using an access to a Normal word address when that internal memory block’s IMDWx bit is cleared (=0) for 32-bit words. Programs use Normal word addressing to access all DSP memory spaces: internal, multiprocessor, and external memory space. The address ranges for memory accesses appear in Figure 5-7 on page 5-15, Figure 5-8 on page 5-18, and Figure 5-9 on page 5-20. ADSP-21160 SHARC DSP Hardware Reference 5-43 Accessing Memory The register file source or destination of a Normal word access is a single 40-bit data register. The DSP zero-fills the least significant 8 bits on loads and truncates these bits on stores. memory space accesses using Normal word addressing and External the LW mnemonic performs a forced 64-bit access. Short Word (16-Bit) Accesses A program makes a Short word (16-bit) access to internal memory, using an access to a Short word address. Programs may not use Short word addressing to access multiprocessor memory space or external memory. The address ranges for internal memory accesses appear in Figure 5-7 on page 5-15. The register file source or destination of such an access is a single 40-bit data register. The DSP zero-fills the least significant 8 bits on loads and truncates these bits on stores. Depending on the value of the SSE bit in the MODE1 system register, the DSP loads the register’s upper 16 bits by either: • Zero-filling these bits if SSE=0 • Sign-extending these bits if SSE=1 SISD, SIMD, and Broadcast Load Modes These three processing element modes influence memory accesses. For a comparison of their effects, see the examples in “Data Access Options” on page 5-45. For more information on SISD and SIMD modes, see “Secondary Processing Element (PEy)” on page 2-35. Broadcast load mode is a hybrid between SISD and SIMD modes, transferring dual-data under special conditions. For examples of broadcast transfers, see “Data Access Options” on page 5-45. For more information on broadcast load mode, see “Broadcast Register Loads” on page 5-34. 5-44 ADSP-21160 SHARC DSP Hardware Reference Memory Single-and Dual-Data Accesses The number of transfers that occur in a cycle influences the data access operation. As described on “Overview” on page 5-1, the DSP supports single-cycle, dual-data accesses to internal memory for register-to-memory transfers. Though only available for transfers to data registers, dual-data transfers are extremely useful, because they double the data throughput over single-data transfers. For examples of data flow paths for single- and dual-data transfers, see “Data Access Options” on page 5-45. Data Access Options Table 5-8 lists the DSP’s 24 possible memory transfer modes that stem from the DSP’s data access options. When looking at Table 5-8, it is important to note that Long and Short word addressing may not target multiprocessor memory space or external memory space. ADSP-21160 SHARC DSP Hardware Reference 5-45 Accessing Memory Table 5-8. The 24 Possible Memory Transfer Modes Access Type Single Data Access Dual Data Access DSP Mode Address Space Long Word Extended Precision Normal Word Short Word PM DM PM DM PM DM PM DM SISD mode LW none EW none NW none SW none none LW none EW none NW none SW SIMD mode LW none EW none LW none SWx2 none none LW none EW none LW none SWx2 B-cast Load LW none EW none NW none SW none none LW none EW none NW none SW SISD mode LW LW EW EW NW NW SW SW SIMD mode LW LW EW EW LW LW SWx2 SWx2 B-cast Load LW LW EW EW NW NW SW SW Symbols: LW = 64-bit data value (two 32-bit values), EW = 40-bit data value (48-bit value), NW = 32-bit data value, SW = 16-bit data value, and SWx2 = two 16-bit data values. Table 5-8 shows the transfer modes that stem from the following data access options: • The mode of the DSP: SISD, SIMD, or Broadcast Load • The size of access words: Long, extended precision Normal word, Normal word, or Short word • The number of transferred words: single- or dual-data 5-46 ADSP-21160 SHARC DSP Hardware Reference Memory Table 5-8 provides a cross reference to examples of each memory access option. Table 5-9. Memory Transfer Modes Cross Reference Access Type DSP Mode Address Space Long Word Extended Precision Normal Word Short Word Single Data Access SISD mode on page 5-69 on page 5-62 on page 5-55 on page 5-47 SIMD mode on page 5-69 on page 5-64 on page 5-57 on page 5-48 B-cast Load Figure 5-23 Figure 5-23 Figure 5-21 Figure 5-16 SISD mode on page 5-69 on page 5-59 on page 5-59 on page 5-51 SIMD mode on page 5-69 on page 5-64 on page 5-59 on page 5-53 B-cast Load Figure 5-26 Figure 5-24 Figure 5-21 Figure 5-17 Dual Data Access Short Word Addressing of Single Data in SISD Mode Figure 5-14 on page 5-49 displays one possible SISD mode, single data, Short word addressed access. For Short word addressing, the DSP treats the data buses as four 16-bit Short word lanes. The 16-bit value for the Short word access transfers using the least significant Short word lane of the PM or DM data bus. The DSP drives the other Short word lanes of the data buses with zeros. In Figure 5-14 on page 5-49, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 whose Short word address has “00” for its least significant two bits of address. Other accesses within this 4-column location have addresses with least significant two bits of “01”, “10”, or “11” and select WORD X1, WORD X2, or WORD X3 from memory ADSP-21160 SHARC DSP Hardware Reference 5-47 Accessing Memory respectively. The syntax targets register, RX, in PEx. The example would target a PEy register if using the syntax SX. The cross (†) in Figure 5-14 indicates that the DSP zero-fills or sign-extends the most significant 16 bits of the data register while loading the Short word value into a 40-bit data register. The selection depends on the state of the SSE bit in the MODE1 system register. The least significant 8 bits of the data register are always zero. Short Word Addressing of Single Data in SIMD Mode Figure 5-15 displays one possible SIMD mode, single data, Short word addressed access. For Short word addressing, the DSP treats the data buses as four 16-bit Short word lanes. The explicitly addressed (named in the instruction) 16-bit value transfers using the least significant Short word lane of the PM or DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) Short word value transfers using the 47-32 bit Short word lane of the PM or DM data bus. The DSP drives the other Short word lanes of the PM or DM data buses with zeros. In Figure 5-15, the explicit access targets the named register, RX, and the implicit access targets that register’s complementary register, SX. This case uses a PEx register with an RX mnemonic. If the syntax named a PEy register, SX, as the explicit target the DSP would use that register’s complement, RX, as the implicit target. For more information on complementary registers, see “Secondary Processing Element (PEy)” on page 2-35. The cross (†) in Figure 5-15 indicates that the DSP zero-fills or sign-extends the most significant 16 bits of the data register while loading the Short word value into a 40-bit data register. The selection depends on the state of the SSE bit in the MODE1 system register. The least significant 8 bits of the data register are always zero. 5-48 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 WORD X11 WORD X10 WORD X9 WORD X8 ADDRESS WORD Y11 WORD Y10 WORD Y9 ADDRESS BLOCK 1 (DM) WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 NO ACCESS 63-48 47-32 31-16 39-24 15-0 RB 23-8 7-0 63-48 47-32 31-16 15-0 DM DATA BUS 0X0000 0X0000 0X0000 WORD X0 39-24 23-8 RA 39-24 WORD X0 SHORT WORD ACCESS PM DATA BUS PEX REGISTERS WORD X5 WORD X4 23-8 7-0 RY 7-0 RX 39-24 23-8 7-0 0X0000† WORD X0 0X00 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, SHORT WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(SHORT WORD ADDRESS); UREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = UREG; DM(SHORT WORD ADDRESS) = UREG; Figure 5-14. Short Word Addressing of Single Data in SISD Mode ADSP-21160 SHARC DSP Hardware Reference 5-49 Accessing Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 WORD X11 WORD X10 WORD X9 WORD X8 ADDRESS WORD Y11 WORD Y10 WORD Y9 ADDRESS BLOCK 1 (DM) WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 NO ACCESS 63-48 47-32 31-16 39-24 15-0 DM DATA BUS RB 23-8 7-0 63-48 47-32 31-16 15-0 0X0000 WORD X2 0X0000 WORD X0 RA 39-24 WORD X0 SHORT WORD ACCESS PM DATA BUS PEX REGISTERS WORD X5 WORD X4 23-8 7-0 RY 39-24 23-8 7-0 RX 39-24 23-8 7-0 0X0000† WORD X0 0X00 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 SX 39-24 23-8 7-0 0X0000† WORD X2 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, SHORT WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(SHORT WORD ADDRESS); UREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = UREG; DM(SHORT WORD ADDRESS) = UREG; Figure 5-15. Short Word Addressing of Single Data in SIMD Mode 5-50 ADSP-21160 SHARC DSP Hardware Reference Memory Figure 5-15 shows the data path for one transfer, but in this mode it is also important to note the pattern of iterative accesses. For Short word accesses, the DSP accesses Short words sequentially in memory. Table 5-10 shows the pattern of SIMD mode Short word accesses. For more information on arranging data in memory to take advantage of this access pattern, see “Arranging Data in Memory” on page 5-84. Table 5-10. Short Word Addressing in SIMD Mode Explicit Short Word Accessed Implicit Short Word Accessed Word X0 (Address two LSBs = 00) Word X2 (Address two LSBs = 10) Word X1 (Address two LSBs = 01) Word X3 (Address two LSBs = 11) Word X2 (Address two LSBs = 10) Word X4 (Address two LSBs = 00) Word X3 (Address two LSBs = 11) Word X5 (Address two LSBs = 01) Short Word Addressing of Dual-Data in SISD Mode Figure 5-16 displays one possible SISD mode, dual-data, Short word addressed access. For Short word addressing, the DSP treats the data buses as four 16-bit Short word lanes. The 16-bit values for Short word accesses transfer using the least significant Short word lanes of the PM and DM data buses. The DSP drives the other Short word lanes of the data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SISD mode dual-data accesses can handle any combination of Short word, Normal word, extended precision Normal word, or Long word accesses. For more information, see “Mixed Word Width Addressing of Dual Data in SISD Mode” on page 5-72. In Figure 5-16, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 in block 1 and WORD Y0 in block 0. Each of these words has a Short word address with “00” for its least significant two bits of address. Other accesses within these 4-column location have the addresses with least significant two bits of “01”, “10”, or “11” and select ADSP-21160 SHARC DSP Hardware Reference 5-51 Accessing Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 WORD X11 WORD X10 WORD X9 WORD X8 ADDRESS WORD Y11 WORD Y10 WORD Y9 ADDRESS BLOCK 1 (DM) WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 SHORT WORD ACCESS PM DATA BUS 39-24 63-48 47-32 31-16 15-0 0X0000 0X0000 WORD Y0 RB 23-8 7-0 63-48 47-32 31-16 DM DATA BUS 0X0000 0X0000 0X0000 39-24 23-8 RA 39-24 23-8 39-24 SB 23-8 7-0 SA 39-24 23-8 15-0 WORD X0 RY 7-0 7-0 0X0000† WORD Y0 0X00 PEY REGISTERS WORD X0 SHORT WORD ACCESS 0X0000 PEX REGISTERS WORD X5 WORD X4 RX 39-24 23-8 0X0000† WORD X0 0X00 SY 7-0 39-24 23-8 7-0 7-0 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS), RY = PM(SHORT WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), PM(SHORT WORD ADDRESS) = DREG, DREG = DM(SHORT WORD ADDRESS); DM(SHORT WORD ADDRESS) = DREG; Figure 5-16. Short Word Addressing of Dual Data in SISD Mode 5-52 ADSP-21160 SHARC DSP Hardware Reference Memory WORD X/Y1, WORD X/Y2, or WORD X/Y3 from memory respectively. The syntax targets registers, RX and RY, in PEx. The example would target PEy registers if using the syntax SX or SY. The cross (†) in Figure 5-16 indicates that the DSP zero-fills or sign-extends the most significant 16 bits of the data register while loading a Short word value into a 40-bit data register. The selection depends on the state of the SSE bit in the MODE1 system register. The least significant 8 bits of the data register are always zero. Short Word Addressing of Dual-Data in SIMD Mode Figure 5-17 displays one possible SIMD mode, dual-data, Short word addressed access. For Short word addressing, the DSP treats the data buses as four 16-bit Short word lanes. The explicitly addressed (named in the instruction) 16-bit values transfer using the least significant Short word lanes of the PM and DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) Short word values transfer using the 47-32 bit Short word lanes of the PM and DM data buses. The DSP drives the other Short word lanes of the PM and DM data buses with zeros. accesses on both buses do not have to be the same word width. The SIMD mode dual-data accesses can handle combinations of Short word and Normal word or extended precision Normal word and Long word accesses. For more information, see “Mixed Word Width Addressing of Dual Data in SIMD Mode” on page 5-74. In Figure 5-17, the explicit accesses targets the named registers RX and RA, and the implicit accesses target those register’s complementary registers, SX and SA. This case uses a PEx registers with the RX and RA mnemonics. If the syntax named PEy registers SX and SA as the explicit targets, the DSP would use those registers’ complements, RX and RA, as the implicit targets. ADSP-21160 SHARC DSP Hardware Reference 5-53 Accessing Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y11 WORD Y10 WORD Y9 WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 SHORT WORD ACCESS PM DATA BUS 39-24 63-48 47-32 31-16 15-0 WORD Y2 0X0000 WORD Y0 RB 23-8 7-0 0X0000† PEY REGISTERS 39-24 23-8 7-0 63-48 47-32 31-16 15-0 0X0000 WORD X2 0X0000 WORD X0 RY 7-0 39-24 23-8 23-8 RX 39-24 23-8 7-0 0X0000† WORD X0 0X00 SA 39-24 7-0 WORD Y0 0X00 SB 23-8 DM DATA BUS RA 39-24 WORD X0 SHORT WORD ACCESS 0X0000 PEX REGISTERS WORD X5 WORD X4 SY 7-0 39-24 0X0000† WORD Y2 0X00 23-8 7-0 SX 39-24 23-8 7-0 0X0000† WORD X2 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS), RA = PM(SHORT WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), PM(SHORT WORD ADDRESS) = DREG, DREG = DM(SHORT WORD ADDRESS); DM(SHORT WORD ADDRESS) = DREG; Figure 5-17. Short Word Addressing of Dual Data in SIMD Mode 5-54 ADSP-21160 SHARC DSP Hardware Reference Memory For more information on complementary registers, see “Secondary Processing Element (PEy)” on page 2-35. The cross (†) in Figure 5-17 indicates that the DSP zero-fills or sign-extends the most significant 16 bits of the data registers while loading the Short word values into the 40-bit data registers. The selection depends on the state of the SSE bit in the MODE1 system register. The least significant 8 bits of the data register are always zero. Figure 5-17 shows the data path for one transfer, but in this mode it is also important to note the pattern of iterative accesses. For Short word accesses, the DSP accesses Short words sequentially in memory. Table 5-10 on page 5-51 shows the pattern of SIMD mode Short word accesses. For more information on arranging data in memory to take advantage of this access pattern, see “Arranging Data in Memory” on page 5-84. 32-Bit Normal Word Addressing of Single Data in SISD Mode Figure 5-18 displays one possible SISD mode, single data, 32-bit normal word addressed access. For Normal word addressing, the DSP treats the data buses as two 32-bit Normal word lanes. The 32-bit value for the Normal word access transfers using the least significant Normal word lane of the PM or DM data bus. The DSP drives the other Normal word lanes of the data buses with zeros. In Figure 5-18, the access targets a PEx register in a SISD mode operation. This case accesses WORD X0 whose Normal word address has “0” for its least significant address bit. The other access within this 4-column location has an addresses with a least significant bit of “1” and selects WORD X1 from memory. The syntax targets register RX in PEx. The example would target a PEy register if using the syntax SX. For Normal word accesses, the DSP zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory. ADSP-21160 SHARC DSP Hardware Reference 5-55 Accessing Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y11 WORD Y10 WORD Y9 WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 SHORT WORD ACCESS PM DATA BUS 39-24 63-48 47-32 31-16 15-0 WORD Y2 0X0000 WORD Y0 RB 23-8 7-0 0X0000† PEY REGISTERS 39-24 23-8 7-0 63-48 47-32 31-16 15-0 0X0000 WORD X2 0X0000 WORD X0 RY 7-0 39-24 23-8 23-8 RX 39-24 23-8 7-0 0X0000† WORD X0 0X00 SA 39-24 7-0 WORD Y0 0X00 SB 23-8 DM DATA BUS RA 39-24 WORD X0 SHORT WORD ACCESS 0X0000 PEX REGISTERS WORD X5 WORD X4 SY 7-0 39-24 0X0000† WORD Y2 0X00 23-8 7-0 SX 39-24 23-8 7-0 0X0000† WORD X2 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS), RA = PM(SHORT WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), PM(SHORT WORD ADDRESS) = DREG, DREG = DM(SHORT WORD ADDRESS); DM(SHORT WORD ADDRESS) = DREG; Figure 5-18. Normal Word Addressing of Single Data in SISD Mode 5-56 ADSP-21160 SHARC DSP Hardware Reference Memory 32-Bit Normal Word Addressing of Single Data in SIMD Mode Figure 5-19 displays one possible SIMD mode, single data, Normal word addressed access. Figure 5-19 shows the data path for one transfer, but in this mode it is also important to note the pattern of iterative accesses. For Normal word accesses, the DSP accesses Normal words sequentially in memory. Table 5-11 on page 5-57 shows the pattern of SIMD mode Normal word accesses. Table 5-11. Normal Word Addressing in SIMD Mode Explicit Normal Word Accessed Implicit Normal Word Accessed Word X0 (Address LSB = 0) Word X1 (Address LSB = 1) Word X1 (Address LSB = 1) Word X2 (Address LSB = 0) For Normal word addressing, the DSP treats the data buses as two 32-bit Normal word lanes. The explicitly addressed (named in the instruction) 32-bit value transfers using the least significant Normal word lane of the PM or DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) Normal word value transfers using the most significant Normal word lane of the PM or DM data bus. The DSP drives the other Normal word lanes of the data buses with zeros. In Figure 5-19, the explicit access targets the named register RX, and the implicit access targets that register’s complementary register SX. This case uses a PEx register with an RX mnemonic. If the syntax named a PEy register SX as the explicit target, the DSP would use that register’s complement, RX, as the implicit target. For more information on complementary registers, see “Secondary Processing Element (PEy)” on page 2-35. For Normal word accesses, the DSP zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory. For more information on arranging data in memory to take advantage of this access pattern, see “Arranging Data in Memory” on page 5-84. ADSP-21160 SHARC DSP Hardware Reference 5-57 Accessing Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0 NO ACCESS 63-48 47-32 NORMAL WORD ACCESS 31-16 15-0 63-48 DM DATA BUS PM DATA BUS RB PEX REGISTERS 39-24 23-8 7-0 RA 39-24 23-8 7-0 47-32 31-16 WORD X1 15-0 WORD X0 RY 39-24 23-8 7-0 RX 39-24 23-8 WORD X0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 7-0 0X00 SX 39-24 23-8 WORD X1 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: 7-0 0X00 PEY REGISTER S RX = DM(NORMAL WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(NORMAL WORD ADDRESS); UREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = UREG; DM(NORMAL WORD ADDRESS) = UREG; Figure 5-19. Normal Word Addressing of Single Data in SIMD Mode 5-58 ADSP-21160 SHARC DSP Hardware Reference Memory 32-Bit Normal Word Addressing of Dual Data in SISD Mode Figure 5-20 displays one possible SISD mode, dual data, 32-bit Normal word addressed access. For Normal word addressing, the DSP treats the data buses as two 32-bit Normal word lanes. The 32-bit values for Normal word accesses transfer using the least significant Normal word lanes of the PM and DM data buses. The DSP drives the other Normal word lanes of the data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SISD mode dual-data accesses can handle any combination of Short word, Normal word, extended precision Normal word, or Long word accesses. For more information, see “Mixed Word Width Addressing of Dual Data in SISD Mode” on page 5-72. In Figure 5-20, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 in block 1 and WORD Y0 in block 0. Each of these words has a Normal word address with “0” for its least significant address bit. Other accesses within these 4-column locations have the addresses with the least significant bit of “1” and select WORD X/Y1 from memory. The syntax targets registers RX and RY in PEx. The example would target PEy registers if using the syntax SX or SY. For Normal word accesses, the DSP zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory. 32-Bit Normal Word Addressing of Dual Data in SIMD Mode Figure 5-21 displays one possible SIMD mode, dual data, 32-bit Normal word addressed access. For Normal word addressing, the DSP treats the data buses as two 32-bit Normal word lanes. The explicitly addressed (named in the instruction) 32-bit values transfer using the least significant Normal word lane of the PM or DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) Normal word values ADSP-21160 SHARC DSP Hardware Reference 5-59 Accessing Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0 NORMAL WORD ACCESS PM DATA BUS 63-48 47-32 0X0000 0X0000 15-0 WORD Y0 RB PEX REGISTERS 39-24 31-16 NORMAL WORD ACCESS 23-8 7-0 63-48 47-32 DM DATA BUS 0X0000 0X0000 39-24 23-8 RA 39-24 23-8 39-24 SB 23-8 7-0 SA 39-24 23-8 15-0 WORD X0 RY 7-0 WORD Y0 PEY REGISTERS 31-16 7-0 0X00 RX 39-24 23-8 WORD X0 7-0 0X00 SY 7-0 39-24 23-8 7-0 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS), RY = PM(NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(NORMAL WORD ADDRESS), PM(NORMAL WORD ADDRESS) = DREG, DREG = DM(NORMAL WORD ADDRESS); DM(NORMAL WORD ADDRESS) = DREG; Figure 5-20. Normal Word Addressing of Dual Data in SISD Mode 5-60 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0 NORMAL WORD ACCESS 63-48 PM DATA BUS PEX REGISTERS 39-24 47-32 31-16 WORD Y1 NORMAL WORD ACCESS 15-0 RB 23-8 7-0 23-8 39-24 SB 23-8 7-0 23-8 WORD Y1 31-16 15-0 WORD X0 RY 7-0 39-24 23-8 7-0 RX 39-24 23-8 WORD X0 0X00 SA 39-24 47-32 WORD X1 RA 39-24 WORD Y0 PEY REGISTERS 63-48 DM DATA BUS WORD Y0 7-0 0X00 SY 7-0 39-24 23-8 7-0 SX 39-24 23-8 WORD X1 0X00 7-0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS), RY = PM(NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(NORMAL WORD ADDRESS), PM(NORMAL WORD ADDRESS) = DREG, DREG = DM(NORMAL WORD ADDRESS); DM(NORMAL WORD ADDRESS) = DREG; Figure 5-21. Normal Word Addressing of Dual Data in SIMD Mode ADSP-21160 SHARC DSP Hardware Reference 5-61 Accessing Memory transfer using the most significant Normal word lanes of the PM and DM data bus. The DSP drives the other Normal word lanes of the PM and DM data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SIMD mode dual-data accesses can handle combinations of Short word and Normal word or extended precision Normal word and Long word accesses. For more information, see “Mixed Word Width Addressing of Dual Data in SIMD Mode” on page 5-74. In Figure 5-21, the explicit access targets the named registers RX and RA, and the implicit access targets those register’s complementary registers SX and SA. This case uses a PEx registers with the RX and RA mnemonics. If the syntax named PEy registers SX and SA as the explicit targets, the DSP would use those registers’ complements RX and RA as the implicit targets. For more information on complementary registers, see “Secondary Processing Element (PEy)” on page 2-35. For Normal word accesses, the DSP zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory. Figure 5-21 shows the data path for one transfer, but in this mode it is also important to note the pattern of iterative accesses. For Normal word accesses, the DSP accesses Normal words sequentially in memory. Table 5-11 on page 5-24 shows the pattern of SIMD mode Normal word accesses. For more information on arranging data in memory to take advantage of this access pattern, see “Arranging Data in Memory” on page 5-84. Extended Precision Normal Word Addressing of Single Data Figure 5-22 displays one possible single data, 40-bit extended precision Normal word addressed access. For extended precision Normal word addressing, the DSP treats each data bus as a 40-bit extended precision Normal word lane. The 40-bit value for the extended precision Normal word access transfers using the most significant 40 bits of the PM or DM data bus. The DSP drives the lower 24 bits of the data buses with zeros. 5-62 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y3 WORD Y2 WORD Y1 WORD Y1 WORD X3 WORD Y2 ADDRESS ADDRESS BLOCK 1 (DM) WORD Y0 WORD X2 47-32 31-16 39-24 15-0 63-48 DM DATA BUS RB 23-8 7-0 23-8 7-0 47-32 31-16 WORD X0 RA 39-24 WORD X0 EXTENDED PRECISION NORMAL WORD ACCESS PM DATA BUS PEX REGISTERS WORD X1 WORD X1 NO ACCESS 63-48 WORD X2 0X00 15-0 0X0000 RY 39-24 23-8 7-0 RX 39-24 23-8 7-0 WORD X0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EXTENDED PRECISION NORMAL WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD OR SIMD, EXT. PREC. NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(EXTENDED PRECISION NORMAL WORD ADDRESS); UREG = DM(EXTENDED PRECISION NORMAL WORD ADDRESS); PM(EXTENDED PRECISION NORMAL WORD ADDRESS) = UREG; DM(EXTENDED PRECISION NORMAL WORD ADDRESS) = UREG; Figure 5-22. Extended Precision Normal Word Addressing of Single Data ADSP-21160 SHARC DSP Hardware Reference 5-63 Accessing Memory In Figure 5-22, the access targets a PEx register in a SISD or SIMD mode operation; extended precision Normal word single-data access operate the same in SISD or SIMD mode. This case accesses WORD X0 with syntax that targets register RX in PEx. The example would target a PEy register if using the syntax SX. Extended Precision Normal Word Addressing of Dual Data in SISD Mode Figure 5-23 displays one possible SISD mode, dual data, 40-bit extended precision normal word addressed access. For extended precision Normal word addressing, the DSP treats each data bus as a 40-bit extended precision Normal word lane. The 40-bit values for the extended precision Normal word accesses transfer using the most significant 40 bits of the PM and DM data bus. The DSP drives the lower 24 bits of the data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SISD mode dual-data accesses can handle any combination of Short word, Normal word, extended precision Normal word, or Long word accesses. For more information, see “Mixed Word Width Addressing of Dual Data in SISD Mode” on page 5-72. In Figure 5-23, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 in block 1 and WORD Y0 in block 0 with syntax that targets registers RX and RY in PEx. The example would target a PEy registers if using the syntax SX or SY. Extended Precision Normal Word Addressing of Dual Data in SIMD Mode Figure 5-24 displays one possible SIMD mode, dual data, 40-bit extended precision normal word addressed access. For extended precision Normal word addressing, the DSP treats each data bus as a 40-bit extended precision Normal word lane. 5-64 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y3 WORD X3 WORD Y2 WORD Y2 ADDRESS ADDRESS BLOCK 1 (DM) WORD Y1 WORD Y1 WORD Y0 WORD X2 63-48 47-32 31-16 WORD Y0 PEX REGISTERS 39-24 0X00 7-0 63-48 DM DATA BUS 0X0000 23-8 47-32 31-16 WORD X0 RA 39-24 WORD X0 EXTENDED PRECISION NORMAL WORD ACCESS 15-0 RB 23-8 WORD X1 WORD X1 EXTENDED PRECISION NORMAL WORD ACCESS PM DATA BUS WORD X2 RY 7-0 39-24 23-8 7-0 RX 39-24 WORD Y0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 15-0 0X00 0X0000 23-8 WORD X0 SY 7-0 39-24 23-8 7-0 7-0 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EP NORMAL WORD X0 ADDR.), RY = PM(EP NORMAL WORD Y0 ADDR.); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, EXTENDED PRECISION NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(EXT. PREC. NORMAL WORD ADDRESS), PM(EXT. PREC. NORMAL WORD ADDRESS) = DREG, DREG = DM(EXT. PREC. NORMAL WORD ADDRESS); DM(EXT. PREC. NORMAL WORD ADDRESS) = DREG; Figure 5-23. Extended Precision Normal Word Addressing of Dual Data in SISD Mode ADSP-21160 SHARC DSP Hardware Reference 5-65 Accessing Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y3 WORD X3 WORD Y2 WORD Y2 ADDRESS ADDRESS BLOCK 1 (DM) WORD Y1 WORD Y1 WORD Y0 WORD X2 63-48 PEX REGISTERS 39-24 47-32 31-16 WORD Y0 7-0 63-48 15-0 DM DATA BUS 23-8 47-32 31-16 WORD X0 RA 39-24 WORD X0 EXTENDED PRECISION NORMAL WORD ACCESS 0X00 0X0000 RB 23-8 WORD X1 WORD X1 EXTENDED PRECISION NORMAL WORD ACCESS PM DATA BUS WORD X2 15-0 0X00 0X0000 RY 7-0 39-24 23-8 7-0 RX 39-24 23-8 7-0 WORD X0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 SY 7-0 39-24 23-8 7-0 SX 39-24 23-8 7-0 WORD Y0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EP NORMAL WORD X0 ADDR.), SX = PM(EP NORMAL WORD Y0 ADDR.); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, EXTENDED PRECISION NORMAL WORD, DUAL-DATA TRANSFERS ARE: PEY DREG = PM(EP NORMAL WORD ADDRESS), PM(EP NORMAL WORD ADDRESS) = PEY DREG, PEX DREG = DM(EP NORMAL WORD ADDRESS); DM(EP NORMAL WORD ADDRESS) = PEX DREG; Figure 5-24. Extended Precision Normal Word Addressing of Dual Data in SIMD Mode 5-66 ADSP-21160 SHARC DSP Hardware Reference Memory Because this word size approaches the limit of the data buses capacity, this SIMD mode transfer only moves the explicitly addressed locations and restricts data bus usage. The explicitly addressed (named in the instruction) 40-bit values transferred over the DM bus must source or sink a PEx data register, and the explicitly addressed (named in the instruction) 40-bit values transferred over the PM bus must source or sink a PEy data register; there are no implicit transfers in this mode. The 40-bit values for the extended precision Normal word accesses transfer using the most significant 40 bits of the PM and DM data bus. The DSP drives the lower 24 bits of the data buses with zeros. accesses on both buses do not have to be the same word width. The This special case of SIMD mode dual-data accesses can handle any combination of extended precision Normal word or Long word accesses. For more information, see “Mixed Word Width Addressing of Dual Data in SIMD Mode” on page 5-74. In Figure 5-24, the access targets PEx and PEy registers in a SIMD mode operation. This case accesses WORD X0 in block 1 with syntax that targets register RX in PEx and accesses WORD Y0 in block 0 with syntax that targets register SX in PEy. Long Word Addressing of Single Data Figure 5-25 displays one possible single data, Long word addressed access. For Long word addressing, the DSP treats each data bus as a 64-bit Long word lane. The 64-bit value for the Long word access transfers using the full width of the PM or DM data bus. In Figure 5-25, the access targets a PEx register in a SISD or SIMD mode operation; Long word single-data access operate the same in SISD or SIMD mode. This case accesses WORD X0 with syntax that explicitly targets register RX and implicitly targets its neighbor register RY in PEx. The example would target PEy registers if using the syntax SX. For more ADSP-21160 SHARC DSP Hardware Reference 5-67 Accessing Memory MEMORY BLOCK 0 (PM) BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD X2 ADDRESS ADDRESS WORD Y2 WORD Y1 WORD Y0 WORD X1 WORD X0 NO ACCESS 63-48 47-32 31-16 LONG WORD ACCESS 15-0 63-48 PM DATA BUS PEX REGISTERS 39-24 RB 23-8 PEY REGISTERS 39-24 47-32 DM DATA BUS 7-0 RA 39-24 23-8 SB 23-8 7-0 7-0 23-8 7-0 15-0 RY 39-24 23-8 7-0 WORD X0, 63-32 0X00 39-24 7-0 SA 39-24 31-16 WORD X0 RX 39-24 23-8 WORD X0, 31-0 SY 23-8 7-0 0X00 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD OR SIMD, LONG WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(LONG WORD ADDRESS); UREG = DM(LONG WORD ADDRESS); PM(LONG WORD ADDRESS) = UREG; DM(LONG WORD ADDRESS) = UREG; Figure 5-25. Long Word Addressing of Single Data 5-68 ADSP-21160 SHARC DSP Hardware Reference Memory information on how neighbor registers (listed in Table 5-7 on page 5-42) work, see “Long Word (64-Bit) Accesses” on page 5-41. Long Word Addressing of Dual Data in SISD Mode Figure 5-26 displays one possible SISD mode, dual data, long word addressed access. For Long word addressing, the DSP treats each data bus as a 64-bit Long word lane. The 64-bit values for the Long word accesses transfer using the full width of the PM or DM data bus. In Figure 5-26, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 and WORD Y0 with syntax that explicitly targets registers RX registers RA and implicitly targets their neighbor registers RY and RB in PEx. The example would target PEy registers if using the syntax SX and SA. For more information on how neighbor registers (listed in Table 5-7 on page 5-42) work, see “Long Word (64-Bit) Accesses” on page 5-41. Programs must be careful not to explicitly target neighbor registers in this case. While the syntax lets programs target these registers, one of the explicit accesses targets the other access’s implicit target. The DSP resolves this conflict by performing only the access with higher priority. For more information on the priority order of data register file accesses, see “Data Register File” on page 2-28. Long Word Addressing of Dual Data in SIMD Mode Figure 5-27 displays one possible SIMD mode, dual data, Long word addressed access targeting internal memory space. For Long word addressing, the DSP treats each data bus as a 64-bit Long word lane. The 64-bit values for the Long word accesses transfer using the full width of the PM or DM data bus. ADSP-21160 SHARC DSP Hardware Reference 5-69 Accessing Memory MEMORY BLOCK 0 (PM) BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD X2 ADDRESS ADDRESS WORD Y2 WORD Y1 WORD X1 WORD X0 WORD Y0 LONG WORD ACCESS LONG WORD ACCESS 63-48 47-32 PM DATA BUS PEX REGISTERS 39-24 15-0 63-48 PEY REGISTERS 7-0 0X00 23-8 WORD Y0, 31-0 SB 23-8 7-0 23-8 15-0 RY 7-0 39-24 23-8 0X00 WORD X0, 63-32 SA 39-24 31-16 WORD X0 RA 39-24 47-32 DM DATA BUS RB 23-8 WORD Y0, 63-32 39-24 31-16 WORD Y0 7-0 0X00 RX 39-24 23-8 WORD X0, 31-0 0X00 39-24 7-0 SY 7-0 39-24 23-8 7-0 7-0 SX 23-8 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), RA = PM(LONG WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, LONG WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(LONG WORD ADDRESS), PM(LONG WORD ADDRESS) = DREG, DREG = DM(LONG WORD ADDRESS); DM(LONG WORD ADDRESS) = DREG; Figure 5-26. Long Word Addressing of Dual Data in SISD Mode 5-70 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD X2 ADDRESS ADDRESS WORD Y2 WORD Y1 WORD Y0 WORD X1 WORD X0 LONG WORD ACCESS 63-48 47-32 PM DATA BUS 63-48 7-0 23-8 SB 23-8 7-0 23-8 15-0 RY 7-0 39-24 23-8 RX 7-0 39-24 WORD X0, 63-32 0X00 WORD X0, 31-0 0X00 39-24 7-0 39-24 7-0 SA 39-24 31-16 WORD X0 RA 39-24 47-32 DM DATA BUS RB 23-8 PEY REGISTERS 39-24 15-0 WORD Y0 PEX REGISTERS 39-24 31-16 LONG WORD ACCESS 23-8 SY 7-0 23-8 WORD Y0, 63-32 0X00 7-0 SX 23-8 WORD Y0, 31-0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), SX = PM(LONG WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, LONG WORD, DUAL-DATA TRANSFERS ARE: PEY DREG = PM(LONG WORD ADDRESS), PM(LONG WORD ADDRESS) = PEY DREG, PEX DREG = DM(LONG WORD ADDRESS); DM(LONG WORD ADDRESS) = PEX DREG; Figure 5-27. Long Word Addressing of Dual Data in SIMD Mode ADSP-21160 SHARC DSP Hardware Reference 5-71 Accessing Memory Because this word size approaches the limit of the data buses capacity, this SIMD mode transfer only moves the explicitly addressed locations and restricts data bus usage. The explicitly addressed (named in the instruction) 64-bit values transferred over the DM bus must source or sink a PEx data register, and the explicitly addressed (named in the instruction) 64-bit values transferred over the PM bus must source or sink a PEy data register; there are no implicit transfers in this mode. In Figure 5-27, the access targets PEx and PEy registers in a SIMD mode operation. This case accesses WORD X0 in block 1 with syntax that targets register RX and its neighbor register RY in PEx and accesses WORD Y0 in block 0 with syntax that targets register SX and its neighbor register SY in PEy. For more information on how neighbor registers (listed in “Neighbor Registers for Long Word Accesses” on page 5-42) work, see “Long Word (64-Bit) Accesses” on page 5-41. accesses on both buses do not have to be the same word width. The This special case of SIMD mode dual-data accesses can handle any combination of extended precision Normal word or Long word accesses. “Mixed Word Width Addressing of Dual Data in SIMD Mode” on page 5-74 Mixed Word Width Addressing of Dual Data in SISD Mode Figure 5-28 displays an example of a mixed word width, dual data, SISD mode access. This example shows how the DSP transfers a Long word access on the DM bus and transfers a Short word access on the PM bus. The memory architecture permits mixing all other combinations of dual-data SISD mode 5-72 ADSP-21160 SHARC DSP Hardware Reference Memory Short word, Normal word, extended precision Normal word, and Long word accesses. MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 WORD X2 ADDRESS WORD Y11 WORD Y10 WORD Y9 ADDRESS BLOCK 1 (DM) WORD X1 WORD X0 LONG WORD ACCESS LONG WORD ACCESS 63-48 PM DATA BUS 0X0000 PEX REGISTERS 39-24 47-32 0X0000 31-16 0X0000 15-0 WORD Y0 RB 23-8 7-0 63-48 23-8 39-24 SB 23-8 7-0 39-24 23-8 WORD X0, 63-32 SA 39-24 23-8 15-0 RY 7-0 0X0000† WORD Y0 0X00 PEY REGISTERS 31-16 WORD X0 RA 39-24 47-32 DM DATA BUS 7-0 0X00 RX 39-24 23-8 WORD X0, 31-0 SY 7-0 39-24 23-8 7-0 7-0 0X00 SX 39-24 23-8 7-0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), RA = PM(SHORT WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, MIXED WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT, NORMAL, EP NORMAL, LONG ADD), PM(SHORT, NORMAL, EP NORMAL, LONG ADD) = DREG, DREG = DM(SHORT, NORMAL, EP NORMAL, LONG ADD); DM(SHORT, NORMAL, EP NORMAL, LONG ADD) = DREG; Figure 5-28. Mixed Word Width Addressing of Dual Data in SISD Mode ADSP-21160 SHARC DSP Hardware Reference 5-73 Accessing Memory of conflicting dual access to the data register file, the DSP Inonlycaseperforms the access with higher priority. For more information on how the DSP prioritizes accesses, see “Data Register File” on page 2-28. Mixed Word Width Addressing of Dual Data in SIMD Mode Figure 5-29 displays an example of a mixed word width, dual data, SIMD mode access. This example shows how the DSP transfers a Long word access on the DM bus and transfers an extended precision Normal word access on the PM bus. memory architecture permits mixing SIMD mode dual data The Short word and Normal word accesses or extended precision Normal word and Long word accesses. No other combinations of mixed word dual-data SIMD mode accesses are permissible. Broadcast Load Access Figure 5-30, Figure 5-31 on page 5-77, Figure 5-32 on page 5-78, Figure 5-33 on page 5-79, Figure 5-34 on page 5-80, Figure 5-35 on page 5-81, Figure 5-36 on page 5-82, and Figure 5-37 on page 5-83 provide examples of broadcast load accesses for single- and dual-data transfers. These examples show that the broadcast load’s memory and register access is a hybrid of the corresponding non-broadcast SISD and SIMD mode accesses. The exceptions to this relation are broadcast load dual-data, extended precision Normal word and Long word accesses. These broadcast accesses differ from their corresponding non-broadcast mode accesses. 5-74 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y3 WORD Y2 WORD Y2 WORD Y1 WORD Y1 WORD X2 ADDRESS ADDRESS BLOCK 1 (DM) WORD X1 WORD Y0 WORD X0 EXTENDED PRECISION NORMAL WORD ACCESS 63-48 PM DATA BUS PEX REGISTERS 39-24 31-16 0X00 7-0 23-8 7-0 23-8 15-0 RY 7-0 39-24 23-8 RX 7-0 39-24 WORD X0, 63-32 0X00 WORD X0, 31-0 0X00 39-24 7-0 39-24 7-0 SA 39-24 31-16 WORD X0 RA 39-24 47-32 DM DATA BUS 0X0000 SB 23-8 63-48 15-0 RB 23-8 PEY REGISTERS 39-24 47-32 WORD Y0 LONG WORD ACCESS 23-8 SY 7-0 23-8 7-0 SX 23-8 WORD Y0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), SX = PM(EP NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, MIXED WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(ADDRESS), PM(ADDRESS) = DREG, DREG = DM(ADDRESS); DM(ADDRESS) = DREG; FOR A LIST OF PERMISSIBLE MIXED DUAL ACCESS COMBINATIONS, SEE DISCUSSION IN TEXT. Figure 5-29. Mixed Word Width Addressing of Dual Data in SIMD Mode ADSP-21160 SHARC DSP Hardware Reference 5-75 Accessing Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 WORD X11 WORD X10 WORD X9 WORD X8 ADDRESS WORD Y11 WORD Y10 WORD Y9 ADDRESS BLOCK 1 (DM) WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 NO ACCESS 63-48 47-32 31-16 39-24 15-0 RB 23-8 7-0 63-48 47-32 31-16 15-0 DM DATA BUS 0X0000 0X0000 0X0000 WORD X0 39-24 23-8 RY RA 39-24 WORD X0 SHORT WORD ACCESS PM DATA BUS PEX REGISTERS WORD X5 WORD X4 23-8 7-0 7-0 RX 39-24 23-8 7-0 0X0000† WORD X0 0X00 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 SX 39-24 23-8 7-0 0X0000† WORD X0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, SHORT WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(SHORT WORD ADDRESS); UREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = UREG; DM(SHORT WORD ADDRESS) = UREG; Figure 5-30. Short Word Addressing of Single Data in Broadcast Load 5-76 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y11 WORD Y10 WORD Y9 WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X3 WORD X2 WORD X1 SHORT WORD ACCESS 63-48 PM DATA BUS 0X0000 0X0000 31-16 23-8 7-0 15-0 0X0000 WORD Y0 63-48 47-32 31-16 15-0 DM DATA BUS 0X0000 0X0000 0X0000 WORD X0 39-24 23-8 RA 39-24 23-8 RY 7-0 7-0 0X0000† WORD Y0 0X00 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 WORD X0 SHORT WORD ACCESS RB PEX REGISTERS 39-24 47-32 WORD X5 WORD X4 23-8 RX 39-24 23-8 0X0000† WORD X0 0X00 SY 7-0 39-24 23-8 7-0 0X0000† WORD Y0 0X00 7-0 SX 39-24 23-8 7-0 0X0000† WORD X0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS), RY = PM(SHORT WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), PM(SHORT WORD ADDRESS) = DREG, DREG = DM(SHORT WORD ADDRESS); DM(SHORT WORD ADDRESS) = DREG; Figure 5-31. Short Word Addressing of Dual Data in Broadcast Load ADSP-21160 SHARC DSP Hardware Reference 5-77 Accessing Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0 NO ACCESS 63-48 47-32 NORMAL WORD ACCESS 31-16 15-0 DM DATA BUS PM DATA BUS PEX REGISTERS 39-24 RB 23-8 7-0 63-48 47-32 0X0000 0X0000 23-8 7-0 15-0 WORD X0 RY RA 39-24 31-16 39-24 23-8 7-0 RX 39-24 23-8 WORD X0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 7-0 0X00 SX 39-24 23-8 WORD X0 7-0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(NORMAL WORD ADDRESS); UREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = UREG; DM(NORMAL WORD ADDRESS) = UREG; Figure 5-32. Normal Word Addressing of Single Data in Broadcast Load 5-78 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0 ADDRESS ADDRESS BLOCK 0 (PM) WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0 NORMAL WORD ACCESS PM DATA BUS 63-48 47-32 0X0000 0X0000 PEX REGISTERS 39-24 31-16 NORMAL WORD ACCESS 15-0 RB 23-8 7-0 DM DATA BUS WORD Y0 63-48 47-32 0X0000 0X0000 23-8 7-0 39-24 23-8 WORD Y0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 15-0 WORD X0 RY RA 39-24 31-16 7-0 0X00 RX 39-24 23-8 WORD X0 7-0 0X00 SY 7-0 39-24 23-8 WORD Y0 7-0 0X00 SX 39-24 23-8 WORD X0 7-0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS), RY = PM(NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(NORMAL WORD ADDRESS), PM(NORMAL WORD ADDRESS) = DREG, DREG = DM(NORMAL WORD ADDRESS); DM(NORMAL WORD ADDRESS) = DREG; Figure 5-33. Normal Word Addressing of Dual Data in Broadcast Load ADSP-21160 SHARC DSP Hardware Reference 5-79 Accessing Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y3 WORD Y2 WORD Y1 WORD Y1 WORD X3 WORD Y2 ADDRESS ADDRESS BLOCK 1 (DM) WORD Y0 WORD X2 47-32 31-16 39-24 15-0 63-48 DM DATA BUS RB 23-8 7-0 23-8 7-0 47-32 31-16 WORD X0 RA 39-24 WORD X0 EXTENDED PRECISION NORMAL WORD ACCESS PM DATA BUS PEX REGISTERS WORD X1 WORD X1 NO ACCESS 63-48 WORD X2 0X00 15-0 0X0000 RY 39-24 23-8 7-0 RX 39-24 23-8 7-0 WORD X0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 SY 39-24 23-8 7-0 SX 39-24 23-8 7-0 WORD X0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EXTENDED PRECISION NORMAL WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, EXTENDED NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(EP NORMAL WORD ADDRESS); UREG = DM(EP NORMAL WORD ADDRESS); PM(EP NORMAL WORD ADDRESS) = UREG; DM(EP NORMAL WORD ADDRESS) = UREG; Figure 5-34. Extended Precision Normal Word Addressing of Single Data in Broadcast Load 5-80 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) … … … … … … … … … … … … … … … … … … … … … … … … WORD Y3 WORD Y2 WORD Y1 WORD Y1 WORD X3 WORD Y2 ADDRESS ADDRESS BLOCK 1 (DM) WORD Y0 WORD X2 63-48 31-16 WORD Y0 PEX REGISTERS 39-24 47-32 0X00 7-0 63-48 DM DATA BUS 0X0000 23-8 7-0 47-32 31-16 WORD X0 RA 39-24 WORD X0 EXTENDED PRECISION NORMAL WORD ACCESS 15-0 RB 23-8 WORD X1 WORD X1 EXTENDED PRECISION NORMAL WORD ACCESS PM DATA BUS WORD X2 0X00 0X0000 RY 39-24 23-8 7-0 RX 39-24 WORD Y0 PEY REGISTERS 39-24 SB 23-8 7-0 SA 39-24 23-8 7-0 15-0 23-8 WORD X0 SY 39-24 23-8 WORD Y0 7-0 7-0 SX 39-24 23-8 7-0 WORD X0 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EP NORMAL WORD X0 ADDR.), RY = PM(EP NORMAL WORD Y0 ADDR.); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, EXTENDED NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(EP NORMAL WORD ADDRESS), DREG = DM(EPNORMAL WORD ADDRESS); PM(EP NORMAL WORD ADDRESS) = DREG, DM(EP NORMAL WORD ADDRESS) = DREG; Figure 5-35. Extended Precision Normal Word Addressing of Dual Data in Broadcast Load ADSP-21160 SHARC DSP Hardware Reference 5-81 Accessing Memory MEMORY BLOCK 0 (PM) BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD X2 ADDRESS ADDRESS WORD Y2 WORD Y1 WORD Y0 WORD X1 WORD X0 NO ACCESS 63-48 47-32 LONG WORD ACCESS 31-16 15-0 63-48 PM DATA BUS PEX REGISTERS 39-24 RB 23-8 7-0 RA 39-24 23-8 7-0 PEY REGISTERS SB 23-8 7-0 23-8 7-0 15-0 RY 39-24 23-8 SA 39-24 31-16 WORD X0 WORD X0, 63-32 39-24 47-32 DM DATA BUS 7-0 0X00 RX 39-24 23-8 WORD X0, 31-0 SY 39-24 23-8 WORD X0, 63-32 7-0 0X00 7-0 0X00 SX 39-24 23-8 WORD X0, 31-0 7-0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, LONG WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(LONG WORD ADDRESS); UREG = DM(LONG WORD ADDRESS); PM(LONG WORD ADDRESS) = UREG; DM(LONG WORD ADDRESS) = UREG; Figure 5-36. Long Word Addressing of Single Data in Broadcast Load 5-82 ADSP-21160 SHARC DSP Hardware Reference Memory MEMORY BLOCK 0 (PM) BLOCK 1 (DM) … … … … … … … … … … … … … … … … … … … … … … … … WORD X2 ADDRESS ADDRESS WORD Y2 WORD Y1 WORD X1 WORD X0 WORD Y0 LONG WORD ACCESS LONG WORD ACCESS 63-48 47-32 PM DATA BUS PEX REGISTERS 39-24 15-0 63-48 PEY REGISTERS 7-0 0X00 23-8 WORD Y0, 31-0 SB 23-8 WORD Y0, 63-32 7-0 0X00 23-8 WORD Y0, 31-0 15-0 RY 7-0 39-24 0X00 23-8 RX 7-0 39-24 WORD X0, 63-32 0X00 00 WORD X0, 31-0 0X00 7-0 39-24 7-0 39-24 7-0 0X00 WORD X0, 63-32 SA 39-24 31-16 WORD X0 RA 39-24 47-32 DM DATA BUS RB 23-8 WORD Y0, 63-32 39-24 31-16 WORD Y0 23-8 SY 23-8 0X00 7-0 SX 23-8 WORD X0, 31-0 0X00 THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), RA = PM(LONG WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, LONG WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(LONG WORD ADDRESS), PM(LONG WORD ADDRESS) = DREG, DREG = DM(LONG WORD ADDRESS); DM(LONG WORD ADDRESS) = DREG; Figure 5-37. Long Word Addressing of Dual Data in Broadcast Load ADSP-21160 SHARC DSP Hardware Reference 5-83 Arranging Data in Memory Arranging Data in Memory Each DSP’s access to internal memory gets data from either a 4-columns (Long, Normal, or Short word) or 3-columns (instruction or extended precision Normal word) memory location. For more information on how the DSP accesses 4- or 3-column data, see “Memory Organization and Word Size” on page 5-22. To take advantage of the DSP’s data accesses to 4- and 3-column locations, programs must adjust the interleaving of data into memory locations to accommodate the memory access mode. The following guidelines provide an overview of how programs should interleave data in memory locations. For more information and examples, see ADSP-21160 SHARC DSP Instruction Set Reference: • Programs can use odd or even modify values (1, 2, 3, …) to step through a buffer in single-or dual-data, SISD or Broadcast load mode regardless of the data word size (Long word, extended precision Normal word, Normal word, or Short word). • Programs should use multiple of 4 modify values (4, 8, 12, …) to step through a buffer of Short word data in single-or dual-data, SIMD mode. • Programs should use multiple of 2 modify values (2, 4, 6, …) to step through a buffer of Normal word data in single- or dual-data SIMD mode. • Programs can use odd or even modify values (1, 2, 3, …) to step through a buffer of Long word or extended precision Normal word data in single- or dual-data, SIMD mode. 5-84 ADSP-21160 SHARC DSP Hardware Reference I/O Processor 6 I/O PROCESSOR The DSP’s I/O processor manages Direct Memory Accessing (DMA) of DSP memory through the external, link, and serial ports. Each DMA operation transfers an entire block of data. By managing DMA, the I/O processor lets programs move data as a background task while using the processor core for other DSP operations. Overview The I/O processor’s architecture, which appears in Figure 6-1 on page 6-4, supports a number of DMA operations. These operations include the following transfer types: • Internal memory external memory or external peripherals • Internal memory internal memory of other DSPs • Internal memory host processor • Internal memory serial port I/O ADSP-21160 SHARC DSP Hardware Reference 6-1 Overview • Internal memory link port I/O • External memory external peripherals chapter describes the I/O processor and how the I/O proces This sor controls external port, link port, and serial port operations. For information on connecting external devices to the external port, link ports, or serial ports, see “External Port”, “Link Ports”, or “Serial Ports”. DMA transfers between internal memory and external memory, multiprocessor memory, or a host use the DSP’s external port. For these types of transfers, a program sets up the DMA controller with the internal memory buffer size and address, the address modifier, and the direction of transfer. These DMA set up parameters are the Transfer Control Block (TCB) for the DMA transfer. After setup, the DMA transfers begins when the program enables the channel and continues until the I/O processor transfers the entire buffer to or from DSP memory. Similarly, DMA transfers between internal memory and link or serial ports have DMA parameters (a TCB). When the I/O processor performs DMA between internal memory and one of these ports, the program sets up the parameters and the I/O goes through the port instead of the external bus. The direction (receive or transmit) of the I/O port determines the direction of data transfer. When the port receives data, the I/O processor automatically transfers the data to internal memory. When the port needs to transmit a word, the I/O processor automatically fetches the data from internal memory. The I/O processor also lets the DSP system perform DMA transfers between an external device and external memory. This external to external transfer only uses the external port and I/O processor. 6-2 ADSP-21160 SHARC DSP Hardware Reference I/O Processor External devices can control external port DMA transfers in two ways. If the external device can handle bus mastership, the external device can master reads or writes to DMA buffers on the DSP. External devices also can assert a DMA Request input (DMARx) to request service. To further minimize loading on the processor core, the I/O processor supports chained DMA operations. When using chained DMA, a program can set up a DMA transfer to automatically set up and start the next DMA transfer after the current one completes. Figure 6-1 shows the DSP’s I/O processor, related ports, and buses. Figure 6-8 on page 6-68 shows more detail on DMA channel data paths. The Data Buffer Registers column in Figure 6-1 shows the data buffer registers for each port. These registers include: • External Port Buffer registers (EBPx). These 64-bit buffers for the external port have eight-position FIFOs for transmitting or receiving data when interfacing with a host or external devices such as memory and memory mapped devices. • Link Port Buffer registers (LBUFx). These buffers for the link ports have two-position FIFOs for transmitting or receiving DMA data when connected to another link port. • Serial Port Receive Buffer registers (RXx). These receive buffers for the serial ports have two-position FIFOs for receiving data when connected to another serial device. • Serial Port Transmit Buffer registers (TXx). These transmit buffers for the serial ports have two position FIFOs for transmitting data when connected to another serial device. ADSP-21160 SHARC DSP Hardware Reference 6-3 Overview INTERNAL MEMORY ADDRESS INTERNAL MEMORY DATA IOA BUS IOD BUS IRPTL, LIRPTL DMASTAT SERIAL PORTS 1-0 II3-0, IM3-0, C3-0, CP3-0, GP3-0, DA3-0, DB3-0 SRCTL1-0, STCTL1-0 TX1-0, RX1-0 LINK PORTS 5-0 II9-4, IM9-4, C9-4, CP9-4, GP9-4, DA9-4, DB9-4 LCTL1-0, LAR, LCOM LBUF5-0 EXTERNAL PORT DATA II13-10, IM13-10, C13-10, CP13-10, GP13-10, EI13-10, EM13-10, EC13-10 SYSCON, WAIT, DMAC13-10, EP3-0 DMARx DMAGx DMA PARAMETER REGISTERS PORT, BUFFER, & DMA CONTROL REGISTERS BUFFER DATA REGISTERS EXTERNAL PORT ADDRESS Figure 6-1. I/O Processor Block Diagram 6-4 ADSP-21160 SHARC DSP Hardware Reference I/O Processor The Port, Buffer, and DMA Control Registers column in Figure 6-1 on page 6-4 shows the control registers for the ports and DMA channels. These registers include: • System Configuration register (SYSCON). This register configures packing, priority, and word order for the external port. • Waitstate and Access Mode register (WAIT). This register configures handshake, idle cycle insertion, and waitstate insertion for external memory DMA accesses. • External Port DMA Control registers (DMACx). These control registers for each external port DMA channel select the direction, format, and handshake and enable chaining, transfer mode, and DMA start. • Link Port Common Controls register (LCOM). This register indicates link buffer packing and error status for link port operations. • Link Port Assignment register (LAR). This register assigns link buffers to link ports for link port operations. • Link Port Control registers (LCTLx). These control registers (each register controls three link buffers) select the direction, word width, and transfer rate and enable chaining, 2-D DMA mode, and DMA start. • Serial Port Receive Control registers (SRCTLx). These control registers for each port select the receive format; monitor FIFO status; and enable chaining, 2-D DMA mode, and DMA start. • Serial Port Transmit Control registers (STCTLx). These control registers for each port select the transmit format; monitor FIFO status; and enable chaining, 2-D DMA mode, and DMA start. ADSP-21160 SHARC DSP Hardware Reference 6-5 Overview The DMA Parameter Registers column in Figure 6-2 on page 6-7 shows the parameter registers for each DMA channel. These registers function similarly to data address generator registers and include: • Internal Index registers (IIx). An index register provides an internal memory address, acting as a pointer to the next internal memory DMA read or write location. • Internal Modify registers (IMx). A modify register provides the signed increment by which the DMA controller post-modifies the corresponding internal memory index register after the DMA read or write. • Count registers (Cx). A count register indicates the number of words remaining to be transferred to or from internal memory on the corresponding DMA channel. • Chain Pointer registers (CPx). A chain pointer register holds the starting address of the Transfer Control Block (parameter register values) for the next DMA operation on the corresponding channel. These registers also control whether the I/O processor generates an interrupt when the current DMA process ends. • General Purpose registers (GPx). A general purpose DMA register holds an address or other value. • Dimension A and B registers (DAx and DBx). Dimension registers hold the counts for the A and B dimensions of a 2-dimensional DMA. For more information on two-dimensional DMA, see “Using Two-Dimensional Link Port DMA” on page 6-83 or “Using Two-Dimensional Serial Port DMA” on page 6-91. • External Index registers (EIx). An index register provides an external memory address, acting as a pointer to the next external memory DMA read or write location. 6-6 ADSP-21160 SHARC DSP Hardware Reference I/O Processor • External Modify registers (EMx). A modify register provides the increment by which the DMA controller post-modifies the corresponding external memory index register after the DMA read or write. • External Count registers (ECx). An external count register indicates the number of words remaining to be transferred to or from external memory on the corresponding DMA channel. R egister Function W idth IIx Internal Index R egister 18-bits* D escription A ddress of buffer in internal m em ory IM x Internal M odify R egister 16-bits S tride for internal buffer Cx Internal C ount R egister 16-bits Length of internal buffer CPx C hain Pointer R egister 19-bits* C hain pointer for D M A chaining GPx G eneral P urpose R egister 18-bits U ser definable DBx D im ension B R egister 16-bits C ount of dim ension B buffer DAx D im ension A R egister 16-bits C ount of dim ension A buffer E Ix E xternal Index R egister 32-bits A ddress of buffer in external m em ory EMx E xternal M odify R egister 32-bits S tride for external buffer ECx E xternal C ount R egister 32-bits Length of external buffer E xtern a l P o rt D M A c ha n n e ls o n ly * O ffset by 0x40000 for internal addressing in norm al w ord space Figure 6-2. ADSP-21160 DSP’s DMA Parameter Register Figure 6-3 on page 6-8 shows a block diagram of the I/O processor’s address generator (DMA controller). Table 6-1 on page 6-12 lists the parameter registers for each DMA channel. The parameter registers are uninitialized following a processor reset. ADSP-21160 SHARC DSP Hardware Reference 6-7 Overview DMA ADDRESS GENERATOR (INTERNAL ADDRESSES) LOCAL BUS INTERNAL MEMORY ADDRESS II X INDEX (ADDRESS) IMX MODIFIER DBX ONLY FOR 2-D DMA + MUX POST-MODIFY DMA WORD COUNTER CX COUNT –1 LOCAL BUS CPX CHAIN POI NT ER GPX GENERAL PURPOSE WORKING REG ISTER FO R 2-D DMA DAX + ONLY FOR 2-D DMA MUX DMA ADDRESS GENERATOR (EXTERNAL ADDRESSES) LOCAL BUS EXTERNAL MEMORY ADDRESS EIX EXT. INDEX (ADDRESS) EMX ECX EXT. MODI FIER EXT. COUNT –1 + + POST-MODI FY Figure 6-3. DMA Address Generator The I/O processor generates addresses for DMA channels much the same way that the Data Address Generators (DAGs) generate addresses for data memory accesses. Each channel has a set of parameter registers including an index register (IIx) and modify register (IMx) that the I/O processor uses to address a data buffer in internal memory. The index register must 6-8 ADSP-21160 SHARC DSP Hardware Reference I/O Processor be initialized with a starting address for the data buffer. As part of the DMA operation, the I/O processor outputs the address in the index register onto the DSP’s IO (I/O Address) bus and applies the address to internal memory during each DMA cycle—a clock cycle in which a DMA transfer is taking place.) All addresses in the index (IIx) registers are offset by a value matching the DSP’s first internal Normal word addressed RAM location, before the I/O processor uses the addresses. For the ADSP-21160, this offset value is 0x0004 0000. While DMA addresses must always be Normal word (32-bit) memory, the internal memory data transfer sizes may be 64-, 48-, or 32-bits. External memory data transfer sizes may be 64-, 32-, or 16-bits. The I/O processor can transfer Short word data (16-bit) using the packing capability of the external port and serial port DMA channels. After transferring each data word to or from internal memory, the I/O processor adds the modify value to the index register to generate the address for the next DMA transfer and writes the modified index value to the index register. The modify value in the IMx register is a signed integer, which allows both increment and decrement modifies. the I/O processor modifies the index register past the maximum If18-bit value to indicate an address out of internal memory, the index wraps around to zero. With the offset for the ADSP-21160 DSP, the wraparound address is 0x0004 0000. Each DMA channel has a count register (Cx) that programs load with a word count to be transferred. The I/O processor decrements the count register after each DMA transfer on that channel. When the count reaches zero, the I/O processor generates the interrupt for that DMA channel. For more information on DMA interrupts, see “Using I/O Processor Status” on page 6-53. ADSP-21160 SHARC DSP Hardware Reference 6-9 Overview program loads the count ( ) register with zero, the I/O proces sorIf a does not disable DMA transfers on that channel. The I/O Cx processor interprets the zero as a request for 216 transfers. This count occurs because the I/O processor starts the first transfer before the testing the count value. The only way to disable a DMA channel is to clear its DMA enable bit. For more information, see “External Port Channel Transfer Modes” on page 6-21, “Link Port Channel Transfer Modes” on page 6-48, or “Serial Port Channel Transfer Modes” on page 6-52. Each DMA channel also has a chain pointer register (CPx) and a general-purpose register (GPx). Chained DMA sequences are a set of multiple DMA sequences, each autoinitializing the next in line. The location of the parameters for the next sequence comes from the CPx register. These parameters are called a Transfer Control Block, and they set up DMA parameter values for autoinitializing the next DMA sequence in the chain. Programs can use the GP register for any purpose, but usually programs store the address of the previous TCB in this register during chained DMA. For more information, see “Chaining DMA Processes” on page 6-69. The external port DMA channels each contain three additional parameter registers, the external index register (EIx), external modify register (EMx), and external count register (ECx). These three registers are not available for the serial port and link port DMA channels. The I/O processor generates 32-bit external memory addresses using the EI, EM, and EC registers, during DMA transfers between internal memory and external memory or devices. Programs must load the register with the count of external bus transfers in the DMA. If the external port is using word packing, EC the EC count differs from the number of words transferred in the DMA. Instead of the EI, EM, and EC register, the serial port and link port DMA channels have the Dimension-A (DA) and Dimension-B (DB) registers. The I/O processor uses these registers for dimension indices during 6-10 ADSP-21160 SHARC DSP Hardware Reference I/O Processor two-dimensional DMA operations. In one-dimensional DMA operations, programs may also use DA and DB as general-purpose registers. For more information, see “Using Two-Dimensional Link Port DMA” on page 6-83 or “Using Two-Dimensional Serial Port DMA” on page 6-91. Memory mapped devices can communicate with the I/O processor using an internal DMA request/grant handshake on an external port DMA channel. Each channel has a single request and a single grant. When a particular I/O port needs to perform transfers to or from internal memory, the channel asserts a request. The I/O processor prioritizes this request with all other valid DMA requests. The default channel priority is DMA channel 0 as highest and DMA channel 13 as lowest. Table 6-1 on page 6-12 lists the DMA channels in priority order. For more information, see “Managing DMA Channel Priority” on page 6-67. When a channel becomes the highest priority requester, the I/O processor services the channel’s request. In the next clock cycle, the I/O processor starts the DMA transfer. a DMA channel is disabled, the I/O processor does not service Ifrequests for that channel, whether or not the channel has data to transfer. The DSP’s 14 DMA channels are numbered as shown in Table 6-1. This table also shows the control, parameter, priority (DMA channel zero is highest and channel 13 lowest) and data buffer registers that correspond to each channel. ADSP-21160 SHARC DSP Hardware Reference 6-11 Overview Table 6-1. DMA Channel Registers: Controls, Parameters and Buffers DMA Chan# Control Registers Parameter Registers Buffer Register Description Channel Priority 0 SRCTL0 II0, IM0, C0, CP0, GP0, DB0, DA0 RX0 Serial Port 0 Receive Highest Priority 1 SRCTL1 II1, IM1, C1, CP1, GP1, DB1, DA1 RX1 Serial Port 1 Receive 2 STCTL0 II2, IM2, C2, CP2, GP2, DB2, DA2 TX0 Serial Port 0 Transmit 3 STCTL1 II3, IM3, C3, CP3, GP3, DB3, DA3 TX1 Serial Port 1 Transmit II4, IM4, C4, CP4, GP4, DB4, DA4 LBUF0 Link Buffer 0 II5, IM5, C5, CP5, GP5, DB5, DA5 LBUF1 Link Buffer 1 II6, IM6, C6, CP6, GP6, DB6, DA6 LBUF2 Link Buffer 2 II7, IM7, C7, CP7, GP7, DB7, DA7 LBUF3 Link Buffer 3 II8, IM8, C8, CP8, GP8, DB8, DA8 LBUF4 Link Buffer 4 II9, IM9, C9, CP9, GP9, DB9, DA9 LBUF5 Link Buffer 5 - TCB Chain Loading Requests1 - External Accesses of Internal Memory (Direct Reads, Direct Writes) and IOP Registers2 4 5 LCTL0, LAR, LCOM 6 7 8 LCTL1, LAR, LCOM 9 10 DMAC10 II10, IM10, C10, CP10, GP10, EI10, EM10, EC10 EPB0 Ext. Port FIFO Buffer 0 113 DMAC11 II11, IM11, C11, CP11, GP11, EI11, EM11, EC11 EPB1 Ext. Port FIFO Buffer 1 6-12 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Table 6-1. DMA Channel Registers: Controls, Parameters and Buffers (Cont’d) DMA Chan# Control Registers Parameter Registers Buffer Register Description Channel Priority 124 DMAC12 II12, IM12, C12, CP12, GP12, EI12, EM12, EC12 EPB2 Ext. Port FIFO Buffer 2 13 DMAC13 II13, IM13, C13, CP13, GP13, EI13, EM13, EC13 EPB3 Ext. Port FIFO Lowest Buffer 3 Priority 1 TCB chain loading is not associated with a specific DMA channel. TCB chain loading uses the I/O bus and requires prioritization. 2 Direct reads and writes are not associated with a specific DMA channel. Direct reads and writes use the I/O bus and require prioritization. 3 The DMAR1 and DMAG1 pins are handshake controls for DMA channel 11. 4 The DMAR2 and DMAG2 pins are handshake controls for DMA channel 12. All of the I/O processor’s registers are memory-mapped in the DSP’s internal memory, ranging from address 0x0000 0000 to 0x0000 00FF. For more information on these registers, see “I/O Processor Registers” on page A-33. Because the I/O processor registers are memory-mapped, the DSP and external processors (host or multiprocessor DSPs) have access to program DMA operations. A processor sets up a DMA channel by writing the transfer’s parameters to the DMA parameter registers. After the IIx, IMx, and Cx registers (among others) are loaded with a starting source or destination address, an address modifier, and a word count, the processor is ready to start the DMA. The external ports, link ports, and serial ports each have a DMA enable bit (DEN, LxDEN, or SDEN) in their channel control register. Setting this bit for a DMA channel with configured DMA parameters starts the DMA on that channel. If the parameters configure the channel to receive, the I/O processor transfers data words received at the buffer to the destination in internal memory. If the parameters configure the channel to transmit, the I/O processor transfers a word automatically from the source memory to ADSP-21160 SHARC DSP Hardware Reference 6-13 Setting I/O Processor—EPort Modes the channel’s buffer register. These transfers continue until the I/O processor transfers the selected number of words (count parameter). start a new (non-chained) DMA sequence after the current one Tois finished, programs must disable the channel (clear its bit); DEN write new parameters to the II, IM, and C registers; then enable the channel (set its DEN bit). For chained DMA operations, this disable-enable process is not necessary. For more information, see “Chaining DMA Processes” on page 6-69. Setting I/O Processor—EPort Modes The SYSCON, WAIT, and DMACx registers control the external port operating mode for the I/O processor. Table A-17 on page A-45 lists all the bits in SYSCON, Table A-19 on page A-49 lists all the bits in WAIT, and Table A-21 on page A-55 lists all the bits in DMACx. The following bits control external port I/O processor modes. Except for the FLSH bit, the control bits in the DMACx registers have a one cycle effect latency (take effect on the second cycle after change). The FLSH bit has a two cycle effect latency. Programs should not modify an active DMA channel’s DMACx register; other than to disable the channel by clearing the DEN bit. For information on verifying a channel’s status with the DMASTAT register, see “Using I/O Processor Status” on page 6-53. Some other bits in SYSCON, WAIT, and DMACx setup non-DMA external port features. For information on these features, see “Setting External Port Modes” on page 7-2. • Boot Select Override. SYSCON Bit 1 (BSO) This bit enables (if set, =1) or disables (if cleared, =0) access to Boot Memory Space. When BSO is set, the DSP uses the BMS select line (instead of MS3-0) to perform DMA channel 10 accesses of external memory. When BSO is set, BMS will be asserted low for an external port DMA transfer and 6-14 ADSP-21160 SHARC DSP Hardware Reference I/O Processor all memory selects (MSx) will be disabled for DMA transfers. However, the memory selects are available (not disabled) to the DSP core for external memory accesses. • Host Packing Mode. SYSCON Bits 6-5 (HPM) These bits select the external bus packing mode for host accesses: 000=no packing, 001=16-to-32/64, 010=16-to-48 (reset value), 011=32-to-48, 100=32-to-32/64. • Host Most Significant Word First Packing Select. SYSCON Bit 7 (HMSWF) This bit selects the word packing order for host accesses as most-significant-word first (if set, =1) or least-significant-word first (if cleared, =0). • Buffer Hang Disable. SYSCON Bit 16 (BHD) This bit controls whether the processor core proceeds (hang disabled if set, =1) or is held-off (hang enabled if cleared, =0) when the core tries to read from an empty EPBx, TXx, or LBUFx buffer or tries to write to a full EPBx, RXx, or LBUFx buffer. • External Port DMA Channel Priority Rotation Enable. SYSCON Bit 19 (DCPR) This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation among external port DMA channels (channel 10-13). • Handshake and Idle for DMA Enable. WAIT Bit 30 (HIDMA) This bit enables (if set, =1) or disables (if cleared, =0) adding an idle cycle after every memory access for DMAs with handshaking (DMARx-DMAGx). • External Port DMA Enable. DMACx Bit 0 (DEN) This bit enables (if set, =1) or disables (if cleared, =0) DMA for the corresponding external port FIFO buffer (EPBx). • External Port DMA Chaining Enable. DMACx Bit 1 (CHEN) This bit enables (if set, =1) or disables (if cleared, =0) DMA chaining for the corresponding external port FIFO buffer (EPBx). ADSP-21160 SHARC DSP Hardware Reference 6-15 Setting I/O Processor—EPort Modes • External Port Transmit/Receive Select. DMACx Bit 2 (TRAN) This bit selects the transfer direction (transmit if set, =1) (receive if cleared, =0) for the corresponding external port FIFO buffer (EPBx). • External Port Data Type Select. DMACx Bit 5 (DTYPE) This bit selects the transfer data type (40/48=bit, 3-column if set, =1) (32/64-bit, 4-column if cleared, =0) for the corresponding external port FIFO buffer (EPBx). • External Port Packing Mode. DMACx Bits 8-6 (PMODE) These bits select the packing mode for the corresponding external port FIFO buffer (EPBx) as follows: 000=No pack, 001=16 external to 32/64 internal packing, 010=16 external to 48 internal packing, 011=32 external to 48 internal packing, 100=32 external to 32/64 internal packing, 101=110=111=reserved. • Most Significant 16-bit Word First during packing. DMACx Bit 9 (MSWF) When the buffer’s PMODE is 001 or 010, this bit selects the packing order of 16-bit words (most significant first set, =1) (least significant first cleared, =0) for the corresponding external port FIFO buffer (EPBx). • Master Mode Enable. DMACx Bit 10 (MASTER) This bit enables (if set, =1) or disables (if cleared, =0) master mode for the corresponding external port FIFO buffer (EPBx). • Handshake Mode Enable. DMACx Bit 11 (HSHAKE) This bit enables (if set, =1) or disables (if cleared, =0) handshake mode for the corresponding external port FIFO buffer (EPBx). • External Handshake Mode Enable. DMACx Bit 13 (EXTERN) This bit enables (if set, =1) or disables (if cleared, =0) external handshake mode for the corresponding external port FIFO buffer (EPBx). • External Port Bus Priority. DMACx Bit 15 (PRIO) This bit selects the external bus access priority level (high if set, =1) (low if cleared, =0) for the corresponding external port FIFO buffer (EPBx). 6-16 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Boot Memory DMA Mode The BSO bit in the SYSCON register enables Boot Memory Select Override— a mode in which the I/O processor supports DMA access to boot memory space. When BSO is set, the DSP uses the BMS select line (instead of MS3-0) to perform DMA channel 10 accesses of external memory. When reading from 8-bit boot memory space, the DSP uses 8-to-48-bit packing. Programs most often use this feature to finish loading programs and data after the DSP completes its automatic 256-instruction boot-load. When writing to 8-bit boot memory space, programs must use the shifter to place ordered bytes for the transfer in bits 39-32 of each long word to be written. Programs must use the shifter because there is no 8-to-48-bit packing mode for external port writes. Programs most often use this feature to update writable boot memory data (flash memory or EEPROM). External Port Buffer Modes The HPM, HMSWF, PMODE, MSWF, and BHD bits in the SYSCON and DMACx registers select a buffer’s packing mode and disable buffer not-ready processor core stalls. The packing mode bits (PMODE for DSP and HPM for host) select the external bus width and word size for transfers. Table 6-2 shows the available settings. Packed data or instructions are arranged in external memory according to the memory address that stems from their word size. For more information, see “Memory Organization and Word Size” on page 5-22. When data or instructions in external memory are not packed, ADSP-21160 SHARC DSP Hardware Reference 6-17 Setting I/O Processor—EPort Modes the words are arranged in memory according to the external bus’ data alignment. This data alignment appears in Figure 7-1 on page 7-2. Table 6-2. DSP (PMODE) and Host (HPM) Packing Modes PMODE or HPM Packing Mode 000 No word packing (64-bit bus, 64-bit words) 001 16-to32/64-bit packing (16-bit bus, 32- or 64-bit words) 010 16-to-40/48-bit packing (16-bit bus, 40- or 48-bit words) 011 32-to-40/48-bit packing (32-bit bus, 40- or 48-bit words) 100 32-to-32/64-bit packing (32-bit bus, 32- or 64-bit words) When packing is enabled, the DSP only uses the and strobes for accessing external memory, regardless of the least-signifRDH WRH icant-bit of the address. The DSP ( ) and host ( ) packing modes must match for correct word-packing operations in host systems. PMODE HPM When the packing mode (PMODE or HPM) is set for a 16-bit bus, programs should set up the 16-bit word order. The 16-bit word order bits ( MSWF for DSP and HMSWF for host) control the order of 16-bit words being packed or unpacked in the 32-, 48-, or 64-bit word being transferred. If the MSWF or HMSWF bit is set, the packing and unpacking is Most significant 16-bit word first. In addition to selecting the packing mode for external port DSP transfers, programs must indicate the type of data in the transfer, using the Data Type (DTYPE) bit. For more information, see “External Port Channel Transfer Modes” on page 6-21. The Buffer Hang Disable (BHD) bit lets the processor core proceed if the core tries to read from an empty EPBx, TXx, or LBUFx buffer or tries to write to a full EPBx, RXx, or LBUFx buffer. The processor core still performs buffer accesses when buffer hang is disabled (BHD=1). If the processor core 6-18 ADSP-21160 SHARC DSP Hardware Reference I/O Processor attempts to read from an empty receive buffer, the core gets a repeat of the last value that was in the buffer. If the processor core attempts to write to a full buffer, the core overwrites the last value that was written to the buffer. Because these buffers are not initialized at reset, a read from a buffer that hasn't been filled since the reset returns an undefined value. If an external port buffer’s bit is set and DMA for that chan nel is not enabled, the external port channel is in single-word, INTIO interrupt-driven transfer mode. For more information, see “Using I/O Processor Status” on page 6-53. External Port Channel Priority Modes The DCPR and PRIO bits in the SYSCON and DMACx registers influence priority levels for an external port buffer and the external port in relation to external port DMA channels and external bus arbitration. For more information on prioritization operations, see “Managing DMA Channel Priority” on page 6-67. Priority for DMA requests from external port channels can be fixed or rotated. When the DMA Channel Priority Rotate (DCPR) bit is cleared, the lowest number external port channel has the highest priority, ranging from highest-priority channel 10 to lowest-priority channel 13. When the DCPR bit is set, the priority levels rotate. High priority shifts to a new channel after each single-word transfer. The I/O processor services a single-word transfer then rotates priority to the next higher numbered channel. Rotation continues until the I/O processor services all four external port channels. Figure 6-4 illustrates this process as described in the following steps: 1. At reset, external port channels have priority order—from high to low—10, 11, 12, and 13. 2. The external port performs a single transfer on channel 11. ADSP-21160 SHARC DSP Hardware Reference 6-19 Setting I/O Processor—EPort Modes 3. The I/O processor rotates channel priority, changing it to 12, 13, 10, and 11 (because rotating priority is enabled for this example, DCPR=1). LOWEST PRIORITY 13 HIGHEST PRIORITY HIGHEST PRIORITY 10 12 STEP 2 12 11 LOWEST PRIORITY 11 STEP 3 13 10 ONE TRANSFER OCCURS ON CHANNEL 11 (STEP 2), ROTATING CHANNEL 11'S PRIORITY TO THE LOWEST PRIORITY SLOT (STEP 3). Figure 6-4. Rotating External Port DMA Channel Priority though the external port channel DMA priority can rotate, Even the interrupt priorities of all DMA channels are fixed. When external port DMA channel priority is fixed (DCPR=0), channel 10 has the highest priority, and channel 13 has the lowest priority. But, programs can redefine this priority order by assigning one of the other channels the highest priority. To change the fixed priority sequence of the external port DMA channels, a program could use the following procedure: 6-20 ADSP-21160 SHARC DSP Hardware Reference I/O Processor 1. Disable all external port DMA channels except the one which is to have lowest priority. 2. Select rotating priority. 3. Cause at least one transfer to occur on the enabled channel. 4. Disable rotating priority and re-enable all of the external port DMA channels After completing this procedure, the channel immediately after the selected channel has the highest fixed priority. In systems where multiple processors are using the external bus, the PRIO bit raises the priority level for external port DMA transfers. When a channel’s PRIO bit is set, the I/O processor asserts the Priority Access (PA) pin when that channel uses the external bus. The channel gets higher priority in bus arbitration, allowing the DMA to complete more quickly. Programs can also rotate priority between external port and link port DMA channels. For more information, see “Link Port Channel Priority Modes” on page 6-45. External Port Channel Transfer Modes The DEN, CHEN, TRAN, and DTYPE bits in the DMACx register enable DMA and chained DMA and select the transfer direction and data type. The DMA enable (DEN) and Chained DMA enable (CHEN) bits work together to select an external port DMA channel’s transfer mode. Table 6-3 lists the modes. ADSP-21160 SHARC DSP Hardware Reference 6-21 Setting I/O Processor—EPort Modes Table 6-3. External Port DMA Enable Modes CHEN DEN DMA Enable Mode Description 0 0 Channel disabled (chaining disabled, DMA disabled) 0 1 Single DMA mode (chaining disabled, DMA enabled) 1 0 Chain insertion mode (chaining enabled, DMA enabled, auto-chaining disabled); For more information, see “Chaining DMA Processes” on page 6-69. 1 1 Chained DMA mode (chaining enabled, DMA enabled, auto-chaining enabled) Because the external port is bidirectional, the I/O processor uses the Transmit select (TRAN) bit to determine the transfer direction (transmit or receive). Data flows from internal to external memory when in transmit mode. In transmit mode, the I/O processor fills the channel’s EPBx buffer when the channel’s DEN bit is set. The Data Type (DTYPE) bit determines how the DMA channel accesses columns of internal memory. If DTYPE is set, the data is 40- or 48-bit words, and the I/O processor makes 3-column internal memory accesses. If DTYPE is cleared, the data is 32- or 64-bit words, and the I/O processor makes 4-column internal memory accesses. For more information, see “Memory Organization and Word Size” on page 5-22. The Width ( for the transfer overrides the Internal Memory Data IMDWx) setting for the internal memory block. DTYPE External Port Channel Handshake Modes The MASTER, HSHAKE, EXTERN, and HIDMA bits in the DMACx and WAIT registers select the channel’s DMA handshake and enable the hold cycles for host DMA. Table 6-4 shows how the MASTER, HSHAKE, and EXTERN bits work to select the channel’s DMA handshake mode. 6-22 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Table 6-4. External Port DMA Handshake Modes—DMACx MASTER (M), HSHAKE (H), and EXTERN (E) Bits EHM DMA Mode of Operation 000 Slave Mode. The DSP responds to the buffer’s internal memory transfer activity based on the buffer status in the FS field, generating a DMA request whenever the buffer is not empty (on receive) or is not full (on transmit). During transmit (TRAN=1), the DSP fills the EPBx buffer when the program enables the buffer (DEN=1). For more information, see “Slave Mode” on page 6-31. 001 Master Mode. The DSP attempts the internal memory DMA transfers indicated by the DMA counter (Cx) based on the buffer status in the FS field, making transfers whenever the buffer is not empty (on receive) or is not full (on transmit). Systems using Master Mode should deassert corresponding DMA request inputs, deasserting DMAR1 if channel 11 is in master mode and deasserting DMAR2 if channel 12 is in master mode. For more information, see “Master Mode” on page 6-25. 010 Handshake Mode. When in this mode, the DSP generates a DMA request whenever the external device asserts the DMARx pin, then the DSP asserts the DMAGx pin, transferring the data (and deasserting DMAGx) when the external devices deasserts the DMARx pin. Note that this mode only applies to external port buffers EPB1 and EPB2 and only applies to DMA channels 11 and 12. For more information, see “Handshake Mode” on page 6-34. 011 Paced Master Mode. The DSP attempts the internal memory DMA transfers indicated by the DMA counter (Cx), making transfers based on external DMA request inputs. The DSP generates a DMA request whenever the external device asserts the DMARx pin and controls the data transfer using the RDH/L or WRH/L and ACK pins and applying the selected number of waitstates. Note that this mode only applies to DMA channels 11 and 12. For more information, see “Paced Master Mode” on page 6-30. 100 Reserved 101 Reserved ADSP-21160 SHARC DSP Hardware Reference 6-23 Setting I/O Processor—EPort Modes Table 6-4. External Port DMA Handshake Modes—DMACx MASTER (M), HSHAKE (H), and EXTERN (E) Bits (Cont’d) EHM DMA Mode of Operation 110 External-Handshake Mode. The DSP responds to external memory DMA requests based on external DMA request inputs. This mode is identical to Handshake Mode, but applies to transfers between external memory and external devices. When in this mode, the DSP generates a DMA request whenever the external device asserts the DMARx pin, then the DSP asserts the DMAGx pin, transferring the data (and deasserting DMAGx) when the external devices deasserts the DMARx pin. Note that this mode only applies to external port buffers EPB1 and EPB2 and only applies to DMA channels 11 and 12. For more information, see “External-Handshake Mode” on page 6-40. 111 Reserved For the Handshake and External-handshake modes shown in Table 6-4, programs can insert an added idle cycle after every memory access. The Handshake and Idle for DMA (HIDMA) bit in the WAIT register enables this added cycle, which reduces bus contention from devices with slow three-state timing or long recovery times. Because external port DMA transfers can go between DSP internal memory and external memory, the I/O processor must generate addresses for both memory spaces. The external port DMA channels have additional parameter registers (EIx, EMx, ECx) for external memory access. To support data packing options for external memory DMA transfers, the EI and EM registers can generate addresses at a different rate than the internal address registers (II and IM). This separation is shown in Figure 6-8 on page 6-68, which shows that the I/O processor has separate address generators for internal and external addresses. For this reason when packing is used for external memory DMA, the external count (EC) register indicates the number of external port transfers, not necessarily the number of internal memory words being transferred. The DMA mode and other factors determine the size of the DMA data transfer on the external port. These other factors include the EI, EM, and 6-24 ADSP-21160 SHARC DSP Hardware Reference I/O Processor parameters, the PMODE, DTYPE, and MAXBL values in DMACx, and the transfer capacity available in the EPBx data buffer employed in the transfer. The internal I/O processor bus transfer size varies with the II, IM, and C parameters, and the PMODE, DMA mode, DTYPE, and INT32 values in DMACx. EC The following sections describe these DMA modes and transfer sizes in more detail: • “Master Mode” on page 6-25 • “Paced Master Mode” on page 6-30 • “Slave Mode” on page 6-31 • “Handshake Mode” on page 6-34 • “External-Handshake Mode” on page 6-40 Master Mode When the MASTER bit is set (=1) and the EXTERN and HSHAKE bits are cleared (=0) in the channel’s DMACx register, the DMA channel is in master mode. A channel in this mode can independently initiate internal or external memory transfers. mode applies to all external port DMA channels: 10, 11, Master 12, and 13. To initiate a master mode DMA transfer, the DSP sets up the channel’s parameter registers and sets the channel’s DMA enable (DEN) bit. A master mode DMA channel performing internal memory to external memory data transfer automatically performs enough transfers from internal memory to keep the EPBx buffer full. When the data transfer direction is external to internal, a master mode DMA channel also performs enough transfers from external memory to keep the EPBx buffer full. I/O processor uses the , , and registers to access exter The nal DSP memory in master mode DMA. EI EM ADSP-21160 SHARC DSP Hardware Reference EC 6-25 Setting I/O Processor—EPort Modes External Transfer Controls In Master Mode. In master mode, the DSP determines the size of the external transfer from the channel’s PMODE bits and EIx, EMx, and ECx registers. Table 6-1 on page 6-12 shows the packing mode selected by the PMODE bits, and Table 6-5 on page 6-26 shows the external transfer size in master mode that results from the combination of the PMODE bits. Table 6-5. Master Mode External Transfer Size Transfer Size 64-bit1 64-bit2 32-bit 16-bit PMODE 000 000 0003, 011, 100 001, 010 EI 64-bit aligned4 64-bit aligned X5 X EM 0 or 1 2 X X EC even # of 32-bit words, >= 2 # of 48-bit words X # of 16-bit xfers DTYPE 0 1 X X EPBx Depth >1 >1 >=1 >=1 1 2 3 4 5 Including packed instructions Including unpacked instructions or 40-bit data For PMODE=000, even 32-bit addresses (EI[0]=0) access the lower 32-bits of the data bus. For a 64-bit aligned address, EI[0]=0. An X indicates any supported value. 64-bit External Transfers. To enable 64-bit transfers, PMODE must be set to 000. EI must be a 64-bit aligned Normal word address, because unaligned 64-bit external transfers are not supported. EM is restricted to values of 0 (to address a memory-mapped data FIFO such as the EPBx data buffer of another DSP), or 1 (to increment through contiguous memory). EC contains the number of 32-bit words to transfer. For 64-bit transfers (only), EC should be programmed to an even number. If EC must be set to an odd value, the last transfer will be a 32-bit only transfer. There must be at least two 32-bit EPBx FIFO entries available to support the 64-bit external transfer. 6-26 ADSP-21160 SHARC DSP Hardware Reference I/O Processor 64-bit External Burst Transfers. Burst transfers are a subset of 64-bit transfers. In addition to the 64-bit transfer requirements described above, bursting must be enabled by setting the MAXBL field in the DMAC. Also, the burst truncates (or does not start), if the least significant bits of the 64-bit address (ADDR bits 2-1) are both set (EI bits 2-1=11) (see the SBSRAM discussion in the External Memory chapter for more discussion on burst address boundaries.) Note that the external memory addressed by the burst transfer must map to a memory bank configured for synchronous access mode (with the WAIT register). The DMA programmer must ensure that the burst transfer does not straddle, or cross, the external memory bank boundaries. The following information applies to 64-bit burst transfers from Table 6-5. MAXBL=01 where: • must address a memory bank configured for synchronous access modes (with WAIT register), and EIx must be 64-bit aligned (EIx bits 2-1 may not be = 11). EI • Burst writes only supported in 1-wait write access mode. Bursts are truncated at modulo4 boundaries of the 64-bit address EPBx Depth >3 to support burst transfers 64-bit External Transfers of 48-bit Data/Instructions. Because the DSP’s external bus does not support a 48-bit transfer size, programs must use 64-bit transfers sizes to move instructions. The two 64-bit transfer columns in Table 6-5 describe how to transfer packed instructions or unpacked instructions. Instructions are “packed” in external memory when the 3-column instructions are stored using all four 16-bit memory columns (same arrangement as in internal memory). When instructions are packed in external ADSP-21160 SHARC DSP Hardware Reference 6-27 Setting I/O Processor—EPort Modes memory, the DSP cannot fetch and execute these instructions. The advantages of using packed instructions are that they take up 1/3 less memory space than unpacked and the DSP can performs 1/3 fewer bus transfers to DMA a block of these instructions than unpacked. Instructions are “unpacked” in external memory when the 3-column instructions are stored left-aligned, using three of the 16-bit memory columns. This arrangement matches the external port bus alignment shown in Figure 7-1 on page 7-2. When instructions are unpacked in external memory, the DSP can fetch and execute these instructions. For more information, see “Executing Instructions From External Memory” on page 7-48. 32-bit External Transfers. The DSP performs 32-bit transfers when (No hardware packing mode), 011 (32-to-48-bit internal), or 100 (32-bit external-to-32-bit/64-bit internal). In PMODE=000 mode, the external bus operation transfers 32-bits, instead of 64-bits if EIx, ECx, or EMx do not match the 64-bit transfer conditions in Table 6-5. PMODE=000 For 32-bit transfers in the PMODE=000 case, consecutive 32-bit transfers access alternating high and low halves of the 64-bit data bus. In PMODE=011or 100, all data transfers across the upper word of the data bus (DATA63-32) as indicated in Figure 7-1 on page 7-2. This mode supports all values of EI, EM, and EC. EC contains the number of 32-bit words to transfer. There must be at least one 32-bit EPBx FIFO entry available to support the 32-bit external transfer. 16-bit External Transfers. The DSP performs 16-bit transfers when PMODE=001 (16-bit external-to-32/64-bit internal) or 010 (16-bit external-to-48-bit internal). This mode supports all values of EI, EM, and EC. EC is programmed to the number of 16-bit words to transfer. There must be at least one 32-bit EPBx FIFO entry available to support the 16-bit external transfer. In PMODE=001, or 010, all data transfers across DATA47-32 as indicated in Figure 7-1 on page 7-2. 6-28 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Internal Address/Transfer Size Generation. In master mode, the DSP determines the size of the internal transfer from the channel’s PMODE bits and IIx, IMx, and Cx registers. Table 6-2 on page 6-18 shows the packing mode selected by the PMODE bits, and Table 6-6 shows the internal transfer size in master mode that results from the combination of the PMODE bits. Table 6-6. Master Mode Internal Transfer Size Determination Transfer Size 64-bit1 48-bit 32-bit PMODE 000, 001, 100 000, 010, 011 000, 001, 100 II depends on IM2 X3 X IM -1 or 1 X X C even # of 32-bit words # of 48-bit words X DTYPE 0 0 or 14 0 EPBx Depth >1 >1 >=1 INT32 0 0 0 or 1 1 2 Including packed instructions. If IMx is 1 for increment, IIx must be an even, 64-bit aligned Normal word address. If IMx is -1 for decrement, IIx must be an odd, Normal word address. 3 X indicates any supported value. 4 DTYPE=1 for PMODE=000, 48-bit instruction transfers (unpacked). DTYPE=0 for 48-bit packing modes. 64-bit Internal Transfers. To enable internal 64-bit transfers and increment the internal IIx pointer, programs must set IIx to match the IMx selection as shown in Table 6-6. Cx contains the number of 32-bit words to transfer, and should be set to an even # of 32-bit words. The DSP decrements Cx by 2 for each 64-bit transfer. For 64-bit transfers, PMODE must be set to 000, 001 (16-bit-to-32/64-bit internal), or 100 (32-bit external-to-32/64-bit internal). DTYPE and INT32 must be cleared. There must be at least two 32-bit EPBx FIFO entries available to support the 64-bit external transfer. ADSP-21160 SHARC DSP Hardware Reference 6-29 Setting I/O Processor—EPort Modes 48-bit Internal Transfers. The DSP can perform 48-bit internal transfers for DMA of packed or unpacked 48-bit instructions. For more information on packed and unpacked instructions, see the discussion on page 6-27. Many applications can use internal 64-bit transfer for 48-bit instructions. This technique can provide greater throughput than 48-bit internal transfers, but there are some restrictions. For more information on internal 64-bit transfers, see Table 6-6 and the discussion on page 6-29. In either of the 48-bit internal transfer modes in Table 6-6 (PMODE=000 and DTYPE=1 or PMODE=010 or 011 and DTYPE=0), the DSP accesses the memory using instruction alignment (3-column read or write) for the EPBx buffer. In this case, IIx points to 48-bit words, and Cx counts the number of 48-bit internal transfers. 32-bit Internal Transfers. The DSP performs according to the conditions in Table 6-6. Under these additional conditions, the DSP performs 32-bit transfers instead of 64- or 48-bit transfers: PMODE=000 (no hardware packing), 001 (16-bit external-to-32-bit internal), or 100 (32-bit external-to-32-bit internal), and II is not aligned to a 64-bit boundary, or IM is < -1, or > 1, or C is < 2, or EPBx depth < 2, or INT32 = 1, and DTYPE=0. Paced Master Mode When the MASTER and HSHAKE bits are set (=1) and the EXTERN bit is cleared (=0) in the channel’s DMACx register, the DMA channel is in Paced Master mode. A channel in this mode can independently initiate internal or external memory transfers. Master mode applies only to external port DMA channels 11 Paced and 12. In Paced Master mode, the DSP has the same control for address generation and transfer size as in master mode. For more information, see “Master Mode” on page 6-25. The difference between these modes is that 6-30 ADSP-21160 SHARC DSP Hardware Reference I/O Processor in Paced Master mode external transfers are controlled and initiated (paced) by the DMARx signal as in Handshake mode. For more information, see “Handshake Mode” on page 6-34. The DSP responds to the DMARx request only with the RDH/L, or WRH/L strobes, depending on direction and data alignment. DMAGx is not asserted in Paced Master mode. This method lets the DSP share the same buffer between the I/O processor and processor core without external gating. Paced Master mode accesses can be extended by the ACK input, by waitstates programmed in the WAIT register, and by holding the DMARx input low. Slave Mode When the MASTER, HSHAKE, and EXTERN bits in the channel’s DMACx register are cleared (=0), the DMA channel is in slave mode. A channel in this mode cannot independently initiate external memory transfers. To initiate a slave mode DMA transfer, an external device must read or write the channel’s EPBx buffer. A slave mode DMA channel performing internal to external data transfer automatically performs enough transfers from internal memory to keep the EPBx buffer full. When the data transfer direction is external to internal, a slave mode DMA channel does not initiate any internal DMA transfers until the external device writes data to the channel’s EPBx buffer. The I/O processor does not use the mode DMA. EI, EM, and EC registers in slave The following sequence describes a typical external to internal slave mode DMA operation where an external device transfers a block of data into the DSP’s internal memory: 1. The external device writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and DMACx control register, initializing the channel. ADSP-21160 SHARC DSP Hardware Reference 6-31 Setting I/O Processor—EPort Modes 2. The external device begins writing data to the EPBx buffer. 3. The EPBx buffer detects data is present and asserts an internal DMA request to the I/O processor. 4. The I/O processor grants the request and performs the internal DMA transfer, emptying the EPBx buffer FIFO. If the internal DMA transfer is held off, the external device can continue writing to the EPBx buffer because of its eight-deep FIFO. When the EPBx FIFO becomes full, the DSP holds off the external device with the ACK signal (for synchronous accesses) or with the REDY signal (for asynchronous, host-driven accesses). This hold-off state continues until the I/O processor finishes the internal DMA transfer, freeing space in the EPBx buffer. The following sequence describes a typical internal to external slave mode DMA operation where an external device transfers a block of data from the DSP’s internal memory: 1. The external device writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and DMACx control register, initializing the channel and automatically asserting an internal DMA request to the I/O processor. 2. The I/O processor grants the request and performs the internal DMA transfer, filling the EPBx buffers FIFO. 3. The external device begins reading data from the EPBx buffer. 4. The EPBx buffer detects that there is room in the buffer (it is now “partially empty) and asserts another internal DMA request to the I/O processor, continuing the process. If the internal DMA transfers cannot fill the EPBx FIFO buffer at the same rate as the external device empties it, the DSP holds off the external device with the ACK signal (for synchronous accesses) or with the REDY signal (for asynchronous, host-driven accesses) until valid data can be transferred to the EPBx buffer. 6-32 ADSP-21160 SHARC DSP Hardware Reference I/O Processor The DSP only deasserts the (or ) signal when the FIFO buffer (or pad data buffer) is full during a write. The ACK REDY EPBx ACK (or signal remains asserted at the end of a completed block transfer if the EPBx buffer is not full. For reads, the DSP deasserts the ACK (or REDY) signal for each read to handle the latency of the read versus posting the write to a buffer. REDY) In slave mode, the DSP determines the size of the transfer by decoding the read and write (RDH/L and WRH/L) signals in addition to the channel’s PMODE bits. Table 6-2 on page 6-18 shows the packing mode selected by the PMODE bits, and Table 6-7 shows the transfer size in slave mode that results from the combination of the read and write signals and PMODE bits. Table 6-7. Slave Mode Transfer Size Determination Transfer Size (external internal) 64-bit 64-bit 32-bit 32-bit 32/64-bit 32-bit 48-bit 16-bit 32-bit1 32/64-bit2 16-bit 48-bit PMODE 000 000 100 011 010 001 RDH RDL Not supported WRH WRL ¥ DTYPE 0 0 0 1 2 1 0 Not supported 1 External device must be connected to the upper half of the data bus (Data[63:32]) External device must be connected to Data[47:32]) ADSP-21160 SHARC DSP Hardware Reference 6-33 Setting I/O Processor—EPort Modes DSP does not support 48-bit accesses to internal memory in The slave mode. Handshake Mode When the MASTER and EXTERN bits are cleared (=0) and the HSHAKE bit is set (=1) in the channel’s DMACx register, the DMA channel is in Handshake mode. A channel in this mode cannot independently initiate external memory transfers. Handshake mode only applies to DMA channels 11 and 12. To initiate a Handshake mode DMA transfer, an external device must assert an external DMA request, asserting DMAR1 for access to EPB1 or DMAR2 for access to EPB2. The buffers pass these request to the I/O processor, which prioritizes these requests with other internal DMA requests. When the external DMA request has the highest priority, the I/O processor asserts an external DMA grant, asserting DMAG1 for EPB1 or DMAG2 for EPB2. The grant signals the external device to read or write the EPBx buffer. A Handshake mode DMA channel performing internal to external data transfer automatically performs enough transfers from internal memory to keep the EPBx buffer full. When the data transfer direction is external to internal, a Handshake mode DMA channel does not initiate any internal DMA transfers until the external devices writes data to the channel’s EPBx buffer. The I/O processor does not use the mode DMA. EI or EM registers in Handshake Other than the DMARx/DMAGx handshake, Handshake mode DMA operations follow almost the same process as slave mode DMA operations. The exception is that in Handshake mode DMAs from internal to external memory the external device must load the channel’s ECx register with the number of external bus transfers. 6-34 ADSP-21160 SHARC DSP Hardware Reference I/O Processor In Handshake mode, the DSP determines the size of the transfer from the channel’s parameter registers and PMODE bits. Table 6-2 on page 6-18 shows the packing mode selected by the PMODE bits, and Table 6-8 shows the transfer size in slave mode that results from the combination of the read and write signals and PMODE bits. Table 6-8. Handshake mode Transfer Size Determination Transfer Size (external internal) 64-bit 64-bit 64-bit 48-bit 32-bit 32-bit 16-bit 16-bit 32/64-bit1 48-bit2 32/64-bit2 48-bit2 PMODE 000 000 100 011 001 001 IIx 64-bit aligned X X X X X IMx 1 X X X X X Cx # of 32-bit words3 # of 48-bit words4 # of 32-bit words # of 48-bit words # of 32-bit words # of 48-bit words ECx # of 32-bit words1 3/4 * Cx # of 32-bit words # of 32-bit words # of 16-bit words # of 16-bit words DTYPE 0 1 0 0 0 0 1 2 3 4 External device must be connected to the upper half of the data bus (Data[63:32]) External device must be connected to Data[47:32]) Must be an even number of 32-bit words Should be a multiple of 4 Signal timing for Handshake mode does not delay the DMA operation. The DMARx/DMAGx handshake operates asynchronously at up to the full CLKIN speed of the DSP. For Handshake mode DMA, the DSP does not assert the MS3-0 memory select lines (address strobes). For information on DMARx/ DMAGx handshake timing, see Figure 6-5 on page 6-36. The I/O processor uses the rising and falling edges of DMARx in the handshake as prompts for DMA operations. On the falling edge of DMARx, the edge signals the I/O processor to begin a DMA access. DMARx/ DMAGx ADSP-21160 SHARC DSP Hardware Reference 6-35 Setting I/O Processor—EPort Modes DMAR rising edge allows first DMAG to complete DMAG has a waitstate because DMAR remained asserted in cycle prior to the DMAG assertion CLKIN First DMA request Second DMA request DMARx DMAGx DATA63-0 Bus Transition Cycle (if not bus master) Data valid Data valid Data valid Data valid Figure 6-5. Handshake DMA Timing (Asynchronous Requests) On the rising edge of DMARx, the edge signals the I/O processor to complete the DMA access. The following sequence describes the process for requesting access to an EPBx buffer in Handshake mode: 1. The external device asserts the buffer’s DMARx signal, placing an external DMA request for access to the EPBx buffer. 2. The EPBx buffer detects the falling edge of the DMARx signal and passes the external DMA request to the I/O processor, synchronizing the DMA operation with the processor’s system clock. 6-36 ADSP-21160 SHARC DSP Hardware Reference I/O Processor 3. To be recognized in a particular cycle, the DMARx low transition must meet the signal setup time from the DSP’s data sheet. If the transition is slower than the setup time, the signal may not take effect until the following cycle. 4. The I/O processor prioritizes the external DMA request with other internal DMA requests. If the DSP is not already bus master, the DSP arbitrates for the external bus when the external DMA request has the highest priority, unless the EPBx buffer is blocked. 5. If the EPBx buffer is full during a write or empty during a read, the buffer is blocked. The DSP does not begin external bus arbitration until the I/O processor services the EPBx buffer, returning it to the unblocked state empty for writing or full for reading. 6. The DSP becomes bus master and asserts DMAGx. 7. The DSP keeps DMAGx asserted until the cycle after the external device deasserts DMARx. By holding DMARx asserted, the external device holds the DSP until the external device is ready to proceed. If the external device does not need to extend the DMA grant cycle, the external device can deassert DMARx immediately (not waiting for DMAGx), providing the DMARx assertion time meets the timing requirements from the DSP data sheet. The responding DMAGx in this case is a short pulse, and the DSP only uses the external bus for one cycle. Because the I/O processor has a three-cycle DMA pipeline and a seven-deep external request counter, the DSP can execute DMARx/DMAGx handshake operations at up to the full CLKIN rate of the DSP. The I/O processor has a three-cycle DMA pipeline, similar to the program sequencer’s fetch–decode–execute instruction pipeline. ADSP-21160 SHARC DSP Hardware Reference 6-37 Setting I/O Processor—EPort Modes The I/O processor processes the DMA pipeline in the following stages: • Recognizes the DMA request and arbitrates internal DMA priority during the DMA fetch cycle. • Generates the DMA address and arbitrates external bus access during the DMA decode cycle. • Transfers DMA data during the DMA execute cycle. the I/O processor has a three-cycle DMA pipeline, there is aBecause minimum delay of three cycles before the DSP asserts . This DMAGx delay is in addition to any delay from internal DMA arbitration, so the external device must not assume that the DMA grant can arrive within two cycles even if higher priority DMA operations are disabled and the external bus is available for the transfer. The I/O processor’s external request counter increments each time the external device asserts DMARx and decrements each time the DSP replies by asserting DMAGx. The external request counter records up to seven requests, so the external device can make up to seven requests before the first one has been serviced. If the DSP cannot immediately service the DMA requests in the external request counter, the DSP services the requests on a prioritized basis. The external DMA device is responsible for keeping track of requests, monitoring grants, and pipelining the data when operating at full speed. external device makes more than seven without receiv Ifingthea grant, the delayed grant causes unpredictable results. DMARx The DSP only asserts DMAGx for the number of DMARx requests indicated by the external request counter. If the external devices makes more requests than the count indicates, the DSP DMAGx assertions cannot match the number of external device requests. To clear this mismatch, programs can clear the buffer and the external request counter using the flush bit (FLSH) in the channel’s DMACx control register. 6-38 ADSP-21160 SHARC DSP Hardware Reference I/O Processor To prevent holding off the DSP, the external device must service the DSP’s data requirements when the DSP’s asserts the DMAGx grant signal. The external device should immediately supply data for writes to the DSP or immediately accept data on reads from the DSP. External interfaces can handle this I/O by placing the data in an external FIFO. When performing DMA operations at the full CLKIN speed of the DSP, the system may need a three-deep external FIFO to handle the latency between request and grant. Programs on the external device can optimize operation of this FIFO by issuing three requests rapidly and making the next requests conditional on when the DSP issues a grant. The external devices must follow the conditions in Figure 6-6 on page 6-40 when enabling or disabling Handshake mode for an external port DMA channel. The DSP ignores a disabled (transitioning from disabled to enabled) DMA channel’s DMARx and DMAGx pins and ignores internal DMARx assertions for up to two processor core clock cycles after the instruction that enables the channel in Handshake mode. • The external devices must maintain DMARx deasserted (kept high, not low or changing) during the instruction that enables DMA in handshake mode. Before using the channel for the first time, programs flush the DMA channel, asserting the FLSH bit in the DMACx control register. This action is not required during chain insertion. • The DSP deasserts DMAGx if a program disables the channel while DMARx and DMAGx are asserted (=0). This action clears the channel’s active status bit, avoiding a potential deadlock condition. ADSP-21160 SHARC DSP Hardware Reference 6-39 Setting I/O Processor—EPort Modes CCLK (core clock) DMARx Executing Instruction DMAR must be kept HI during this instruction Instruction enabling DMA by setting DEN=1 and HSHAKE=1 in DMAC11 or DMAC12 DMARx ignored Instruction Instruction Figure 6-6. DMARx Delay After Enabling Handshake DMA DSPs in a multiprocessing cluster may share a DMAGx signal, because only the bus master drives DMAGx. On the bus slaves, DMAGx is three-stated. This state eliminates the need for external gating if more than one DSP or the host needs to drive the DMA buffer. Systems may need a pull-up resistor on this line if the host is not connected to the pin or does not drive it when it acquires the bus. DMAGx has the same timing and transitions as the RDH/L and WRH/L strobes in asynchronous access mode. DMAGx responds to the SBTS and HBR signals in the same way as the read and write strobes. For more information, see “External Port”. External-Handshake Mode When the MASTER bit is cleared (=0) and the HSHAKE and EXTERN bits are set (=1) in the channel’s DMACx register, the DMA channel is in the External-Handshake mode. A channel in this mode cannot independently initiate external memory transfers. External-handshake mode is identical to Handshake mode, except that External-Handshake mode transfers data between external memory and an external device. This section covers the differences between Handshake mode and External-Handshake mode. For more information, see “Handshake Mode” on page 6-34. 6-40 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Handshake mode, External-Handshake mode only applies to Like DMA channels 11 and 12. To initiate an External-Handshake mode DMA transfer, an external device must assert an external DMA request, asserting DMAR1 for access to DMA channel 11or DMAR2 for access to DMA channel 12. The channels pass these request to the I/O processor, which prioritizes these requests with other internal DMA requests. When the external DMA request has the highest priority, the I/O processor asserts an external DMA grant, asserting DMAG1 for channel 11 or DMAG2 for channel 12. The grant signals the external device to read or write the external bus. An External-Handshake mode DMA channel performing external to external data transfer automatically generates external memory addresses and strobes for transfers between external memory and the external device. Handshake mode, the I/O processor must use the , , Unlike registers in External-Handshake mode DMA. Also unlike and EIx EMx ECx Handshake mode, the data for DMA channels 11 and 12 does not pass through the EPB1 or EPB2 buffers. During External-Handshake mode transfers, the I/O processor generates external memory access cycles. DMARx and DMAGx operate the same as in Handshake mode, but the DSP also outputs addresses, MS3-0 memory selects, and RDH/L and WRH/L strobes, and responds to ACK. On external memory writes, the DSP asserts DMAGx until the external device releases the ACK line or any DSP waitstates expire. The external memory access by the external devices responds as if the DSP processor core were making the access. See “External Port” for information on connecting external devices to the external port. Because the I/O processor accesses external memory in External-Handshake mode, programs must load the DMA channel’s EIx, EMx, and ECx parameter registers, and the DMACx PMODE bits. These settings let the I/O processor generate the external memory addresses and word count. ADSP-21160 SHARC DSP Hardware Reference 6-41 Setting I/O Processor—EPort Modes mode does not support chained DMA inter External-handshake rupts. Because no internal DMA transfers occur in External-Handshake mode, the PCI bit in the channel’s CPx register cannot disable the DMA interrupt. Programs must use the IMASK register to mask this interrupt. In External-Handshake mode, the DSP does not perform packing. The DSP does determine the size of the transfer from the channel’s parameter registers, PMODE bits. Table 6-9 shows the transfer size in slave mode that results from the combination of the read and write signals and PMODE bits. Table 6-9. External-Handshake Mode Transfer Size Determination Transfer Size (DSP device) 64-bit Memory 64-bit Memory 32-bit Memory 64-bit Device 64-bit Device 32-bit Device 32-bit Device PMODE 000 000 000 100 EIx X constant X X1 EMx 12 03 Not 0 or 1 X ECx Even4 Even4 X5 X5 DTYPE 0 0 0 0 32-bit Memory • For 64-bit transfers, the device must be connected to DATA63-0, and both RDH/L and WRH/L are used. • For 32-bit transfers, the device may be connected to either DATA63-32 or DATA31-0 and the corresponding RDH/L or WRH/L is used depending on the address. • X indicates any legal value. 1 2 3 4 5 6-42 For packed 32-bit transfers (PMODE=100), all ADDR bits are decoded. For EMx=1, EIx is incremented by 2. For EMx=0, EIx remains constant. For 64-bit transfers, ECx is decremented by 2. For 32-bit transfers, ECx is decremented by 1. ADSP-21160 SHARC DSP Hardware Reference I/O Processor Setting I/O Processor—LPort Modes The SYSCON, LCOM, LAR, and LCTLx registers control the link ports operating modes for the I/O processor. Table A-17 on page A-45 lists all the bits in SYSCON, Table A-24 on page A-65 lists all the bits in LCOM, Table A-25 on page A-67 lists all the bits in LAR, and Table A-23 on page A-63 lists all the bits in LCTLx. This section contains: • “Link Port Buffer Modes” on page 6-45 • “Link Port Channel Priority Modes” on page 6-45 • “Link Port Channel Priority Modes” on page 6-45 The following bits control link port I/O processor modes. The control bits in the LCTLx registers have a one cycle effect latency (take effect on the second cycle after change). Programs should not modify an active DMA channel’s bits in the LCTLx register; other than to disable the channel by clearing the LxDEN bit. For information on verifying a channel’s status with the DMASTAT register, see “Using I/O Processor Status” on page 6-53. • Link Port DMA Channel Priority Rotation Enable. SYSCON Bit 10 (LDCPR) This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation among link port DMA channels (channel 4-9). • Link–External Port DMA Channel Priority Rotation Enable. SYSCON Bit 12 (PRROT) This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation between link port DMA channels (channel 4-9) and external port DMA channels (channel 10-13). ADSP-21160 SHARC DSP Hardware Reference 6-43 Setting I/O Processor—LPort Modes • Link port assignment for LBUFx. LAR Bits 2-0, 5-3, 8-6, 11-9, 14-12, and 17-15 (AxLB) Each 3-bit set assigns a link port to the corresponding link buffer (LBUFx) as shown in Table 6-10 on page 6-45. • Link Buffer Enable. LCTL0 Bits 0, 10, and 20 and LCTL1 Bits 0, 10, and 20 (LxEN) This bit enables (if set, =1) or disables (if cleared, =0) the corresponding link buffer (LBUFx). • Link Buffer DMA Enable. LCTL0 Bits 1, 11, and 21 and LCTL1 Bits 1, 11, and 21 (LxDEN) This bit enables (if set, =1) or disables (if cleared, =0) DMA transfers for the corresponding link buffer (LBUFx). • Link Buffer DMA Chaining Enable. LCTL0 Bits 2, 12, and 22 and LCTL1 Bits 2, 12, and 22 (LxCHEN) This bit enables (if set, =1) or disables (if cleared, =0) DMA chaining for the corresponding link buffer (LBUFx). • Link Buffer Transfer Direction. LCTL0 Bits 3, 13, and 23 and LCTL1 Bits 3, 13, and 23 (LxTRAN) This bit selects the transfer direction (transmit if set, =1) (receive if cleared, =0) for the corresponding link buffer (LBUFx). • Link Buffer Extended Word Size. LCTL0 Bits 4, 14, and 24 and LCTL1 Bits 4, 14, and 24 (LxEXT) This bit selects the transfer extended word size (48-bit if set, =1) (32-bit if cleared, =0) for the corresponding link buffer (LBUFx). Programs must not change a buffer’s LxEXT setting while the buffer is enabled. • Link Buffer DMA 2-Dimensional. LCTL0 Bits 7, 17, and 27 and LCTL1 Bits 7, 17, and 27 (LxDMA2D) This bit enables (if set, =1) or disables (if cleared, =0) two-dimensional DMA transfers for the corresponding link buffer (LBUFx). Some other bits in LCTLx setup non-DMA link port features. For information on these features, see “Setting Link Port Modes” on page 8-5. 6-44 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Link Port Buffer Modes The AxLB and LxEN bits in the LAR and LCTLx registers assign link ports to link buffers and enable link buffers. Table 6-10 shows how the AxLB bits in the LAR register assign link buffers to link ports. To enable a link buffer, a program sets the buffer’s LxEN bit in LCTL0 or LCTL1. To disable a link buffer, a program clears the buffer’s LxEN bit in LCTL0 or LCTL1. The bit locations appear on page 6-43. the DSP disables the buffer ( When low), the DSP clears the corresponding transitions from high to LxSTAT and LxRERR bits. LxEN Table 6-10. DSP Link Buffer-to-Link Port Assignments (AxLB) Link Buffers Link Ports Buffer# LAR Bits 0 1 2 3 4 5 NA1 0 2-0 000 001 010 011 100 101 111 1 5-3 000 001 010 011 100 101 111 2 8-6 000 001 010 011 100 101 111 3 11-9 000 001 010 011 100 101 111 4 14-12 000 001 010 011 100 101 111 5 17-15 000 001 010 011 100 101 111 1 NA indicates Not Assigned Link Port Channel Priority Modes The LDCPR and PRROT bits in the SYSCON register select priority levels for the link port buffers in relation to the priority of other link port buffers and the other I/O ports. The Link Port DMA Channel Priority Rotation Enable (LDCPR) bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation among link port DMA channels (channel 4-9). Rotating priority distributes link port DMA channel access to the IO bus. ADSP-21160 SHARC DSP Hardware Reference 6-45 Setting I/O Processor—LPort Modes When channel priority is rotating, the DSP arbitrates IO bus access between contending link port DMA channels, forcing the channels to take turns. When channel priority is fixed, the lower numbered link port DMA channel (4-9) always has priority over the higher numbered channel when contending for IO bus access. When LDCPR is set (rotating priority), high priority shifts to a new channel after each single-word transfer. The order of channel priority then rotates. After a single-word transfer, the I/O processor rotates priority to the next higher-numbered channel, and so on until all six are serviced. Figure 6-7 illustrates this process, according to the following example: HIGHEST PRIORITY HIGHEST PRIORITY 8 4 5 LOWEST PRIORITY 9 STEP 2 6 9 LOWEST PRIORITY 7 STEP 3 7 8 4 5 6 ONE TRANSFER OCCURS ON CHANNEL 7 (STEP 2), ROTATING CHANNEL 7'S PRIORITY TO THE LOWEST PRIORITY SLOT (STEP 3). Figure 6-7. Link Port Channel Rotating Priority Example 6-46 ADSP-21160 SHARC DSP Hardware Reference I/O Processor • At reset, link port channels have priority order—from high to low—4, 5, 6, 7, 8, and 9. • The link port performs a single transfer on channel 7. • The I/O processor rotates channel priority, changing it to 8, 9, 4, 5, 6, and 7 (because rotating priority is enabled for this example, LDCPR=1). though the link port channel DMA priority can rotate, the Even interrupt priorities of all DMA channels are fixed. When a program uses fixed priority for the link port DMA channels, the I/O processor assigns the highest priority to Channel 4 and the lowest priority to Channel 9. For a list of all fixed priority assignments, see the list of DMA channels in Table 6-1 on page 6-12. Programs can change the fixed priority order, assigning a different channel to the highest priority. The following example shows how to change the fixed priority sequence of the link port DMA channels: • Disable all link port DMA channels except the one immediately above the channel that is to have highest priority. • Select rotating priority (by setting the LDCPR bit). • Cause at least one transfer to occur on the enabled channel. • Disable rotating priority and re-enable all of the link port DMA channels. • The channel immediately after the selected channel now has the highest fixed priority. Programs can also rotate priority between the link port and external port DMA channels. The Link–External Port DMA Channel Priority Rotation Enable (PRROT) bit enables (rotates if set, =1) or disables (fixed if cleared, ADSP-21160 SHARC DSP Hardware Reference 6-47 Setting I/O Processor—LPort Modes =0) priority rotation between link port DMA channels (channel 4-9) and external port DMA channels (channel 10-13). Rotating priority distributes link port and external port DMA channels’ access to the I/O bus. When channel priority is rotating, the DSP arbitrates IO bus access between contending link port and external port DMA channels, forcing the channel types to take turns. When channel priority is fixed, any link port DMA channel (4-9) always has priority over any external port DMA channel (10-13) when contending for IO bus access. Link Port Channel Transfer Modes The LxDEN, LxCHEN, LxTRAN, LxEXT, and LxDMA2D bits in the LCTLx register enable link port DMA, chained DMA, and two-dimensional DMA and select the transfer direction and format. The link DMA enable (LxDEN) and link Chained DMA enable (LxCHEN) bits work together to select a link port DMA channel’s transfer mode. Table 6-11 lists the modes. Table 6-11. Link Port DMA Enable Modes LxCHEN LxDEN DMA Enable Mode Description 0 0 Channel disabled (chaining disabled, DMA disabled) 0 1 Single DMA mode (chaining disabled, DMA enabled) 1 0 Chain insertion mode (chaining enabled, DMA enabled, auto-chaining disabled); For more information, see “Chaining DMA Processes” on page 6-69. 1 1 Chained DMA mode (chaining enabled, DMA enabled, auto-chaining enabled) Because link ports are bidirectional, the I/O processor uses the link Transmit select (LxTRAN) bit to determine the transfer direction (transmit or receive). Data flows from internal to external memory when in transmit 6-48 ADSP-21160 SHARC DSP Hardware Reference I/O Processor mode. In transmit mode, the I/O processor fills the channel’s LBUFx buffer when the channel’s LxDEN bit is set. The link Extend Word Size (LxEXT) bit determines how the DMA channel accesses columns of internal memory. If LxEXT is set, the data is 40- or 48-bit words, and the I/O processor makes 3-column internal memory accesses. If LxEXT is cleared, the data is 32-bit words, and the I/O processor makes 2-column internal memory accesses. For more information, see “Memory Organization and Word Size” on page 5-22. The Width ( LxEXT for the transfer overrides the Internal Memory Data setting for the internal memory block. IMDWx) The link buffer’s DMA 2-Dimensional (LxDMA2D) bit enables (if set, =1) or disables (if cleared, =0) two-dimensional DMA transfers for the corresponding link buffer (LBUFx). When LxDMA2D is set, the channel’s DAx and DBx registers control the two-dimensional DMA process. For more information, see “Using Two-Dimensional Link Port DMA” on page 6-83. A link buffer’s bit must be cleared (=0) for standard (one-dimension) DMA operation. LxDMA2D Setting I/O Processor—SPort Modes The SRCTLx and STCTLx registers control the serial port operating mode for the I/O processor. Table A-28 on page A-74 lists all the bits in SRCTLx and Table A-27 on page A-71 lists all the bits in STCTLx. This section contains: • “Serial Port Buffer Modes” on page 6-51 • “Serial Port Channel Priority Modes” on page 6-52 • “Serial Port Channel Transfer Modes” on page 6-52 ADSP-21160 SHARC DSP Hardware Reference 6-49 Setting I/O Processor—SPort Modes The following bits control serial port I/O processor modes. The control bits in the SRCTLx and STCTLx registers have a one cycle effect latency (take effect on the second cycle after change). Programs should not modify an active DMA channel’s bits in the SRCTLx or STCTLx registers; other than to disable the channel by clearing the SDEN bit. To change an inactive serial port’s operating mode, programs should clear a serial port’s control register before writing new settings to the control register. For information on verifying a channel’s status with the DMASTAT register, see “Using I/O Processor Status” on page 6-53. Some other bits in SRCTLx and STCTLx setup non-DMA serial port features. For information on these features, see “Setting Serial Port Modes” on page 9-6. • Serial Port Enable. SRCTLx and STCTLx Bit 0 (SPEN) This bit enables (if set, =1) or disables (if cleared, =0) the corresponding serial port. • Data Type Select. SRCTLx and STCTLx Bits 2-1 (DTYPE) These bits select the data type formatting for normal and multi-channel reception as follows: (normal/multichannel= format) 00/x0=Right-justify and zero-fill unused MSBs, 01/x1=Right-justify and sign-extend unused MSBs, 10/0x=Compand using -law, 11/1x=Compand using A-law. • Serial Word Endian Select. SRCTLx and STCTLx Bit 3 (SENDN) This bit selects little endian words (LSB first, if set, =1) or big endian words (MSB first, if cleared, =0). • Serial Word Length Select. SRCTLx and STCTLx Bits 8-4 (SLEN) These bits select the word length (–1) in bits. Word sizes can be from 3-bit (SLEN=2) to 32-bit (SLEN=31). • 16-bit to 32-bit Word Packing Enable. SRCTLx and STCTLx Bit 9 (PACK) This bit enables (if set, =1) or disables (if cleared, =0) 16- to 32-bit word packing. 6-50 ADSP-21160 SHARC DSP Hardware Reference I/O Processor • Serial Port Receive DMA Enable. SRCTLx and STCTLx Bit 18 (SDEN) This bit enables (if set, =1) or disables (if cleared, =0) the serial port’s receive DMA. • Serial Port Receive DMA Chaining Enable. SRCTLx and STCTLx Bit 19 (SCHEN) This bit enables (if set, =1) or disables (if cleared, =0) serial port DMA chaining. • Two Dimensional DMA Array Enable. SRCTLx and STCTLx Bit 21 (D2DMA) This bit enables (if set, =1) or disables (if cleared, =0) two-dimensional serial DMA. This bit must be cleared for multichannel operation. Serial Port Buffer Modes The SPEN, SENDN, SLEN, and PACK bits in the SRCTLx and STCTLx registers enable the serial port and select the transfer format. To enable a serial port transmit or receive buffer, a program sets the buffer’s SPEN bit in the SRCTLx or STCTLx register. To disable a serial port transmit or receive buffer, a program clears the buffer’s SPEN bit in the SRCTLx or STCTLx register. a serial port buffer is enabled and DMA for that channel is not Ifenabled, the serial port is in single-word, interrupt-driven transfer mode. For more information, see “Using I/O Processor Status” on page 6-53. Each serial port buffer allows independent settings for the three transfer format features: bit order, word length, and word packing. For transferring little endian words (LSB first, if set, =1) to or from little endian devices, the serial port buffers have a Serial Word Endian Select (SENDN) bit. This bit selects little endian words (LSB first, if set, =1) or big endian words (MSB first, if cleared, =0). ADSP-21160 SHARC DSP Hardware Reference 6-51 Setting I/O Processor—SPort Modes The Serial Word Length Select (SLEN) bit field selects the transfer word length (-1) in bits. Word sizes can be from 3-bit (SLEN=2) to 32-bit (SLEN=31). If the serial word length is 16-bits or smaller, the serial port can pack two of these words into the serial port buffer. The 16-bit to 32-bit Word Packing Enable (PACK) bit can enable this packing because the I/O processor performs 32-bit transfers between the serial port buffers and DSP memory. In addition to selecting the endian, length, and packing modes for serial port DSP transfers, programs must indicate the type of data in the transfer, using the Data Type (DTYPE) bit. For more information, see “Serial Port Channel Transfer Modes” on page 6-52. Serial Port Channel Priority Modes Serial port DMA transfers always take priority over external port or link port DMA transfers. For more information on prioritization operations, see “Managing DMA Channel Priority” on page 6-67. Serial Port Channel Transfer Modes The SDEN, SCHEN, DTYPE, and D2DMA bits in the SRCTLx and STCTLx register enable serial port DMA, chained DMA, and two-dimensional DMA and select the format. The DMA enable (SDEN) and Chained DMA enable (SCHEN) bits work together to select a serial port DMA channel’s transfer mode. Table 6-12 lists the modes. 6-52 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Table 6-12. Serial Port DMA Enable Modes SCHEN SDEN DMA Enable Mode Description 0 0 Channel disabled (chaining disabled, DMA disabled) 0 1 Single DMA mode (chaining disabled, DMA enabled) 1 0 Chain insertion mode (chaining enabled, DMA enabled, auto-chaining disabled); For more information, see “Chaining DMA Processes” on page 6-69. 1 1 Chained DMA mode (chaining enabled, DMA enabled, auto-chaining enabled) Because serial port buffers are bidirectional, the I/O processor does not need an indicator to determine the transfer direction (transmit or receive). Data flows from internal to external devices using a transmit (TXx) buffer. When transmitting serial data as DMA, the I/O processor fills the channel’s TXx buffer when the channel’s SDEN bit is set. The serial port buffer’s DMA 2-Dimensional (D2DMA) bit enables (if set, =1) or disables (if cleared, =0) two-dimensional DMA transfers for the corresponding serial port buffer (TXx or RXx). When 2DDMA is set, the channel’s DAx and DBx registers control the two-dimensional DMA process. For more information, see “Using Two-Dimensional Serial Port DMA” on page 6-91. A serial port buffer’s bit must be cleared (=0) for standard (one-dimension) DMA operation. D2DMA Using I/O Processor Status The I/O processor monitors the status of data transfers on DMA channels and indicates status in the DMASTAT, IRPTL, and LIRPTL registers. Table A-9 on page A-18 lists all the bits in IRPTL, Table A-10 on page A-24 lists all the bits in LIRPTL, and a discussion of DMASTAT appears in “DMA Channel Status Register (DMASTAT)” on page A-60. ADSP-21160 SHARC DSP Hardware Reference 6-53 Using I/O Processor Status The I/O processor reports on DMA in progress, DMA complete, or DMA channel not ready status: • All DMA channels can be active or inactive. If a channel is active— a DMA is in progress on that channel—the I/O processor indicates the active status by setting the channel’s bit in the DMASTAT register. • When an unchained (single-block) DMA process reaches completion on any DMA channel, the I/O processor generates that DMA channel's interrupt—sets the DMA channel's interrupt latch bit in the IRPTL or LIRPTL register. DMA process is complete when • The count in Cx=0 (for slave mode and handshake modes). • The count in ECx=0 (for external handshake mode). • The count in Cx=0 and ECx=0 (for master mode and paced master mode). • When a DMA process in a chained DMA sequence reaches completion (the count in Cx=0) on any DMA channel, the I/O processor generates an interrupt if the PCI bit in the channels CPx register is set, except in external-handshake mode. The I/O processor also generates that DMA channel’s interrupt when the last block in a chained DMA reaches completion whether or not PCI is set. • When a DMA channel’s buffer not being used for a DMA process, the I/O processor can generate an interrupt on single word writes or reads of the buffer. This interrupt service differs slightly for each port. For more information on single-word interrupt-driven transfers, see “External Port Status” on page 6-57, “Link Port Status” on page 6-60, and “Serial Port Status” on page 6-63. Using the DMA Channel Status Register (DMASTAT), programs can check which DMA channels are performing a DMA or chained DMA. For each channel, the I/O processor sets the channel’s active status bit if DMA for 6-54 ADSP-21160 SHARC DSP Hardware Reference I/O Processor that channel is enabled and a DMA sequence is in progress on that channel. The I/O processor sets the channel’s chaining status bit if a chained DMA sequence is in progress or pending on that channel. is a one cycle latency between a change in DMA channel sta There tus and the status update in the register. DMASTAT As an alternative to interrupt-driven DMA, programs can poll the DMASTAT register to determine when a single DMA sequence is done. To poll channel status, programs read DMASTAT. If both status bits for the channel are cleared, the DMA sequence has completed. chaining is enabled on a DMA channel programs should not use Ifpolling to determine channel status. Polling could provide inaccurate information in this case because the next DMA sequence might be under way by the time the polled status is returned. During interrupt-driven DMA, programs use the interrupt mask bits in the IMASK and LIRPTL registers to selectively mask DMA channel interrupts that the I/O processor latches into the IRPTL and LIRPTL registers. Descriptions of these conditions appear on on page 6-53. I/O processor only generates a DMA complete interrupt when The the channel’s count register decrements to zero as a result of actual DMA transfers. Writing zero to a count register does not generate the interrupt. A channel interrupt mask in IMASK and IRPTL masks out DMA complete interrupts for a channel, but other types of interrupt masking are also available. These other types of interrupt masking include: ADSP-21160 SHARC DSP Hardware Reference 6-55 Using I/O Processor Status • By clearing a channels PCI bit during chained DMA, programs mask the DMA complete interrupt for a DMA processes within a chained DMA sequence. • By masking the LPISUM interrupt, programs mask out the logical Or’ing of link port interrupt status. • By masking the LSRQ interrupt, programs mask out link port service requests to link ports that do not have an assigned link buffer. These lower levels of interrupt masking let programs limit some of the conditions that can cause DMA channel interrupts. DMA channel has its own interrupt. Although the external Each port and link port channel access priority can rotate, the interrupt priorities of all DMA channels are fixed. In DSP systems using I/O processor interrupts, an external device may need to change the DSP’s interrupt mask. This task presents a challenge because the IMASK register is not memory-mapped and is not directly accessible to external devices through the external port. To read or write IMASK through the external port, programs can set up an interrupt vector routine to handle this task. The VIRPT vector interrupt register may be used for this task. The I/O processor can also generate DMA channel interrupts for I/O port operations that do not use DMA. In this case, the I/O processor generates a DMA interrupt when data becomes available at the receive buffer or when the transmit buffer does not have new data to transmit. Generating DMA interrupts in this fashion lets programs implement interrupt-driven I/O under control of the processor core. Care is needed because multiple interrupts can occur if several I/O ports transmit or receive data in the same cycle. For more information on these types of single-word interrupt-driven transfers, see the external port discussion on page 6-59, link port discussion on page 6-63, or serial port discussion on page 6-65. 6-56 ADSP-21160 SHARC DSP Hardware Reference I/O Processor External Port Status The I/O processor monitors the status of data transfers on the external port. DMA channel status for the external port is described in “Using I/O Processor Status” on page 6-53. This section describes external port specific status features, such as buffer status, buffer control, and single-word interrupt-driven transfers. Bits in the SYSTAT, SYSCON and DMACx registers indicate and control status of external port buffers. Table A-20 on page A-51 lists all the bits in SYSTAT, Table A-17 on page A-45 lists all the bits in SYSCON, and Table A-21 on page A-55 lists all the bits in DMACx. The following bits influence external port buffer status: • Host Packing Status. SYSTAT bits 15-14 (HPS). These bits indicate the host’s packing status. • Host Packing Status Flush. SYSCON Bit 8 (HPFLSH) This bit flushes (when set, =1) settings for the direct write FIFO. • External Port Packing Status. DMACx Bits 4-3 (PS). These bits indicate the corresponding FIFO buffer’s packing status. • Single-Word Interrupt Enable. DMACx Bit 12 (INTIO). This bit enables (if set, =1) or disables (if cleared, =0) single-word, non-DMA, interrupt-driven transfers for the corresponding external port FIFO buffer (EPBx). To avoid spurious interrupts, programs must not change a buffer’s INTIO setting while the buffer is enabled. • Flush DMA Buffers and Status. DMACx Bit 14 (FLSH). This bit flushes (when set, =1) settings for the corresponding external port FIFO buffer (EPBx). • External Port FIFO Buffer Status. DMACx bit 17-16 (FS). These bits indicate the corresponding FIFO buffer’s status. ADSP-21160 SHARC DSP Hardware Reference 6-57 Using I/O Processor Status The HPS and PS bits in the SYSTAT and DMACx registers indicate an external buffer’s packing status. These bits are read-only, and the DSP clears these bits when DEN is cleared (changes from 1 to 0). Table 6-13 shows the available settings. Table 6-13. DSP (PS) and Host (HPS) Packing Status PS or HPS Packing Status 00 pack complete (reset value) 01 1st stage pack/unpack 10 2nd stage multi-stage pack/unpack 11 reserved The FS bits in the DMACx registers indicate an external buffer’s FIFO status. These bits are read-only. The DSP clears these bits when DEN is cleared (changes from 1 to 0). Table 6-14 shows the available settings. Table 6-14. External Port Buffer FIFO Status FS FIFO Buffer Status 00 buffer empty 01 buffer-not-full 10 buffer-not-empty 11 buffer full For transmit (TRAN=1), buffer-not-full means that the buffer has space for one Normal word, and buffer-not-empty means that the buffer has space for two-or-more Normal words. For receive (TRAN=0), buffer-not-full means that the buffer contains one Normal word, and buffer-not-empty means that the buffer contains two-or-more Normal words. Any type of full status (01, 10, or 11) in receive mode indicates that new (unread) data is in the buffer. 6-58 ADSP-21160 SHARC DSP Hardware Reference I/O Processor The HPFLSH and FLSH bits in the SYSCON and DMACx registers flush an external buffer’s packing status, but these bits work differently. The HPFLSH bit flushes (when set, =1) settings for the direct write FIFO. Flushing these settings clears the HPS status bits in the SYSTAT register, clears (=0) the channel’s DMA request counter, and clears (-0) any partially packed words. By comparison, the FLSH bit flushes (when set, =1) settings for the corresponding external port FIFO buffer (EPBx). Flushing these settings does the following: Clears (=0) the FS and PS status bits, Clears (=0) the FIFO buffer and DMA request counter, Clears (-0) any partially packed words. When a program sets (=1) either HPFLSH or FLSH, the DSP flushes the settings and clears (=0) the flush bit. There is a two-cycle effect latency in completing the flush operation. DSP programs must not set a buffer’s FLSH during the same write that enables the buffer. Also, programs must not set a buffer’s FLSH bit while the DMA channel is active. Programs should determine the channel’s active status by reading the corresponding bit in the DMASTAT register. does not change on the master DSP during external port Status DMA until the external portion is completed (i.e., the buffers EPBx are emptied). If in chain insertion mode ( =0, =1), then channel chaining status will never go to 1. Programs should test channel status to see DEN CHEN if it is ready before re-writing the chain pointer (CPx). The INTIO bit in the DMACx registers support single-word interrupt-driven transfers for each corresponding external port buffer. These non-DMA transfers are available under the following conditions: ADSP-21160 SHARC DSP Hardware Reference 6-59 Using I/O Processor Status • The external port DMA channel’s DEN bit is cleared (DMA disabled). • The external port DMA channel’s INTIO bit is set (enabling interrupt-driven I/O). • The external port DMA channel’s buffer is “not empty” on an external read or “not full” on an external write. Under these conditions, the I/O processor generates that DMA channel’s interrupt on the single word transfer to that channel’s external port buffer. Link Port Status The I/O processor monitors the status of data transfers on the link ports. DMA channel status for the link ports is described in “Using I/O Processor Status” on page 6-53. This section describes link ports specific status features, such as buffer status, buffer control, and single-word interrupt-driven transfers. Bits in the LCOM and LSRQ registers indicate and control status of link port buffers. Table A-24 on page A-65 lists all the bits in LCOM and Table A-26 on page A-68 lists all the bits in LSRQ. The following bits influence link port buffer status. • Link Buffer x Status. LCOM Bits 1-0, 3-2, 5-4, 7-6, 9-8, 11-10 (LxSTAT). These bits indicate the corresponding buffer’s status. • Link Buffer x Receive Pack Error Status. LCOM Bits 26, 27, 28, 29, 30, 31 (LRERRx). These bits indicate the buffer’s packing status. • Link Port x transmit mask. LSRQ Bit 4, 6, 8, 10, 12, 14 (LxTM). These bits mask (if set, =1) or unmask (if cleared, =0) the L0TRQ through L5TRQ status bits. 6-60 ADSP-21160 SHARC DSP Hardware Reference I/O Processor • Link Port x receive mask. LSRQ Bit 5, 7, 9, 11, 13, 15 (LxRM). These bits mask (if set, =1) or unmask (if cleared, =0) the L0RRQ through L5RRQ status bits. • Link Port x transmit request status (read-only). LSRQ Bit 20, 22, 24, 26, 28, 30 (LxTRQ). If set (=1), these bits indicate that a link port (0 through 5) is disabled, but has a request to transmit data. • Link Port x receive request status (read-only). LSRQ Bit 21, 23, 25, 27, 29, 31 (LxRRQ). If set (=1), these bits indicate that a link port (0 through 5) is disabled, but has a request to receive data. The LRERRx bits in the LCOM register indicate a link port buffer’s receive packing status. When the buffer is ready to receive and pack a a new word, the DSP clears (=0) LRERRx. If this bit remains set (=1) after the buffer receives a word, a link transfer error (for example, a clock glitch) has occurred. These bits are read-only, and the DSP clears these bits when LxEN is cleared (changes from 1 to 0). Table 6-15 shows the available settings. Table 6-15. Link Port Buffer Receive Packing Status LRERRx Receive Packing Status 0 pack complete (reset value) 1 pack not complete The LxSTAT bits in the LCOM register indicate a link buffer’s FIFO status. When transmitting, these bits indicate when the buffer has space for more data. When receiving, these status bits indicate when the buffer contains new (unread) data. These bits are read-only. The DSP clears these bits when LxEN is cleared (changes from 1 to 0) and empties the buffer. Table 6-16 shows the available settings. ADSP-21160 SHARC DSP Hardware Reference 6-61 Using I/O Processor Status Table 6-16. Link Port Buffer FIFO Status LxSTAT FIFO Buffer Status 00 buffer empty 01 reserved 10 one word 11 buffer full The LAR register lets programs assign link buffers to link ports. Table 6-10 on page 6-45 shows how the AxLB bits in the LAR register assign link buffers to link ports. Because this mapping allows link ports to be unassigned (no buffer), the I/O processor has an interrupt (LSRQI) to notify programs that an external device has made a read or write request on a disabled link port. When the an LSRQI interrupt is latched into the IRPTL register, programs use the transmit (LxTRQ) and receive (LxRRQ) request bits in LSRQ register to determine which port has a request. The LSRQ register’s bits indicate that: • For a transmit request (LxTRQ=1), the LSRQI interrupt indicates that the link port (0 through 5) is disabled, but another DSP has requested more data by setting the link port’s acknowledge (LxACK=1). • For a receive request (LxRRQ=1), the LSRQI interrupt indicates that the link port (0 through 5) is disabled, but another DSP has requested to send data by setting the link port’s clock (LxCLK=1). To control sources of link port service requests, the I/O processor lets programs mask these service requests. The LSRQ register provides mask bits for transmit (LxTM) and receive (LxRM) link service requests. 6-62 ADSP-21160 SHARC DSP Hardware Reference I/O Processor The LxEN bits in the LCTLx register support single-word interrupt-driven transfers for each corresponding link port buffer. These non-DMA transfers are available under these conditions: • The link port DMA channel’s LxDEN bit is cleared (DMA disabled). • The link port DMA channel’s LxEN bit is set (enabling the link buffer). • The link port DMA channel’s buffer is “not empty” on receive or “not full” on transmit. Under these conditions, the I/O processor generates that DMA channel’s interrupt on the single word transfer to that channel’s link port buffer. Serial Port Status The I/O processor monitors the status of data transfers on the serial ports. DMA channel status for the serial ports is described in “Using I/O Processor Status” on page 6-53. This section describes serial ports specific status features, such as buffer status, transmit buffer underflow, receive buffer overflow, and single-word interrupt-driven transfers. Bits in the STCTLx and SRCTLx registers indicate and control status of serial port buffers. Table A-27 on page A-71 lists all the bits in STCTLx and Table A-28 on page A-74 lists all the bits in SRCTLx. The following bits influence serial port buffer status: • Transmit Underflow Status (sticky, read-only). STCTLx Bit 29 (TUVF). This bit indicates whether the serial transmit operation has underflowed (if set, =1). • Transmit Data Buffer Status (read-only). STCTLx Bits 31-30 (TXS). These bits indicate the status of the serial port’s transmit buffer (TXx). ADSP-21160 SHARC DSP Hardware Reference 6-63 Using I/O Processor Status • Receive Overflow Status (sticky, read-only). SRCTLx Bit 29 (ROVF). This bit indicates whether the receive operation has overflowed (if set, =1). • Receive Data Buffer Status (read-only). SRCTLx Bits 31-30 (RXS). These bits indicate the status of the serial port’s receive buffer (RXx). The TXS and RXS bits in the STCTLx and SRCTLx registers indicate a serial port transmit (TXx) or receive (RXx) buffer’s FIFO status. Status bits are read-only. Disabling the serial port (setting SPEN=0), clears the status bits and empties the buffer. TXS and RXS may change state if the data is read or written by the processor core while the serial port is disabled. Table 6-17 shows the available settings. Table 6-17. Serial Port Transmit and Receive Buffer FIFO Status TXS or RXS FIFO Buffer Status 00 buffer empty 01 reserved 10 partially full 11 buffer full The TUVF and ROVF bits in the STCTLx and SRCTLx registers indicate a serial port transmit underflow or receive overflow to the buffer’s FIFO. Status bits are read-only. Disabling the serial port (setting SPEN=0), clears the status bits and empties the buffer. These overflow and underflow bits are sticky; once set, they remain set regardless of buffer status until the serial port is disabled. 6-64 ADSP-21160 SHARC DSP Hardware Reference I/O Processor The SPEN bit in the STCTLx or SRCTLx register support single-word interrupt-driven transfers for each corresponding serial port transmit or receive buffer. These non-DMA transfers are available under these conditions: • The serial port DMA channel’s SDEN bit is cleared (DMA disabled). • The serial port DMA channel’s SPEN bit is set (enabling the serial port transmit or receive buffer). • The serial port DMA channel’s buffer is “not empty” on receive or “not full” on transmit. Under these conditions, the I/O processor generates that DMA channel’s interrupt on the single word transfer to that channel’s link port buffer. DMA Controller Operation DMA sequences start in different ways depending on whether DMA chaining is enabled. When chaining is not enabled, only the DMA enable bit (DEN) allows DMA transfers to occur. A DMA sequence starts when one of the following occurs: • Chaining is disabled and the DMA enable bit (DEN) transitions from low to high. • Chaining is enabled, DMA is enabled (DEN=1), and the CPx register address field is written with a non-zero value. In this case, TCB chain loading of the channel parameter registers occurs first. • Chaining is enabled, the CPx register address field is non-zero, and the current DMA sequence finishes. Again, TCB chain loading occurs. ADSP-21160 SHARC DSP Hardware Reference 6-65 DMA Controller Operation A DMA sequence ends when one of the following occurs: • The count register decrements to zero (both C and EC for external port channels). • Chaining is disabled and the channel’s DEN bit transitions from high to low. If the DEN bit goes low (=0) and chaining is enabled, the channel enters chain insertion mode and the DMA sequence continues. For more information, see “Inserting a TCB in an Active Chain” on page 6-73. a program sets the bit (=1) after a single DMA finishes, When the DMA sequence continues from where it left off (for DEN non-chained operations only). To start a new DMA sequence after the current one is finished, a program must first clear the DEN enable bit, write new parameters to the II, IM, and C registers, then set the DEN bit to re-enable DMA. For chained DMA operations, these steps are not necessary. For more information, see “Chaining DMA Processes” on page 6-69. a DMA operation completes and the count register is rewritten Ifbefore the DMA enable bit is cleared, the DMA transfer will restart at the new count. Once a program starts a DMA process, the process is influenced by two external controls: DMA channel priority and DMA chaining. For more information, see For more information, see “Managing DMA Channel Priority” on page 6-67. or “Chaining DMA Processes” on page 6-69. 6-66 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Managing DMA Channel Priority The DMA channels for each of the DSP’s I/O ports negotiate channel priority with the I/O processor using an internal DMA request/grant handshake. Each I/O port (link ports, serial ports, and external ports) has one or more DMA channels, with each channel having a single request and a single grant. When a particular channel needs to read or write data to internal memory, the channel asserts an internal DMA request. The I/O processor prioritizes the request with all other valid DMA requests. When a channel becomes the highest priority requester, the I/O processor asserts the channel’s internal DMA grant. In the next clock cycle, the DMA transfer starts. Figure 6-8 shows the paths for internal DMA requests within the I/O processor. If a DMA channel is disabled ( , , or bit =0), the I/O processor does not issue internal DMA grants to that channel, DEN LxDEN SDEN whether or not the channel has data to transfer. Because more than one DMA channel can make a DMA request in a particular cycle, the I/O processor prioritizes DMA channel service. DMA channel prioritization determines which channel can use the IOD (I/O Data) bus to access memory. Default DMA channel priority is fixed prioritization by DMA channel type (serial ports, link ports, or external port). Within the DMA channel types, the serial port DMA channels are always fixed priority, the external port DMA channels may be either fixed or rotated priority, and the link port DMA channels may be either fixed or rotated priority. Table 6-1 on page 6-12 lists the DMA channels in descending order of priority. For information on programming link port or external port priority modes, see “External Port Channel Priority Modes” on page 6-19 or “Link Port Channel Priority Modes” on page 6-45. The I/O processor determines which DMA channel has the highest priority internal DMA request during every cycle between each data transfer. ADSP-21160 SHARC DSP Hardware Reference 6-67 DMA Controller Operation DSP CORE INTERNAL MEMORY PM ADDRESS ADDR DM ADDRESS ADDR ON ASYNCHRONOUS WRITES, THE SLAVE WRITE FIFO IS 4 DEEP. ON SYNCHRONOUS WRITES, THE SLAVE WRITE FIFO IS 2 DEEP. ADDR DATA DATA DATA IOA PMD DMD EXTERNAL PORT IOD PMA PM DATA DMA PMD DM DATA DMD ADDR 32 EPD EPA I/O DATA BUS (IOD) I/O ADDRESS BUS (IOA) SLAVE WRITE FIFO (2/4 DEEP) 64 DATA BUFFER 18 64 64 EXT. PORT DATA BUS (EPD) 32 EXT. PORT ADDR BUS (EPA) I/O PROCESSOR EXTERNAL PORT FIFOS EPB (8 DEEP) INTERNAL DMA ADDRESS GENERATORS GRNTS REQ.S LINK PORT FIFOS LBUF (2 DEEP) EXTERNAL DMA ADDRESS GENERATORS GRNTS REQ.S DMAR 14 REQUESTS GRANTS INTERNAL DMA PRIORITZER DMA CONTROLLER 14 SERIAL PORT FIFOS RX,TX (2 DEEP) OTHER IOP REGISTERS REQUESTS GRANTS DMAG EXTERNAL DMA PRIORITIZER DMA CONTROLLER LINK PORTS DIRECT WRITE FIFO (8 DEEP) SERIAL PORTS Figure 6-8. I/O Processor Internal Request and Grant Paths Internal DMA channel arbitration differs from external bus arbitration. For more information on external bus arbitration, see “Multiprocessor Bus Arbitration” on page 7-98. Processor core accesses of I/O processor registers, external direct accesses of internal memory, and TCB chain loading are subject to the same prioritization scheme as the DMA channels. Applying this scheme uniformly prevents I/O bus contention, because these accesses are also performed 6-68 ADSP-21160 SHARC DSP Hardware Reference I/O Processor over the internal I/O bus. TCB chain loading has a higher priority than external port accesses. This TCB priority permits chained serial port DMA, even when the external port is attempting an access in every cycle. For more information, see “Chaining DMA Processes” on page 6-69. If a DSP has all six link ports enabled and active at the same time, the default priority scheme could hold off external port DMA channels for extended periods of time. Because this hold off could have a significant negative impact on external bus performance, the I/O processor permits rotating DMA channel priority between the link port channel group and external port channel group. For more information on using the PRROT bit to rotate priority between link ports and the external port, see “Link Port Channel Priority Modes” on page 6-45. Chaining DMA Processes DMA chaining lets the I/O processor automatically load DMA parameters and start the next DMA when the current DMA finishes. This feature permits unlimited multiple DMA transfers without processor core intervention. Using chaining, programs can set up multiple DMA operations, and each operation can have different attributes. I/O processor responds by auto-initializing the channel’s The parameter registers with the first TCB and starting the first transfer. To chain together multiple DMA operations, the I/O processor must load the next Transfer Control Block (DMA parameters) into the DMA parameter registers when the current DMA finishes (DMA count =0). The chain pointer register (CPx) points to the next set of DMA parameters, which are stored in internal memory. This process of loading the TCB into the parameter registers is called TCB chain loading. The chain pointer register should be cleared first before enabling chaining. Two controls enable chained DMA. Each DMA channel has a chaining enable bit (CHEN) in the channel’s control register. When set, the CHEN bit ADSP-21160 SHARC DSP Hardware Reference 6-69 DMA Controller Operation directs the I/O processor to use the CPx register for chained DMA. Programs start the chained DMA by writing a non-zero address to the CPx register, directing the I/O processor to start the DMA with TCB chain loading. Programs can disable chained DMA by writing all zeros to the address field of the CPx register. DMA operations may only occur within the same chan Chained nel. The DSP does not support cross-channel chaining. The CPx register is 19 bits wide, of which the lower 18 bits are the memory address field. Like other I/O processor address registers, the CPx registers value is offset to match the starting address of internal memory before being used by the I/O processor. On the ADSP-21160 DSP, this offset value is 0x0004 0000. Bit 18 (the 19th bit) of the CPx register is Program Controlled Interrupts (PCI) bit. If set, the PCI bit enables a DMA channel interrupt to occurs at the completion of the current DMA sequence. (The bit only effects DMA channels that have chaining enabled CHEN =1). Also, interrupt requests enabled by the PCI bit are maskable with the IMASK register. PCI the bit is not part of the memory address in the Because register, programs must be careful when writing and reading PCI CPx addresses to and from the register. To prevent errors, programs should mask out the PCI bit (bit 18, the 19th bit) when copying the address in CPx to another address register. During chained DMA, the channel’s General Purpose (GP) register is a useful place to point to the last completed DMA sequence. This practice lets programs determine where the last full (or empty) data buffer is located. 6-70 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Transfer Control Block (TCB) Chain Loading During TCB chain loading, the I/O processor loads the DMA channel parameter registers with values retrieved from internal memory. The address in the CPx register points to the highest address of the TCB (containing the IIx parameter). The TCB values reside in consecutive memory locations. Table 6-18 shows the TCB-to-register loading sequence for the external port, link port, and serial port DMA channels. The parameter order in the table is the order that the I/O processor reads each word of the TCB and loads it into the corresponding register. Programs must set up the TCB in memory in the order shown in Table 6-18, placing the IIx parameter at the address pointed to by the CPx register of the previous DMA operation of the chain. Table 6-18. TCB Chain Loading Sequence Address1 External Port Link Ports and Serial Ports CPx + 0x0004 0000 IIx IIx CPx – 1 + 0x0004 0000 IMx IMx CPx – 2 + 0x0004 0000 Cx Cx (and DAx for 2D DMA)2 CPx – 3 + 0x0004 0000 CPx CPx CPx – 4 + 0x0004 0000 GPx GPx CPx – 5 + 0x0004 0000 EIx DBx (loaded during 2D DMA only) CPx – 6 + 0x0004 0000 EMx LPATH1 (mesh multiproc. links only)3 CPx – 7 + 0x0004 0000 ECx LPATH2 (mesh multiproc. links only)3 CPx – 8 + 0x0004 0000 – LPATH3 (mesh multiproc. links only)3 1 An “x” denotes the DMA channel number. 2 The DAx and DBx registers are not loaded during chaining in normal, one-dimensional DMA. In 2D DMA operations, only DBx is loaded. The DAx register is automatically loaded with the same value as the Cx register. ADSP-21160 SHARC DSP Hardware Reference 6-71 DMA Controller Operation 3 The link transmit chain also downloads the LPATH1, LPATH2, and LPATH3 registers when the LMSP bit in the LCOM control register is set, enabling mesh multiprocessing. A TCB chain load request is prioritized like all other DMA operations. The I/O processor latches a TCB loading request and holds it until the load request has the highest priority. If multiple chaining requests are present, the I/O processor services the TCB registers for the highest priority DMA channel first. A channel which is in the process of chain loading cannot be interrupted by a higher priority channel. For a list of DMA channels in priority order, see Table 6-1 on page 6-12. For more information on DMA priority, see “Managing DMA Channel Priority” on page 6-67. Setting Up and Starting The Chain To setup and initiate a chain of DMA operations, program use the following steps: 1. Set up all TCBs in internal memory. 2. Write to the appropriate DMA control register, setting the DEN DMA enable bit to 1 and the CHEN chaining enable bit to 1. 3. Write the address containing the IIx register value of the first TCB to the CPx register, starting the chain (see Figure 6-9). The I/O processor responds by autoinitializing the channel’s parameter registers with the first TCB and starting the first transfer. When the transfer finishes, the I/O processor begins the next TCB chain load if the current chain pointer address is non-zero. The CPx address points to the next TCB. address field of the registers is only 18 bits wide. If a pro The gram writes a symbolic address to bit 18 of , there may be a CPx CPx conflict with the PCI bit. Programs should clear the upper bits of the address, then AND in the PCI bit separately if needed. 6-72 ADSP-21160 SHARC DSP Hardware Reference I/O Processor 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 CPx PCI B IT PR OGR AM CONT R OL L E D INT E R R UPT B IT IF T HIS B IT IS S E T , T HE I/O P R OCE S S OR GE ME R AT E S A DMA INT E R R UPT ON COMP L E T ION OF A CHAINE D DMA. Figure 6-9. CPx Register Inserting a TCB in an Active Chain It is possible to insert a single DMA operation or another DMA chain within an active DMA chain. Programs may need to perform insertion when a high priority DMA requires service and cannot wait for the current chain to finish. When DMA on a channel is disabled (DEN=0) and chaining on the channel is enabled (CHEN=1), the DMA channel is in chain insertion mode. This mode lets a program insert a new DMA or DMA chain within the current chain without effecting the current DMA transfer. Programs should use the following sequence to insert a DMA subchain while another chain is active: 1. Enter chain insertion mode by setting CHEN=1 and DEN=0 in the channel’s DMA control register. 2. The DMA interrupt indicates when the current DMA sequence has completed. 3. Write the CPx register value into the CP position of the last TCB in the new chain. 4. Enter chained DMA mode by setting DEN=1 and CHEN=1. 5. Write the start address of the first TCB of the new chain into the CPx register. ADSP-21160 SHARC DSP Hardware Reference 6-73 External Port DMA Chain insertion mode operates the same as chained DMA mode (DEN=1, CHEN=1), except that when the current DMA transfer ends, automatic chaining is disabled and an interrupt request occurs. This interrupt request is independent of the PCI bit state. insertion should not be set up as an initial mode of opera Chain tion. This mode should only be used to insert a DMA within an active DMA chaining operation. External Port DMA The DSP support a number of DMA modes for external port DMA. The following sections provide overviews of typical external port DMA processes: • “Setting up External Port DMA” on page 6-74 • “Bootloading Through The External Port” on page 6-76 Setting up External Port DMA The method for setting up and starting an external port DMA sequence varies slightly with the selection of transfer and DMA handshake for the channel. For more information on transfer and DMA handshake modes, see “External Port Channel Transfer Modes” on page 6-21 and “External Port Channel Handshake Modes” on page 6-22. For more detailed information on external port DMA features, see “Setting I/O Processor—EPort Modes” on page 6-14. 6-74 ADSP-21160 SHARC DSP Hardware Reference I/O Processor In general, the following sequence describes a typical external to internal DMA operation where an external device transfers a block of data into the DSP’s internal memory: 1. The DSP or host (depends on mode) writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and DMACx control register, initializing the channel for receive (TRAN=0). 2. The DSP or host (depends on mode) sets (=1) the channel’s DEN bit enabling the DMA process. 3. The external device begins writing data to the EPBx buffer (through the external port). 4. Whether or not the DSP signals for this transfer to begin depends on the mode. 5. The EPBx buffer detects data is present and asserts an internal DMA request to the I/O processor. 6. The I/O processor grants the request and performs the internal DMA transfer, emptying the EPBx buffer FIFO. In general, the following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the DSP’s internal memory: 1. The DSP or host (depends on mode) writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and DMACx control register, initializing the channel for transmit (TRAN=1). 2. The DSP or host (depends on mode) sets (=1) the channel’s DEN bit enabling the DMA process. Because this is a transmit, setting DEN automatically asserts an internal DMA request to the I/O processor. 3. The I/O processor grants the request and performs the internal DMA transfer, filling the EPBx buffer’s FIFO. ADSP-21160 SHARC DSP Hardware Reference 6-75 External Port DMA 4. The external device begins reading data from the EPBx buffer (through the external port). 5. Whether or not the DSP signals for this transfer to begin depends on the mode. 6. The EPBx buffer detects that there is room in the buffer (it is now “partially empty) and asserts another internal DMA request to the I/O processor, continuing the process. Bootloading Through The External Port The DSP can boot from an EPROM or host processor through the external port. The DMAC10 control register is specially initialized for booting in each case. Each booting mode packs boot data into 48-bit instructions. EPROM and host boot use channel 10 of the I/O processor’s DMA controller to transfer the instructions to internal memory. For EPROM booting, the DSP reads data from an 8-bit external EPROM. For host booting, the DSP accepts data from a 16- or 32-bit host microprocessor (or other external device). After the boot process loads 256 words into memory locations 0x40000 through 0x400FF, the DSP begins executing instructions. Because most DSP programs require more than 256 words of instructions and initialization data, the 256 words typically serve as a loading routine for the application. Analog Devices supplies loading routines (Loader Kernels) that can load entire programs. These routines come with the development tools. For more information on Loader Kernels, see the development tools documentation. is important to note that DMA channel differences between the ItADSP-21160 DSP and previous SHARC DSPs (ADSP-2106x DSPs) introduce some booting differences. Even with these differences, the ADSP-21160 DSP supports the same boot capability and configuration as the ADSP-2106x DSPs. 6-76 ADSP-21160 SHARC DSP Hardware Reference I/O Processor DMAC default values differ because the ADSP-21160 DSP has additional parameters and different DMA channel assignments. The EPROM and Host boot modes use EPB0, DMA channel 10. In EPROM booting, the alignment of the 8-bit port differs due to the new 64-bit data path. The ADSP-21160 DSP boots from DATA39-32 instead of DATA23-16 as on the ADSP-2106x DSPs. For EPROM or host booting the ADSP-21160 DSP, the Program sequencer automatically unmasks the DMA channel 10 channel interrupt, initializing the IMASK register to 0x00008003. The DSP determines the booting mode at reset from the EBOOT, LBOOT, and BMS pin inputs. When EBOOT=1 and LBOOT=0, the DSP boots from an EPROM through the External Port and uses BMS as the memory select output. When EBOOT=0, LBOOT=0, and BMS =1, the DSP boots from a host through the External Port. For a list showing how to select different boot modes, see the Boot Memory Select pin description on “Pin Descriptions” on page 11-3. of the power-up booting modes, address When using any should not contain a valid instruction since it is not 0x0004 0004 executed during the booting sequence. A NOP or IDLE instruction should be placed at this location. In EPROM booting through the external port, an 8-bit wide boot EPROM must be connected to data bus pins 39-32 (DATA39-32). The lowest address pins of the DSP should be connected to the EPROM’s address lines. The EPROM’s chip select should be connected to BMS and its output enable should be connected to RDH. ADSP-21160 SHARC DSP Hardware Reference 6-77 External Port DMA In a multiprocessor system, the BMS output is only driven by the ADSP-21160 DSP bus master. This allows wire-OR’ing of multiple BMS signals for a single common boot EPROM. can boot any number of ADSP-21160 DSPs from a single Systems EPROM, using the same code for each processor or differing code for each. During reset, the DSP’s ACK line is internally pulled high with a 2 k equivalent resistor and is held high with an internal keeper latch. It is not necessary to use an external pull-up resistor on the ACK line during booting or at any other time. When EPROM boot mode is configured, the External Port DMA Channel 10 (DMAC10) becomes active following reset; it is initialized to 0x04A1, which allows external port DMA enable and selects DTYPE for instruction words. The packing mode bits (PMODE) are ignored, BSO is set in SYSCON, and 8-to-48 bit packing is forced with least-significant-word first. The UBWS and UBAM fields of the WAIT register are initialized to perform asynchronous access and generate seven wait states (eight cycles total) for the EPROM access in unbanked external memory space. (Note that wait states defined for unbanked memory are applied to BMS-asserted accesses.) Table 6-19 shows how the DMA Channel 10 parameter registers are initialized at reset for EPROM booting. The count register (C10) is initialized to 0x0100 for transferring 256 words to internal memory. The external count register (EC10), which is used when external addresses are generated by the DMA controller, is initialized to 0x0600 (that is, 0x0100 words with six bytes per word). 6-78 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Table 6-19. DMA Channel 10 Parameter Register Initialization For EPROM Booting Parameter Register Initialization Value II10 0x0004 0000 IM10 uninitialized (increment by 1 is automatic) C10 0x0100 (256 instruction words) CP10 uninitialized GP10 uninitialized EI10 0x0080 0000 EM10 uninitialized (increment by 1 is automatic) EC10 0x0600 (256 words x 6 bytes/word) At system start-up, when the DSP’s RESET input goes inactive, the following sequence occurs: 1. The DSP goes into an idle state, identical to that caused by the IDLE instruction. The program counter (PC) is set to address 0x0004 0004. 2. The DMA parameter registers for channel 10 are initialized (as shown in Table 6-19). 3. BMS becomes the boot EPROM chip select. 4. 8-bit Master Mode DMA transfers from EPROM to internal memory begin, on the external port data bus lines 39-32. 5. The external address lines (ADDR31-0) start at 0x0080 increment after each access. 0000 and 6. The RDH strobe asserts as in a normal memory access, with seven wait states (eight cycles). ADSP-21160 SHARC DSP Hardware Reference 6-79 Link Port DMA The DSP’s DMA controller reads the 8-bit EPROM words, packs them into 48-bit instruction words, and transfers them to internal memory until 256 words have been loaded. The EPROM is automatically selected by the BMS pin; other memory select pins are disabled. The DMA external count register (EC10) decrements after each EPROM transfer. When EC10 reaches zero, the following wake-up sequence occurs: 1. The DMA transfers stop. 2. The External Port DMA Channel 10 interrupt (EP0I) is activated. 3. is deactivated and normal external memory selects are activated. BMS 4. The DSP vectors to the EP0I interrupt vector at 0x0004 0050. At this point the DSP has completed its booting mode and is executing instructions normally. The first instruction at the EP0I interrupt vector location, address 0x0004 0050, should be an RTI (Return From Interrupt). This process returns execution to the reset routine at location 0x0004 0005 where normal program execution can resume. After reaching this point, a program can write a different service routine at the EP0I vector location 0x0004 0050. Link Port DMA The DSP support a number of DMA modes for link port DMA. The following sections provide overviews of typical link port DMA processes: • “Setting up Link Port DMA” on page 6-81 • “Using Two-Dimensional Link Port DMA” on page 6-83 • “Bootloading Through The Link Port” on page 6-87 6-80 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Setting up Link Port DMA The method for setting up and starting an link port DMA sequence varies slightly with the transfer mode for the channel. For more information on DMA transfer modes, see “Link Port Channel Transfer Modes” on page 6-48. For more detailed information on link port DMA features, see “Setting I/O Processor—LPort Modes” on page 6-43. In general, the following sequence describes a typical external to internal DMA operation where an external device transfers a block of data into the DSP’s internal memory using a link port: 1. The DSP or host (depends on mode) assigns the DMA channel’s link buffer to a link port using the channel’s AxLB bits in the LAR register. 2. The DSP or host (depends on mode) enables the DMA channel’s link buffer, setting the buffer’s LxEN bit in the channel’s LCTLx register. The DSP or host selects a words size (32- or 40/48-bits) using the LxEXT in the channel’s LCTLx register. 3. The DSP or host (depends on mode) writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and LCTLx control register, initializing the channel for receive (LxTRAN=0). 4. The DSP or host (depends on mode) sets (=1) the channel’s LxDEN bit enabling the DMA process. 5. The external device begins writing data to the LBUFx buffer (through the link port). 6. The LBUFx buffer detects data is present and asserts an internal DMA request to the I/O processor. 7. The I/O processor grants the request and performs the internal DMA transfer, emptying the LBUFx buffer FIFO. ADSP-21160 SHARC DSP Hardware Reference 6-81 Link Port DMA In general, the following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the DSP’s internal memory using a link port: 1. The DSP or host (depends on mode) assigns the DMA channel’s link buffer to a link port using the channel’s AxLB bits in the LAR register. 2. The DSP or host (depends on mode) enables the DMA channel’s link buffer, setting the buffer’s LxEN bit in the channel’s LCTLx register. The DSP or host selects a words size (32- or 40/48-bits) using the LxEXT in the channel’s LCTLx register. 3. The DSP or host (depends on mode) writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and DMACx control register, initializing the channel for transmit (LxTRAN=1). 4. The DSP or host (depends on mode) sets (=1) the channel’s LxDEN bit enabling the DMA process. Because this is a transmit, setting LxDEN automatically asserts an internal DMA request to the I/O processor. 5. The I/O processor grants the request and performs the internal DMA transfer, filling the LBUFx buffer’s FIFO. 6. The external device begins reading data from the LBUFx buffer (through the link port). 7. The LBUFx buffer detects that there is room in the buffer (it is now “partially empty) and asserts another internal DMA request to the I/O processor, continuing the process. 6-82 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Using Two-Dimensional Link Port DMA Two-dimensional DMA is available on all link port and serial port DMA channels. This DMA mode lets programs DMA data in memory treating the data as an array. Programs can use this row and column data arrangement in DSP algorithms that perform array operations. In this mode, the DMA channel’s DAx, DBx, and GPx registers direct data placement operations and the operation of the Cx register changes slightly from single-dimension DMA transfers. To place a link port or serial port in two-dimensional DMA mode, a program must set the link port channel’s LxDMA2D bit in the LCTLx register or the serial port channel’s D2DMA bit in the SRCTLx or SRCTLx register. Figure 6-10 and Figure 6-11 show how registers operate for two-dimensional DMA. These figures also show how the I/O processor places 16- or 32-bit data from a two-dimensional DMA in memory. In two-dimensional DMA, the I/O processor’s parameter registers operate as follows: • The Index register (IIx) initially holds the first address in the data array. As the DMA progresses, the I/O processor updates IIx to hold the current address by adding the data array X increment after each transfer. • The Modify register (IMx) holds the data array X (column) increment. The I/O processor uses IMx to modify the current address (IIx) to point to the next element in the data array X dimension (next column of array, not necessarily the next memory column). • The Dimension-A (DAx) register holds the data array X initial count (column count). At the beginning of each new row, the I/O processor uses DAx to load the Cx register with the number of columns in the data array. For programming convenience, the I/O processor writes the DAx register automatically whenever the processor core writes a count to the Cx register. ADSP-21160 SHARC DSP Hardware Reference 6-83 Link Port DMA Viewing DSP memory as four 16-bit columns, the 4x4 data array on the right would 2-D DMA into the memory space shown below. The callouts on this diagram indicate parameter register values for the 2-D DMA operation, transferring 16-bit words. E D B A 9 8 7 6 5 4 3 2 1 1 5 9 D 2 6 A E 3 7 B F C Y-DIRECTION (COLUMN) F 0 4 8 C GPx THE GENERAL PURPOSE (GPX) REGISTER CONTA INS THE COUNT OF ARRA Y ELEM ENTS REMAINING IN THE Y-DIRECTI ON. HE RE, GPX=4 TO S TART AND DEC RE MENTS AS E AC H ROW IS PROC ESSE D. DBx THE DIMENSION-B (DBX) REGISTER CONTA INS THE OFFSE T FROM THE ADDRESS OF THE LAST ROW ELEMENT TO THE E LEM ENT AT THE START OF THE NEXT ROW . HERE, DBX=1. 0 X-DI RE CTION (ROW ) Cx DAx IMx IIx THE C OUNT (CX) REGI STER CONTAINS THE NUMBER OF ARRAY ELEME NTS LEFT IN THE X-DI REC TION (ROW). CX IS RELOADE D FROM DAX W ITH EACH NEW ROW. HERE, C X STA RTS AS 4. THE DIM ENSION-A (DAX) RE GIS TER C ONT AI NS THE NUM BER OF ARRAY ELEME NTS IN THE X-DIRECTION. HERE, DAX=4. THE MODIFY (IMX) REGISTER CONTAINS THE X-DIRE CTION INCREMEN T. HERE, IMX=1. THE INDE X (II X) REGISTER I NITIALLY C ONTAINS THE FIRST ADDRESS IN THE ARRAY AND IS UPDATED TO I NDIC ATE THE CURRENT ADDRESS BY ADDING THE X INCRE MENT AFTE R EACH TRANS FE R Figure 6-10. Two-Dimensional DMA of a 4x4 Array of 16-bit Words • The Count (Cx) register contains the number of data elements left in the current row. At the beginning of each new row, the I/O processor loads Cx from DAx. When Cx decrements to zero, the I/O processor goes to the next row. 6-84 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Viewing DSP memory as two 32-bit columns, the 4x4 data array on the right would 2-D DMA into the memory space shown below. The callouts on this diagram indicate parameter register values for the 2-D DMA operation, transferring 32-bit words. E D C B A 9 8 7 6 5 4 3 2 2 6 A E 3 7 B F THE GENERAL PURPOSE (GPX) REGISTER CONTA INS THE COUNT OF ARRA Y ELEM ENTS REMAINING IN THE Y-DIRECTI ON. HE RE, GPX=4 TO S TART AND DEC RE MENTS AS E AC H ROW IS PROC ESSE D. DBx THE DIMENSI ON-B (DBX) REGISTER CONTA INS THE OFFSE T FROM THE ADDRESS OF THE LAST ROW ELEMENT TO THE E LEM ENT AT THE START OF THE NEXT ROW . HERE, DBX=1. X-DI RE CTION (ROW ) 1 1 5 9 D GPx Y-DIRECTION (COLU MN) F 0 4 8 C 0 Cx DAx IMx IIx THE C OUNT (CX) REGISTER CONTAINS THE NUMBER OF ARRAY ELEME NTS LEFT IN THE X-DI REC TION (ROW). CX IS RELOADE D FROM DAX W ITH EACH NEW ROW. HERE, C X STA RT S AS 4. THE DIM ENSION-A (DAX) RE GIS TER C ONTAI NS THE NUM BER OF ARRAY ELEME NTS I N THE X-DIRECTI ON. HERE, DAX=4. THE MODIFY (IMX) REGISTER CONTAINS THE X-DIRE CTION INCREMEN T. HERE, IMX=1. THE INDE X (IIX) REGISTER I NITIALLY C ONTAINS THE FIRST ADDRESS IN THE ARRAY AND IS UPDATED TO I NDIC ATE THE CURRENT ADDRESS BY ADDING THE X INCRE MENT AFTE R EACH TRANS FE R Figure 6-11. Two-Dimensional DMA of a 4x4 Array of 32-bit Words ADSP-21160 SHARC DSP Hardware Reference 6-85 Link Port DMA • The Dimension-B (DBx) register contains the data array Y increment (row increment). To start a new row, the I/O processor adds the offset in DBx to the current address to point to the next element in the Y dimension (first location in next row). • The General Purpose (GPx) register contains the data array Y Count. Initially, the GPx contains the number of data elements in the Y dimension (number of rows). The I/O processor decrements this value each time the X count register reaches zero. When Y Count reaches zero, the two-dimensional DMA is finished. To do one-dimensional DMA transfers in two-dimensional DMA mode, programs must set the Y Count GPx to one. a number of parameter registers must update, two DMA Because cycles are required for a row change. For two-dimensional DMA transfers, these register operations occur in the following processor order: 1. During the first DMA cycle, the I/O processor performs the following steps: a. Outputs the current address from the IIx register and starts a DMA memory cycle b. Adds the X Increment value from the IMx register to the current address in the IIx register c. Decrements the X Count in the Cx register d. Checks whether the X Count has decremented to zero; if so, performs the second DMA cycle 6-86 ADSP-21160 SHARC DSP Hardware Reference I/O Processor 2. During the second DMA cycle, the I/O processor performs the following steps: a. Restores the X Count in the Cx register from the DAx register b. Adds the Y Increment value in the DBx register to the current address in the IIx register c. Decrements the Y Count in the GPx register d. Checks whether the Y Count has decremented to zero; if so, the DMA sequence is finished (the channel becomes inactive) loads the X Count ( ) register or Y Count ( ) with Ifzero,a program the I/O processor does not disable DMA transfers on that Cx GPx channel. The I/O processor interprets the zero as a request for 216 transfers. This count occurs because the I/O processor starts the first transfer before the testing the count value. The only way to disable a DMA channel is to clear its DMA enable bit. For more information, see “External Port Channel Transfer Modes” on page 6-21, “Link Port Channel Transfer Modes” on page 6-48, or “Serial Port Channel Transfer Modes” on page 6-52. Bootloading Through The Link Port One of the DSP’s booting mode is booting the DSP through the link port. Link port booting uses DMA channel 8 of the I/O processor to transfer the instructions to internal memory. In this boot mode, the DSP receives 4-bit wide data in link buffer 4. After the boot process loads 256 words into memory locations 0x40000 through 0x400FF, the DSP begins executing instructions. Because most DSP programs require more than 256 words of instructions and initialization data, the 256 words typically serve as a loading routine for the ADSP-21160 SHARC DSP Hardware Reference 6-87 Link Port DMA application. Analog Devices supplies loading routines (Loader Kernels) that load an entire program through the selected port. These routines come with the development tools. For more information on Loader Kernels, see the development tools documentation. channel differences between the ADSP-21160 DSP and pre DMA vious SHARC DSPs (ADSP-2106x DSPs) introduce some booting differences. Even with these differences, the ADSP-21160 DSPs supports the same boot capability and configuration as the ADSP-2106x DSPs. For link booting the ADSP-21160 DSP, the Program sequencer automatically unmasks the DMA channel 8 interrupt, initializing the LIRPTL register to 0x00100000 and IMASK register to 0x00004003. The DSP determines the booting mode at reset from the EBOOT, LBOOT, and BMS pin inputs. When EBOOT=0, LBOOT=1, and BMS=1, the DSP boots through the Link Port. For a list showing how to select different boot modes, see the Boot Memory Select pin description on Table 11-1 on page 11-3. of the power-up booting modes, address When using any should not contain a valid instruction since it is not 0x0004 0004 executed during the booting sequence. A NOP or IDLE instruction should be placed at this location. In Link Port Booting, the DSP gets boot data from another DSP’s link port or four bit wide external device after system powerup. The external device must provide a clock signal to the link port assigned to link buffer 4. The clock can be any frequency, up to a maximum of the DSP clock frequency. The clock’s falling edges strobe the data into the link port. The most significant 4-bit nibble of the 48-bit instruction must be downloaded first. Table 6-20 shows how the DMA Channel 8 parameter registers are initialized at reset for Link Port booting. The count register (C8) is initialized to 0x0100 for transferring 256 words to internal memory. The LCTL and 6-88 ADSP-21160 SHARC DSP Hardware Reference I/O Processor link port control registers are overridden during link port booting to allow link buffer 4 to receive 48-bit data. LCOM Table 6-20. DMA Channel 8 Parameter Register Initialization For Link Port Booting Parameter Register Initialization Value II8 0x0004 0000 IM8 uninitialized (increment by 1 is automatic) C8 0x0100 (256 instruction words) CP8 uninitialized GP8 uninitialized DA uninitialized DB uninitialized In systems where multiple DSPs are not connected by the parallel external bus, booting can be accomplished from a single source through the link ports. To simultaneously boot all of the DSPs, a parallel common connection should be made to Link Buffer 4 on each of the processors. If only a daisy chain connection exists between the processors’ link ports, then each DSP can boot the next one in turn. Link Buffer 4 must always be used for booting. Serial Port DMA The DSP support a number of DMA modes for link port DMA. The following sections provide overviews of typical serial port DMA processes: • “Setting up Serial Port DMA” on page 6-90 • “Using Two-Dimensional Serial Port DMA” on page 6-91 ADSP-21160 SHARC DSP Hardware Reference 6-89 Serial Port DMA Setting up Serial Port DMA The method for setting up and starting an serial port DMA sequence varies slightly with the transfer mode for the channel. For more information on DMA transfer modes, see “Serial Port Channel Transfer Modes” on page 6-52. For more detailed information on serial port DMA features, see “Setting I/O Processor—SPort Modes” on page 6-49. In general, the following sequence describes a typical external to internal DMA operation where an external device transfers a block of data into the DSP’s internal memory using a serial port: 1. The DSP or host (depends on mode) enables the DMA channel’s serial port, setting the port’s SPEN bit in the port’s SRCTLx register. The DSP or host selects a words size using the DTYPE in the ports’s SRCTLx register. 2. The DSP or host (depends on mode) writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and SRCTLx control register, initializing the channel for receive. 3. The DSP or host (depends on mode) sets (=1) the channel’s SDEN bit enabling the DMA process. 4. The external device begins writing data to the RXx buffer (through the serial port). 5. The RXx buffer detects data is present and asserts an internal DMA request to the I/O processor. 6. The I/O processor grants the request and performs the internal DMA transfer, emptying the RXx buffer. 6-90 ADSP-21160 SHARC DSP Hardware Reference I/O Processor In general, the following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the DSP’s internal memory using a serial port: 1. The DSP or host (depends on mode) enables the DMA channel’s serial port, setting the port’s SPEN bit in the port’s STCTLx register. The DSP or host selects a words size using the DTYPE in the port’s STCTLx register. 2. The DSP or host (depends on mode) writes the DMA channel’s parameter registers (IIx, IMx, and Cx) and STCTLx control register, initializing the channel for transmit. 3. The DSP or host (depends on mode) sets (=1) the channel’s SDEN bit enabling the DMA process. Because this is a transmit, setting SDEN automatically asserts an internal DMA request to the I/O processor. 4. The I/O processor grants the request and performs the internal DMA transfer, filling the TXx buffer. 5. The external device begins reading data from the TXx buffer (through the serial port). 6. The TXx buffer detects that there is room in the buffer (it is now “partially empty) and asserts another internal DMA request to the I/O processor, continuing the process. Using Two-Dimensional Serial Port DMA Two-dimensional DMA is available on all link port and serial port DMA channels. This DMA mode lets programs DMA data in memory treating the data as an array. For more information, see “Using Two-Dimensional Link Port DMA” on page 6-83. ADSP-21160 SHARC DSP Hardware Reference 6-91 Optimizing DMA Throughput Optimizing DMA Throughput This section discusses overall DMA throughput when several DMA channels are trying to access internal or external memory at the same time. Table 6-21 on page 6-93 summarizes the advantages of different system configurations. Internal Memory DMA The DMA channels arbitrate for access to the DSP’s internal memory. The DMA controller determines, on a cycle-by-cycle basis, which channel is allowed access to the internal I/O bus and consequently which channel will read or write to internal memory. The priority order of the DMA channels appears in Table 6-1 on page 6-12. Each DMA transfer takes one clock cycle even when different DMA channels are being allowed access on sequential cycles; i.e. there is no overall throughput loss in switching between channels. Thus, four link port DMA channels, each transferring one byte per cycle, would have the same I/O transfer rate as one external port DMA channel transferring data to internal memory on every cycle. Any combination of link ports, serial ports, and external port transfers has the same maximum transfer rate. 6-92 ADSP-21160 SHARC DSP Hardware Reference I/O Processor External Memory DMA When the DMA transfer is between DSP internal memory and external memory, the external memory may have one or more wait states. Figure 6-12 shows an example DMA hardware interface. Table 6-21. Configurations for DSP—DSP (ADSP-2116x) DMA DSP Configuration (Data Source) DSP Configuration (Data Destination) C/T1 Advantages, Disadvantages Bus Master DMA Master Mode (MASTER= 1) TRAN=1, EIx=address of EPBx buffer in destination, EMx= 0 Bus Slave DMA Slave Mode (MASTER= 0), TRAN= 0 1 Advantage: Destination automatically generates interrupt upon completion. Disadvantage: DMA must be programmed on both source and destination. Bus Master DMA Master Mode (MASTER= 1) TRAN=1, EIx=MMS address in destination2, EMx=1 Bus Slave Direct Write 1 Advantage: No programming required for destination. Disadvantage: No interrupt generated upon completion—source must issue vector interrupt to destination. Bus Slave DMA Slave Mode (MASTER= 0), TRAN= 1 Bus Master DMA Master Mode (MASTER= 1), TRAN=0, EIx=address of EPBx buffer in source, EMx=0 33 Advantage: Source automatically generates interrupt upon completion. Disadvantages: Slower throughput. DMA must be programmed on both source and destination. Bus Slave Direct Read Bus Master DMA Master Mode (MASTER= 1), TRAN=0, EIx=MMS address in source2, EMx=1 43 Advantage: No programming required for source. Disadvantages: Slowest throughput. No interrupt generated upon completion— destination must issue vector interrupt to source. 1 2 3 C/T is throughput in cycles/transfer. MMS= Multiprocessor Memory Space Maximum burst throughput: 3-2-2-2, 4-2-2-2 ADSP-21160 SHARC DSP Hardware Reference 6-93 Optimizing DMA Throughput External memory wait states, however, do not reduce the overall internal DMA transfer rate if other channels have data available to transfer. In other words, the DSP’s internal I/O data bus will not be held up by an incomplete external transfer. ADSP-2116X 5 BR2 ADDR23-0 BR1 DATA31-0 DMA DATA BUS LATCH 16, 32, 48, OR 64 D Q DMA READ REQ. OE 3 DMAR1 ID2-0 DMAG1 010 DMA READ GRANT DMAR2 RDH/L WRH/L ACK MS3-0 ADSP-2116X 5 3 COMMON REQ. LINE HBR COMMON GRANT LINE DMAG2 HBG 16, 32, 48, OR 64 DMA DATA BUS LATCH BR1 ADDR23-0 Q BR2 DATA31-0 OE D DMAR1 ID2-0 DMAG1 001 DMA WRITE REQ. DMAR2 DMAG2 HBG RDH/L OE HBR WE ACK WRH/L ACK DMA WRITE GRANT CS MS3-0 ADDR DATA EXTERNAL MEMORY Figure 6-12. Example DMA Hardware Interface 6-94 ADSP-21160 SHARC DSP Hardware Reference I/O Processor Notes on Figure 6-12: Because DMARx and DMAGx are tied together, only one of the DSPs may have DMA enabled at a time. • DMAGx is only driven by the DSP bus master. • The DMA Write Grant signal can be the combination of RDH/L and MS3-0 instead of DMAG2 if paced master mode is used. • The DMA Read Grant signal can be the combination of WRH/L and MS3-0 instead of DMAG1 if paced master mode is used. • DMA transfers may be to either DSP or to external memory (in external handshake mode). CLKIN tHDGC FETCH/DECODE CYCLES—2 CYCLES MIN. DMARX tSDRLC DMAGX tSDRHC tWDGL tVDATDGH DATA31-0 VALID Figure 6-13. DMAR and DMAG Timing ADSP-21160 SHARC DSP Hardware Reference 6-95 Optimizing DMA Throughput Notes on Figure 6-13: • DMARx setup times relate to the use of the signal in that cycle by the DSP. DMA requests may be asserted asynchronously to CLKIN. • drives DATA63-0 if DSP is receiving. DMAGx latches DATA63-0 if DSP is transmitting. DMAGx When data is to be transferred from internal to external memory, the internal memory data is first placed in the external port’s EPBx buffer by the DMA controller; the external memory access is then begun independently. (Likewise for external-to-internal DMA, the internal DMA request will not be made until the external memory data is in the EPBx buffer.) In both cases, the external DMA address generator—the EI and EM parameter registers—maintains the external address until the data transfer is completed. The internal and external address generators of a DMA channel are decoupled and operate independently. When EXTERN mode DMA transfers occur between an external device and external memory, no internal resources of the DSP are utilized and internal DMA throughput is not affected. System-Level Considerations Slave mode DMA is useful in systems with a host processor because it allows the host to access any DSP internal memory location while limiting the address space the host must recognize—only the address space of the DSP’s I/O processor registers. Slave mode DMA is also useful for DSP-to-DSP DMA transfers. Slave mode DMA has one drawback when interfacing to a slow host—the fact that the external bus is held up during the transfer (whether initiated by the DSP or the host) and no other transactions can proceed. To overcome this, the handshake DMA mode may be used. 6-96 ADSP-21160 SHARC DSP Hardware Reference I/O Processor In Handshake mode, the host does not have to master the bus in order to make a DMA request, nor does the DSP (in master mode) have to wait on the bus for the transfer to complete. Instead, the host asserts the DMARx pin. When the DSP is ready to make the transfer, it can complete it in one bus cycle. For more information, see “Handshake Mode” on page 6-34. ADSP-21160 SHARC DSP Hardware Reference 6-97 Optimizing DMA Throughput 6-98 ADSP-21160 SHARC DSP Hardware Reference External Port 7 EXTERNAL PORT The DSP’s external port extends the DSP’s address and data buses off-chip. Using these buses and external control lines, systems can interface the DSP with external memory, 16- or 32-bit host processors, and other DSPs. Because many of the external port operations relate to external memory accessing or I/O processing, this chapter refers to the memory and I/O processor chapters (“Memory” and “I/O Processor”) frequently. Overview This chapter describes connection and timing issues for the external port. The main sections of this chapter describe the interfaces that are available through the external port. These interfaces include: • “External Memory Interface” on page 7-3 • “Host Processor Interface” on page 7-49 • “Multiprocessor (DSPs) Interface” on page 7-91 Data alignment through the external port is identical for these interfaces. Figure 7-1 on page 7-2 shows the external port’s data alignment. ADSP-21160 SHARC DSP Hardware Reference 7-1 Setting External Port Modes DATA63-0 63 55 47 39 31 23 15 BYTE 7 7 0 BYTE 0 RDL/WRL RDH/WRH 64-BIT LONG WORD, SIMD, OR DMA TRANSFERS 64-BIT TRANSFER FOR 48-BIT INSTRUCTION FETCH 64-BIT TRANSFER FOR 40-BIT EXT. PREC. 32-BIT NORMAL WORD(EVEN ADDR) 32-BIT NORMAL WORD (ODD ADDR) RESTRICTED DMA, HOST, EPROM DATA ALIGNMENTS: 32-BIT PACKED 16-BIT PACKED EPROM Figure 7-1. External Port Word Alignment Setting External Port Modes The SYSCON, WAIT, and DMACx registers control the external port operating mode. Table A-17 on page A-45 lists all the bits in SYSCON, Table A-19 on page A-49 lists all the bits in WAIT, and Table A-21 on page A-55 lists all the bits in DMACx. For information about setting up memory access modes (synchronous versus asynchronous interface), see “Setting Data Access Modes” on page 5-28. 7-2 ADSP-21160 SHARC DSP Hardware Reference External Port For information on setting DMA through the external port, see “Setting I/O Processor—EPort Modes” on page 6-14. For information on using external port interrupts, see “Using I/O Processor Status” on page 6-53. is a 3:1 conflict resolution ratio at the external port interface There (three internal buses to one external bus) in addition to the 2:1 or greater clock ratio between the DSP’s internal clock and the external system clock. Systems that fetch instructions or data through the external port must tolerate at least one cycle—and possibly many additional cycles—of latency. External Memory Interface In addition to its on-chip SRAM, the DSP provides addressing of up to 4 gigawords of off-chip memory through its external port. This external address space includes multiprocessor memory space—the on-chip memory of all other DSPs connected in a multiprocessor system—as well as external memory space—the region for standard addressing of off-chip memory. Figure 7-2 shows how the buses and control signals extend off-chip, connecting to external memory. The DSP’s memory control signals permit direct connection to fast static RAM devices. Memory mapped peripherals and slower memories can also connect to the DSP using a user-defined combination of programmable waitstates and hardware acknowledge signals. External memory can hold instructions and data. The external data bus (DATA63-0) must be 64 bits wide to transfer 48-bit instructions and 40-bit extended-precision floating-point data without data packing. If external memory contains only data or packed instructions for transfer by DMA, the external data bus width can be either 16 or 32 bits wide. In a 16- or 32-bit bus system, the DSP’s on-chip I/O processor unpacks incoming ADSP-21160 SHARC DSP Hardware Reference 7-3 External Memory Interface data and packs outgoing data. Figure 7-1 shows how the DSP transfers different data word sizes over the external port. ADSP-2116X CLKIN 4 FLAG3-0 TIMEXP LINK DEVICES (6 MAX) (OPTIONAL) LXACK SERIAL SERIAL DEVICE DEVICE (OPTIONAL) (OPTIONAL) TCLK0 RCLK0 TFS0 RSF0 DT0 DR0 SERIAL SERIAL DEVICE DEVICE (OPTIONAL) (OPTIONAL) TCLK1 RCLK1 TFS1 RSF1 DT1 DR1 LXCLK LXDAT7-0 RPBA ID2-0 RESET ADDR CIF DATA BRST ADDR31-0 ADDR DATA63-0 DATA OE WE RDX WRX ACK ACK CS MS3-0 PAGE SBTS CLKOUT DMAR1-2 BOOT EPROM (OPTIONAL) MEMORY AND PERIPHERALS (OPTIONAL) DATA 3 EBOOT LBOOT IRQ2-0 CS BMS CLK_CFG3-0 ADDRESS 4 CONTROL CLOCK DMA DEVICE (OPTIONAL) DATA DMAG1-2 CS HBR HBG REDY HOST PROCESSOR INTERFACE (OPTIONAL) BR1-6 ADDR PA DATA JTAG 6 Figure 7-2. ADSP-21160 Processor System 7-4 ADSP-21160 SHARC DSP Hardware Reference External Port Table 7-1 defines the DSP pins used for interfacing to external memory. Table 7-1. External Memory Interface Signals Pin Type Function ADDR 31-0 I/O/T External Bus Address. The DSP outputs addresses for external memory and peripherals on these pins. In a multiprocessor system, the bus master outputs addresses for read/writes of the internal memory or IOP registers of other DSPs. The DSP inputs addresses when a host processor or multiprocessing bus master is reading or writing its internal memory or I/O processor registers. DATA 63-0 I/O/T External Bus Data. The DSP inputs and outputs data and instructions on these pins. 32-bit single-precision floating-point data and 32-bit fixed-point data is transferred over bits 63-32 or 31-0 of the bus. 40-bit extended-precision floating-point data is transferred over bits 63-24 of the bus. 16-bit short word data is transferred over bits 47-32 of the bus. Pull-up resistors on unused DATA pins are not necessary. In asynchronous access mode, read data is sampled by the rising edge of the read strobe (DATA 63-32 sampled with RDH, DATA31-0 sampled with RDL). On write operations, the data is driven from rising edge of CLKIN, before the write strobes are asserted. MS3-0 O/T Memory Select Lines. These lines are asserted (low) as chip selects for the corresponding banks of external memory. Memory bank size must be defined in the DSP’s system control register (SYSCON). The MS3-0 lines are decoded memory address lines that change at the same time as the other address lines. When no external memory access is occurring the MS3-0 lines are inactive. In asynchronous access mode, the MSx signal is asserted for the whole access. In synchronous access mode, the MSx signal is only asserted until ACK is sampled asserted. MS0 can be used with the PAGE signal to implement a bank of DRAM memory (Bank 0). In a multiprocessing system the MS3-0 lines are output by the bus master. Unlike previous SHARC DSPs, strobe assertion for conditional instructions occurs only when the instruction condition code evaluates as true. CLKOUT O/T Synchronous output clock. Output clock signal at same rate as CLKIN. Output by current bus master. I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the DSP is a bus slave) ADSP-21160 SHARC DSP Hardware Reference 7-5 External Memory Interface Table 7-1. External Memory Interface Signals (Cont’d) Pin Type Function PAGE O/T DRAM Page Boundary. The DSP asserts this pin to signal that an external DRAM page boundary has been crossed. DRAM page size must be defined in the DSP’s memory control register (WAIT). DRAM can only be implemented in external memory Bank 0; the PAGE signal can only be activated for Bank 0 accesses. In a multiprocessing system, PAGE is output by the bus master. RDH/L I/O/T Read High, and Read Low Strobes. RDH indicates that a read of the high word of the data bus (DATA63-32) is in progress. RDL indicates that a read of the low word of the data bus (DATA31-0) is in progress. As a master, the DSP asserts the strobe after the ADDR31-0 and MS3-0 assert, unless the following bus operation is to the same bank or multiprocessor memory and asserts the same strobe. Timing of the deassertion of the strobe depends upon the access mode. In asynchronous access mode, the strobe is deasserted before the rising edge of CLKIN. For an access to a bank in synchronous access mode, the strobe is deasserted on the rising edge of CLKIN. As a slave, the DSP samples this input to determine the type of bus operation, as well as the size and data alignment for the transfer. WRH/L I/O/T Write High, and Write Low Strobes. WRH indicates that a write on the high word of the data bus (DATA63-32) is in progress. WRL indicates that a write on the low word of the data bus (DATA31-0) is in progress. As a master, the DSP asserts the strobe after the ADDR31-0 and MS3-0 assert, unless the following bus operation is to the same bank or multiprocessor memory and asserts the same strobe. Timing of the deassertion of the strobe depends upon the access mode. In asynchronous access mode, the strobe is deasserted before the rising edge of CLKIN. For an access to a bank in synchronous access mode, the strobe is deasserted on the rising edge of CLKIN. As a slave, the DSP samples this input to determine the type of bus operation, as well as the size and data alignment for the transfer. I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the DSP is a bus slave) 7-6 ADSP-21160 SHARC DSP Hardware Reference External Port Table 7-1. External Memory Interface Signals (Cont’d) Pin Type Function CIF O/T Core Instruction Fetch. As a master, the DSP asserts (low) this output when the program sequencer of the DSP is making an off-chip instruction fetch (read) only. The address generated for this request is a 48-bit instruction pointer. If the instruction fetch is to an address in one of the external memory banks, the MSx output for that bank is also asserted. This output has timing similar to the MS3-0signals. BRST I/O/T Burst Transfer. This signal is asserted (high) by a bus master, to indicate that the current bus read or write transfers a block of data to contiguous, incrementing, 64-bit aligned addresses, over multiple cycles. Each individual data transfer requires an acknowledgment (ACK assertion) from the slave addressed by the transfer. BRST is asserted as an output by the DSP bus master in the cycle after the first cycle in which ACK is sampled asserted. As a synchronous slave, the DSP samples the BRST input to determine if a burst read transfer is in progress. The DSP slave does not support burst write transfers. When interfacing to SBSRAM gluelessly, this output should be connected to the ADSC input of the SBSRAMs (not ADV). ACK I/O/S Memory Acknowledge. External devices can deassert ACK (low) to add waitstates to an external memory access (including individual transfers within a burst access). ACK is used by I/O devices, memory controllers, or other peripherals to hold off completion of an external memory access. As a bus master, the DSP samples this input. In asynchronous access mode, ACK is not sampled until the programmed number of waitstates for the access have been counted. For an access to a bank in synchronous access mode, ACK is sampled each CLKIN cycle even during programmed waitstate count. The DSP has a keeper latch on its ACK pin that maintains the input at the level it was last driven. ACK must be sampled high by the DSP before it asserts the strobe(s) for a bus operation. Slaves must assert ACK before three-stating this signal. As a slave, the DSP deasserts ACK as an output, to add waitstates to a synchronous access of its internal memory or IOP register space. I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the DSP is a bus slave) ADSP-21160 SHARC DSP Hardware Reference 7-7 External Memory Interface maximum flexibility when interfacing the DSP to 32-bit wide For memory, connect the memory’s data lines to the DSP’s DATA63-32 pins; do not connect the A0 pin. This alignment permits more packing options and lets supports easier DMA to the external memory. In DMA accesses to such memory, the DMA uses a stride of two. Figure 7-1 also shows how the DSP stores unpacked data and instructions in the 64-bit wide external memory. The external memory map is organized such that consecutive addresses access adjacent 32-bit memory locations. For off-chip instruction fetches, the program sequencer accesses adjacent 48-bit wide memory locations. ADSP-21160 external memory interface differs from previous The SHARC DSPs. Compared to previous SHARC DSPs, the interface has added signals that support burst transfers and the 64-bit data bus. The synchronous interface delivers greater performance, while the asynchronous interface remains similar to previous SHARC DSPs. The external interface provides glueless support for many asynchronous and/or synchronous devices, including other DSPs. The DSP’s burst transfer protocol supports Synchronous Burst SRAMs (SBSRAMs). Because the memory sub-system uses a 64-bit wide data bus, the DSP has high and low read and write strobes (RDH, RDL, WRH, WRL) to mask and enable 32-bit normal word lanes on the DATA63-0 bus. Note that the least significant bit, ADDR0, of the ADDR31-0 bus may be disregarded during DSP external memory space accesses of 32-bit locations (CIF deasserted), as this information is redundant with the strobes. For more information on packing modes in which the DSP only uses the RDH and WRH pins for accesses, see Table 6-2 on page 6-7. require the least significant address bit to support off-chip Systems instruction execution by the core (Core Instruction Fetch, , CIF asserted), DMA packing modes (including EPROM booting), and host-DSP accesses. 7-8 ADSP-21160 SHARC DSP Hardware Reference External Port External memory can hold both instructions and data. The external memory must support the full width of the data bus (DATA63-0) to achieve maximum performance. If the DSP DAGs generate external accesses to Long word data (including 48-bit instructions or 40-bit Extended Precision Normal word data) or if the DSP accesses external memory while in SIMD mode, the system must implement the full 64-bit external data bus. Also, the system must support the full 64-bit external data bus if the DSP makes burst DMA transfers. ADSP-21160 DSP does not support direct data transfers of The 48-bit instructions or 40-bit extended precision data to or from external memory. For example: dm(0x800100) = r0; // moves 32 MSBs of r0 dm(0x800100) = r0 (LW); // moves 32 bits from r0 // and 32-bits from r1 To move instructions or 40-bit extended precision data to or from external memory, programs should use the PX register as an intermediate 64-bit holding register. Also, programs can use the I/O processor to transfer this data through an EPBx FIFO. For example: dm(0x800100) = px; //moves 48 MSBs of px ADSP-21160’s external PM address bus is 32 bits wide. The The DSP’s DM address, PM address, and I/O processor can address the entire 4-gigaword external memory space. The ADSP-21160’s program sequencer, like previous SHARC DSPs, only can address the low 24-bits of address space. ADSP-21160 SHARC DSP Hardware Reference 7-9 External Memory Interface Banked External Memory The DSP divides external memory into four equal-size, programmable banks. By mapping peripherals into different banks, systems can accommodate I/O devices with different timing requirements. For information on configuring these memory banks for waitstates and synchronous or asynchronous access modes, see “Setting Data Access Modes” on page 5-28. the ADSP-21160 DSP, Bank 0 starts at address 0x0080 0000 On in external memory and is followed in order by Banks 1, 2, and 3. When the DSP generates an address located within one of the four banks, the DSP asserts the corresponding memory select line, MS3-0. The MS3-0 outputs serve as chip selects for memories or other external devices, eliminating the need for external decoding logic. MS0 provides a select line for an optional bank of DRAM memory, when used in combination with the PAGE signal. For more information, see “DRAM Page Boundary Detection” on page 7-15. The MS3-0 lines are decoded memory address lines that change at the same time as the other address lines. When no external memory access is occurring, the MS3-0 lines are inactive. previous SHARC DSPs, strobe assertion for conditional Unlike instructions occurs only when the instruction condition code evaluates as true. Unbanked External Memory The region of external memory above Banks 0-3 is called unbanked external memory space. No MSx memory select line is asserted for accesses in this address space. For information on configuring this unbanked memory for waitstates and synchronous or asynchronous access modes, see “Setting Data Access Modes” on page 5-28. 7-10 ADSP-21160 SHARC DSP Hardware Reference External Port Boot Memory Most often, the DSP only asserts the BMS memory select line when the DSP is reading from a boot EPROM. This line allows access to a separate external memory space for booting. Unbanked memory waitstates and mode are applied to BMS-selected accesses. The BMS output is only driven by the DSP bus master. For more information on booting, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. It is also possible to write to boot memory using BMS. For more information, see “Using Boot Memory” on page 5-29. Idle Cycle A bus idle cycle is an inactive bus cycle that the DSP automatically generates to avoid data bus driver conflicts. Such a conflict can occur when a device with a long output disable time continues to drive after RDH/L is deasserted while another device begins driving on the following cycle. Idle cycles are also required to provide time for a slave in one bank to three-state its ACK driver, before the slave in the next bank enables its ACK driver in the synchronous access modes. ADSP-21160 SHARC DSP Hardware Reference 7-11 External Memory Interface Figure 7-3 shows idle cycle insertion between a synchronous read and a zero-wait, synchronous write in cycle 3. READ OP 1 2 IDLE CYCLE WRITE OP 3 4 5 CLKIN ADDRESS[31:0] MS[3:0] RDH RDL WRH WRL BRST DATA[63:0] ACK Figure 7-3. Idle Cycle Example 7-12 ADSP-21160 SHARC DSP Hardware Reference External Port To avoid this conflict, the DSP generates an idle cycle in the following cases. • On a transition from a read operation to a write operation in the same bank. • On a transition from one bank, or multiprocessor memory ID space to any other bank or multiprocessor slave ID space, independent of access mode. previous SHARC DSPs, the ADSP-21160 DSP does not Unlike support idle cycle insertion on a page boundary crossing. Data Hold Cycle The data hold cycle is another configurable memory access feature for adding cycles much like waitstates, as discussed in “Setting Data Access Modes” on page 5-28. A hold time cycle is an inactive bus cycle that the DSP automatically generates at the end of a read or write to allow a longer hold time for address and data. The address, data (if a write), and bank select (if in banked external memory) remain unchanged and are driven for one cycle after the read or write strobes are deasserted. The DSP inserts the data hold cycle only in asynchronous mode and only if the number of programmed waitstates code is 010–111. Figure 7-4 demonstrates a hold time cycle appended to an asynchronous write access (EBxWS=011). ADSP-21160 SHARC DSP Hardware Reference 7-13 External Memory Interface HOLD TIME CYCLE WRITE OPERATION 1 2 3 4 5 CLKIN ADDRESS[31:0] MS[3:0] WRH WRL DATA[63:0] Figure 7-4. Hold Time Cycle Example ADSP-21160 DSP does not append an Idle cycle after a Hold The cycle. Multiprocessor Memory Space Waitstates and Acknowledge Multiprocessor memory space uses only the synchronous transfer protocols, using the zero-waitstate access for writes and a minimum 1-waitstate access for reads. Slave DSPs deassert ACK if more access time is required. DMA burst transfers are only defined for direct read access of a DSP slave’s internal memory and reads from the external port buffers (EPBx). For more information, see “Multiprocessor (DSPs) Interface” on page 7-91. The ADSP-21160 DSP does not support the ous SHARC DSPs. 7-14 MMSWS bit from previ- ADSP-21160 SHARC DSP Hardware Reference External Port DRAM Page Boundary Detection Applications with large amounts of data may want to use DRAM memory for bulk storage. To simplify interfacing to page-mode or static-column DRAMs, the DSP detects page boundary crossings and outputs the PAGE signal to an external DRAM controller. Figure 7-5 shows an example DSP system with DRAM. Host ADSP-2116x HBR HBG REDY MS0 DATA63-0 ADDR31-0 RDH/L 5 BR1 BR2 3 010 WRH/L BR3-6 ACK PAGE SBTS ID2-0 CS 11 DATA31-0 HBR MS0 HBG DATA63-0 REDY OE ADDR31-0 CS 22 RDH/L WRH/L 5 BR 2-6 BR1 3 ID2-0 4M DRAM RAS CAS WR DATA ADSP-2116x ADDRESS 6 001 ADDR10-0 ACK RAS CAS WR ADDR21-0 RD WR PAGE ACK PAGE SBTS TS DRAM Controller Figure 7-5. Example DRAM Interface ADSP-21160 SHARC DSP Hardware Reference 7-15 External Memory Interface Different interfacing methods may be needed in some applications, especially if buffers are needed for the DRAM. Page boundaries are user-defined. Boundaries must be programmed in the WAIT register. For more information, see “Setting Data Access Modes” on page 5-28. Automatic page boundary detection is provided by the DSP’s PAGE signal. Systems may only place DRAM memory in bank 0 of external memory—the PAGE signal is only active within bank 0. Programs write the page size for page boundary detection in the PAGSZ field of the WAIT register. The DSP asserts the PAGE pin when an external access crosses a page boundary and the address is within bank 0. The processor detects a boundary crossing by comparing each address output for bank 0 to the address of the last successful external access, which is stored in the I/O processor ELAST register. If a memory access is aborted—for example, due to a conditional write, the DSP does not assert the PAGE pin and does not update the current page in ELAST. Also, the DSP does not assert the PAGE pin or update the current page if the access is to multiprocessor memory space or to any memory space other than bank 0 of external memory space. The PAGE pin remains asserted as long as the access is active. PAGE is not asserted if no access is performed. The current page is automatically invalidated and the PAGE pin asserted upon the next external access if: 1) the DSP loses mastership of the external bus to another DSP or to a host processor, or 2) the processor is reset. Programs should not read ELAST in the cycle immediately after it is written, because it may be in the process of updating. The host bus request pin (HBR) is disabled when the PAGE pin is asserted. Disabling HBR prevents the possibility of the DSP becoming a bus slave through deadlock resolution while the DRAM controller is servicing a page change. 7-16 ADSP-21160 SHARC DSP Hardware Reference External Port In page DRAM systems, the DSP may need to recover from DRAM page fault conditions, using the Suspend Bus Three-State (SBTS) pin. External devices can assert the DSP’s SBTS input to place the external bus address, data, selects, burst, and strobes in a high-impedance state. This input is sampled by the DSP on the rising edge of CLKIN. The DSP external bus outputs three-state later in the cycle in which SBTS is sampled asserted. If the DSP core attempts to access external memory while SBTS is asserted, the processor halts and the memory access does not complete until SBTS is sampled deasserted. should only be used to recover from DRAM page faults or host processor/DSP deadlock condition. For more information, see SBTS “Deadlock Resolution” on page 7-84. In the case of DRAM page faults, SBTS allows the external DRAM controller to take control of the external bus. SBTS three-states the signals. Table 7-2. Signals SBTS Three-States ADDR31-0 RDH/L BRST DMAG1 MS3-0 CIF PAGE DATA63-0 WRH/L DMAG2 BMS CLKOUT When the DSP uses SBTS for resolving bus deadlock, SBTS operates differently than when a host processor uses and SBTS and HBR. For more information see how the host processor uses SBTS and HBR as discussed at the end of the section “Synchronous Burst Read Transfers” on page 7-65. When SBTS is asserted, the DSP places the external bus address, data, selects, and strobes in a high-impedance state for the following cycle. If an external access is underway when SBTS is asserted, the access is held off (as if ACK were deasserted), the bus is three-stated, and the memory access continues in the cycle after the deassertion of SBTS. If SBTS is asserted while no external access is occurring, the external bus pins are three-stated and the DSP continues running until it tries to perform an external access. ADSP-21160 SHARC DSP Hardware Reference 7-17 External Memory Interface The DSP then halts. In this case, the memory access begins in the cycle after the deassertion of SBTS. When SBTS is deasserted, the DSP reasserts the RDH/L, WRH/L, and DMAGx strobes—if they had been asserted prior to SBTS—after the external address has become valid, asserting them at their normal timing within the cycle. The waitstate counter is reset. differs from HBR in that SBTS takes effect in the next cycle, even if an external access is occurring but not finished. Systems should only use SBTS when the external access is to a device such as a DRAM or cache memory where the access must be held off in order to prepare for the access. Using SBTS at other times—such as during DSP-to-DSP accesses or when DMAGx is asserted—results in incorrect operation. SBTS Timing External Memory Accesses Memory access timing for external memory space and multiprocessor space is the same. For exact timing specifications, refer to ADSP-21160 DSP Microcomputer Data Sheet. The DSP can interface to external memories and memory-mapped peripherals that operate asynchronously with respect to CLKIN. The DSP also supports synchronous external memories and memory-mapped peripherals. Synchronous devices derive all of their bus timing with respect to CLKIN of the DSP. The synchronous interface mode supports DMA burst transfers, which can significantly improve bus throughput for large, contiguous block transfers. The synchronous interface protocols are compatible with Synchronous Burst SRAMS (SBSRAMs) from a variety of vendors. In a multiprocessing system, the DSP must be the bus master in order to access external memory. 7-18 ADSP-21160 SHARC DSP Hardware Reference External Port Asynchronous Mode Interface Timing Figure 7-6 shows typical timing for an asynchronous read or write of external memory. Here, the CLKIN clock signal appears only to indicate that the access occurs within a single CLKIN cycle. All timing for the master DSP is derived synchronously from CLKIN. The asynchronous slave mode modifies the basic synchronous access to better support slaves whose timing is not derived from CLKIN. CLKIN ADDRESS[31:0] READ/WRITE ADDRESS MS[3:0] RDH/L OR WRH/L (WRITE) DATA[63:0] WRITE DATA (READ) DATA[63:0] ACK Figure 7-6. External Memory Asynchronous Access Cycle ADSP-21160 SHARC DSP Hardware Reference 7-19 External Memory Interface Figure 7-7 shows timing relationships employed by the asynchronous external access mode. 1 2 CLKIN ADDRESS[31:0] A0 MS[3:0] BRST RDH RDL WRH WRL ACK DATA[63:0] D0 Figure 7-7. Asynchronous Access Timing Derivation 7-20 ADSP-21160 SHARC DSP Hardware Reference External Port In this mode, the following occurs. • The strobes assert and deassert based on timing derived from an internal clock whose frequency is twice that of the core clock. (This differs from synchronous mode where the strobes assert from the same edge.) The trailing edge timing is derived from the rising edge of the internal version of CLKIN. • The MSx memory select lines are held stable for the entire access. (This differs from synchronous read or synchronous write—minimum 2-cycle—modes where the memory select lines are deasserted after the first ACK-ed cycle of the transfer.) • For read operations, DATA63-32 are sampled by the DSP on the rising edge of the RDH. DATA31-0 are sampled by the rising edge of RDL. (This differs from synchronous mode where DATA63-0 are sampled by the internal version of CLKIN.) Asynchronous Mode Read – Bus Master DSP bus master reads of external memory, in asynchronous mode, occur with the following sequence of events as shown in Figure 7-6 on page 7-51. 1. The DSP samples ACK synchronously. If ACK is asserted, the DSP drives the read address and asserts a memory select signal (MS3-0) to indicate the selected bank. A memory select signal is not deasserted between successive accesses of the same memory bank. If ACK is sampled deasserted, the DSP waits one CLKIN cycle and samples ACK again. 2. The DSP asserts the read strobes. Strobe assertion is determined by the size and alignment of the data transfer. For more information on data alignment, see Figure 7-1 on page 7-2. ADSP-21160 SHARC DSP Hardware Reference 7-21 External Memory Interface 3. The DSP checks whether waitstates are needed. If so, the memory select and read strobe remain active for additional cycles. Waitstates are determined by a combination of the state of the external acknowledge signal (ACK) and the internally programmed waitstate count. 4. The DSP deasserts the read strobe(s) in the cycle where no further waitstates are indicated. The data bus (DATA63-0) is sampled on the rising edge of the read strobe(s). 5. If a Hold cycle is programmed for the accessed bank (via the EBxWS parameter of the WAIT register), the address bus and memory selects are held stable for an additional cycle. If initiating another read memory access to the same bank, the DSP drives the address and memory select for that access in the next cycle. Asynchronous Mode Write – Bus Master DSP bus master writes to external memory, in asynchronous mode, occur with the following sequence of events as shown in Figure 7-5 on page 7-15. 1. The DSP samples ACK synchronously. If ACK is asserted, the DSP drives the write address and asserts a memory select signal (MS3-0) to indicate the selected bank. A memory select signal is not deasserted between successive accesses of the same memory bank. The DSP also drives the write data (DATA63-0). If ACK is sampled deasserted, the DSP waits one CLKIN cycle and samples ACK again. 2. The DSP asserts the write strobes. Strobe assertion is determined by the size and alignment of the data transfer. For more information, Figure 7-1 on page 7-2. 3. The DSP checks whether waitstates are needed. If so, the memory select and write strobes remain active for additional cycles. Waitstates are determined by the state of the external acknowledge signal (ACK) and the internally programmed waitstate count. 7-22 ADSP-21160 SHARC DSP Hardware Reference External Port 4. The DSP deasserts the write strobes near the end of the cycle where no further waitstates are indicated. 5. The DSP three-states its data outputs, unless the next access is also a write to the same bank, or if a Hold cycle is programmed for the accessed bank using the EBxWS parameter of the WAIT register. If a Hold cycle is inserted, the address bus, data bus, and memory selects are held stable for an additional cycle. If initiating another memory access to the same bank, the DSP drives the address, memory select for the next access in the following cycle. Synchronous Mode Interface Timing Any slave addressed by a DSP in a bank configured for synchronous transfer mode must use a clock with the same frequency and phase characteristics to the clock which drives CLKIN on the DSP. The slave samples all inputs, and drives all outputs on the rising edge of this clock. Except for zero-waitstate writes, the slave must assert ACK at least twice for each access; once to acknowledge the address/command (strobe assertion) and once (if not a burst) or more to acknowledge the data transfer. The following notes apply to all synchronous access modes: • A slave recognizes the start of a valid bus operation by synchronously sampling one or more of the strobes asserted and ACK asserted—but not by this slave, which would indicate the end of the transfer. • For each of the non-burst, synchronous read/write accesses (except zero-waitstate writes), the master recognizes the end of the access as the cycle in which 1) the slave samples or drives data in response to a valid operation driven by the master (read or write), 2) the slave asserted ACK to the master {except for zero-waitstate write ADSP-21160 SHARC DSP Hardware Reference 7-23 External Memory Interface operations}, and 3) the number of waitstates for read or write access to that bank have occurred—asserting ACK does not terminate the wait count early. • The program must select a number of waitstates that is consistent with the access time for the slave addressed by that external memory bank. • For the zero-waitstate writes, the access can only be extended beyond one clock cycle by deasserting ACK in the cycle of the transfer. This extension can occur on back-to-back writes in which ACK is deasserted due to full write buffer capacity from the previous write, or slaves can asynchronously deassert ACK in the first cycle. • Deasserting ACK during the initial command phase does inhibit waitstate count and change of bus signals. After the first ACK assertion, deasserting ACK for the data phase does not inhibit waitstate counting. • Only one slave (or driver for ACK) should be allocated per external memory bank. More than one slave may introduce ACK drive contention. • The read/write strobes for an access do not assert until ACK sampled asserted. This conditional strobe assertion delays the start of an access until ACK is asserted by the previous slave. This sampling is because the slave target of a single-cycle write operation may have to deassert ACK in the cycle after the bus cycle, to stall further writes to that slave. To provide a cycle for the previous slave to three-state its ACK driver before the next slave drives ACK, the next operation to a new bank must not launch on the bus. 7-24 ADSP-21160 SHARC DSP Hardware Reference External Port • Write/read access stalls (no state change, other than internal waitstate counting) on the bus if ACK is deasserted in cycle(s) of data transfer. • The last read/write operation must be ACK-ed before a transition to a new bus master (BTC), bank, or multiprocessor space slave occurs. The master always inserts an Idle cycle on this transition. No pipelining can occur across these boundaries. Synchronous Mode Read – Bus Master An example synchronous read cycle appears in Figure 7-8. 1 2 CLKIN ADDRESS 31-0 MS3-0 RDH RDL WRH WRL BRST DATA 63:0 ACK Figure 7-8. Typical Synchronous Read Timing ADSP-21160 SHARC DSP Hardware Reference 7-25 External Memory Interface Propagation delays are not shown in this timing diagram. Because a synchronous access requires a rising clock edge for the slave to sample the asserted signals of the master (and for the master to sample slave), the minimum read access in the synchronous mode is two cycles. synchronous access mode, the waitstate selection in the Inregister ) must be 001 or greater. =000 is not sup( WAIT EBxWS EBxWS ported in synchronous access mode. This example demonstrates a minimum latency, one-waitstate, 32-bit (normal word) read, from an even address in external memory (had the 32-bit access been to an odd 32-bit address, RDH would have asserted instead of RDL.) that do not support the entire 64-bit data bus width do not Slaves have to connect to both read strobes. Also, slaves that do not support bursting protocols do not need to connect to the BRST signal. Bus master synchronous reads from external memory occur with the following sequence of events as shown in. 1. (cycle 1 in Figure 7-8 on page 7-25) If ACK is sampled as asserted at the beginning of cycle 1, the DSP drives the read address and asserts a memory select signal (MS3-0) to indicate the selected bank. The DSP asserts the RDH/RDL strobes to indicate the size and alignment of the requested data. The read strobes are not deasserted between successive read accesses of the same memory bank. If the size or alignment changes, strobe assertion also changes. Strobe assertion is determined by the size and alignment of the data transfer. For more information on data alignment, see Figure 7-1 on page 7-2. 2. (cycle 2) If ACK was sampled as deasserted at the beginning of the cycle, the MSx strobes would remain asserted. If ACK was sampled asserted (as shown in Figure 7-8), the MSx strobes would deassert. The slave must be capable of detecting that MSx was asserted in 7-26 ADSP-21160 SHARC DSP Hardware Reference External Port cycle 1 and retain this information internally. If ACK was deasserted by the previous slave (for a single-cycle write), deassertion of the MSx is delayed. 3. (cycle 2) The DSP checks whether more than one waitstates are needed. If so, the read strobes remain active for additional cycle(s). Waitstates are determined by a combination of the state of the external acknowledge signal (ACK) and the programmed waitstate count. 4. (end of cycle 2) The data bus (DATA63-0) is sampled on the rising edge of CLKIN. 5. (cycle 3) If initiating another read memory access to the same bank, the DSP drives the address, memory select, and strobes for the next access. Figure 7-1 on page 7-HIDDEN shows back-to-back reads to the same bank with the second access stalled for one cycle by the slave deasserting ACK. This example assumes that the EBxWS=001 for this bank, indicating one internal waitstate. Synchronous Write, Zero-Waitstate Mode Figure 7-9 shows typical synchronous write cycle timing. Propagation delays are not shown in this timing diagram. Synchronous access requires a rising clock edge for the slave to sample the asserted signals of the master (and for master to sample slave). In the case of writes, the latency can be reduced to a single cycle if the slave always latches the bus signals on each clock cycle (it does not sample ACK). For example, the slave cannot sample the bus, decode that it is being addressed as a slave, and sample the write data of the bus in the following cycle. ADSP-21160 SHARC DSP Hardware Reference 7-27 External Memory Interface 1 2 3 CLKIN ADDRESS[31:0] WRITE #1 WRITE #2 WRITE #1 WRITE #2 MS[3:0] RDH RDL WRH WRL BRST DATA[63:0] STALL 2ND WRITE ACK Figure 7-9. Typical Synchronous Write Example The slave samples the bus each cycle and decodes the sampled value to determine if that slave was addressed by the write operation. If the slave’s write queue goes full with that write, the slave deasserts ACK in the cycle after the write operation transferred on the bus. Any subsequent bus operation (read or write) stalls until ACK is sampled asserted, as shown in Figure 7-9. 7-28 ADSP-21160 SHARC DSP Hardware Reference External Port The example demonstrates a minimum latency, zero-waitstate, 64-bit (Long word) write in cycle 1 followed by a write to the same bank that stalls because ACK is deasserted in cycle 2 in response to the write in cycle 1. The second access is a 32-bit write to an odd address in external memory. If the 32-bit access went to an even 32-bit address, WRL would have asserted instead of WRH. The zero-waitstate write mode provides the highest performance if the slave has sufficient write buffer storage. Systems should use this mode where the slave can always accept one write transfer (unless it has ACK deasserted) and can generally accept more than one write. If the slave has only one store buffer, such that it always deasserts ACK after the first write, the one-waitstate write mode may be the better choice. The zero-waitstate write mode is targeted towards ASIC/FPGA designs, which can likely implement multiple write buffers (including DSP as a slave), and fully pipelined synchronous devices such as SBSRAMs. that do not support the entire 64-bit data bus width do not Slaves have to connect to both write strobes. Also, slaves that do not support bursting protocols do not need to connect to the BRST signal. Bus master synchronous writes to external memory occur with the following sequence of events as shown in Figure 7-9 on page 7-28. 1. (cycle 1 in Figure 7-9) If ACK is sampled asserted at the start of cycle 1, the DSP bus master drives the write address and asserts a memory select signal (MS3-0) to indicate the selected bank. The DSP asserts the WRH/WRL strobe(s) to indicate the size and alignment of the requested data. The write strobes are not deasserted between successive writes accesses of the same memory bank. If the size or alignment changes, strobe assertion also changes. Strobe assertion is determined by the size and alignment of the data transfer. For more information on data alignment, see Figure 7-1 on page 7-2. ADSP-21160 SHARC DSP Hardware Reference 7-29 External Memory Interface 2. (cycle 1) The previous slave three-states ACK. The keeper latch on the DSP master keeps ACK at the asserted value until driven by the next slave. Note that the slave could have driven ACK through cycle 1. Only one slave is supported per bank, and any bank transition has an Idle cycle inserted to provide time for the slave to three-state ACK. 3. (cycle 2) The DSP is initiating another write memory access to the same bank. It drives the address, memory select, and strobes for the next access. 4. (cycle 2) The slave, having decoded that it received a valid write operation in the previous cycle, detects that it cannot accept further bus operations until the (or an element in the) write queue becomes available, so it deasserts ACK. 5. (cycle 3) The DSP samples ACK deasserted by the slave. It inserts waitstates until ACK is sampled asserted. The write ends in the cycle in which ACK is sampled asserted by the slave (end of cycle 3). Figure 7-10 shows a zero waitstate write, followed by a synchronous read from the same bank. The slave addressed by both accesses determines in cycle 2 that it has no more write capacity. It deasserts ACK in this cycle, in response to the write in cycle 1. In cycle 3, the slave determines that it is now addressed by the master to perform a read and asserts ACK to acknowledge the transfer. The slave asserts ACK in cycle 4 when read data is available to complete the data transfer. The memory select for the read access is held asserted by the master until cycle 4, because ACK was deasserted in cycle 2. In this example, both operations use the full data bus width, as indicated by both WRH/L and RDH/L strobes asserted in for the write and the read. 7-30 ADSP-21160 SHARC DSP Hardware Reference External Port 1 2 3 4 5 CLKIN write address ADDRESS[31:0] read address MS[3:0] RDH RDL WRH WRL BRST write data read data DATA[63:0] ACK Figure 7-10. Synchronous Write Followed by Synchronous Read Example Synchronous Write, One Waitstate Mode Because some synchronous slaves cannot support a free-running latch function to capture zero-wait bus writes, the DSP also supports a minimum two-cycle (minimum one-waitstate) write access. This mode is set using the bank Access Mode bits (EBxAM). For more information on access modes, see Table A-19 on page A-49. The one-waitstate, synchronous write access is shown in the second write of Figure 7-11. ADSP-21160 SHARC DSP Hardware Reference 7-31 External Memory Interface WRITE #1 1 IDLE 2 WRITE #2, DIFFERENT BANK 3 4 CLKIN ADDRESS[31:0] MS[3:0] RDH RDL WRH WRL BRST DATA[63:0] ACK Figure 7-11. Asynchronous Write Followed by Synchronous Write One-Waitstate Mode In this example, the first access is to a bank configured for asynchronous writes (cycle 1). In Figure 7-11, this condition is shown by the deassertion of the write strobes before the rising edge of CLKIN for cycle 2. In cycle 2, a bank transition occurs, and an idle cycle is inserted to allow the slaves to transition ownership of ACK. In cycle 3, the second write begins, to a new bank configured for one-waitstate write access. The address and data are held for a minimum of two cycles. 7-32 ADSP-21160 SHARC DSP Hardware Reference External Port Similar to the synchronous read, MSx deassert in the next cycle (cycle 4), and the waitstate counter decrements if ACK is sampled asserted. The access can be held off the bus by deasserting ACK in cycle 2, or extended by deasserting ACK in cycle 3 (unlikely for a synchronous slave) or cycle 4. Synchronous Burst Mode Interface Timing Synchronous burst mode provides improved performance on synchronous operations, read operations in particular. The DSP supports a DMA-mastered (only) burst mode. If the addressed slave supports this burst transfer, after the one or more waitstates associated with access to the first 64-bit read data transfer, contiguous data can transfer on each subsequent clock cycle, up to a maximum of four 64-bit transfers. Burst accesses support only 64-bit data transfers. Partial data bus width transfers are not supported. For burst transfers, the master drives the address of the first access on the bus during the entire burst transfer. The master does not increment the address for the slave. The maximum length of the burst transfer is four. So, slaves only need a 2-bit address incrementer to generate the offset address from the address driven by the master on the bus. Burst length determination as a function of initial address is shown in Table 7-3. Table 7-3. Linear Burst Address Order First Address[2:1] (external) Second Address (internal) Third Address (internal) Fourth Address (internal) 00 01 10 11 01 10 11 Burst Terminated1 10 11 Burst Terminated1 11 Burst Terminated2 1 2 Master always terminates burst when internal address[2:1] = 11 Master transfers this case as a single synchronous access ADSP-21160 SHARC DSP Hardware Reference 7-33 External Memory Interface If the DMA channel has sufficient data to transfer, it initiates a new burst transfer starting at ADDR2-1=00, 01, or 10 when it wins bus arbitration. Bursts always terminate when ADDR2-1=11. An example of a synchronous burst read, of length three appears in Figure 7-12. Here, the bank employed in the transfer has 2 waitstates. 1 2 3 4 5 CLKIN ADDRESS[31:0] ADDRESS[2:1] = 01 MS[3:0] RDH RDL WRH WRL BRST DATA[63:0] ACK Figure 7-12. External Memory Synchronous Burst Read Example Burst Length Determination The DMA arbitration logic amortizes the initial access latency by bursting up to the maximum burst length of four when possible, assuming the 7-34 ADSP-21160 SHARC DSP Hardware Reference External Port channel is burst enabled. When a DMA channel wins internal I/O processor arbitration, the channel drives the internal buses as with a non-burst transfer. At the same time, the I/O processor detects whether it can perform a burst transfer, according to the following criteria. 1. The DMAC burst enable (MAXBL1-0) control bit field is set for that DMA channel. For more information on setting up a burst transfer, see the 64-bit External Burst Transfers discussion on page 6-27. 2. The EI register points to a 64-bit aligned address, 3. The EM register is set to 0 or 1. A value of 0 does not increment EI. This feature is useful when bursting to or from a registered data port, buffer, or register, such as the EPBx FIFOs of another DSP. 4. The EC register is >= 4 (four 32-bit words equals two 64-bit transfers). 5. The EPB FIFO for that channel has at least four 32-bit words to transfer for an external burst write or has at least four empty 32-bit elements to receive data for an external burst read. 6. The two least significant bits of the 64-bit DMA channel external address are not set (ADDR2-1 does not equal 11). Burst Stall Criteria If I/O processor determines that it can perform a burst transfer (according to the burst length criteria), the arbitration between the processor core and I/O processor locks or parks the effective arbitration grant to that DMA channel until: 1. The DMA channel external ADDR2-1 = 11. By disconnecting the burst on this boundary, a modulo4 (ADDR31-1) is effectively implemented, which is required by SBSRAMs, and other slaves with ADSP-21160 SHARC DSP Hardware Reference 7-35 External Memory Interface limited address incrementing capability. For DSP-based systems, slaves only need a 2-bit counter to support the address incrementing function of the burst. 2. Space in the EPB FIFO drops to less than four 32-bit elements (if a external bus read), or less than four valid 32-bit elements for external bus writes. This almost full or empty detection is required by the master logic to deassert BRST on the cycle before the end of the burst. 3. EC goes to < 4; the burst pin must negate at EC=2. 4. and SBTS are asserted on the external bus, indicating the deadlock resolution case in which the DSP must three-state its outputs and switch into slave mode. For more information, see “Deadlock Resolution” on page 7-84. Assertion of either signal alone does not terminate the burst early. HBR assertion does not receive an HBG until the burst finishes. SBTS assertion causes the master to three-state outputs and insert waitstates. HBR If any of these conditions occur, normal arbitration between the processor core and I/O processor for the external bus occurs. If the same bursting channel wins arbitration again, a new burst is initiated, introducing at least one lost or dead cycle in the burst throughput for reads. 7-36 ADSP-21160 SHARC DSP Hardware Reference External Port When arbitration occurs, the DMA channel loses arbitration if any of the following conditions are detected: 1. Higher priority external request for the bus: a. HBR asserted. b. BRx asserted and BMAX time out has occurred. c. BRx asserted and PA asserted, but not by this master. 2. Higher priority internal I/O processor requester: a. Processor core request (DAGs or program sequencer) b. A higher priority request from another DMA channel or direct read/write access causes this channel to lose arbitration. For more information, see “I/O Processor” Synchronous Burst Reads External memory synchronous burst reads occur with the following sequence of events as shown in Figure 7-12 on page 7-34: 1. (cycle 1 in) If ACK is sampled asserted at the beginning of cycle 1, the DSP drives the read address and asserts a memory select signal (MS3-0) to indicate the selected bank. 2. (cycle 1) The DSP asserts both RDH/RDL strobes to indicate a 64-bit read request of the slave. 3. (cycle 2) As with the non-burst synchronous read, the DSP deasserts the MSx output signal, asserts the BRST output signal and enables waitstate counting if ACK is sampled asserted at the end of cycle 1. 4. (cycle 2) The DSP checks whether more than one waitstates (2 waitstates for this example) are needed. If so, BRST and the read strobes remain active for additional cycle(s). ADSP-21160 SHARC DSP Hardware Reference 7-37 External Memory Interface 5. (cycle 3) The slave samples BRST asserted, informing it that the master requests at least one more 64-bit transfer after the current transfer is ACK-ed by the slave. 6. (cycle 3) The programmed number of waitstates (for example, 2) have been counted, and the slave is driving 64-bits of valid data and asserting the ACK signal. This ends the first access. 7. (cycle 4) The slave drives the next 64-bits of contiguous data and asserts ACK. If the slave needs more time to service any one transfer within the burst, it can deassert ACK to stall the bus transfer. 8. (cycle 4) The slave samples BRST asserted, informing it that the master requests at least one more 64-bit transfer. 9. (cycle 5) The master deasserts BRST to inform the slave that this is the last transfer of the burst. In this example, the master deasserts BRST due to the address modulo4 function. The two LSBs of the initial 64-bit address = 01. The slave increments the address as 01->10->11, the maximum offset it needs to support from the initial address. 10.(cycle 5) The slave drives valid data for the last transfer, and asserts ACK. 11.(cycle 6) If initiating another burst read memory access to the same bank, the DSP asserts the address, memory select, and strobes for the next access. This introduces at least two dead cycles in the back-to-back burst throughput, because the initial waitstate count applies to the first access of the second burst. 12.(cycle 6) With BRST sampled deasserted, the slave concludes its service of the burst request by three-stating the DATA63-0 and ACK drivers. As a master, the DSP supports burst reads on each of the four external port DMA channels. Each channel has an independent burst enable 7-38 ADSP-21160 SHARC DSP Hardware Reference External Port control field (MAXBL1-0). For more information on setting up a burst transfer, see the 64-bit External Burst Transfers discussion on page 6-27. As a slave, the DSP supports read bursts from internal memory or the EPBx buffers (with the EPBx read). For more information, see “Multiprocessor (DSPs) Interface” on page 7-91 and “Host Processor Interface” on page 7-49. reads of the FIFO are destructive, the DSP slave must Because deassert on each transfer of the burst to guarantee that it samEPBx ACK ples the deasserted BRST input before committing the EPBx FIFO read. If the system design employs a similar destructive read data buffer, similar precautions should be employed if burst reads of the buffer are supported. Synchronous Burst Writes The DSP can master burst read and write operations in the one-waitstate write access mode (EBxAM=10) if one or more DMA channels are configured appropriately. The DSP can master non-burst, zero-waitstate, writes every cycle. Burst write transfers are not supported in this access mode. Synchronous external devices which require at least one cycle of write access latency (for example, bus bridges, SDRAM controllers, and others) may be able to optimize throughput for burst write operations, based on the contiguous, incrementing block transfer information conveyed by the burst protocol. Burst accesses support only 64-bit data transfers. Partial data bus width transfers are not supported. An example of a synchronous burst write appears in Figure 7-13. Here, the bank employed in the transfer has the 1 waitstate mode, for the first write of the burst. ADSP-21160 SHARC DSP Hardware Reference 7-39 External Memory Interface 1 2 3 4 5 CLKIN ADDRESS[31:0] ADDRESS[2:1]=00 MS[3:0] RDH RDL WRH WRL BRST DATA[63:0] ACK Figure 7-13. External Memory Synchronous Burst Write Example 7-40 ADSP-21160 SHARC DSP Hardware Reference External Port External memory synchronous burst writes occur with the following sequence of events as shown in Figure 7-13 on page 7-40. 1. (cycle 1) If ACK is sampled asserted at the start of cycle 1, the DSP drives the write address and asserts a memory select signal (MS3-0) to indicate the selected bank. The DSP also drives valid data in this cycle. The DSP asserts both WRH/WRL strobes to indicate a 64-bit write command to the slave. 2. (cycle 2) The slave samples the write command and address. At this point, the slave does not see that a burst write is in progress—the access looks identical to a non-burst synchronous write. If the slave cannot accept the write command, it deasserts ACK in this cycle to stall the bus until it can. In this example, it has buffer capacity to accept all of the data of the burst, so ACK stays asserted. 3. (cycle 2) If ACK was sampled asserted at the start of the cycle, the DSP asserts the BRST output signal and deasserts the MSx output signal. 4. (cycle 3) The DSP samples ACK asserted by the slave at the start of the cycle, so it increments the data bus to the second of four data transfers within the burst. 5. (cycle 3) The slave samples BRST asserted at the start of the cycle, informing it that the master is writing at least one more 64-bit transfer. The slave samples the second of four data transfers within the burst and asserts ACK. 6. (cycle 4) The DSP samples ACK asserted by the slave at the start of the cycle, so it increments the data bus to the third of four data transfers within the burst. ADSP-21160 SHARC DSP Hardware Reference 7-41 External Memory Interface 7. (cycle 4) The slave samples BRST asserted at the start of the cycle, informing it that the master is writing at least one more 64-bit transfer. The slave also samples the third of four data transfers within the burst, and asserts ACK. If the slave needs more time to service any one transfer within the burst, it can deassert ACK to stall the bus transfer. 8. (cycle 5) The DSP samples ACK asserted by the slave at the start of the cycle, so it increments the data bus to the last of four data transfers within the burst. The master deasserts BRST to inform the slave that this is the last transfer of the burst. 9. (cycle 5) The slave samples BRST asserted at the start of the cycle, informing it that the master is writing at least one more 64-bit transfer. The slave samples the fourth of four data transfers within the burst and asserts ACK. 10.(cycle 6) If initiating another write burst memory access to the same bank, the DSP asserts the address, memory select, and strobes for the next access. This introduces at least one dead cycle in the back-to-back burst throughput, because the initial waitstate count applies to the first access of the second burst. 11.(cycle 6) With BRST sampled deasserted, the slave concludes its service of the burst request by three-stating the ACK driver. As a master, the DSP supports burst writes on each of the four external port DMA channels. Each channel has an independent burst enable control field (MAXBL1-0). For more information on setting up a burst transfer, see the 64-bit External Burst Transfers discussion on page 6-27. ADSP-21160 DSP does not support burst writes. The AsDSPa slave, supports single cycle writes, so burst writes would provide no added performance improvement. 7-42 ADSP-21160 SHARC DSP Hardware Reference External Port Using External SBSRAM The DSP can connect to a variety of synchronous burst static RAMs (SBSRAMs) with a glueless interface—no external logic required. These synchronous memories can provide high throughput, especially when employing the burst read transfer modes. The DSP has features to support SBSRAMs from a number of memory vendors. The DSP can support flow-through, pipelined and ZBT SBSRAMs. Where bus frequency and system organization features like trace lengths, capacitive loading, and termination characteristics allow, using flow-through devices delivers lower latency and higher system performance. The DSP can support SBSRAMs on any of the four external memory banks. The DSP supports SBSRAM single transfer reads and writes and SBSRAM burst read transfer operations. Single cycle burst write transfers are not supported. SBSRAM support is enabled by configuring the bank access mode ( EBxAM) bits for synchronous, 1-cycle writes and waitstate (EBxWS) bits for 1 waitstate (flow-through SBSRAMs) or 2 waitstates (fully pipelined SBRAMs). For more information on programming access modes and waitstates, see the WAIT register bits in Table A-19 on page A-49. If burst read transfer capability is needed, one or more of the external port DMA channels must be configured appropriately. For more information on setting up a burst transfer, see the 64-bit External Burst Transfers discussion on on page 6-27. Because burst transfers are controlled at the DMA channel, the DMA sequence must make sure that the DMA burst transfer addresses a memory bank or slave that supports the read burst transfer. Figure 7-14 and Table 7-4 on page 7-45 show how the DSP I/O should be connected to the SBSRAM I/O. Table 7-4 assumes a 512KByte ADSP-21160 SHARC DSP Hardware Reference 7-43 External Memory Interface SBSRAM array consisting of one bank of two 3.3V, 32K x 32 devices. The names of the SBSRAM signals may vary from one vendor to another. ADSP-21160 SBSRAM 32Kx32 ADDR[15:0] ADDR[31:1] MS0 CE1 BRST RDH ADSC CE BWE BW[4:1] OE RDL WRH ADSP GW WRL LBO ADV DATA[63:32] DATA[63:0] CE2 DATA[31:0] CLKIN CLKIN ZZ SBSRAM 32Kx32 ADDR[15:0] CE1 ADSC OE GW DATA[31:0] DATA[31:0] CLKIN ADSP CE BWE BW[4:1] LBO ADV CE2 ZZ Figure 7-14. SBSRAM System Interface Example 7-14 is for illustrative purposes—actual system designs may Figure differ and must be carefully analyzed to determine the actual system topology. 7-44 ADSP-21160 SHARC DSP Hardware Reference External Port Table 7-4. ADSP-21160 to SBSRAM Signal Mapping DSP SBSRAM Comment CLKIN CLK Both devices driven by same input clock ADDR16-1 ADDR15-0 Read/Write strobes decode bit 0 of address MSx CE Chip Enable, active low BRST ADSC Address Status Controller, active low RDH OE Asynchronous Output Enable of SBSRAM #1, active low RDL OE Asynchronous Output Enable of SBSRAM #2, active low WRH GW Global Write Enable of SBSRAM #1, active low WRL GW Global Write Enable of SBSRAM #2, active low DATA63-32 DATA31-0 I/O of SBSRAM #1 (High word of bus, odd address) DATA31-0 DATA31-0 I/O of SBSRAM #2 (Low word of bus, even address) No connect CE Chip Enable, active high, Always Asserted (Vdd) No connect CE2 Second Chip Enable, Always Asserted (GND) No connect ADSP Always Deasserted (Vdd) No connect ADV Always Asserted (GND) No connect BWE Byte Write Enable, Always Deasserted (Vdd) No connect BW4-1 Byte Write Selects, Always Deasserted (Vdd) No connect LBO Linear Burst Order, active low, always asserted (GND) No connect ZZ Sleep Mode Enable, active high, always deasserted (GND) The SBSRAM devices are fully synchronous devices, except for the output enable. The DSP issues commands and updates the SBSRAM address latches, as a controller, using the ADSC input of the SBSRAMs, rather than the ADSP processor input. Using the ADSC SBSRAM input enables single cycle writes and simplifies SBSRAM deselect operations. By always asserting the ADV (advance address) input to the SBSRAM, the device is always attempting to burst. This input is a do not care when ADSC is asserted. Because the BRST/ADSC signal is always low for a single access or ADSP-21160 SHARC DSP Hardware Reference 7-45 External Memory Interface the first access of a burst, the SBSRAM always updates its address latches correctly. For the subsequent transfers (up to three, after the initial access) of a read burst, the SBSRAM samples BRST/ADSC high. The asserted ADV correctly advances the internal address count of the SBSRAM. Figure 7-15 on page 7-46 demonstrates a burst read of the flow-through SBSRAM. 1 2 3 4 5 6 7 8 9 10 CLKIN ADDRESS[31:0] A0 B1 MS0 (CE) DESELECT CYCLE BRST (ADSC) C1 DESELECT CYCLE IDLE CYCLE RDH/L (OE) WRH/L (GW) ACK DATA[63:0] A0 A1 A2 A3 B1 C1 Figure 7-15. SBSRAM – Burst Read, Single Write, Single Read The DSP issues four types of bus operations to the SBSRAMs, as shown in Table 7-5. 7-46 ADSP-21160 SHARC DSP Hardware Reference External Port Table 7-5. SBSRAM Partial Truth Table SBSRAM Operation ADV1 GW WRx OE RDx I/O L X H L Data L L X L H Hi-Z Read cycle, continue burst X H L H L Data Deselect Cycle H L X X X Hi-Z CE1 MSx ADSC Read cycle, begin burst L2 Write cycle, begin burst BRST All other signal inputs held static per Figure 7-7 1 2 ADV statically held asserted, low L=low, H=High, X=don’t care, Hi-Z=three-stated, high impedance output Single read or write transfers, and the first transfer of a burst read, employ the read or write cycle, begin burst bus operation. Burst write transfers are not supported. The subsequent transfers (up to three) of a read burst employ the read cycle, continue burst bus operation. The last cycle of any read access performs a deselect bus operation to make sure that the SBSRAM data buffers remain three-stated for accesses to other banks. The write operations are achieved by configuring the appropriate bank of DSP to synchronous minimum one-cycle write mode. The synchronous read waitstate count should be programmed to one for flow-through SBSRAMs, or two for fully pipelined SBSRAMs. DSP’s page detection function is not needed for SRAM mem The ory systems. SBSRAMs are not stalled, or suspended, by assertion of in this configuration. Systems should not deassert during any ACK ACK SBSRAM access. The DSP has a weak pull-up device on ACK; ACK does not need to be driven during an access to a slave which does not or cannot control ACK. ADSP-21160 SHARC DSP Hardware Reference 7-47 External Memory Interface The read is followed by a single write to the SBSRAM, which is followed by a single read of the SBSRAM. For burst operations, the deasserting BRST is not required in the last cycle of the burst transfer. The DSP’s burst protocols also support ASIC/FPGA systems in which the pipelined end-of-burst indicator may be of value. It is possible to increase the SBSRAM array size from the example. This increase can come from using higher density devices or implementing multiple banks of SBSRAM. Multiple banks are possible using the depth expansion capability of the SBSRAMs and the multiple memory select outputs of the DSP. Executing Instructions From External Memory For systems that execute instructions from external memory, the system must include a bank of 48-bit or 64-bit wide memory that is allocated specifically to program memory. This dedicated bank for instructions is required because fetch addresses from the program sequencer are pointers to 48-bit locations—the DSP does not translate the fetch address into a 32-bit address for external memory. The system can select the instruction bank using one of the memory select (MS3-0) pins or the Core Instruction Fetch (CIF) pin. performance is reduced significantly when executing instruc DSP tions directly from external memory. For the instruction bank, the system can use CIF as a separate bank select. CIF has the same timing as the MS3-0 outputs, but CIF asserts only for an instruction fetch from external memory (depends on fetch from sequencer, not address in memory). If the instruction fetch occurs to the address range of one of the external banks, the DSP also asserts the memory select for that bank. The ADSP-21160 DSP supports the Core Instruction Fetch ( ) pin for executing instructions from external memory. This pin is CIF not available on the previous SHARC DSPs. 7-48 ADSP-21160 SHARC DSP Hardware Reference External Port To fetch instructions from external memory, the system uses either of the following methods: Connect a dedicated bank of 48-bit wide memory to DATA63-16 pins and use CIF or MS3-0 as the memory selects. The DSP uses the full address bus (ADDR31-0 including LSB of the address) to address this bank. A bank of 64-bit wide memory also can be used this way. • Connect a bank of 64-bit wide memory, store the 48-bit instructions MSB aligned in this bank, and use external address translation to generate the appropriate address on an instruction fetch. The system can use CIF as an indication of the fetch. either of the above methods, the DSP asserts both for Using instruction fetch accesses. address translation in the second method is required to accom The modate the unpacked instruction locations. The Program RDH/L Sequencer issues sequential addresses for each fetch, but each unpacked instruction word uses two (32-bit) memory locations. The DSP only asserts CIF during instruction fetches from external memory. Other types of external memory accesses (such as 40-bit data accesses or DMA transfers of packed instructions) do not use CIF. For more information on 40-bit data accesses in external memory, see “Internal Data Bus Exchange” on page 5-7. For more information on packed data transfers from external memory, see the packed data discussion on page 6-27. Host Processor Interface The DSP’s host interface supports connecting the DSP to 16- or 32-bit microprocessor buses. By providing an address, a data bus, and memory control signals—such as read, write and chip select—a host may access any ADSP-21160 SHARC DSP Hardware Reference 7-49 Host Processor Interface device on the DSP bus as if it were a memory. Figure 7-16 shows an example of how to connect a host processor to the DSP. HBG REDY HBR SYSTEM BUS INTERFACE ADSP-21160 BR1-BR6 SYSTEM DATA BUS ADDR31-0 DATA63-0 3 000 ID2-0 CS HBR WRH/L HBG RDH/L OE T/R HBG REDY WRITE ACK MS3-0 READ EXTERNAL MEMORY ADDR SYSTEM ADDRESS BUS DATA WE OE HBR CS ACK CS ADDRESS COMPARATOR "ADDRESS VALID" REDY DSP BUS ACK SYSTEM BUS Figure 7-16. Example DSP-to-Host System Interface 7-50 ADSP-21160 SHARC DSP Hardware Reference External Port The DSP accommodates either synchronous or asynchronous data transfers, allowing the host to use a different clock frequency. Transfers at speeds up to the full CLKIN clock rate are supported. 7-16 shows all lines needed for an asynchronous host inter Figure face. Systems using a synchronous interface do not need or , CS REDY and the address comparator might not be needed, depending on the host processor’s requirements. Table 7-6 defines the DSP pins used in host processor interfacing. Table 7-6. Host Interface Signals Signal Type Definition HBR I/A Host Bus Request. The host processor must assert HBR to request control of the DSP’s external bus. When HBR is asserted in a multiprocessing system, the DSP that is bus master relinquishes the bus and asserts HBG. To relinquish the bus, the DSP places the address, data, select, and strobe lines in a high-impedance state. HBR has priority over all DSP bus requests (BR1-6) in a multiprocessing system. HBG I/O Host Bus Grant. HBG acknowledges an HBR bus request, indicating that the host processor may take control of the external bus. HBG is asserted (held low) by the DSP until HBR is released. In a multiprocessing system, HBG is output by the DSP bus master and is monitored by all others. CS I/A Chip Select. The host processor asserts CS to select the DSP (for asynchronous transfer protocol). O (o/d) Host Bus Acknowledge. The DSP deasserts REDY (low) to add waitstates to an asynchronous access of its internal memory or IOP registers by a host. This pin is open-drain output (o/d) by default, but can be programmed with the ADREDY bit of SYSCON register to be active drive (a/d). REDY is only output if the CS and HBR inputs are asserted. I/S Suspend Bus Three-state. External devices can assert SBTS (low) to place the external bus address, data, selects, and strobes in a high-impedance state for the following cycle. If the DSP attempts to access external memory while SBTS is asserted, the processor halts and the memory access does not complete until SBTS is deasserted. SBTS should only be used to recover from PAGE faults or host processor/DSP deadlock. REDY SBTS I=Input, S=Synchronous, (o/d)=Open Drain, O=Output, A=Asynchronous, (a/d)=Active Drive ADSP-21160 SHARC DSP Hardware Reference 7-51 Host Processor Interface The host accesses the DSP through the DSP’s external port. Figure 6-8 on page 6-68 shows a block diagram of the external port, I/O processor, and FIFO data buffers, illustrating the on-chip data paths for host-driven transfers. The four external port DMA channels are available for use by the host—DMA transfers of code and data can be performed with low software overhead. ADSP-21160 DSP supports the host interface protocols of the The previous SHARC DSPs. Also, the ADSP-21160 DSP provides new synchronous interface protocols that support the 64-bit data bus and burst transfers of sequential data. The host processor requests and controls the DSP’s external bus with the host bus request (HBR) and host bus grant (HBG) signals. Host logic does not need to duplicate the distributed multiprocessor arbitration protocol of the DSPs. After the host gets control of the DSP bus, the host may transfer data either synchronously or asynchronously. The host bus may be 16, 32, or 64 bits wide for synchronous transfers, but only 16 or 32 bits wide for asynchronous transfers. For asynchronous transfers, the host also uses the chip select ( CS) and ready (REDY) signals. After getting control of the bus, the host can directly read and write the internal memory of the DSP. The host can also read and write to any of the DSP’s I/O processor registers, including the EPBx FIFO buffers. The host uses certain I/O processor registers to control and monitor the DSP (such as SYSCON and SYSTAT) and to set up DMA transfers. DMA transfers are controlled by the DSP’s I/O processor after they are set up by the host. In a multiprocessor system, the host can access the internal memory and I/O processor registers of every DSP. Data written to and read from the DSP can be packed or unpacked into different word widths. When the width of the host bus is 16 or 32 bits, the DSP can pack data into 32 or 48-bit words. The DSP attempts to gather two 32-bit words into one single 64-bit internal transfer where 7-52 ADSP-21160 SHARC DSP Hardware Reference External Port possible. When the width of the host bus is 64 bits (synchronous transfer modes only), the DSP can pack 48-bit instructions so four instructions are transferred in three 64-bit transfers (maximum throughput), or the DSP handles unpacked data so only 48-bits of the 64-bit transfer are treated as valid data. The host packing mode control bits (HPM) in the SYSCON register configure data packing and unpacking. Acquiring the Bus For a host processor to gain access to the DSP, the host must first assert HBR, the host bus request signal. HBR has priority over all BRx multiprocessor bus requests. When asserted, HBR causes the current DSP master to give up the bus to the host after the DSP finishes the current bus operation. If the current operation is a burst transfer, the change in bus mastership interrupts the transfer on a modulo4 boundary. The current DSP bus master signals that it is transferring ownership of the bus by asserting HBG (low) when the current bus operation ends. The cycle in which control of the bus is transferred to the host is called a Host Transition Cycle (HTC). Figure 7-7 on page 7-20 shows the timing for the host acquiring the bus. HBG is asserted while the bus master releases control of the bus and remains asserted until HBR is sampled deasserted by the DSP. The cycles in which control of the bus is released by the bus master is called the DSP’s Bus Transition Cycle (BTC). HBG freezes DSP multiprocessor bus arbitration during the time that the host owns the bus. HBG may be used to enable the host’s signal buffers, as shown in Figure 7-16 on page 7-50, Figure 7-22 on page 7-81, and Figure 7-23 on page 7-83. While HBG is asserted in a multiprocessor system, the DSPs continue to assert their BRx outputs, as in normal operation, but no BTCs occur. The current DSP bus master keeps its BRx output asserted throughout the entire time the host controls the bus. ADSP-21160 SHARC DSP Hardware Reference 7-53 Host Processor Interface 1 2 3 4 5 6 7 8 CLKIN HBR CS REDY HBG BRx ADDR[31:0] A0 Host Address D0 Host Data A1 MSx BRST WRH WRL RDH, RDL DATA[63:0] D1 ACK Figure 7-17. Example Timing for Host Acquisition of Bus 7-54 ADSP-21160 SHARC DSP Hardware Reference External Port After the host gets control of the bus, the host can choose to perform synchronous or asynchronous transfers with the DSP or other system components. To initiate asynchronous transfers, the host asserts (low) the CS and HBR inputs of the DSP that it intends to access and performs the read or write. The DSP does not respond to CS until HBG is asserted. To initiate synchronous transfers, the host keeps all DSP CS pins deasserted (high) and reads or writes to the DSPs’ multiprocessor memory space. The host may also communicate directly with system peripherals, such as SBSRAMs. These transfers occur using the protocol of the peripheral or using the external handshake mode of DMA channels 11 and 12 to control the memory or peripheral. With DMA handshaking, the host only needs to source or sink the data with the correct timing. Either of these solutions may require additional hardware support for the host. The host is responsible for driving the following signals during the HTC in which it gains control of the bus: ADDR31-0, RDH, RDL, WRH, WRL, DMAGx (if employed in the system), and PAGE (if the PAGE function is employed in the system application). These signals must be driven by the host while the host is bus master. Also, the host must drive or weakly pull up or down the MS3-0, BRST, CLKOUT, DMAG1, and DMAG2 signals as required. The DSP bus master three-states these lines, letting the host use them. The DSP with device ID=000 or 001 enables internal pull-up devices on the MS3-0, CIF, RDL, RDH, WRL, WRH, DMAR1, DMAR2, DMAG1, and DMAG2 signals. The pull-up provides a weak current source to hold these signals in the deasserted state when driven to that state. system noise can cause these weakly driven signals Excessive ( , , , , , , , , , and MS3-0 CIF RDL RDH WRL WRH DMAR1 DMAR2 DMAG1 DMAG2) to be sampled asserted. The DSP with device ID=000 or 001 enables its keeper latches on and DATA63-0, BRST, PAGE, and CLKOUT, so these signals are weakly pulled to the last value driven on them if any of these signals remain undriven for multiple cycles. ADDR31-0 ADSP-21160 SHARC DSP Hardware Reference 7-55 Host Processor Interface During read-modify-write operations, the host should keep HBR asserted to avoid temporary loss of bus mastership. HBR must remain asserted until after the host completes the last data transfer. The following restrictions apply to bus acquisition by the host. • If HBR is asserted while the DSP is in reset, the DSP does not respond with HBG until after reset and multiprocessor synchronization is completed. • The host should keep HBR asserted until after the host completes its last data transfer and is ready to give up bus ownership. • If SBTS is asserted after HBR, the DSP enters slave mode and suspends any unfinished access to the external bus. • In uniprocessor systems (with ID2-0=000), the host must assert CS in the same cycle as HBR to initiate an asynchronous access. • If the host is to execute both synchronous and asynchronous accesses during a single bus grant, it must allow at least one cycle to pass after the last access before switching CS. • Synchronous accesses may not be used in systems with only one DSP (with ID2-0=000). • Synchronous burst writes to DSPs are not supported. Synchronous burst reads of DSP internal memory and the I/O processor’s EPBx FIFOs are supported. After the host finishes its task, it can relinquish control of the bus by deasserting HBR. The DSP bus master responds by deasserting HBG in the cycle after sampling HBR deasserted. In the cycle following deassertion of HBG, the DSP bus master assumes control of the bus and normal multiprocessor arbitration resumes. 7-56 ADSP-21160 SHARC DSP Hardware Reference External Port Asynchronous Transfers To initiate asynchronous transfers after acquiring control of the DSP’s external bus, the host must assert the CS input of the DSP to be accessed. The host then drives the address of the memory location or I/O processor register to access. To simplify the hardware requirements for external interface logic, only the address bits shown in Table 7-7 need to be driven. Table 7-7. Address Fields For Asynchronous Host Accesses Address Bits1 Comments ADDR7-0 Must be driven in all cases ADDR16-8 Must be driven only if the S field indicates an internal memory access ADDR19-17 S field2 – Must be driven “000” for IOP register accesses, “01m” for internal memory normal word accesses, or”1nn” for short word accesses and either ADDR22-20 M field2 – Must be driven “000” to deselect other DSPs, if present, or ADDR31-23 1 2 E field2 – One of the lines 31–23 driven as “1” Setup and hold times for these address lines are specified in the DSP’s data sheet. For a complete description of these address fields, see “ADSP-21160 DSP Memory Map” on page 5-12. Table 7-7 applies to all asynchronous host access cases, including multiprocessor systems. Fewer address bits may need to be driven depending on the system. For example in a uniprocessor DSP system, the host need not drive the ADDR22-20 address pins. Host direct reads and writes are possible for Normal word or Short word data. Host access to Long word data on DSP slaves is not supported. Normal words are accessed if the ADDR19-17 address pins are 01m, where m is the most significant bit of the Normal word address. Short words are ADSP-21160 SHARC DSP Hardware Reference 7-57 Host Processor Interface accessed if the ADDR19-17 address pins are 1nn, where nn are the two most significant bits of the Short word address. When using asynchronous transfers and direct access to internal memory is not required, only the lower 8 bits, ADDR7-0, need be supplied by the host. The upper address bits can be configured as explained in Table 7-7. ADSP-21160 DSP does not support the Instruction Word The Transfer ( ) function from previous SHARC DSPs. 48-bit IWT instructions can be transferred by configuring the host packing mode to one of the 48-bit internal transfer modes. For direct write operations, the correct size information is tracked through the FIFO. Asynchronous write operations are latched at the I/O pads in a four-deep FIFO buffer; this buffer is called the slave write FIFO and appears in Figure 6-8 on page 6-68. This buffering allows previously written words to be re-synchronized while a new word is being written and allows asynchronous writes to occur at up to the full CLKIN clock rate of the DSP. A host may write to several DSPs simultaneously (a broadcast write), by asserting each of their CS pins. Each DSP accepts the write as if it were the only device being addressed. Because the REDY output is wire-OR’ed (if configured as an open-drain output), REDY only appears asserted when all selected DSPs are ready, unless REDY is actively pulled up. ACK is not active when CS is asserted. To eliminate the need for a host to drive the multiprocessor address lines (ADDR22-20) in systems with only one DSP (ID2-0=000), such DSP does not recognize synchronous accesses to these addresses. The host must drive these address lines if the DSP’s ID2-0 is anything other than 000. To account for buffer delays when sampling the REDY signal, systems must make sure that REDY is properly re-synchronized by the host. 7-58 ADSP-21160 SHARC DSP Hardware Reference External Port Asynchronous Transfer Timing When a DSP’s CS chip select is asserted (low), the selected DSP deasserts the REDY signal. Refer to ADSP-21160 DSP Microcomputer Data Sheet for exact timing specifications. As shown in Figure 7-18, the DSP deasserts REDY in response to CS. The host can assert CS before or after HBR is asserted, but the DSP does not reassert REDY until after HBG is asserted and a RDH/L or WRH/L strobe is applied. This condition is true only if a RDH/L or WRH/L strobe is active when HBG is asserted. Otherwise, this timing is determined by the tTRDYHG switching characteristic specified in the “Multiprocessor Bus Request and Host Bus Request” timing data in the ADSP-21160 DSP Microcomputer Data Sheet. is asserted prior to the RDH/L or WRH/L being asserted and becomes deasserted only if the DSP is not ready for the read or write to complete— the only exception is when CS is first asserted. The REDY pin is an open-drain output to facilitate interfacing to common buses. It can be changed to an active-drive output by setting the ADREDY bit in the SYSCON register. REDY ADSP-21160 DSP’s asynchronous transfer timing is similar to The that of previous SHARC DSPs with one important difference. The DSP has two read strobes (RDH and RDL) and two write strobes (WRH and WRL). Each of these strobes enables or masks 32-bits of the 64-bit data bus. Only RDH and WRH are employed on asynchronous transfers using the host packing mode support for 16/32-bit transfers. See “External Port Buffer Modes” on page 6-17. ADSP-21160 SHARC DSP Hardware Reference 7-59 Host Processor Interface HBR CS DRIVEN BY HOST HO ST ADDRESS VALID HOST WRITE ADDRESS VALID ADDRESS SETUP HOST READ HOST BUFFERS TURN ON HBG RDH/L WRH/ L MSX DRI VEN BY DSP BUS MASTER DRIVEN INACTIVE BEFORE THREE-STATE BRX ACK DRIVEN BY EACH DSP REDY DATA SETUP REDY DEASSERTED FOR A MIN OF 1 CYCLE VALI D DATA VA LID D A TA FR OM DS P HOST TRI STATES BEFORE ASSERTING RD DATA IS LATCHED IN HO ST DATA FROM HOST I S LATCHED INTO DSP ON WR RISING EDG E O N RD RISING EDG E HOST BUS TRANSITIO N T RANSITI ON CYCLE (BTC) CYCLE (HTC) Figure 7-18. Example Timing For Host Read and Write Cycles 7-60 ADSP-21160 SHARC DSP Hardware Reference External Port Figure 7-18 shows the timing of a host write cycle, including details of data packing and unpacking. This timing applies to the example host interface hardware shown in Figure 7-23 on page 7-83 and has the following sequence. 1. The host asserts the address. HBR and CS are decoded from the host bus interface address comparator and do not need to be supplied directly by the host. The selected DSP deasserts REDY immediately. 2. The host asserts WRH/L and drives data according to the timing requirements specified in the ADSP-21160 DSP Microcomputer Data Sheet. 3. The selected DSP asserts REDY when it is ready to accept the data. This transition occurs after the current bus master has completed its current transfer and has asserted HBG. HBG enables the host interface buffers to drive onto the DSP bus. 4. The host deasserts WRH/L when REDY is high and stops driving data. 5. The selected DSP latches data on the rising edge of WRH/L. After the first word, the write sequence is: 1. The host asserts WRH/L and drives data according to the timing requirements specified in the ADSP-21160 DSP Microcomputer Data Sheet. 2. The DSP deasserts REDY if it is not ready to accept data. 3. The host deasserts WRH/L when REDY is high and stops driving data. 4. The selected DSP latches data on the rising edge of WRH/L. More than one DSP may have its CS pin asserted at any one time during a write, but not during a read because of bus conflicts. ADSP-21160 SHARC DSP Hardware Reference 7-61 Host Processor Interface Figure 7-18 on page 7-60 also shows the timing of a host read cycle. This timing applies to the bus interface hardware in Figure 7-23 on page 7-83 and has the following sequence: 1. The host asserts the address. HBR and the appropriate CS line are decoded by the host bus interface address comparator. The selected DSP deasserts REDY immediately and asserts HBG. 2. The host asserts RDH/L. 3. The selected DSP drives data onto the bus and asserts REDY when the data is available. 4. The host latches the data and deasserts RDH/L. After the first word, the read sequence is: 1. The host asserts RDH/L. 2. The selected DSP deasserts REDY then asserts REDY, driving data when it becomes available. 3. The host deasserts RDH/L when REDY is high and latches the data. Synchronous Transfers Synchronous transfers are defined by both master and slave deriving bus timing (sampling bus inputs and driving bus outputs) from the same clock input (same clock frequency and phase). Synchronous transfers potentially offer significantly higher throughput than asynchronous transfers, but may require additional synchronization logic. To perform synchronous transfers, the CS input is not asserted and the host must act like another DSP in a multiprocessor system. The host must generate an address in the multiprocessor memory space of a DSP, assert the RDH/L or WRH/L strobes (and BRST if a burst read transfer) and sink or source data. 7-62 ADSP-21160 SHARC DSP Hardware Reference External Port For examples of synchronous transfers, see “External Memory Interface” on page 7-3 and “Multiprocessor (DSPs) Interface” on page 7-91. Synchronous accesses are not supported in uniprocessor systems in which the ID2-0=000. Synchronous accesses are possible in uniprocessor systems in which the ID2-0=001 or in multiprocessor systems. To perform synchronous accesses in a multiprocessor system, the host must drive the address pins ADDR22-20 with a value of 1-7 to select one of the DSPs (by its ID2-0) or one of the ADDR31-23 address pins must be driven high to select an address in external memory. For more information on using these address pins, see Table 7-7 on page 7-57. For synchronous host transfers, the DSP uses its ACK signal—instead of REDY—to add waitstates to an access. The host must wait for the DSP to assert ACK. Synchronous accesses are not recognized during the Host Transition Cycle. This prevents any spurious access from occurring while the external host buffers are starting to drive the address, data, and strobes. may glitch during the HTC and should not be relied on until the following cycle. ACK When a DSP is responding to a synchronous read access, it drives valid data only in the cycle in which it asserts ACK. The DSP three-states the data bus in the cycles following assertion of ACK in response to a synchronous host read, even if the host continues to assert RDH/RDL. Synchronous Broadcast Writes The timing in Figure 7-19 on page 7-64 demonstrates two synchronous broadcast writes from the host. Broadcast writes address multiple slaves. Master and slaves employ a different protocol for ACK-acknowledgment of the broadcast write, relative to other bus operations. For broadcast writes, slaves employ a wired-OR protocol in which they drive ACK deasserted (only if required) in alternate cycles, starting with the cycle after the write is driven on the bus. The master pre-charges ACK high in alternate cycles, starting with the second cycle after the broadcast write. ADSP-21160 SHARC DSP Hardware Reference 7-63 Host Processor Interface 1 2 BTC HTC 3 4 5 6 7 8 9 CLKIN BTC HBR HBG BRx ADDR31:0 BCast Write 1 BCast Write 2 Write 1 BCast Write 2 MSx BRST WRH/L RDH/L ACK ACK (master out) ACK (slaves) DATA63:0 Figure 7-19. Synchronous Broadcast Write Example In this example, the first broadcast write is accepted by all slaves at the end of cycle 3. One or more slaves must stall any further access until one has capacity to accept the next write operation. This stall is accomplished by the slaves deasserting ACK in cycle 4. The master must pre-charge ACK in cycle 5, the second cycle after the first broadcast write. All slaves can accept more operations by cycle 6, and none of the slaves drive ACK deasserted again. 7-64 ADSP-21160 SHARC DSP Hardware Reference External Port The host samples ACK asserted in cycle 6 and completes the second broadcast write as shown by deasserting the write strobes in cycle 7. Even though the host deasserted in cycle 7, the DSP does not respond by deasserting until cycle 9. HBR HBG Synchronous Burst Read Transfers As a slave, the DSP supports synchronous burst read transfers from the EPBx FIFOs or direct reads from internal memory. Burst write transfers are not supported, because single-cycle, non-burst writes can provide as much write bandwidth as burst writes. Burst reads are supported as contiguous, aligned, 64-bit data transfers up to a maximum length of four 64-bit transfers. The DSP slave increments the address if the burst read access is from internal memory. The slave address increment function is only supported for ADDR2-1. The host cannot burst across a modulo4 (ADDR2-0) address boundary as shown in Table 7-3 on page 7-33. To perform a burst read transfer from an EPBx buffer, the host issues a starting burst address pointing to one of the EPBx buffers. The DSP slave does not increment an EPBx burst read address. The modulo4 (ADDR2-0) address boundary restriction does not apply in this case. burst transfers tie up the external bus and may prevent or sig Long nificantly degrade system response to potentially higher priority events. The DSP has no way of truncating or disconnecting a host access. Ifring, and HBR are asserted while an external DMA access is occurHBG is not asserted until the access is completed. SBTS The DSP also supports burst transfers, which can be truncated by assertion of HBR and SBTS. If the DMA transaction was a burst transfer, when the host relinquishes control of the local bus, the DSP resumes the burst transfer, starting at the address of the last operation that did not complete. ADSP-21160 SHARC DSP Hardware Reference 7-65 Host Processor Interface Slave Direct Reads and Writes The host can directly access the internal memory and I/O processor registers of a DSP by reading or writing the appropriate address in multiprocessor memory —known as a direct read or direct write access. Each DSP bus slave monitors addresses driven on the external bus and responds to any that fall within its region of multiprocessor memory. These accesses are invisible to the slave DSP’s processor core. They do not degrade internal memory or internal bus performance. Direct access is important, because it lets the processor core continue program execution uninterrupted. The host can directly read or write the I/O processor registers to control and configure the DSP or to set up DMA transfers. IOP Shadow Registers To ease host and multiprocessor system operations, the DSP I/O processor registers include registers that shadow or mirror some processor core system registers, including the program counter (PC), and MODE2 registers. These registers facilitate system start up and debug, by letting the host (or another DSP in an multiprocessor system) interrogate these processor core registers. These shadow registers are read only and lag the value of the registers they shadow by one internal core clock. For more information, see “PC Shadow Register (PC_SHDW)” on page A-53 and “MODE2 Shadow Register (MODE2_SHDW)” on page A-53. silicon revision field of the shadow register The is now used for differentiating between silicon revisions. These corMODE2 MODE2_SHDW responding bits in the MODE2 (foreground) register are now reserved. The application program must read the MODE2_SHDW register bits [31:25] to identify the silicon revision. MODE2_SHDW is a memory-mapped IOP register whose address is 0x11. See Figure 7-20 on page 7-67. 7-66 ADSP-21160 SHARC DSP Hardware Reference External Port MODE2_SHDW 0x11 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 PID 2-0** Processor Identification (Read Only - 21160 M) PID4-3 Processor Identification (Read Only) Silicon Revision* (Read Only - 21160M) Revision 0.0 = 00 Revision 0.1 = 01 Revision 1.0/1.1 = 10 Revision 1.2 = 11 15 14 13 12 11 10 9 8 7 6 0 0 0 0 0 0 0 0 0 0 5 0 4 3 2 0 0 0 1 0 0 0 *These revisions apply to 21160M silicon only. For 21160N silicon, revision bits 29:28 are 00. **This processor ID applies to 21160M silicon only. For 21160N silicon, the processor ID bits 27:25 are 100. Figure 7-20. Mode2 Shadow Register Instruction Transfers For 16- or 32-bit host interfaces, the DSP can pack and unpack 48-bit instruction or 40-bit Extended Precision Normal word data based on the host packing mode selected with the HPM bits in the SYSCON register. Host processors can achieve maximum throughput by transferring packed instructions to or from internal memory, using the full 64-bit data bus width synchronously. For more information, see the Packed versus Unpacked instruction discussion on on page 6-27. In packed 64-bit direct reads or writes of 48-bit memory, the program must translate the addresses from an instruction word (Iword) pointer to a Normal word (Nword) pointer as follows. 1. Nword address= [(Iword address - Block base) * 3/2] + Block base ADSP-21160 SHARC DSP Hardware Reference 7-67 Host Processor Interface 2. If the host transfers a group of instructions to the DSP, the instructions go into DSP memory starting at an internal memory address (for example, 0x40100). In internal memory, the instructions are packed instructions—48-bit instructions 48-bit aligned—not padded to align to 64-bit boundaries. 3. Programs can find the Normal word starting address of the host transfer to the DSP by scaling the least significant 16-bits of the internal 48-bit instruction address by 3/2. For example, the Normal word address that corresponds to instruction address 0x40100 is: Nword address= [(0x40100 - 0x40000) * 3/2] + 0x40000 = 0x40180 The ADSP-21160 DSP does not support the SHARC DSPs. IWT bit from previous Host Direct Writes and Reads The DSP supports synchronous and asynchronous direct read and writes by the host. Synchronous burst writes to ADSP-21160 slaves are not supported. Direct Writes When a direct write to a slave DSP occurs, the address, data, and control are latched into the direct write FIFO. The FIFO supports up to eight, 64-bit wide direct writes, which can be performed with no stalls. If additional direct writes are attempted when the FIFO buffer is full, the DSP deasserts ACK (or REDY) until the buffer is no longer full. The direct write FIFO may be held off for up to four processor core cycles if all of the serial port DMA channels are active, or up to nine core cycles per chain if DMA chaining is occurring. 7-68 ADSP-21160 SHARC DSP Hardware Reference External Port Direct Write Latency The DSP handles asynchronous and synchronous direct writes differently. This difference influences the latency for the direct writes. When a DSP bus slave receives data from an asynchronous direct write, the DSP latches the data and address in a four-level FIFO buffer. For synchronous direct writes, this buffer is two-levels deep. This buffer is called the slave write FIFO and appears in Figure 6-8 on page 6-68. In the following cycle, the slave write FIFO attempts to complete the write internally. This buffering lets the host (or DSP bus master) perform writes at the full clock rate. slave DSP’s core cannot explicitly read the slave write FIFO. The Also, the DSP cannot determine the slave write FIFO’s status. Writes to the I/O processor registers from the slave write FIFO usually occur in the following one or two cycles or when any current DMA transfer is completed. The write takes more than two cycles only if a direct write in the previous cycle was held off by a full buffer. If the slave write FIFO is full when a write is attempted, the DSP deasserts ACK (or REDY) until the FIFO is not full. Unless higher priority on-chip DMA transfers are occurring, the slave write FIFO usually empties out within one cycle, creating a one-cycle write latency. Slave reads are held off when there is data in the slave write FIFO—this prevents false data reads and out-of-sequence operations. The Direct Write Pending (DWPD) bit of the SYSTAT register indicates when a direct write to internal memory is pending in the I/O processor’s direct write FIFO or data is pending in the slave write FIFO (at the external port I/O pins). Direct writes and I/O processor register accesses may be completed in different sequences. If the host performs a direct memory write then writes to an I/O processor register on the DSP, the I/O processor register write may complete before the direct write. ADSP-21160 SHARC DSP Hardware Reference 7-69 Host Processor Interface Direct Reads When a direct read of a DSP occurs, the address is latched on-chip by the I/O processor. If the access is asynchronous (CS asserted), REDY is deasserted asynchronously. If the access is synchronous (CS deasserted), ACK is deasserted in the following CLKIN cycle. When the data is available, the I/O processor drives the data and asserts REDY (or ACK). Direct reads cannot be pipelined like direct writes—direct reads only occur one at a time. Direct reads have a maximum throughput of one access per every three CLKIN cycles for synchronous I/O processor register reads or have one access every four CLKIN cycles for synchronous internal memory reads. As a slave, the DSP supports synchronous burst direct read accesses, which improve throughput for internal memory reads. Maximum throughput for synchronous burst direct read accesses is summarized in Table 7-8. Table 7-8. Direct Read Latencies—for a 1:2 Clock Ratio Access Type Latency (CLKIN cycles) Single Direct Read of I/O processor register 3 Single Direct Read of internal memory 4 Burst Direct Read of I/O processor registers (EPBx only) 3-2-2-2 Burst Direct Read of Internal memory 4-2-2-2 with the throughput advantage of burst read transfers, direct Even reads are not the most efficient method of transferring data out of a slave DSP. The highest throughput is achieved by forcing the slave to become the master and having it write (or push) the data out. The advantage of direct reads is that no programming of the slave I/O processor is required. 7-70 ADSP-21160 SHARC DSP Hardware Reference External Port Broadcast Writes Broadcast writes allow simultaneous transmission of data to all of the DSPs in a multiprocessing system. The host processor (or master DSP) can perform broadcast writes to the same memory location or I/O processor register on all of the slaves. Broadcast writes can be used to implement reflective semaphores in a multiprocessing system. For more information, see “Broadcast Writes” on page 7-115. Shadow Write FIFO Because the DSP’s internal memory must operate at high speeds, writes to the memory do not go directly into the internal memory. Instead, writes go to a two-deep FIFO called the shadow write FIFO. When an internal memory write cycle occurs, the DSP loads the data in the FIFO from the previous write into memory, and the new data goes into the FIFO. This operation is transparent, because any reads of the last two locations written are intercepted and routed to the FIFO. the ADSP-21160 DSP’s shadow write FIFO automatically Because pushes the write to internal memory as soon as the write does not compete with a read, this FIFO’s operation is completely transparent to programs. Unlike previous SHARC DSPs, there is no need for dummy writes to clear the FIFO when mixing 32- and 48-bit writes. Data Transfers Through the EPBx Buffers In addition to direct reads and writes, the host processor can transfer data to and from the DSP through the external port FIFO buffers, EPB0, EPB1, EPB2, and EPB3. Each of these buffers, which are part of the I/O processor register set, is an eight-location FIFO. These buffers support single-word transfers, DMA transfers, and sequential burst accesses. ADSP-21160 SHARC DSP Hardware Reference 7-71 Host Processor Interface DMA transfers are handled internally by the DSP’s I/O processor, but single-word transfers must be handled by the processor core. For information on external port transfers, see “External Port Channel Transfer Modes” on page 6-21. For information on external port handshaking, see “External Port Channel Handshake Modes” on page 6-22. For information on single-word interrupt-driven transfers through the external port, see “Link Port Status” on page 6-60. support debugging buffer transfers, the DSP has a Buffer Hang ToDisable ( ) bit. When set (=1), this bit prevents the processor BHD core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on page on page 6-18. DMA Transfers The host processor can also set up DMA transfers to and from the DSP. After the host gets control of the DSP, the host can access the on-chip DMA control and parameter registers to set up an external port DMA operation. DMA is the most efficient way to transfer blocks of data. • DMA Transfers to Internal Memory. The host can set up external port DMA channels to transfer data to and from DSP internal memory. • DMA Transfers to External Memory. The host can set up an external port DMA channel to transfer data directly to external memory using the DMA request and grant lines (DMARx, DMAGx). For more information, see “Setting up External Port DMA” on page 6-74. The host may also use the handshake signals for a DMA transfer as a bus slave, but may not use DMA as a bus master DMARx/DMAGx while HBR retains control of the bus. 7-72 ADSP-21160 SHARC DSP Hardware Reference External Port Host Data Packing The host interface uses the same data packing features as the I/O processor uses. “External Port Buffer Modes” on page 6-17. The “32-bit Data Packing” on page 7-73 and “48-Bit Instruction Packing” on page 7-76 sections describe timing for these data packing operations. For direct reads and writes of DSP internal memory, the determine the packing mode. For transfers to or from the HPM bits data EPBx buffers, the packing mode is determined by the setting of the PMODE bits in the DMACx control register of each external port buffer. 32-bit Data Packing For a 16-bit host bus, the DSP latches incoming data on pins DATA47-32. Similarly, the DSP drives outgoing data on DATA47-32 with the other lines equal to zeroes. The sequence of events for 32-bit packing and unpacking is different for writes and reads. a 16-bit host bus, the endian format of the transfers is con For trolled by the bit in the register. If =0, the least HMSWF SYSCON HMSWF significant 16-bit word is packed first. If HMSWF=1, the most significant 16-bit word is packed first. Figure 7-21 on page 7-74 shows example timing for host interface data packing. ADSP-21160 SHARC DSP Hardware Reference 7-73 Host Processor Interface 16/32 BIT PACKING HOST ADDRESS(15:1) (SAME) WRITE ADDRESS WRITE ADDRESS READ ADDRESS (SAME) READ ADDRESS WR RD REDY WRITE 1ST WORD INTO DSP DATA47-31 WRITE 2ND WORD INTO DSP READ 1ST WORD FROM DSP READ 2ND WORD FROM DSP VALID VALID VALID VALID HOST READ WITH 16/32 BIT PACKING HOST WRITE WITH 16/32 BIT PACKING 32/48 BIT PACKING ADDR31-0 WORD1 WORD1 WRITE ADDR WRITE ADDR WORD2 WRITE ADDR WORD1 READ ADDR WORD2 READ ADDR WORD2 READ ADDR WR RD REDY WRITE 2ND WRITE 1ST LONG WORD LONG WORD INTO DSP INTO DSP DATA63-32 1ST WORD 2ND WORD HOST WRITE WITH 32/48 BIT PACKING 3RD WORD READ 1ST LONG WORD INTO DSP WORD1 READ 2ND LONG WORD INTO DSP WORD2 WORD3 HOST READ WITH 32/48 BIT PACKING Figure 7-21. Example Timing for Host Interface Data Packing 7-74 ADSP-21160 SHARC DSP Hardware Reference External Port When a host reads a 32-bit word with 16-bit unpacking using the typical bus interface hardware shown in Figure 7-23 on page 7-83, the following sequence of events occurs as illustrated in Figure 7-21 on page 7-74. • The host initiates a read cycle by driving an address, asserting CS if the access is asynchronous, and asserting RDH (low). • The selected DSP deasserts REDY, latches the address, and performs an internal direct read to get the data. • When the DSP has the data, it asserts REDY and drives the first 16-bit word. • The host latches the data and deasserts RDH (high). • The host initiates another read access, driving the address of the data to be accessed then asserting RDH. • The DSP transmits the second 16-bit word. When a host writes a 32-bit word with 16-bit packing using the typical bus interface hardware shown in Figure 7-23 on page 7-83, the following sequence of events occurs as illustrated in Figure 7-21 on page 7-74. • The host initiates a write cycle by driving the write address, asserting CS if the access is asynchronous, and asserting WRH (low). • The DSP asserts REDY when it is ready to accept data. • The host drives the address and the first 16-bit word and deasserts WRH (high). • The DSP latches the first 16-bit word. • The host drives the address and initiates another write cycle for the second 16-bit word by asserting WRH. ADSP-21160 SHARC DSP Hardware Reference 7-75 Host Processor Interface • When the DSP has accepted the second word, it performs an internal direct write to its memory (or memory-mapped I/O processor register). If the DSP’s internal write has not completed by the time another host access occurs, the DSP holds off that access with REDY. If the DSP is waiting for another 16-bit word from the host to complete the packed word, the HPS field in the SYSTAT register is non-zero. For more information, see “Host Interface Status” on page 7-77. there is only one packing buffer for the host interface, the Because host must complete each packed transfer (or end the transfer and purge the FIFO with the HPFLSH bit) before another is begun. For more information, see “External Port Status” on page 6-57. 48-Bit Instruction Packing The host can also download and upload 48-bit instructions over its 16- or 32-bit bus. The packing sequence for downloading DSP instructions from a 32-bit host bus (HPM=11) takes 3 cycles for every 2 words, as illustrated in Table 7-9. The 32-bit data is transferred on data bus lines 63-32 (DATA63-32). If an odd number of instruction words are transferred, the packing buffer must be flushed by a dummy access to remove the unused word. Table 7-9. 32-to-48-Bit Word Packing (Host Bus DSP) Data Bus Lines 63-48 Data Bus Lines 47-32 First transfer Word1, bits 47-32 Word1, bits 31-16 Second transfer Word2, bits 15-0 Word1, bits 15-0 Third transfer Word2, bits 47-32 Word2, bits 31-16 The 7-76 HMSWF bit of SYSCON is ignored for 32-to-48-bit packing. ADSP-21160 SHARC DSP Hardware Reference External Port The packing sequence for downloading or uploading DSP instructions over a 16-bit host bus takes 3 cycles for every word, as shown in Table 7-10. Table 7-10. 16-to-48-Bit Word Packing, HMSWF=1 (Host Bus DSP) Data Bus Pins 47-32 First transfer Word1, bits 47-32 Second transfer Word1, bits 31-16 Third transfer Word1, bits 15-0 The bit in determines whether the most significant 16-bit word or least significant 16-bit word is packed first. HMSWF SYSCON 40-bit extended precision data may be transferred using the 48-bit packing mode. For more information on memory allocation for different word widths, see “Memory Organization and Word Size” on page 5-22. Host Interface Status The SYSTAT register provides status information for host and multiprocessor systems. Table 7-14 on page 7-122 lists the status bits in the SYSTAT register. For more information on the SYSTAT register, see Table A-20 on page A-51. Interprocessor Messages and Vector Interrupts After getting control of the DSP, the host processor communicates with the DSP by writing messages to the memory-mapped I/O processor registers. In a multiprocessor system, the host can access the internal memory and I/O processor registers of every DSP. The MSGRx registers are general-purpose registers that can be used for message passing between the host and DSP. They are also useful for ADSP-21160 SHARC DSP Hardware Reference 7-77 Host Processor Interface semaphores and resource sharing between multiple DSPs. The MSGRx and VIRPT registers can be used for message passing in the following ways. • Message Passing. The host can use any of the 8 message registers, MSGR0 through MSGR7, to communicate with the DSP. • Vector Interrupts. The host can issue a vector interrupt to the DSP by writing the address of an interrupt service routine to the VIRPT register. When serviced, this high priority interrupt causes the DSP to branch to the service routine at that address. The MSGRx and VIRPT registers also support shared-bus multiprocessing through the external port. Because these registers may be shared resources within a single DSP, conflicts may occur—your system software must prevent this. For further discussion of I/O processor register access conflicts, see “I/O Processor Registers” on page A-33. Message Passing (MSGRx) Three possible software protocols that the host can use for communicating with the DSP through the MSGRx message registers are: (1) vector-interrupt-driven, (2) register handshake, and (3) register write-back. For the vector-interrupt-driven method, the host fills predetermined MSGRx registers with data and triggers a vector interrupt by writing the address of the service routine to VIRPT. The service routine should read the data from the MSGRx registers and then write “0” into VIRPT to tell the host it is done. The service routine also could use one of the DSP’s FLAG3-0 pins to tell the host it has finished. For the register handshake method, four of the MSGRx registers should be designated as follows: a receive register (R), a receive handshake register (RH), a transmit register (T), and a transmit handshake register (TH). To pass data to the DSP, the host would write data into T and then write a “1” into TH. When the DSP sees a “1” in TH, it reads the data from T and then writes back a “0” into TH. When the host sees a “0” in TH, it knows 7-78 ADSP-21160 SHARC DSP Hardware Reference External Port that the transfer is complete. A similar sequence of events occurs when the DSP passes data to the host through R and RH. The register write-back method is similar to register handshaking, but uses only the T and R data registers. The host writes data to T. When the DSP sees a non-zero value in T, it retrieves it and writes back a “0” to T. A similar sequence occurs when the host is receiving data. This simpler method works well when the data to be passed does not include “0.” Host Vector Interrupts (VIRPT) Vector interrupts are used for interprocessor commands between the host and a DSP or between two DSPs. When the external processor writes an address to the DSP’s VIRPT register, the write triggers a vector interrupt. For more information, see “Multiprocessing Interrupts” on page 3-47 To use the DSP’s vector interrupt feature, the host could perform the following sequence of actions. 1. Poll the DSP’s VIRPT register until the host reads a certain token value (for example, zero). 2. Write the vector interrupt service routine address to VIRPT. 3. When the service routine is finished, the DSP should write the token back into VIRPT to indicate that it is finished and that another vector interrupt can be initiated. A host using direct writes and vector interrupts should use the Direct Write Pending (DWPD) bit to determine when a direct write to internal memory is pending, because direct writes and I/O processor register accesses may be completed in different sequences. If the host performs a direct memory write to a DSP then writes to an I/O processor register on the DSP, the I/O processor register write may complete before the direct write. Because of this FIFO latency, direct writes performed just before vector interrupt writes (to VIRPT) may be delayed until after the branch to the interrupt vector. ADSP-21160 SHARC DSP Hardware Reference 7-79 Host Processor Interface To prevent unacceptable, latency-related sequencing, the host should check that all direct writes have completed before writing to the DSP’s VIRPT register. By polling the DSP’s DWPD bit and waiting for direct writes to clear, the host can write to VIRPT with correct sequencing. System Bus Interfacing A DSP subsystem, consisting of several DSPs with local memory may be viewed as one of several processors connected together by a system bus. Examples of such systems are the EISA bus, PCI bus, or several DSP subsystems. The processors in such a system arbitrate for the system bus using an arbitration unit. Each device on the bus that needs to become a bus master must be able to drive a bus request signal and respond to a bus grant signal. The arbitration unit determines which request to grant in any given cycle. Access to the DSP Bus—Slave DSP Figure 7-22 on page 7-81 shows an example of a interface to a system bus that isolates the local DSP bus from the system bus. When the system is not accessing the DSPs, the local bus supports transfers between other local DSPs and local external memory or devices. 7-80 ADSP-21160 SHARC DSP Hardware Reference External Port HBG REDY HBR SYSTEM BUS INTERFACE ADSP-21160 #2 5 3 010 SYSTEM DATA BUS BR2 ADDR31-0 BR1, BR3-BR6 DATA63-0 ID2-0 CS HBR WRH/L HBG RDH/L OE T/R HBG WRITE REDY ACK MS3-0 READ EXTERNAL MEMORY ADDR SYSTEM ADDRESS BUS DATA WE OE HBR CS2 CS1 ACK CS ADDRESS COMPARATOR "ADDRESS VALID" ADSP-21160 #1 REDY ACK BR1 ADDR31-0 BR2-BR6 DATA63-0 SYSTEM BUS 5 3 001 ID2-0 CS HBR WRH/L HBG RDH/L REDY ACK MS3-0 CLUSTER BUS Figure 7-22. Slave DSP System Bus Interface ADSP-21160 SHARC DSP Hardware Reference 7-81 Host Processor Interface When the system needs to access a DSP, the system executes a read or write to the address range of the DSP subsystem. The external address comparator detects a local access and asserts HBR and one of the appropriate CS lines. The DSP holds off the system bus with REDY until the DSP is ready to accept the data. The HBG signal enables the system bus buffers. The buffers’ direction for data is controlled by the read or write signals. To avoid glitching the HBR line when addresses are changing, the address comparator may be qualified by an enable signal from the system or qualified by the system read or write signals. These methods cause HBR to be deasserted each time system read or write is deasserted or the address is changed. Because these techniques deassert HBR with each access, the overhead of an HTC occurs as part of each access. To avoid this type of overhead, systems can latch HBG during long sequences of bus accesses. Access to the System Bus—Master DSP Figure 7-23 on page 7-83 shows a bidirectional system interface in which the DSP subsystem can access the system bus by becoming a bus master. Before beginning the access, the DSP first requests permission to become the bus master by generating the System Bus Request signal (SBR). A bus arbitration unit determines when to respond with the System Bus Grant signal (SBG). Here, each system bus master generates and responds to its own unique pair of signals. The method a DSP uses to arbitrate for the system bus depends on whether the access is from the DSP processor core or I/O processor. For more information, see “Processor Core Access To System Bus” on page 7-84 and “DSP DMA Access To System Bus” on page 7-88. 7-82 ADSP-21160 SHARC DSP Hardware Reference External Port HBG REDY HBR SYSTEM BUS INTERFACE FLAG0 5 010 3 ADSP-21160 #2 BR2 ADDR31-0 BR1, BR3-BR6 DATA63-0 ID2-0 CS HBR WRH/L HBG RDH/L REDY ACK SBTS MS3-0 SYSTEM DATA BUS SYSTEM BUS GRANT HBG WRITE READ EXTERNAL MEMORY ADDR DATA SYSTEM ADDRESS BUS WE OE ACK HBR CS2 CS1 CS FLAG0 5 001 3 ADDRESS COMPARATOR ADSP-21160 #1 BR1 ADDR31-0 BR2-BR6 DATA63-0 ID2-0 CS HBR WRH/L HBG RDH/L "ADDRESS VALID" ACK REDY SYSTEM BUS REDY ACK SBTS MS3-0 CLUSTER BUS SYSTEM BUS REQUEST ACK MS3-0 FLAG0 (1,2) FLAG0 (1,2) SBTS (1,2) SYSTEM BUS GRANT SBTS (1,2) Figure 7-23. Bidirectional System Bus Interface ADSP-21160 SHARC DSP Hardware Reference 7-83 Host Processor Interface Processor Core Access To System Bus The DSP core may arbitrate for the system bus by setting a flag and waiting for SBG on another flag. This technique has the benefit of not stalling the local bus while waiting. If SBG is tied to an interrupt pin, the DSP can continue processing while waiting. Another method is for the DSP to attempt the access assuming that the system bus is available. The DSP then either waits or aborts the access if the bus is not available. The DSP begins the access to the system bus by asserting one of the memory select lines, MS3-0. This assertion also asserts SBR. If the system bus is not available (for example, SBG is deasserted), the DSP should be held off with ACK. This approach is simple, but stalls the DSP and the local bus when the system bus is accessed while it is busy. To overcome this stall, programs can use the Type 10 instruction: IF condition JUMP(addr), ELSE compute, DM(addr)=dreg; This instruction aborts the bus access if the condition (SBG) is not true and causes a branch to a try-again-later routine. This method works well if SBG is asserted most of the time. If the Type 10 instruction is not used, a deadlock condition can result if an access is attempted before the bus is granted. The DSP samples inputs at the frequency, and outputs must be held stable for at least one full cycle. FLAG CLKIN FLAG CLKIN Deadlock Resolution When both the DSP subsystem and a host processor try to access the system bus in the same cycle, a bus deadlock occurs in which neither access can complete. Normally, the master DSP responds to an HBR request by asserting HBG after the completion of the current access. If the DSP is accessing the system bus at the same time, HBG is not asserted, because this current access cannot complete. This condition results in a deadlock in which neither access can complete. The deadlock may be broken by asserting the Suspend Bus Three-state (SBTS) input for one or more cycles after 7-84 ADSP-21160 SHARC DSP Hardware Reference External Port the deadlock is detected—when the system bus to local bus buffer is requested from both sides. The combination of SBTS and HBR forces the master DSP to suspend its core access and assert HBG. This lets the system access to the local bus proceed. The combination of HBR and SBTS should be applied only when a deadlock is caused by a DSP access to the system bus. The SBTS signal should not be used when there is a local bus transfer, because the WRH/L signal is asserted twice—once before the SBTS is asserted and once after the access resumes. For DSP-to-DSP transfers on the local bus, this double assertion violates the slave timing requirements. The signals ACK, HBG, REDY, and the data bus are all active in slave mode. If the DSP was performing an external access (which did not complete) in the same cycle that SBTS and HBR were asserted, the access is suspended until SBTS and HBR are both deasserted again. The timing in Figure 7-24 on page 7-86 shows example timing for the deadlock resolution case on a synchronous host interface. previous SHARC DSPs, and can be used together Asfor with and are host/DSP deadlock resolution. However, if SBTS HBR SBTS HBR asserted while bus lock is set, the DSP three-states its bus signals, but does not go into slave mode. When a host processor uses SBTS and HBR for deadlock resolution, SBTS operates differently than when the DSP uses only SBTS. For more information on how the DSP uses SBTS, see the discussion on page 7-17. The signals and can be used together for host/DSP core deadlock resolution but not for DMA transfers or bursts. SBTS HBR ADSP-21160 SHARC DSP Hardware Reference 7-85 Host Processor Interface DSP Read BTC Dead HTC Host Access BTC Read Resumed CLKIN HBR SBTS HBG BRx A0 ADDR[31:0] Host Address A0 Driven by Host MSx RDH/L WRH/L ACK for synchronous host ACK D0 DATA[63:0] Host Data D0 Figure 7-24. Deadlock Resolution Example Timing 7-86 ADSP-21160 SHARC DSP Hardware Reference External Port The following sequence of actions allows the host processor to suspend an ongoing DSP access and gain access to its internal resources, provided that: (1) the access originates from the DSP’s core, not the DMA controller, (2) a DRAM page miss is not detected for that memory access, and (3) bus lock is not enabled. 1. The host interface asserts HBR. 2. The host interface determines whether the request has created a deadlock. If a deadlock is detected, the host asserts SBTS for one or more cycles. HBR should be asserted first. SBTS should be asserted only after detecting the deadlock. SBTS should remain asserted at least until the DSP asserts HBG. SBTS should be deasserted before HBR deasserts. SBTS should be a stable signal that follows a proper setup and hold requirement as described in the ADSP-21160 DSP Microcomputer Data Sheet. 3. The DSP sees asserted SBTS with HBR. If SBTS is asserted in cycle 0, the DSP samples it in cycle 1 and aborts the ongoing access in the same cycle. 4. The host three-states both RDH/L and WRH/L in cycle 1. 5. In cycle 3, HBG is asserted. The host acquires full control of the bus and may access any of the DSPs or peripherals on the bus. Remove SBTS before HBR is deasserted. 6. The host performs the accesses as bus master. After completing the accesses, the host deasserts HBR. 7. The DSP waits for ACK to be asserted. If ACK is sampled asserted in cycle 0, HBG is deasserted in cycle 1. In cycle 2, the DSP starts the access (which it aborted to give mastership to host) as a fresh access. All the strobes fire as if it is a new access. The wait counter is also reset. ADSP-21160 SHARC DSP Hardware Reference 7-87 Host Processor Interface 8. The DSP becomes bus master again. When SBTS is used for deadlock resolution, an asserted ACK is not needed to assert HBG. The host, however, may need a valid ACK signal for synchronous accesses. ACK must be asserted to deassert HBG and return the bus mastership back to the DSP. ACK is then used for normal DSP transfer requirements. DSP DMA Access To System Bus Using the SBTS and HBR inputs to resolve a system bus deadlock, as described in “Deadlock Resolution” on page 7-84, cannot be done for DMA transfers because after a DMA word transfer has begun in the DSP, it must be completed (for example, it must receive the ACK signal). If SBTS and HBR are asserted during a DMA access, the HBG pin is not asserted until the access cycle has completed. If the single DMA access is not allowed to complete, a deadlock condition may result. To prevent system bus deadlock when using DMA, programs must make sure that SBG has been asserted before the DMA sequence begins. If a higher priority access is needed, the DMA sequence may be held off (by asserting HBR) at any time after a word has been transferred. Systems must make sure that SBG is asserted before HBR is deasserted to prevent the possibility of another deadlock occurring. When the DMA sequence is complete, the DMA interrupt service routine should clear the external SBR flag. Because the system bus is likely to be considerably slower than the DSP local bus, performance on the local bus may be considerably improved by using handshake mode DMA. In this case, the SBG signal is tied to the DMA request line, DMARx. The local and system bus accesses are only initiated when the system bus is available. a FIFO in the system interface unit, to allow DMA data Using from the local bus to be posted, may also increase performance on the local bus when using a slow system bus. 7-88 ADSP-21160 SHARC DSP Hardware Reference External Port Multiprocessing with Local Memory Figure 7-25 shows how several DSP subsystems may be connected together on a system bus for high throughput. The gate array implements bus arbitration when the system bus is accessed. The buffers isolate the DSP local buses from the system bus. The example system in Figure 7-25 works in the following way. • A DSP requests the system bus with SBR when it asserts the MS2 line. The gate array arbitrates between the SBR lines and then enables the highest priority group by asserting SBG, which is tied to FLAG0. • The master DSP may connect to system memory or to other DSP groups. When the bus buffer is enabled, the read or write strobe enables should be asserted with a delay to allow the address to stabilize. • To access a DSP slave in another group, the master DSP addresses that group’s multiprocessor memory space. The gate array detects group multiprocessor memory space from three high-order address bits and asserts the HBR line for the selected group. When HBG is asserted, the gate array enables the slave’s bus buffer. The high-order group address bits are cleared by the buffer to allow the group to decode the address as local multiprocessor memory space. The access is synchronous because the CS line is not asserted. The single waitstate option for the bus should be enabled. • If two groups access each other in the same cycle, a deadlock may occur. The SBTS pin may be used to clear the deadlock. DSP To Microprocessor Interface A DSP without external memory may connect to a host microprocessor’s bus. Depending on the microprocessor’s I/O capabilities, the interface may not require any buffers. This type of connection assumes that the ADSP-21160 SHARC DSP Hardware Reference 7-89 Host Processor Interface 3 SYSTEM BUS ADDRESS SYSTEM BUS ADSP-21160 #2 SBR MS2 SBG FLAG0 ADDR LOCAL BUS BUFFER ENABLE ACK DATA HBR HBG SBTS SYSTEM MEMORY LOCAL MEMORY SYSTEM BUS ARBITRATION DATA ADDR (GATE ARRAY) ADSP-21160 #1 SBR MS2 SBG FLAG0 DATA63-0 ADDR31-0 3 SYSTEM BUS ADDRESS ADDR LOCAL BUS ENABLE BUFFER ACK DATA HBR HBG SBTS LOCAL MEMORY Figure 7-25. DSP Subsystems On A System Bus DSP can execute its application from internal memory most of the time and only occasionally needs to request an external access. The host microprocessor should always keep the HBR request asserted unless it sees BR1 asserted (for the BRx line of the DSP with ID=001). The host can then deassert HBR to allow the DSP to perform an external access 7-90 ADSP-21160 SHARC DSP Hardware Reference External Port when the host is ready to give up its bus. Usually, the host can read or write to the DSP as needed. The host accesses the DSP by asserting the CS signal and handshaking with REDY. Host Bus Grant (HBG) need not be used in this system. Multiprocessor (DSPs) Interface The ADSP-21160 supports connecting to other ADSP-21160s to create multiprocessing DSP systems. This support includes: • Distributed, on-chip arbitration for the shared external bus • A unified multiprocessor address space that makes the internal memory and I/O processor registers of all DSPs directly accessible to each DSP (and host interface) • Dedicated hardware support for interprocessor communication (for example, reflective semaphores) • Dedicated, point-to-point communication channels between DSPs using the link ports Figure 7-26 on page 7-93 illustrates a basic multiprocessing system. In a multiprocessor system with several DSPs sharing the external bus, any of the processors can become the bus master. The bus master has control of the bus, which consists of the DATA63-0, ADDR31-0, and associated control lines. ADSP-21160 SHARC DSP Hardware Reference 7-91 Multiprocessor (DSPs) Interface Table 7-11 on page 7-92 shows the external port signals that are needed in multiprocessor DSP arbitration and communication. Table 7-11. Signal For Cluster Multiprocessor Systems Signal Types Signals Synchronization: CLKIN, RESET Arbitration: BR6-1, PA1 Bused Information: ADDR31-0, DATA63-0 Master Controls: RDH, RDL, WRH, WRL, Slave Control: ACK Host Interface:2 HBR, HBG, CS3, 1 2 3 BRST REDY3, SBTS Optional, only needed if Priority Access function is used Optional, only needed if Host Interface is used. Optional, only needed if asynchronous Host Interface employed. The internal memory and I/O processor registers of the system’s DSPs comprise the multiprocessor memory space. Multiprocessor memory space is mapped into the unified address space of each DSP. For more information, see the multiprocessor memory map in Figure 5-8 on page 5-18. After a DSP becomes the bus master, it can directly read and write the internal memory of any other slave DSP. The master can also read and write to any of the slave’s I/O processor registers, including their external port FIFO data buffers. For example, the master DSP may write to a slave’s I/O processor registers to set up DMA transfers or to send a vector interrupt. Multiprocessing System Architectures Multiprocessor systems typically use one of two schemes to communicate between processor nodes. One scheme uses dedicated point-to-point 7-92 ADSP-21160 SHARC DSP Hardware Reference CONTROL ADDRESS DATA ADDRESS DATA ADSP-2116X #6 ADSP-2116X #5 ADSP-2116X #4 CONTROL External Port ADSP-2116X #3 CLKIN RESET RPBA 3 011 ID2-0 ADDR31-0 DATA63-0 CONTROL PA BR1, BR3-6 BR3 5 ADSP-2116X #2 CLKIN RESET RPBA 3 ID2-0 010 ADDR31-0 DATA63-0 CONTROL PA BR1, BR3-6 BR2 5 ADSP-2116X #1 CLOCK CLKIN RESET RESET RPBA 3 001 ID2-0 CONTROL ADDR31-0 ADDR DATA63-0 DATA RDX WRX ACK MS3-0 OE WE ACK CS BMS PAGE CS ADDR SBTS BRST CLKOUT DATA CS HBR HBG REDY PA BR2-6 BR1 ADDR 5 GLOBAL MEMORY AND PERIPHERALS (OPTIONAL) BOOT EPROM (OPTIONAL) HOST PROCESSOR INTERFACE (OPTIONAL) DATA Figure 7-26. ADSP-21160 Multiprocessor System ADSP-21160 SHARC DSP Hardware Reference 7-93 Multiprocessor (DSPs) Interface communication channels. In the other scheme, nodes communicate through a single shared global memory over a parallel bus. The DSP supports point-to-point communication—data flow multiprocessing—through its six link ports. Also, the DSP supports a shared parallel bus communication—cluster multiprocessing—through its link ports and external port. For more information on data flow multiprocessing, see “Data Flow Multiprocessing” below and “Data Flow Multiprocessing With Link Ports” on page 8-22. For more information on cluster multiprocessing, see “Cluster Multiprocessing” on page 7-95. Data Flow Multiprocessing Data flow multiprocessing works for applications requiring high computational bandwidth, but requiring only limited flexibility. The program partitions its algorithm sequentially across multiple processors and passes data through a line of processors, as shown in Figure 7-26. ADSP-21160 LINK PORT LINK PORT ADSP-21160 LINK PORT LINK PORT ADSP-21160 LINK PORT LINK PORT Figure 7-27. Data Flow Multiprocessing The DSP provides complete support for data flow multiprocessing applications, because the DSP eliminates the need for interprocessor data FIFOs and external memory. The internal memory of the DSP is usually large enough to contain both code and data for most applications using data-flow system-topology. All that a data flow system requires are a number of DSPs and point-to-point signals connecting them. This design yields savings in complexity, board space, and system cost. For more information on connecting multiple DSPs using link ports, see “Host Processor Access To Link Buffers” on page 8-11. 7-94 ADSP-21160 SHARC DSP Hardware Reference External Port Cluster Multiprocessing Cluster multiprocessing works for applications where a fair amount of flexibility is required. This flexibility is needed when a system must be able to support a variety of different tasks, some of which may be running concurrently. The cluster multiprocessing configuration is shown in Figure 7-28. Also, the DSP has an on-chip host interface that lets a cluster be interfaced to a host processor or another cluster. ADSP-21160 LINK PORT LINK PORT EXTERNAL PORT ADSP-21160 LINK PORT LINK PORT EXTERNAL PORT ADSP-21160 LINK PORT LINK PORT EXTERNAL PORT BULK MEMORY Figure 7-28. Cluster Multiprocessing Cluster multiprocessing systems include multiple DSPs connected to a parallel bus that supports interprocessor access of on-chip memory and access to shared global memory. In a typical cluster of DSPs, up to six processors and a host can arbitrate for the bus. The on-chip bus arbitration logic lets these processors share the common bus. The DSP’s features (such as large internal memory, link ports, and external port FIFOs) help eliminate the need for any extra hardware in the cluster multiprocessor configuration. External memory—local and global—can frequently be eliminated in this type of system. ADSP-21160 SHARC DSP Hardware Reference 7-95 Multiprocessor (DSPs) Interface The DSP supports fixed and rotating priority schemes. Other supported techniques include bus locking, timed release, DMA prioritization, and core processor access preemption of background DMA transfers. The on-chip arbitration logic lets transitions in bus mastership take up to only one cycle of overhead. Bus requests are generated implicitly when a processor accesses an external address. Because each processor monitors all bus requests and applies the same priority logic to the requests, each can independently determine who is the next bus master. After getting mastership of the bus, a DSP can access external memory and the internal memory and I/O processor registers of all other DSPs (slaves) in the system. A DSP can directly transfer data to another DSP or set up a DMA channel to transfer the data. The DSPs are mapped into a common memory map—to identify the address space of each DSP within the unified memory map of the system cluster. Also, each DSP has a unique ID. The DSP’s I/O processor registers, internal memory, and external memory are all part of the unified address space. This shared on-chip memory eliminates the need to use external memory for message passing between DSPs and simplifies software communications. DSPs can write directly into each other’s memory, saving an extra transfer step. The multiprocessor communication between DSPs is eased with the broadcast write feature, which lets a DSP write to all processors simultaneously. This broadcast supports reflective semaphores, where a processor polls its own internal copy of the semaphore and only uses the external bus for a broadcast write to all other processors when it wants to change it. This reduces communications traffic on the external bus. The cluster configuration allows the DSPs to have a very fast node-to-node data transfer rate. Clusters also allow a simple, efficient software-communication model. 7-96 ADSP-21160 SHARC DSP Hardware Reference External Port For example, all of the required setup operations for a DMA transfer can be accomplished by a single DSP on one side of the transfer. The other processor is not interrupted until the DMA transfer is complete. DSP’s internal memory facilitates I/O in multiprocessor sys The tems. The on-chip, dual-ported RAM supports full-speed inter-DSP transfers concurrent with dual accesses by the DSP’s processor core. No cycles are stolen from the processor core, and the processor’s full performance is maintained during these accesses. Link Port Data Transfers In A Cluster. A bottleneck exists within the cluster because only two DSPs can communicate over the shared bus during each cycle—other DSPs are held off until the bus is released. Because the DSP can also perform point-to-point link port transfers within a cluster, systems can eliminate this bottleneck by setting up data communication through the link ports. Data links between DSPs can be dynamically set up and initiated over the common bus. All six link ports can operate simultaneously on each DSP. A disadvantage of the link ports is that individual transfers occur at a much lower rate than that of the shared parallel bus. Because the link ports’ 8-bit data path is smaller than the processor’s native word size, the transfer of each word requires multiple clock cycles. Link ports may also require more software overhead and complexity because they must be set up on both sides of the transfers before they can occur. SIMD Multiprocessing. For certain classes of applications such as radar imaging, a Single-Instruction, Multiple-Data (SIMD) array of DSPs may be the most efficient topology to coordinate a large number of DSPs in a single system. The SIMD array of Figure 7-28 on page 7-95 consists of multiple DSPs connected in a two- or three-dimensional mesh. The data link ports provide nearest neighbor communications and through-routing of data. A single master DSP provides the instruction stream that the array executes. Data flow in and out the array can be managed through multiple serial port streams. ADSP-21160 SHARC DSP Hardware Reference 7-97 Multiprocessor (DSPs) Interface Multiprocessor Bus Arbitration Multiple DSPs can share the external bus with no additional arbitration logic. Arbitration logic is included on-chip to allow the connection of up to six DSPs and a host processor. The DSP accomplishes bus arbitration through the BR1-6, HBR, and HBG signals. BR1-6 arbitrate between multiple DSPs, and HBR/HBG pass control of the bus from the DSP bus master to the host and back. The priority scheme for bus arbitration is determined by the setting of the RPBA pin. Table 7-12 defines the DSP pins used in multiprocessing systems. Table 7-12. Multiprocessing DSP Pins Signal Type Definition BR6-1 I/O/S Multiprocessing Bus Requests. Used by multiprocessing DSPs to arbitrate for bus mastership. A DSP only drives its own BRx line (corresponding to the value of its ID2-0 inputs) and monitors all others. In a multiprocessor system with less than six DSPs, the unused BRx pins should be tied high; the processor’s own BRx line must not be tied high or low because it is an output. Note that ID=00x device enables keeper latch/pull-up devices on certain signals. For more information, see Table 11-1 on page 11-3. ID2-0 I Multiprocessing ID. Determines which multiprocessing bus request (BR1—BR6) is used by DSP. ID=001 corresponds to BR1, ID=010 corresponds to BR2, etc. (ID=000 is used in single-processor systems.) These lines are a system configuration selection which should be hardwired or only changed at reset. I = Input, S = Synchronous, (o/d) = Open Drain; O = Output, A = Asynchronous, (a/d) = Active Space 7-98 ADSP-21160 SHARC DSP Hardware Reference External Port Table 7-12. Multiprocessing DSP Pins (Cont’d) Signal Type Definition RPBA I Rotating Priority Bus Arbitration Select. When RPBA is high, rotating priority for multiprocessor bus arbitration is selected. When RPBA is low, fixed priority is selected. This signal is a system configuration selection which must be set to the same value on every DSP. If the value of RPBA is changed during system operation, it must be changed in the same CLKIN cycle on every DSP. PA (a/d) I/O/S Priority Access. The DSP slave may assert the PA signal to interrupt background DMA transfers and gain access to the external bus. This signal is asserted when a DSP slave’s processor core requests the bus or if an external DMA channel requests the bus with the DMACx PRIO control bit set. The PA signal is an active drive output, which may be asserted (low) by one or more slaves. It is deasserted (high) by the master. A protocol is employed to avoid driver contention. I = Input, S = Synchronous, (o/d) = Open Drain; O = Output, A = Asynchronous, (a/d) = Active Space The ID2-0 pins provide a unique identity for each DSP in a multiprocessing system. The first DSP should be assigned ID=001, the second should be assigned ID=010, and so on. One of the DSPs must be assigned ID=001 in order for the bus synchronization scheme to function properly. DSP with The during reset. ID=001 holds the external bus control lines stable When the ID2-0 inputs of a DSP are equal to 001, 010, 011, 100, 101, or 110, the DSP configures itself for a multiprocessor system and maps its internal memory and I/O processor registers into the multiprocessor memory space. ID=000 configures the DSP for a single-processor system. ID=111 is reserved and should not be used. A DSP in a multiprocessor system can determine which processor is the current bus master, by reading the CRBM2-0 bits of the SYSTAT register. These bits give the value of the ID2-0 inputs of the current bus master. ADSP-21160 SHARC DSP Hardware Reference 7-99 Multiprocessor (DSPs) Interface Conditional instructions can be written, depending on whether the DSP is the current bus master in a multiprocessor system. The assembly language mnemonic for this condition code is Bm, and its complement is Not Bm (not bus master). The Bm condition indicates whether the DSP is the current bus master. For more information, see “Conditional Sequencing” on page 3-52. To use the bus master condition, the condition code select (CSEL) field in the MODE1 register must be zero or the condition is always evaluated as false. Bus Arbitration Protocol The Bus Request (BR1-6) pins are connected between each DSP in a multiprocessing system, with the number of BRx lines used equal to the number of DSPs in the system. Each processor drives the BRx pin that corresponds to its ID2-0 inputs and monitors all others. If less than six DSPs are used in the system, the unused BRx pins should be tied high. When one of the slave DSPs needs to become bus master, it automatically initiates the bus arbitration process by asserting its BRx line at the beginning of the cycle. Later in the same cycle, the DSP samples the value of the other BRx lines. The cycle in which mastership of the bus is passed from one DSP to another is called a Bus Transition Cycle (BTC). A bus transition cycle occurs when the current bus master’s BRx pin is deasserted and one or more of the slave’s BRx pins is asserted. The bus master can retain bus mastership by keeping its BRx pin asserted. Also, the bus master does not always lose bus mastership when it deasserts its BRx line—another BRx line must be asserted by one or more of the slaves at the same time. In this case, when no other BRx is asserted, the master does not lose any bus cycles. By observing all of the BRx lines, each DSP can detect when a bus transition cycle occurs and which processor has become the new bus master. A bus transition cycle is the only time that bus mastership is transferred. 7-100 ADSP-21160 SHARC DSP Hardware Reference External Port After conditions determine that a bus transition cycle is going to occur, every DSP in the system evaluates the priority of the BRx lines asserted within that cycle. For a description of bus arbitration priority, see “Bus Arbitration Priority (RPBA)” on page 7-104. The DSP with the highest priority request becomes the bus master on the following cycle, and all of the DSPs update their internal record to indicate which DSP is the current bus master. This information can be read from the current bus master field, CRBM, of the SYSTAT register. Figure 7-29 shows typical timing for bus arbitration. ADSP-21160 #1 IS BUS BTC MASTER ADSP-21160 #2 IS BUS MASTER BTC CLKIN BUS REQUESTS: BR1 BRX SAMPLED AT THIS POINT BR2 EXECUTE FLOW BUS ACTIVE ADSP-21160 WITH ID=1: INTERNAL OPERATION INTERNAL OPERATION INTERNAL OPERATION INTERNAL OPERATION HOLD SIGNAL STABLE EXTERNAL ACCESS INTERNAL OPERATION PERFORM ACCESS UNDRIVEN ADSP-21160 WITH ID=2: EXECUTE FLOW BUS ACTIVE INTERNAL OPERATION UNDRIVEN EXTERNAL ACCESS PERFORM ACCESS EXTERNAL ACCESS INTERNAL OPERATION EXTERNAL ACCESS INTERNAL OPERATION INTERNAL OPERATION PERFORM ACCESS HOLD SIGNAL STABLE PERFORM ACCESS HOLD SIGNAL STABLE UNDRIVEN Figure 7-29. Bus Arbitration Timing ADSP-21160 SHARC DSP Hardware Reference 7-101 Multiprocessor (DSPs) Interface The actual transfer of bus mastership is accomplished by the current bus master three-stating the external bus—DATA63-0, ADDR31-0, RDH, RDL, WRH, WRL, BRST, MS3-0, CIF, PAGE, HBG, DMAG2-1—at the end of the bus transition cycle and the new bus master driving these signals at the beginning of the next cycle. The bus strobes (RDH, RDL, WRH, and WRL) and MS3-0 are driven high (inactive) before three-stating occurs. ACK must be sampled high by the new master before it starts a new bus operation. For more information, see Figure 7-30 on page 7-103. During bus transition cycle delays, execution of external accesses are delayed. When one of the slave DSPs needs to perform an external read or write, it automatically initiates the bus arbitration process by asserting its BRx line. This read or write is delayed until the processor receives bus mastership. If the read or write was generated by the DSP’s processor core (not the I/O processor), program execution stops on that DSP until the instruction is completed. The following steps occur as a slave acquires bus mastership and performs an external read or write over the bus as shown in Figure 7-30. 1. The slave determines that it is executing an instruction which requires an off-chip access. It asserts its BRx line at the beginning of the cycle. Extra cycles are generated by the core processor (or I/O processor) until the slave acquires bus mastership. 2. To acquire bus mastership, the slave waits for a bus transition cycle in which the current bus master deasserts its BRx line. If the slave has the highest priority request in the bus transition cycle, it becomes the bus master in the next cycle. If not, it continues waiting. 3. At the end of the bus transition cycle the current bus master releases the bus, and the new bus master starts driving. 7-102 ADSP-21160 SHARC DSP Hardware Reference External Port 1 2 3 4 5 6 CLKIN BRX OPTIONAL HIGHEST PRIORITY REQUESTER BECOMES BUS MASTER ADDR VALID VALID MSX VALID VALID RDH MS, STROBES DRIVEN INACTIVE BEFORE TRISTATE RDL WRH WRL BRST pull-up HOLDS ACK ASSERTED ACK MINIMUM 2-CYCLE SYNC READ - SLAVE DEASSERTS ACK IN 2ND CYCLE IF NEEDED DATA BTC VALID VALID SYNC WRITE SYNC READ BTC DOES NOT ACCESS ACCESS OCCUR IF NO OTHER BRS ASSERTED Figure 7-30. Bus Request and Read/Write Timing ADSP-21160 SHARC DSP Hardware Reference 7-103 Multiprocessor (DSPs) Interface During the CLKIN cycle in which the bus master deasserts its BRx output, it three-states its outputs in case another bus master wins arbitration and enables its drivers in the next CLKIN cycle. If the current bus master retains control of the bus in the next cycle, it enables its bus drivers, even if it has no bus operation to run. The DSP with ID=00x enables internal keeper latches, or pull-up devices, on key signals, including the address and data buses, strobes, and ACK. These devices provide a weak current source or sink—approximate 20K impedance—to keep these signals from drifting near input receiver thresholds when all drivers are three-stated. For more information, see Table 11-1 on page 11-3. When the bus master stops using the bus, its BRx line is deasserted, allowing other DSPs to arbitrate for mastership if they need it. If no other DSPs are asserting their BRx line when the master deasserts its BRx, the master retains control of the bus and continues to drive the memory control signals until: 1) it needs to use the bus again, or 2) another DSP asserts its BRx line. waits to be a master for a DMA transfer, it asserts While. If athatslaveslave’s core accesses the DA group registers, the is BRx BRx deasserted during that access. Bus Arbitration Priority (RPBA) To resolve competing bus requests, there are two available priority schemes: fixed and rotating. The RPBA pin selects the scheme. When RPBA is high, rotating priority bus arbitration is selected, and when RPBA is low, fixed priority is selected. The RPBA pin must be set to the same value on each DSP in a multiprocessing system. If the value of RPBA is changed during system operation, it must be changed synchronously to CLKIN and must meet a setup time that lets all DSPs recognize the change in the same cycle. The priority scheme changes in that (same) cycle. 7-104 ADSP-21160 SHARC DSP Hardware Reference External Port In the fixed priority scheme, the DSP with the lowest ID number among the competing bus requests becomes the bus master. If, for example, the processor with ID=010 and the processor with ID=100 request the bus simultaneously, the processor with ID=010 becomes bus master in the following cycle. DSP knows the of the other processors requesting the bus, Each because their corresponds to the line being used. ID ID BRx The rotating priority scheme gives roughly equal priority to each DSP. When rotating priority is selected, the priority of each processor is reassigned after every transfer of bus mastership. Highest priority is rotated from processor to processor as if they were arranged in a circle—the DSP located next to (one place down from) the current bus master is the one that receives highest priority. Table 7-13 shows an example of how rotating priority changes on a cycle-by-cycle basis. Table 7-13. Rotating Priority Arbitration Example Cycle Number Hardwired Processor IDs and Priority1 ID1 ID2 ID3 ID4 ID5 ID6 12 M 1 2-BR 3 4 5 2 4 5-BR M-BR 1 2 3 3 4 5-BR M 1 2 3 4 5-BR M 1 2 3 4-BR 53 1-BR 2 3 4 5 M 1 The following symbols appear in these cells: 1-5 = assigned priority, M = bus mastership (in that cycle), BR = requesting bus mastership with BRx 2 Initial priority assignments 3 Final priority assignments ADSP-21160 SHARC DSP Hardware Reference 7-105 Multiprocessor (DSPs) Interface Mastership Timeout Bus In either the fixed or rotating priority scheme, systems may need to limit how long a bus master can retain the bus. Systems can limit bus mastership by forcing the bus master to deassert its BRx line after a specified number of CLKIN cycles and giving the other processors a chance to acquire bus mastership. To setup a bus master timeout, a program must load the BMAX register with the maximum number of CLKIN cycles (minus 2) that the DSP can retain bus mastership: BMAX = (maximum # of bus mastership CLKIN cycles) – 2 Internal processor clock cycles are a multiple of CLKIN cycles. The minimum value for BMAX is 2, which lets the processor retain bus mastership for 4 CLKIN cycles. Setting BMAX=1 is not allowed. To disable the bus master timeout function, set BMAX=0. Each time a DSP acquires bus mastership, its BCNT register is loaded with the value in BMAX. BCNT is then decremented in every CLKIN cycle that the master performs a read or write over the bus and any other (slave) DSPs are requesting the bus. Any time the bus master deasserts its BRx line, BCNT is reloaded from BMAX. When BCNT decrements to zero, the bus master first completes its off-chip read/write and then deasserts its own BRx (any new off-chip accesses are delayed)—this allows transfer of bus mastership. If the ACK signal is holding off an access when BCNT reaches zero, bus mastership is not relinquished until the access can complete. If BCNT reaches zero while a burst transfer is in progress, the bus master completes the burst transfer before deasserting its BRx output. If BCNT reaches zero while bus lock is active, the bus master does not deassert its 7-106 ADSP-21160 SHARC DSP Hardware Reference External Port line until bus lock is removed. If HBR is being serviced, BCNT stops decrementing and continues only after HBR is deasserted. BRx lock is enabled by the bit in the register. For more Bus information, see “Bus Lock and Semaphores” on page 7-118. BUSLK MODE2 Priority Access The Priority Access signal (PA) lets external bus accesses by a slave DSP take priority over ongoing DMA transfers. Normally when external port DMA transfers are in progress, the slave DSPs cannot use the external bus until the DMA transfer is finished. By asserting its PA pin, the slave DSP can acquire the bus without waiting for the DMA operation to complete. The PA signal can also be asserted by a slave with a high-priority DMA access pending on the external bus. If the PA signal is not used in a multiprocessor system, the DSP bus master does not give up the bus to another DSP until: (1) a cycle in which it does not perform an external bus access or (2) a bus timeout. If a slave DSP needs to send a high priority message or perform an important data transfer, it normally must wait until any DMA operation completes. Using the PA signal lets the slave perform its higher priority bus access with less delay. Each of the DMACx registers has a PRIO bit that raises that DMA channel to a higher priority than all other internal DMA channels that do not have the PRIO bit set. Unless configured differently with the EBPR bit in the SYSCON register, this channel still has lower priority (internally) than the core. Programs should be careful to minimize the number of DMA channels enabled to high priority status in the multiprocessor system, because both core and (external) high priority DMA requests from slaves are arbitrated at the same priority level. For example, a slave core cannot arbitrate bus ownership away from a high priority DMA transfer unless the bus timeout (BMAX function) occurs. ADSP-21160 SHARC DSP Hardware Reference 7-107 Multiprocessor (DSPs) Interface When PA is asserted, the current DSP bus master deasserts its BRx output, and gives up the bus, provided: • Its core does not have an external access pending, and • None of its external bus DMA channels have pending high-priority bus requests. All DSP slaves also deassert their BRx outputs, if each slave meets the same provisions. The current bus master never asserts PA, because it already has control of the bus. If the current master detects a condition that would assert PA while it is bus master, it performs that high priority operation before giving up bus ownership. In the CLKIN cycle after PA has been asserted, only the DSP slaves with a pending high priority access have their bus requests asserted. Bus arbitration proceeds as usual with the highest priority device becoming the master when the previous bus master releases its BRx output. The new master samples all BRx inputs after gaining bus mastership— during the cycle that follows the BTC. If no other bus requests are asserted, the master is the only device driving PA, and the master deasserts and three-state PA in this cycle as shown in Figure 7-31. If the master samples other BRx inputs as asserted, multiple devices are driving PA, and the new bus master cannot deassert PA. The new bus master three-states its PA driver in this case. All DSP slaves recognize the cycle following the BTC. They do not assert PA during this cycle, unless they were already driving their BR and PA outputs in the BTC. 7-108 ADSP-21160 SHARC DSP Hardware Reference External Port This behavior is demonstrated in Figure 7-32. 1 2 BR1-5 3 4 All ADSP-21160s that do not have core access pending remove their BRx BR6 { { PA Slaves cannot assert BTC PA in this cycle Bus Master samples all other BRx negated and negates PA Figure 7-31. Example PA Deassertion All ADSP-21160s that do not have core access pending remove their BRx 1 2 3 4 BR1-6 { { PA BTC Bus Master samples other BRx asserted and three-states (only) PA Slaves continue to assert PA in this cycle Figure 7-32. Example of PA Driven by Multiple Slaves ADSP-21160 SHARC DSP Hardware Reference 7-109 Multiprocessor (DSPs) Interface Bus Synchronization After Reset When a multiprocessing system is reset (RESET asserted), the bus arbitration logic on each processor must synchronize, making sure that only one DSP drives the external bus. One DSP must become the bus master, and all other processors must recognize which one it is before actively arbitrating for the bus. The bus synchronization scheme also lets the system safely bring individual DSPs into and out of reset. only difference between the soft and hard reset is that the The external bus arbitration does not get affected by a soft reset (no bus synchronization at soft reset). The PLL also does not get reset at soft reset. One of the DSPs in the system must be assigned ID=001 in order for the bus synchronization scheme to function properly. This processor also holds the external bus control lines stable during reset. Bus arbitration synchronization is disabled if the DSP is in a single-processor system (ID=000). To synchronize their bus arbitration logic and define the bus master after a system reset, the multiple DSPs obey the following rules: • All DSPs except the one with ID=001 deassert their BRx line during reset. They keep their BRx deasserted for at least two cycles after reset and until their bus arbitration logic is synchronized. • After reset, a DSP considers itself synchronized when it detects a cycle in which only one BRx line is asserted. The DSP identifies the bus master by recognizing which BRx is asserted and updates its internal record to indicate the current master. • The DSP with ID=001 asserts its BRx (BR1) during reset and for at least two cycles after reset. If no other BRx lines are asserted during these cycles, the DSP with ID=001 drives the memory control 7-110 ADSP-21160 SHARC DSP Hardware Reference External Port signals to prevent them from glitching. Although it is asserting its BRx and driving the memory control signals during these cycles, this DSP does not perform reads or writes over the bus. If the DSP with ID=001 is synchronized by the end of the two cycles following reset, it becomes the bus master. If it is not synchronized at this time, it deasserts its BRx (BR1) and waits until it is synchronized. a DSP has synchronized itself, it sets the Whenregister. BSYN bit in the SYS- TAT If one DSP comes out of reset after the others have synchronized and started program execution, that DSP may not be able to synchronize immediately (for example, if it detects more than one BRx line asserted). If the un-synchronized processor tries to execute an instruction with an off-chip read or write, it cannot assert its BRx line to request the bus and execution is delayed until it can synchronize and correctly arbitrate for the bus. Synchronization cannot occur while HBG is asserted, because bus arbitration is suspended while the bus is controlled by a host. If HBR is asserted immediately after reset and no bus arbitration has taken place, the DSP with ID=001 is considered to be the last bus master. The DSP with ID=001 maintains correct logic levels on the RDH/L, WRH/L, MS3-0, CIF, PAGE, and HBG signals during reset. Because the “001” processor can be accidently reset by an erroneous write to the soft reset bit (SRST) of the SYSCON register, it behaves in the following manner during reset. • While it is in reset, the DSP with ID=001 attempts to gain control of the bus by asserting BR1. • While it is in reset, the DSP with ID=001 drives the RDH/L, WRH/L, MS3-0, CIF, DMAG1, DMAG2, PAGE, and HBG signals only if it determines that it has control of the bus. For the DSP to decide it has ADSP-21160 SHARC DSP Hardware Reference 7-111 Multiprocessor (DSPs) Interface control of the bus, two conditions must be true: 1) BR1 was asserted and no other BRx lines were asserted in the previous cycle, and 2) HBG was deasserted in the previous cycle. Timing differences occur during processor reset (RESET) or software reset (SRST bit in SYSCON register = 1) deassertion (MS3-0, HBG, DMAGx, RDx, WRx, CIF, PAGE, BRST) and threestate (FLAG3-0, LxCLK, LxACK, LxDAT7-0, ACK, REDY, PA, TFSx, RFSx, TCLKx, RCLKx, DTx, BMS, TDO, EMU, DATA). These timings occur asynchronously to CLKIN and may not meet the specifications published in the ADSP-21160 DSP Microcomputer Data Sheet Timing Requirements and Switching Characteristics tables. Refer to ADSP-21160 DSP Microcomputer Data Sheet for more information. The DSP with ID=001 continues to drive the RDH/L, WRH/L, MS3-0, CIF, DMAG1, DMAG2, PAGE, and HBG signals for two cycles after reset, as long as neither HBG nor any other BRx lines are asserted. At the end of the second cycle it assumes bus mastership (if it is synchronized), and normal bus arbitration begins in the following cycle. If it is not synchronized, it deasserts BR1, stops driving the memory control signals and does not arbitrate for the bus until it becomes synchronized. Although the bus synchronization scheme allows individual processors to be reset, the DSP with ID=001 may fail to drive the memory control signals if it is in reset while any other processors are asserting their BRx line. If the DSP with ID=001 has asserted HBG while it is in reset, it is synchronized when RESET is deasserted. This lets the host start using the bus while the DSPs are still in reset. If a host processor attempts to reset the DSP bus master (which is driving the HBG output), the host immediately loses control of the bus. During reset, the ACK line is pulled high internally by the DSP bus master with a 2 k equivalent resistor. 7-112 ADSP-21160 SHARC DSP Hardware Reference External Port Booting Another DSP If the system uses one DSP to boot another DSP over the cluster bus, the master DSP must do the following to communicate to the slave DSP (DSP that boots) through the external port interface: 1. Program the PMODE field in DMAC10 of the booting DSP for no packing. 2. Make 48-bit writes to EPB0 on the booting DSP. Multiprocessor Direct Writes and Reads A DSP bus master has the same type of access as a host processor to read or write the internal memory and I/O processor registers of a slave DSP. A DSP bus master or host processor can access the slave by reading and writing to the appropriate address in multiprocessor memory space—this is called a direct read or direct write access. For more information, see “Slave Direct Reads and Writes” on page 7-66. Each DSP bus slave monitors addresses driven on the external bus and responds to any that fall within its region of multiprocessor memory space. These accesses are invisible to the slave DSP’s processor core. They do not degrade internal memory or internal bus performance as seen by the core. This feature lets the processor core continue program execution uninterrupted. The DSP bus master can directly read and write the slave’s I/O processor registers (for example, SYSCON, SYSTAT) to send a vector interrupt or to set up DMA transfers. The DSP supports 64-bit direct writes through normal word (32-bit) address space. The master can perform a 64-bit DMA to a slave by asserting Normal word addresses in multiprocessor slave space with the stride set to 2. The master can also do 64-bit direct writes by asserting Normal ADSP-21160 SHARC DSP Hardware Reference 7-113 Multiprocessor (DSPs) Interface word addresses in multiprocessor slave space with the LW mnemonic set or with SIMD enabled. IOP Shadow Registers In a multiprocessing system, read access to another DSP’s PC or MODE2 register provides useful information. The DSP’s I/O processor registers include registers that shadow or mirror the program counter (PC), and MODE2 registers. For more information, see “IOP Shadow Registers” on page 7-66. Instruction Transfers Multiprocessor instruction transfers to or from internal memory of DSP should use 64-bit transfers for maximum performance (described below). If 48-bit internal transfers are required, one of the slave EPBx FIFOs must be employed, using the packing mode function (PMODE) of the DMA channel. Maximum throughput is achieved by transferring packed instructions to or from internal memory, using the full 64-bit data bus width synchronously. For more information, see the Packed versus Unpacked instruction discussion on on page 6-27. For packed 64-bit direct reads or writes of 48-bit memory the addresses must be translated as during a host processor transfer. For more information, see “Instruction Transfers” on page 7-67. Direct Writes When a direct write to a slave DSP occurs, the address, data, and control are latched into a dedicated direct write FIFO. For more information, see “Direct Writes” on page 7-68. 7-114 ADSP-21160 SHARC DSP Hardware Reference External Port Direct Write Latency The DSP handles asynchronous and synchronous direct writes differently. This difference influences the latency for the direct writes. For more information, see “Direct Write Latency” on page 7-69. Direct Reads When a direct read of a DSP occurs, the address is latched on-chip by the I/O processor at the end of the first CLKIN cycle. ACK is deasserted in the following CLKIN cycle. When the data is available, the I/O processor drives the data and asserts REDY (or ACK). Direct reads cannot be pipelined like direct writes—they only occur one at a time. See “Direct Reads” on page 7-70. Broadcast Writes Broadcast writes allow simultaneous transmission of data to all of the DSPs in a multiprocessing system. The master DSP can perform broadcast writes to the same memory location or I/O processor register on all of the slaves. During broadcast writes, the master also writes to itself unless the broadcast is a DMA write. Broadcast writes can be used to implement reflective semaphores in a multiprocessing system. For more information, see “Bus Lock and Semaphores” on page 7-118. Broadcast writes also can simultaneously transfer code or data to multiple processors. The highest region of multiprocessor memory space, addresses 0x0070 0000 to 0x007F FFFF, is used for broadcast writes. When a write address falls within this region, each DSP slave responds by accepting the access; the master DSP also accepts its own broadcast write. A read cycle generated in the broadcast write region reads the corresponding location in that processor’s internal memory and does not assert the processor’s BRx. Figure 7-33 shows the timing for a typical broadcast write. In this example, the first broadcast write is accepted by all slaves in cycle 1. This ADSP-21160 SHARC DSP Hardware Reference 7-115 Multiprocessor (DSPs) Interface broadcast write fills the write buffer capacity of one or more of the DSP slaves for less than three CLKIN cycles. The second broadcast write stalls on the bus until write capacity is available in all of the slaves—as indicated by none of the slaves deasserting ACK in cycle 4. The master—having sampled ACK deasserted at the end of cycle 2—pre-charges ACK in cycle 3. The master samples ACK asserted at the end of cycle 4, indicating that all slaves have accepted the second broadcast write. 1 2 3 4 CLKIN ADDR DATA BCast Write 1 Write 1 BCast Write 2 BCast Write 2 WRH/L ACK ACK (master) ACK (slaves) Figure 7-33. Broadcast Write Timing Example Because the master DSP must wait for a broadcast write to complete on all of the slaves, the acknowledge signal is handled differently to prevent drive conflicts on the ACK line. A wired-OR acknowledge signal is implemented to respond to broadcast writes. 7-116 ADSP-21160 SHARC DSP Hardware Reference External Port This protocol operates as follows. 1. The DSP does not assert the strobes for broadcast write, unless it samples ACK asserted at the end of the previous cycle. 2. The synchronous broadcast write completes on the bus in the first cycle if ACK is sampled asserted at the end of that cycle. 3. In the first cycle of the broadcast write and in all succeeding odd cycles, a slave DSP deasserts ACK if it is not ready to allow the broadcast write to complete on the bus. If it is ready, it does not drive the ACK line. 4. During all succeeding even cycles in which the broadcast write is not finished, the slave DSPs do not drive ACK. Instead, the master DSP drives (for example, pre-charges) ACK high and must continue the write. (Iterate steps 3 and 4). In most cases, the ACK signal is high, and the DSP slaves are ready to accept data at the start of the broadcast write—the write completes in one cycle. If the ACK signal is low or one of the slaves is not ready to accept the data, the broadcast write takes a minimum of three cycles. DSP with =00x enables a keeper latch on the line to pre The vent the signal from drifting. This eliminates any power ID ACK consumption caused by the signal drifting to the switching point and improves the robustness of broadcast writes. Multiprocessor systems that use broadcast writes should keep the ACK signal line as free of noise as possible. Shadow Write FIFO Because the DSP’s internal memory must operate at high speeds, writes to the memory do not go directly into the memory array, but rather to a two-deep FIFO called the shadow write FIFO. The operation of this ADSP-21160 SHARC DSP Hardware Reference 7-117 Multiprocessor (DSPs) Interface FIFO is transparent to program execution. For more information, see “Shadow Write FIFO” on page 7-71. Data Transfers Through the EPBx Buffers The DSP bus master can transfer data to and from the slave DSPs through the external port FIFO buffers, EPB0, EPB1, EPB2, and EPB3. Each of these buffers, which are part of the I/O processor register set, is an eight-location FIFO. Both single-word transfers and DMA transfers can be performed through the EPBx buffers. DMA transfers are handled internally by the DSP’s I/O processor, but single-word transfers must be handled by the DSP core. For more information, see “Data Transfers Through the EPBx Buffers” on page 7-71. The DSP supports synchronous burst read transfers from the EPBx FIFOs, or direct read from internal memory, as a slave. Burst write transfers are not supported. Burst reads are supported as contiguous, aligned, 64-bit data transfers up to a maximum length of four 64-bit transfers. The DSP slave increments the address if the burst read access is from internal memory space only. The slave address increment function is only supported for ADDR2-1 (similar to SBSRAMs). To perform a burst read transfer from an EPBx buffer, the DSP master issues a starting burst address pointing to one of the EPBx buffer addresses in I/O processor control register space. The DSP slave does not increment an EPBx burst read address, and the master DSP limits the burst transfer length to the modulo4 address boundary restriction. Bus Lock and Semaphores Semaphores can be used in multiprocessor systems to allow the processors to share resources such as memory or I/O. A semaphore is a flag that can be read and written by any of the processors sharing the resource. The value of the semaphore tells the processor when it can access the resource. 7-118 ADSP-21160 SHARC DSP Hardware Reference External Port Semaphores are also useful for synchronizing the tasks being performed by different processors in a multiprocessing system. With the use of its bus lock feature, the DSP has the ability to read and modify a semaphore in a single indivisible operation—a key requirement of multiprocessing systems. Because both external memory and each DSP’s internal memory (and I/O processor registers) are accessible by every other DSP, semaphores can be located almost anywhere. Read-modify-write operations on semaphores can be performed if all of the DSPs obey two simple rules: • A DSP must not write to a semaphore unless it is the bus master. This is especially important if the semaphore is located in the DSP’s own internal memory or I/O processor registers. • When attempting a read-modify-write operation on a semaphore, the DSP must have bus mastership for the duration of the operation. Both rules apply when a DSP uses its bus lock feature, which retains its mastership of the bus and prevents the other processors from simultaneously accessing the semaphore. Bus lock is requested by setting the BUSLK bit in the MODE2 register. When this happens, the DSP initiates the bus arbitration process in the usual fashion, by asserting its BRx line. When it becomes bus master, it locks the bus (for example, retains bus mastership) by keeping its BRx line asserted even when it is not performing an external read or write. Host Bus Request (HBR) is also ignored during a bus lock. When the BUSLK bit is cleared, the DSP gives up the bus by deasserting its BRx line. While the BUSLK bit is set, the DSP can determine if it has acquired bus mastership by executing a conditional instruction with the Bus Master (Bm) or Not Bus Master (Not Bm) condition codes, for example: IF NOT BM JUMP(PC,0); /* wait for bus mastership */ ADSP-21160 SHARC DSP Hardware Reference 7-119 Multiprocessor (DSPs) Interface If it has become the bus master, the DSP can proceed with the external read or write. If not, it can clear its BUSLK bit and try again later. A read-modify-write operation is accomplished with the following steps. 1. Request bus lock by setting the BUSLK bit in MODE2. 2. Wait for bus mastership to be acquired. 3. Wait until Direct Write Pending (DWPD) is zero. 4. Read the semaphore, test it, then write to it. Locking the bus prevents other processors from writing to the semaphore while the read-modify-write is occurring. After bus mastership is acquired, the Direct Write Pending (DWPD) bit’s status in SYSTAT must be checked to ensure that a semaphore write by another processor is not pending. is reflective, located in the DSP’s internal mem Iforytheor semaphore an I/O processor register, the processor must write to it only when it has bus lock. Bus lock can be used in combination with broadcast writes to implement reflective semaphores in a multiprocessing system. The reflective semaphore should be located at the same address in internal memory or I/O processor register of each DSP. To check the semaphore, each DSP simply reads from its own internal memory. To modify the semaphore, a DSP requests bus lock and then performs a broadcast write to the semaphore address on every DSP, including itself. Before modifying the semaphore, the DSP must re-check it to verify that another processor has not changed it. With reflective semaphores, the external bus is used only for updating the semaphore, not for reading it. This technique reduces bus traffic. Interprocessor Messages and Vector Interrupts The DSP bus master can communicate with slave DSPs by writing messages to their I/O processor registers. The MSGR0-7 registers are 7-120 ADSP-21160 SHARC DSP Hardware Reference External Port general-purpose registers which can be used for convenient message passing between DSPs. They are also useful for semaphores and resource sharing between multiple DSPs. The MSGRx and VIRPT registers can be used for message passing in the following ways. • Message Passing. The master DSP can communicate with a slave DSP by writing and/or reading any of the 8 message registers, MSGR0-MSGR7, on the slave. • Vector Interrupts. The master DSP can issue a vector interrupt to a slave by writing the address of an interrupt service routine to the slave’s VIRPT register. This causes an immediate high-priority interrupt on the slave which, when serviced, causes it to branch to the specified service routine. The MSGRx and VIRPT registers also support the host processor interface. Because these registers may be shared resources within a single DSP, conflicts may occur—system software must prevent this. For further discussion of I/O processor register access conflicts, see “I/O Processor Registers” on page A-33 Message Passing (MSGRx) There are three methods by which the DSP bus master can communicate with a slave through the MSGRx message registers: 1) vector-interrupt driven, 2) register handshake, and 3) register write-back. These techniques are the same as for a host processor. For more information, see “Message Passing (MSGRx)” on page 7-78. Vector Interrupts (VIRPT) Vector interrupts are used for interprocessor commands between two DSPs or between a host and the DSP. When the external processor writes an address to the DSP’s VIRPT register, a vector interrupt is caused. Vector interrupts operate the same for host and multiprocessor systems. For more information, see “Host Vector Interrupts (VIRPT)” on page 7-79. ADSP-21160 SHARC DSP Hardware Reference 7-121 Multiprocessor (DSPs) Interface Multiprocessor Interface Status The SYSTAT register provides status information for host and multiprocessor systems. Table 7-14 shows the status bits in this register. For more information on the SYSTAT register, see Table A-20 on page A-51. Table 7-14. SYSTAT System Status Register Bit # Name Definition 0 HSTM Host Mastership – Indicates whether the host processor has been granted control of the bus. 1=Host is bus master 0=Host is not bus master 1 BSYN Bus Synchronization – Indicates when the DSP’s bus arbitration logic is synchronized after reset. For more information, see “Bus Synchronization After Reset” on page 7-110. 1=Bus arbitration logic is synchronized 0=Bus arbitration logic is not synchronized [3:2] [6:4] reserved CRBM 7 [10:8] reserved IDC 11 12 7-122 Current Bus Master – Indicates the ID of the DSP that is the current bus master. If CRBM is equal to the ID of this DSP then it is the current bus master. CRBM is only valid for ID 2-0 > 0 (greater than zero). When ID 2-0 =000, CRBM is always 1. ID Code – Indicates the ID 2-0 inputs of this DSP. reserved DWPD Direct Write Pending – Indicates when a direct write to the DSP’s internal memory is pending. The DWPD bit is cleared when the direct write has been completed. (Direct writes may be delayed for several cycles if DMA chaining is underway or if higher priority DMA requests occur. Maximum delay is 12 cycles.) 1=Direct write pending 0=No direct write pending ADSP-21160 SHARC DSP Hardware Reference External Port Table 7-14. SYSTAT System Status Register (Cont’d) Bit # Name Definition 13 VIPD Vector Interrupt Pending – Indicates that a pending vector interrupt has not yet been serviced. The VIPD bit is set when the VIRPT register is written to and is cleared upon return from the interrupt service routine. The host processor (or other DSP) that issued the vector interrupt should monitor this bit to determine when the service routine has been completed (and when a new vector interrupt may be issued). 1=Vector interrupt pending 0=No vector interrupt pending [15:14] HPS Host Packing Status – Indicates when host word packing is completed or, if not, what stage of the packing/unpacking process is taking place. 00=process complete 01=First stage of process 10=Second stage of process [19:16] CRAT Processor Core Clock (CCLK)-to-CLKIN clock ratio, as determined by the CLK_CFG0-3 inputs [31:20] reserved ADSP-21160 SHARC DSP Hardware Reference 7-123 Multiprocessor (DSPs) Interface 7-124 ADSP-21160 SHARC DSP Hardware Reference Link Ports 8 LINK PORTS This chapter describes the ADSP-21160 DSP’s link ports. The DSP has six 8-bit wide link ports, which can connect to other DSPs’ or peripherals’ link ports. Overview The six ADSP-21160 DSP’s bidirectional link ports have eight data lines, an acknowledge line, and a clock line. Link ports can operate at frequencies up to the same speed as the DSP’s internal clock, letting each port transfer up to 8 bits of data per internal clock cycle. Link ports also: • Operate independently and simultaneously. • Pack data into 32- or 48-bit words; this data can be directly read by the DSP or DMA-transferred to or from on-chip memory. • Are accessible by the external host processor, using direct reads and writes. • Have double-buffered transmit and receive data registers. ADSP-21160 SHARC DSP Hardware Reference 8-1 Overview • Include programmable clock/acknowledge controls for link port transfers. Each link port has its own dedicated DMA channel. • Provide high-speed, point-to-point data transfers to other DSP processors, allowing differing types of interconnections between multiple DSPs, including 1-, 2- and 3-dimensional arrays. DSP’s link ports are logically (but not electrically) ADSP-21160 compatible with previous SHARC DSP (ADSP-2106x DSPs) link ports. For more information, see “Link Data Path (and Compatibility) Modes” on page 8-7. Table 8-1 on page 8-2 lists the pins associated with each link port. Each link port consists of eight data lines (LxDAT7-0), a link clock line (LxCLK), and a link acknowledge line (LxACK). The LxCLK line allows asynchronous data transfers and the LxACK line provides handshaking. When configured as a transmitter, the port drives both the data and LxCLK lines. When configured as a receiver, the port drives the LxACK line. Figure 8-1 shows link port connections. Table 8-1. Link Port Pins Link Port Pin(s) Link Port Function LxDAT7-0 Link Port x Data LxCLK Link Port x Clock LxACK Link Port x Acknowledge “x” denotes the link port number, 0-5 8-2 ADSP-21160 SHARC DSP Hardware Reference Link Ports TRANSMITTER LXDAT7-0 EACH LINK PORT RECEIVER 8 LXDAT7-0 LXCLK LXCLK LXACK LXACK EACH LINK PORT “X” DENOTES THE LINK PORT NUMBER, 0-5. Figure 8-1. Link Port Pin Connections Link Port To Link Buffer Assignment There are six buffers, LBUF0-5, that buffer the data flow through the link ports. These buffers are independent of the link ports and may be connected to any of the six link ports. The link ports receive and transmit data on their LxDAT7-0 data pins. Any of the six link buffers may be assigned to handle data for a particular link port. The data in the link buffers can be accessed with DMA or processor core control. “Link Port x” does not necessarily connect to “Link Buffer x.” ADSP-21160 SHARC DSP Hardware Reference 8-3 Overview Figure 8-2 shows a block diagram of the link ports and link buffers. LINK BUFFERS 0-5 LBUF0 LBUF1 LBUF2 LBUF3 LDAT7-0 4/8 EXTERNAL PACKING REGISTER 4/8 LINK PORTS 0-5 LBUF4 LBUF5 6/10 LP0 LP1 CROSS-BAR CONNECTION MX 32/48 LP2 LP3 LP4 LINK CLOCK (1X, .5X, .33X, OR .25X) LP5 INTERNAL REGISTER LAR LINK ASSIGNMENT REGISTER 32/48 32/48/64 DM DATA BUS PM DATA BUS I/O DATA BUS Figure 8-2. Link Ports and Buffers The Link Assignment Register (LAR) assigns the link buffer-to-port connections. Memory-to-memory transfers may be accomplished by assigning the same link port to two buffers, setting up a loopback mode. For details on the LAR register, see “Link Port Assignment Register (LAR)” on page A-67. Assigning more than two buffers to one port will disable the port. 8-4 ADSP-21160 SHARC DSP Hardware Reference Link Ports Link Port DMA Channels DMA channels 4-9 support buffers 0-5. The buffer channel pairings are listed in Table 8-2. Table 8-2. DMA Channel/Link Buffer Pairing DMA Channel # Link Buffer Supported DMA Channel 4 Link Buffer 0 DMA Channel 5 Link Buffer 1 DMA Channel 6 Link Buffer 2 DMA Channel 7 Link Buffer 3 DMA Channel 8 Link Buffer 4 DMA Channel 9 Link Buffer 5 For more information, see “Setting I/O Processor—LPort Modes” on page 6-43. Link Port Booting Systems may boot the DSP through a link port. For more information, see “Bootloading Through The Link Port” on page 6-87. Setting Link Port Modes The SYSCON, LCOM, LAR, and LCTLx registers control the link ports operating modes for the I/O processor. Table A-17 on page A-45 lists all the bits in SYSCON, Table A-24 on page A-65 lists all the bits in LCOM, Table A-25 on page A-67 lists all the bits in LAR, and Table A-22 on page A-63 lists all the bits in LCTLx. ADSP-21160 SHARC DSP Hardware Reference 8-5 Setting Link Port Modes The following bits control link port modes. Some other bits in the SYSCON, LCOM, LAR, and LCTLx registers setup DMA and I/O processor related link port features. For information on these features, see For more information, see “Setting I/O Processor—LPort Modes” on page 6-43.. • Link Buffer Mesh Multiprocessing. LCOM Bit 20 (LMSP) This bit enables (if set, =1) or disable (if cleared, =0) mesh multiprocessing. • Link Path (Mesh Multiprocessing) Delay. LCOM Bit 22-21 (LPATHD) These bits apply change over delays when changing to the next LPATH register as follows: 00=no additional delay, 01=1 additional delay, 10=2 additional delays, 11=3 additional delays. Link Port Clock Divisor. LCTL0 Bits 6-5, 16-15, and 26-25 and LCTL1 Bits 6-5, 16-15, and 26-25 (LxCLKD) These bits select the transfer clock divisor for the corresponding link buffer (LBUFx). The transfer clock equals the DSP’s internal clock (CCLK) divided by LxCLKD, where LxCLKD is: 01=1, 10=2, 11=3, or 00=4. • Link Port pull-down Resistor Disable/Enable. LCTL0 Bits 8, 18, and 28 and LCTL1 Bits 8, 18, and 28 (LxPDRDE) This bit disables (if set, =1) or enables (if cleared, =0) the internal pull-down resistors on the LxCLK, LxACK, and LxDAT7-0 pins of the corresponding link port; this bit applies to the port, which is not necessarily the port assigned to the corresponding link buffer (LBUFx). link ports are bussed together and you have the Link IfPortmultiple pull-down Resistor enabled on all the processors, you will heavily load the line. Ensure that you have only one DSP enabling this functionality. Link Port Data Path Width. LCTL0 Bits 9, 19, and 29 and LCTL1 Bits 9, 19, and 29 (LxDPWID) This bit selects the link port data path width (8-bit if set, =1) (4-bit if cleared, =0) for the corresponding link buffer (LBUFx). 8-6 ADSP-21160 SHARC DSP Hardware Reference Link Ports internal clock ( ) is the frequency multiplied byThea DSP’s clock ratio ( ). For more information, see the clock CCLK CLKIN CLK_CFG3-0 ratio discussion on page 11-8. link buffers are enabled or disabled, the I/O processor may When generate unwanted interrupt service requests if Link Service Request (LSRQ) interrupts are unmasked. To avoid unwanted interrupts, programs should mask the LSRQ interrupts while enabling or disabling link buffers. For more information, see “Using Link Port Interrupts” on page 8-12. Link Data Path (and Compatibility) Modes The link ports can transmit and received data using all 8 of the link port’s data pins (LxDAT7-0) or the 4 lower data pins (LxDAT3-0). The LxDPWID bit in the LCTLx register selects the link port data path width (8-bit if set, =1) (4-bit if cleared, =0). When is cleared (4-bit data path), the ADSP-21160 DSP can be connected to link ports of previous SHARC DSPs LxDPWID (ADSP-2106x DSP family). The link port receiver must run at the same speed or faster than the transmitter. Connecting to an ADSP-2106x DSP may require that the ADSP-21160 DSP be configured for 1/2 core clock rate operation. For more information, see “Using Link Port Handshake Signals” on page 8-7. Using Link Port Handshake Signals The LxCLK and LxACK pins of each link port allow handshaking for asynchronous data communication between DSPs. Other devices that follow the same protocol may also communicate with these link ports. The DSP link ports are backward compatible with the SHARC link ports for basic transfers, including LSRQ functions. ADSP-21160 SHARC DSP Hardware Reference 8-7 Using Link Port Handshake Signals A link-port-transmitted word consists of 4 bytes (for a 32-bit word) or 6 bytes (for a 48-bit word). The transmitter asserts the clock (LxCLK) high with each new byte of data. The falling edge of LxCLK is used by the receiver to latch the byte. The receiver asserts LxACK when it is ready to accept another word in the buffer. The transmitter samples LxACK at the beginning of each word transmission (that is, after every 4 or 6 bytes). If LxACK is deasserted at that time, the transmitter does not transmit the new word. For more information, see Figure 8-3 on page 8-8. The transmitter leaves LxCLK high and continues to drive the first byte if LxACK is deasserted. When LxACK is eventually asserted again, LxCLK goes low and begins transmission of the next word. If the transmit buffer is empty, LxCLK remains low until the buffer is refilled, regardless of the state of LxACK. LC LK STAYS HI GH A T BYTE 0 IF LA C K IS SAM PLED LOW ON PREVIOUS LC LK R I SI NG EDGE— LC LK HIGH I N DIC A TES A STALL LXC LK M INIM UM LA CK SET-UP TIM E LXAC K LAC K WILL R EASSERT AS SOON AS THE LIN K BUFF ER IS "N OT F ULL" LXA CK M AY DEASSER T A FTER BYTE 0 LXDA T7-0 BYTE 1 BY TE 2 BYTE 3 REC EIVER WI LL AC C EPT REM A INI N G BYTES IN THE C URR ENT WO RD EV EN IF LA C K IS DEA SSER TED. THE TRA NSM ITTER WILL NOT SEN D THE FO LLOW ING WO RD. TRA N SM ITTER SA M PLES LA CK HER E TO DETERM IN E W HETHER TO TRA NSM IT N EXT WO RD BYTE 0 (M SBS) TR AN SMIT D ATA FO R N EXT W ORD IS H ELD UNTIL LAC K IS A SSER TED Figure 8-3. Link Port Handshake Timing (32-Bit Mode) 8-8 ADSP-21160 SHARC DSP Hardware Reference Link Ports The receive buffer may fill if a higher priority DMA, core I/O processor register access, direct read, direct write or chain loading operation is occurring. LxACK may deassert when it anticipates the buffer may fill. LxACK is reasserted by the receiver as soon as the internal DMA grant signal has occurred, freeing a buffer location. Data is latched in the receive buffer on the falling edge of LxCLK. The receive operation is purely asynchronous and can occur at any frequency up to the processor clock frequency. When a link port is not enabled, LxDAT7-0, LxCLK and LxACK are three-stated. When a link port is enabled to transmit, the data pins are driven with whatever data is in the output buffer, LxCLK is driven high and LxACK is three-stated. When a port is enabled to receive, the data pins and LxCLK are three-stated and LxACK is driven high. To allow a transmitter and a receiver to be enabled (assigned and link buffer enabled) at different times, LxACK, LxCLK, and LxDAT7-0 may be held low with the 50K internal pull-down resistors if LxPDRDE is cleared when the Link Port is disabled. If the transmitter is enabled before the receiver, LxACK is low and the transmission is held off. If the receiver is enabled before the transmitter, LxCLK is held low by the pull-down and the receiver is held off. If many link ports are bused together, the systems may need to enable only one of the internal resistors to pull down each bused pin, so the bused lines are not pulled down too strongly or too heavily loaded. , and - should not be left unconnected unless external, pull-down resistors are used. LxACK LxCLK LxDAT7 0 ADSP-21160 SHARC DSP Hardware Reference 8-9 Using Link Buffers Using Link Buffers Each link buffer consists of an external and an internal 48-bit register. For more information, see Figure 8-2 on page 8-4. When transmitting, the internal register is used to accept core data or DMA data from internal memory. When receiving, the external register performs the packing and unpacking for the link port, most significant nibble or byte first. These two registers form a two-stage FIFO for the LBUFx buffer. Two writes (32or 48-bit) can occur to the register by the DMA or the core, before it signals a full condition. As each word is unpacked and transmitted, the next location in the FIFO becomes available and a new DMA request is made. If the register becomes empty, the LxCLK signal is deasserted. When transmitting, only the number of words written are transmitted. Full/empty status for the link buffer FIFOs is given by the LxSTAT bits of the LCOM register. This status is cleared for a link buffer when its LxEN enable bit is cleared in the LCTLx register. During receiving, the external buffer is used to pack the receive link port data (most significant nibble or byte first) and pass it to the internal register before DMA-transferring it to internal memory. This buffer is a two-deep FIFO. If the DSP’s DMA controller does not service it before both locations are filled, the LxACK signal is deasserted. The link buffer width may be selected to be either 32 or 48 bits. This selection is made individually for each buffer with the LxEXT bits in the LCTLx register. For 40-bit extended precision data or 48-bit instruction transfers, the width must be set to 48 bits. 8-10 ADSP-21160 SHARC DSP Hardware Reference Link Ports Core Processor Access To Link Buffers In applications where the latency of link port DMA transfers to and from internal memory is too long, or where a process is continuous and has no block boundaries, the DSP processor core may read or write link buffers directly using the full or empty status bit of the link buffer to automatically pace the operation. The full or empty status of a particular LBUFx buffer can be determined by reading the LCOM control/status register. DMA should be disabled when using this capability (LxDEN=0). If a read is attempted from an empty receive buffer, the core stalls (hangs) until the link port completes reception of a word. If a write is attempted to a full transmit buffer, the core stalls until the external device accepts the complete word. Up to four words (2 in the receiver and 2 in the transmitter) may be sent without a stall before the receiver core or DMA must read a link buffer register. support debugging buffer transfers, the DSP has a Buffer Hang ToDisable ( ) bit. When set (=1), this bit prevents the processor BHD core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on page -18. Host Processor Access To Link Buffers The link buffers can also be accessed by the external host processor, using direct reads and writes. When the host reads or writes to these buffers, the word width is determined only by the host packing mode, as selected by the HPM bits in the SYSCON register. ADSP-21160 SHARC DSP Hardware Reference 8-11 Using Link Port DMA Using Link Port DMA DMA channels 4-9 support link buffers 0-5. A maskable interrupt is generated when the DMA block transfer has completed. For more information on link port interrupts, see “Using Link Port Interrupts” on page 8-12. For more information on link port DMA, see “Link Port DMA” on page 6-80. previous SHARC DSPs, there are no shared DMA channels Unlike on the ADSP-2153x DSP. Each link port buffer has its own dedicated DMA channel. In chained DMA operations, the DSP automatically sets up another DMA transfer when the current DMA operation completes. The chain pointer register (CPx) is used to point to the next set of buffer parameters stored in memory. The DSP’s DMA controller automatically downloads these buffer parameters to set up the next DMA sequence. For information on setting up DMA chaining, see “Chaining DMA Processes” on page 6-69. Using Link Port Interrupts Three types of interrupts are dedicated to the link ports: The I/O processor generates a DMA channel interrupt when a DMA block transfer through the link port with DMA enabled (LxDEN=1) finishes. • The I/O processor generates a DMA channel interrupt when DMA for the link buffer channel is disabled (LxDEN=0) and the buffer is not full (for transmit) or the buffer is not empty (for receive). • The I/O processor generates a Link Services Request (LSRQ) interrupt when an external source accesses a disabled link port— unassigned link port or assigned port with buffer disabled. 8-12 ADSP-21160 SHARC DSP Hardware Reference Link Ports Registers control link port interrupt latching and masking. The LIRPTL register is the individual link port interrupt latch/mask register, and the IRPTL and IMASK registers control global link port DMA interrupt latching and masking. For more information, see “Link Port Interrupt Register (LIRPTL)” on page A-24, “Interrupt Latch Register (IRPTL)” on page A-18, and “Interrupt Mask Register (IMASK)” on page A-23. Link Port Interrupts With DMA Enabled A link port interrupt is generated when the DMA operation is done— when the block transfer has completed and the DMA count register is zero. One way programs can use this interrupt is to send additional control information at the end of a block transfer. Because the receive DMA buffer is empty when the DMA block has completed, the external bus master can send up to two additional words to the slave DSP’s buffer, which has space for the two words. The slave’s same interrupt vector associated with the completion of the Link Port DMA could then read the buffer and use these control words to determine the next course of action. Link Port Interrupts With DMA Disabled If DMA is disabled for a link port buffer, then the buffer may be written or read by the DSP core as a memory-mapped I/O processor register. If the DMA is disabled but the associated link buffer is enabled, then a maskable interrupt is generated whenever a receive buffer is not empty or when a transmit buffer is not full. This interrupt is the same interrupt vector associated with the completion of the DMA block transfers. The interrupt latch bit in LIRPTL may be unmasked by the corresponding mask bit in the same register. When initially enabling the mask bit, the corresponding latch bit in LIRPTL should be cleared first to clear out any request that may have been inadvertently latched. ADSP-21160 SHARC DSP Hardware Reference 8-13 Using Link Port Interrupts The interrupt service routine should test the buffer status after each read or write to check when the buffer is empty or full, in order to determine when it should return from interrupt. This will reduce the number of interrupts it must service. Link Port Service Request Interrupts (LSRQ) Link port service requests let a disabled (unassigned or assigned with buffer disabled) link port cause an interrupt when an external access is attempted. The transmit and receive request status bits of the LSRQ register let a DSP determine if another DSP is attempting to send or receive data through a particular link port. This lets two processors communicate without prior knowledge of the transfer direction, link port number, or exactly when the transfer is to occur. When LxACK or LxCLK is asserted externally, a link service request (LSR) is generated in a disabled (unassigned or assigned with buffer disabled) link port. LSRs are not generated for a link port that is disabled by loopback mode. Each LSR is gated by mask bits before being latched in the LSRQ register. The six possible receive LSRs and the six possible transmit LSRs are gated by mask bits and then OR’ed together to generate the link service request interrupt. The LSRQ interrupt request may be masked by the LSRQI mask bit of the IMASK register. When the mask bit is set, the interrupt is allowed to pass into the interrupt priority encoder. A diagram of this logic appears in Figure 8-4 on page 8-14. LSRQ LXRRQ LXTRQ IMASK, LSRQI LINK SERVICE REQUEST INTERRUPT LSR STATUS LSR MASK IRPT L, LSRQI MODE1, I RPTEN Figure 8-4. Logic For Link Port Interrupts 8-14 ADSP-21160 SHARC DSP Hardware Reference Link Ports In Figure 8-4, the and inputs stand for status bits in the register. For transmit requests, =1 indicates the folLxTRQ LxRRQ LSRQ LxTRQ lowing status: LxACK=1, LxTM=1, and LxEN=0. For receive requests, LxRRQ=1 indicates the following status: LxCLK=1, LxRM=1, and LxEN=0. The interrupt routine must read the LSRQ register to determine which link port to service and whether it is a transmit or receive request. LSR interrupts have a latency of two cycles. Note that the link service request interrupt is different from the link receive and transmit interrupt—this is also true in IMASK. The 32-bit LSRQ register holds the masked link status of each link port and the corresponding interrupt mask bits. The link service request status of the port is set whenever the port is not enabled and one of LxACK or LxCLK is asserted high. The LSRQ status bits are read-only. Table A-26 on page A-68 shows the individual bits of the LSRQ register. which link port to service, programs can transfer To determine to a register (in the register file) then use the leading 0s LSRQ Rx detect instruction: Rn=LEFTZ Rx Here, Rn indicates which link port is active in order of priority. If link service requests are in use, they should be masked out when the assigned link buffers are being enabled, disabled, or when the link port is being unassigned in LAR, otherwise spurious service requests may be generated. This need for masking is due to a delay before LxCLK or LxACK (if already asserted) signals are pulled (if pull-downs enabled) or driven externally (if pull-downs disabled) below logic threshold. During this delay, these signals are sampled asserted and generate an LSRQ. ADSP-21160 SHARC DSP Hardware Reference 8-15 Detecting Errors On Link Transmissions the possibility of spurious interrupts, programs should Tomaskavoid the interrupt or the appropriate request bit in the LSRQ LSRQ register and allow an appropriate delay before unmasking. Alternatively, programs can mask the LSRQ interrupt and poll the appropriate request status bit until it is cleared and then unmask the interrupt. Detecting Errors On Link Transmissions Transmission errors on the link ports may be detected by reading the LRERRx bits in the LCOM register. These bits reflects the status of each nibble or byte counter. The LRERRx bit is zero if the pack counter of the corresponding link buffer is zero—a multiple of 8 or 12 nibbles or bytes have been received. If LRERR is high when a transmission has completed, then an error occurred during transmission. DMA word count provides an exact count of the number of The words to be transferred. To allow checking of this status, the transmitter and receiver should follow a protocol such as the following: • Transmitter Protocol—To make use of the LRERRx status, one additional dummy word should always be transmitted at the end of a block transmission. The transmitter must then deselect the link port and re-enable as a receiver to allow the receiver to send an appropriate message back to the transmitter. • Receiver Protocol—When the receiver has received the data block, indicated by a the same interrupt vector associated with the completion of the Link Port DMA, it checks that it has received an additional word in the link buffer and then reads the LRERR bit. The receiver may then clear the link buffer (LxEN=0) and transmit the appropriate message back to the transmitter on the same, or a different, link port. 8-16 ADSP-21160 SHARC DSP Hardware Reference Link Ports Using Token Passing With Link Ports Two DSPs that communicate using a link port need to know which of them is currently the transmitter and which is the receiver, otherwise they might both try to transmit at the same time. Token passing is a protocol that can help the DSPs alternate control. Figure 8-5 on page 8-18 shows a Token Passing Flow Chart describing a software protocol for token passing through the link ports. A good example of this protocol is available in the Engineer-to-Engineer Note EE-16: Using Token Passing to Control SHARC Link Port Bi-directional Communication on the Analog Devices website. In token passing, the token is a software flag that passes between the processors. At reset, the token (flag) is set to reside in the link port of one device, making it the master and the transmitter. When a receiver link port (slave) wants to become the master, it may assert its LxACK line (request data) to get the master’s attention. The master knows, through software protocol, whether it is supposed to respond with actual data or whether it is being asked for the token. ADSP-21160 SHARC DSP Hardware Reference 8-17 Using Token Passing With Link Ports ORIGINAL SLAVE ORIGINAL MASTER • DMA TRANSFER COMPLETE • LBUF DISABLED • LSRQ INTERRUPT ENABLED • DMA TRANSFER COMPLETE • LBUF DISABLED • LBUF RX NON-DMA ENABLED • LACK ASSERTION CAUSES LSRQ INTERRUPT • LBUF TX NON-DMA ENABLED • SEND TRW 4 TIMES TO FILL LBUF FIFOS ON BOTH SIDES • CHECK LCOM FOR SLAVE READ OF TRW BEFORE ACCEPTANCE TEST • READ LBUF • TEST FOR TRW • CHECK LCOM TO SEE IF SLAVE ACCEPTED TOKEN BY EMPTYING FIFOS IN AN ALLOTTED TIME PERIOD • ACCEPT TOKEN BY EMPTYING LBUF FIFOS THROUGH 3 MORE READS WITHIN THE ALLOTTED TIME PERIOD • SETUP LBUF FOR RX NON-DMA TO ACCEPT DMA SIZE • SETUP LBUF FOR RX DMA AND DMA COMPLETE IRQ • DISABLE LBUF AND LSRQ INTERRUPT • POLL LSRQ STATUS FOR LINK PORT TRANSMIT REQUEST TO BE SURE THAT THE ORIGINAL MASTER IS NOW A SLAVE • LACK ASSERTION ASSURES THAT IT IS SAFE TO BEGIN TRANSMITTING • SETUP LBUF FOR TX NON-DMA TO SEND DMA SIZE • SETUP LBUF FOR TX DMA AND DMA COMPLETE INTERRUPT • DMA TRANSFER COMPLETE • SETUP LBUF FOR TX NON-DMA • DMA TRANSFER COMPLETE • SETUP LBUF FOR RX NON-DMA Figure 8-5. Token Passing Flow Chart 8-18 ADSP-21160 SHARC DSP Hardware Reference Link Ports The token release word can be any user-defined value. Since both the transmitter and receiver are expecting a code word, this need not be exclusive of normal data transmission. If the master wishes to give up the token, it may send back a user-defined token release word and thereafter clear its token flag. Simultaneously, the slave examines the data sent back and if it is the token release word, the slave will set its token, and can thereafter transmit. If the received data is not the token release word, then the slave must assume the master was beginning a new transmission. Through software protocol, the master can also ask to receive data by sending the token release word without the LxACK (data request) going low first. Figure 8-5 shows a flow chart of the example code’s protocol. To use the example, the example code is to be loaded on both the original master and the original slave. The code is ID intelligent for multiprocessor systems: ID1 is the original master (transmitter) and ID2 is original slave (receiver). The master transmits a buffer via DMA through LPORT0 using LBUF3 and the slave receives through LPORT0 using LBUF2. The slave then requests the token by generating an LSRQ interrupt in the disabled link port of the master (LPORT0). The master responds by sending the token release word and waiting to see if it is accepted. The slave checks to see that it is the token release word and accepts the token by emptying the master’s link buffer FIFO within a predetermined amount of time. If the token is accepted the slave becomes the master and transmits a buffer of data to the new slave. If the token is rejected, the master transmits a second buffer. When complete, the original master will finish by setting up LBUF2 to receive without DMA, and the original slave sets up LBUF3 to transmit without DMA. ADSP-21160 SHARC DSP Hardware Reference 8-19 Using Token Passing With Link Ports The following is a list of the major areas of concern when a program implements a software protocol scheme for token passing: • The program must make sure that both link buffers are not enabled to transmit at the same time. In the event that this is allowed, data may be transmitted and lost due to the fact that neither link port is driving LxACK. • In the example, the LSRQ register status bits are polled to ensure that the master becomes the slave before the slave becomes the master, avoiding the two transmitter conflict. • The program must make sure that the link interrupt selection matches the application. If a status detection scheme using the status bits of the LSRQ register is to be used, it is important to note the following: If a link port that is configured to receive is disabled while LxACK is asserted, there is an RC delay before the 50K pull-down resistor on LxACK (if enabled) can pull the value below logic threshold. If a link port that is configured to transmit is disabled while LxCLK is asserted, there is an RC delay before the 50K pull-down resistor on LxCLK (if enabled) can pull the value below logic threshold. If the appropriate request status bit is unmasked in the LSRQ register (in this instance), then an LSR is latched and the LSRQ interrupt may be serviced, even though unintended, if enabled. • The program must make sure that synchronization is not disrupted by unrelated influences at critical sections where timing control loops are used to synchronize parallel code execution. Disabling of nested interrupts is one techniques to control this. 8-20 ADSP-21160 SHARC DSP Hardware Reference Link Ports Designing Link Port Systems The DSPs link ports support I/O with peripherals and other DSP link ports. While link ports require few connections, there are a number of design issues that systems using these ports must accommodate. Terminations For Link Transmission Lines The link ports are designed to allow long distance connections to be made between the driver and the receiver. This is possible because the links are self-synchronizing—the clock and data are transmitted together. Only relative delay, not absolute delay between clock and data is of importance. In addition, the LACK signal inhibits transmission of the next word, not of the current nibble or byte. For example, the current word is always allowed to complete transmission. This allows delays of 3 to 5 cycles for the LxACK signal to reach the transmitter. The links are designed to drive transmission lines with characteristic impedances of 50 or greater. A higher transmission line impedance reduces the on-chip effect of driver impedance variations, for distances longer than about six inches. It is recommended that an external series termination resistor be used at each link port pin to absorb reflections from the open circuit at the destination. The external resistor should be selected such that its value (plus the internal resistance of the driver) be equal to the characteristic impedance of the transmission line. example, a system with a typical internal drive resistance of For 10 and a characteristic impedance of 50 should use a link port pin resistor of 40. ADSP-21160 SHARC DSP Hardware Reference 8-21 Designing Link Port Systems Peripheral I/O Using Link Ports The example shown in Figure 8-6 on page 8-23 shows how a multiprocessing system can use link ports to connect to local memories and I/O devices. An ASIC implements the interface between the link port and DRAM or an I/O device. This minimal hardware solution frees the DSP’s external bus for other shared-bus communication. The DRAM and ASIC may be implemented on a single 10-pin SIMM module. Accesses to the DRAM over a link is most efficient under DMA control. The ASIC receives DMA control information from the link port and sets up the access to the DRAM. It unpacks 16-bit data words from the DRAM or packs 8-bit bytes from the link. At the end of the DMA transfer, an interrupt lets the DSP send new control information to the ASIC. The ASIC always reverts to receive mode at the end of a transfer. The LxACK signal is deasserted by the ASIC whenever a page change, memory refresh cycle, or any other access to the DRAM occurs. Memory modules may be shared by multiple DSPs when the link port is bused. Each link port supports 100 Mbyte per second access throughput for either instructions or data. The ASIC is responsible for generating the clock when transmitting to the DSP. The ASIC is also responsible for generating sequential DMA addresses based on a start address and word count. Data Flow Multiprocessing With Link Ports Figure 8-7 on page 8-24 shows examples of different link port communications schemes. 8-22 ADSP-21160 SHARC DSP Hardware Reference Link Ports LINK INTERFACE ASIC DRAM 0 20 MHZ CYCLE 16 ADSP-21160 L0DAT7-0 L0CLK L0ACK 10 I/O DEVICE EXTERNAL PORT DATA LINK BUS 0 LINK PORT 0 ADDRESS & CONTROL LINK PORT 1 EXTERNAL MEMORY DRAM 1 LINK INTERFACE ASIC 20 MHZ CYCLE 16 L1DAT7-0 L1CLK L1ACK 10 LINK BUS 1 CLK ADSP-21160 EXTERNAL PORT DATA LINK PORT 0 ADDRESS & CONTROL DMA DEVICE HOST LINK PORT 1 CLK Figure 8-6. Local DRAM With Link Ports ADSP-21160 SHARC DSP Hardware Reference 8-23 Designing Link Port Systems For more information on the multiprocessor interface, see “Multiprocessing System Architectures” on page 7-92. LINK PORTS LINK PORTS LINK PORTS LINK PORTS LINK PORTS LINK PORTS ADSP-21160 ADSP-21160 ADSP-21160 ADSP-21160 ADSP-21160 ADSP-21160 EXTERNAL PORT EXTERNAL PORT EXTERNAL PORT DATAFLOW SHARC CLUSTER LINK PORTS LINK PORTS LINK PORTS ADSP-21160 ADSP-21160 ADSP-21160 EXTERNAL PORT EXTERNAL PORT EXTERNAL PORT ADSP-21160 ADSP-21160 ADSP-21160 LINK PORTS LINK PORTS LINK PORTS LINK PORTS LINK PORTS LINK PORTS ADSP-21160 ADSP-21160 ADSP-21160 EXPANDING CLUSTERS 2D MESH Figure 8-7. Link Port Communication Examples 8-24 ADSP-21160 SHARC DSP Hardware Reference Serial Ports 9 SERIAL PORTS This chapter describes ADSP-21160 DSP’s serial ports. Overview The DSP has two independent, synchronous serial ports, SPORT0 and SPORT1, that provide an I/O interface to a wide variety of peripheral devices. Each serial port has its own set of control registers and data buffers. With a range of clock and frame synchronization options, the SPORTs allow a variety of serial communication protocols and provide a glueless hardware interface to many industry-standard data converters and CODECs.because channels 0 through The serial ports can operate at 1/2 the full clock rate of the processor, providing each with a maximum data rate of n/2 Mbit/s, where n equals the processor clock frequency. Independent transmit and receive functions provide greater flexibility for serial communications. Serial port data can be automatically transferred to and from on-chip memory using DMA block transfers. Each of the serial ports offers a TDM (time division multiplexed) multichannel mode. Serial port clocks and frame syncs can be internally generated by the DSP or received from an external source. The serial ports can operate with little-endian or big-endian transmission formats, with word lengths ADSP-21160 SHARC DSP Hardware Reference 9-1 Overview selectable from 3 to 32 bits. They offer selectable synchronization and transmit modes and optional -law or A-law companding in hardware. The serial ports offer the following features and capabilities: • Provides independent transmit and receive functions • Transfers data words up to 32 bits in length, either MSB-first or LSB-first • Double-buffers data—both receive and transmit functions have a data buffer register and a shift register—the double-buffering provides additional time to service the SPORT • Compands (compression/decompression) A-law and -law hardware companding on transmitted and received words. • Internally generates serial clock and frame sync signals—in a wide range of frequencies—or accepts clock and frame synch input from an external source • Performs interrupt-driven, single-word transfers to and from on-chip memory controlled by the DSP core • Executes DMA transfers to and from on-chip memory—each SPORT can automatically receive and transmit an entire block of data • Permits chaining of DMA operations for multiple data blocks • Has a multichannel mode for TDM interfaces—each SPORT can receive and transmit data selectively from channels of a time-division-multiplexed serial bitstream—this mode can be useful for T1 interfaces 9-2 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Table 9-1 shows the pins of each serial port: Table 9-1. Serial Port Pins SPORT0 Pins SPORT1 Pins Description DT0 DT1 Transmit Data TCLK0 TCLK1 Transmit Clock TFS0 TFS1 Transmit Frame Sync DR0 DR1 Receive Data RCLK0 RCLK1 Receive Clock RFS0 RFS1 Receive Frame Sync A serial port receives serial data on its DR input and transmits serial data on its DT output. It can receive and transmit simultaneously for full duplex operation. Serial communications are synchronized to a clock signal—every data bit must be accompanied by a clock pulse. Each serial port can generate or receive its own transmit clock signal (TCLK) and receive clock signal (RCLK). Internally-generated serial clock frequencies are configured in the TDIVx and RDIVx registers. In addition to the serial clock signal, data may be signalled by a frame synchronization signal. The framing signal can occur either at the beginning of an individual word or at the beginning of a block of words. The configuration of frame synch signals depends upon the type of serial device connected to the DSP. Each serial port can generate or receive its own transmit frame sync signal (TFS) and receive frame sync signal (RFS). Internally-generated frame sync frequencies are configured in the TDIVx and RDIVx registers. Figure 9-1 shows a block diagram of a serial port. Data to be transmitted is written to the TX buffer. The data is (optionally) compressed in hardware, then automatically transferred to the transmit shift register. The ADSP-21160 SHARC DSP Hardware Reference 9-3 Overview data in the shift register is then shifted out on the SPORT’s DT pin, synchronous to the TCLK transmit clock. If framing signals are used, the TFS signal indicates the start of the serial word transmission. The DT pin is always driven (for example, three-stated) if the serial port is enabled (SPEN=1 in the STCTLx control register), unless it is in multichannel mode and an inactive time slot occurs. For more information, see “Multichannel Operation” on page 9-28. DM DAT A B US P M DAT A B US I/O DAT A B US 32 32 32 T Xx T R ANS MIT B UF F E R R Xx R E CE IVE DAT A B U F F E R HAR DWAR E COMP ANDING (COMP R E S S ION) HAR DWAR E COMP ANDING (COMPR E S S ION) 32 32 S E R IAL P OR T CONT R OL T R ANS MIT S HIF T R E GIS T E R DT x T CL K x T FS x RF Sx R E CE IVE S HIF T R E GIS T E R R CL K x DR x Figure 9-1. Serial Port Block Diagram The receive portion of the SPORT shifts in data from the DR pin, synchronous to the RCLK receive clock. If framing signals are used, the RFS signal indicates the beginning of the serial word being received. When an entire word is shifted in, the data is (optionally) expanded, then automatically transferred to the RX buffer. 9-4 ADSP-21160 SHARC DSP Hardware Reference Serial Ports DSP SPORTs are not UARTs and cannot be used to commu The nicate with an RS-232 device or any other asynchronous communications protocol. One way to implement RS-232-compatible communications with the DSP is to use two of the FLAG pins as asynchronous data receive and transmit signals. For an example of how to do this, see Chapter 11 “Software UART” in the Digital Signal Processing Applications Using The ADSP-2100 Family, Volume 2. SPORT Interrupts Each serial port has a transmit DMA interrupt and a receive DMA interrupt. When serial port DMA is not enabled, interrupts occur based on the SPORT transmit or receive FIFO status. If on the transmit side the FIFO is empty or on the receive side the FIFO is full, interrupts are generated. The priority of the serial port interrupts is shown in Table 9-2. Table 9-2. SPORT Interrupts Interrupt Name Interrupt SPR0I SPORT0 Receive DMA Channel Highest Priority SPR1I SPORT1 Receive DMA Channel SPT0I SPORT0 Transmit DMA Channel SPT1I SPORT1 Transmit DMA Channel Lowest Priority The interrupt names are defined in the def21160.h include file supplied with the ADSP-21xxx DSP Development Software. SPORT Interrupts occur on the second system clock ( the last bit of the serial word is latched in or driven out. CLKIN) ADSP-21160 SHARC DSP Hardware Reference after 9-5 Overview SPORT Reset There are two ways to reset the serial ports: a hardware reset using the RESET pin of the processor, and a software reset accomplished by clearing the serial port’s enable bit (SPEN) in the STCTLx and SRCTLx control registers. Each method has a different effect on the serial port. A hardware reset disables the serial ports by clearing the STCTLx and control registers (including the SPEN enable bits) and the TDIVx and RDIVx frame sync divisor registers. Any ongoing operations are aborted. SRCTLx A software reset of the SPEN enable bit(s) disables the serial port(s) and aborts any ongoing operations. Status bits are also cleared. The serial ports are ready to start transmitting or receiving data two CLKIN cycles after they are enabled (in the STCTLx or SRCTLx control register). No serial clocks are be lost from this point on. only difference between the soft (setting the bit in The register and hard reset ( pin) is that the external bus arbitraSRST SYSCON RESET tion does not get affected by a soft reset. That is, there is no bus synchronization at soft reset. The PLL also does not get reset at soft reset. 9-6 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Setting Serial Port Modes The registers used to control and configure the serial ports are part of the IOP register set. Each SPORT has its own set of the control registers and data buffers, as shown in Table 9-3. Table 9-3. SPORT Registers Register Name* Function STCTLx SPORT Transmit Control Register TXx Transmit Data Buffer TDIVx Transmit Clock and Frame Sync Divisors MTCSx Multichannel Transmit Select MTCCSx Multichannel Transmit Compand Select SRCTLx SPORT Receive Control Register RXx Receive Data Buffer RDIVx Receive Clock and Frame Sync Divisors MRCSx Multichannel Receive Select MRCCSx Multichannel Receive Compand Select SPATHx SPORT Path Length (for mesh multiprocessing) KEYWDx SPORT Receive Comparison KEYMASKx SPORT Receive Comparison Mask An asterisk (*) indicates x = 0, 1. These control registers are describe in detail in the following sections: • “SPORT Serial Transmit Control Registers (STCTLx)” on page A-71 • “SPORT Serial Receive Control Registers (SRCTLx)” on page A-73 ADSP-21160 SHARC DSP Hardware Reference 9-7 Setting Serial Port Modes • “SPORT Transmit Buffer Registers (TXx)” on page A-73 • “SPORT Receive Buffer Registers (RXx)” on page A-76 • “SPORT Transmit Divisor Registers (TDIVx)” on page A-76 • “SPORT Transmit Count Registers (TCNTx)” on page A-77 • “SPORT Receive Divisor Registers (RDIVx)” on page A-77 • “SPORT Receive Count Registers (RCNTx)” on page A-78 • “SPORT Transmit Select Registers (MTCSx)” on page A-78 • “SPORT Receive Select Registers (MRCSx)” on page A-78 • “SPORT Transmit Compand Registers (MTCCSx)” on page A-79 • “SPORT Receive Compand Register (MRCCSx)” on page A-79 • “SPORT Receive Comparison and Mask Registers (KEYWDx and KEYMASKx)” on page A-80 • “SPORT Serial Path Length Registers (SPATHx)” on page A-80 These sections show the memory-mapped address and reset initialization value of each SPORT control register. All of the registers are 32 bits wide. The SPORT control registers are programmed by writing to the appropriate address in memory. The symbolic names of the registers and individual control bits can be used in DSP programs—the #define definitions for these symbols are contained in the file def21160.h which is provided in the INCLUDE directory of the ADSP-21xxx DSP Development Software. The def21160.h file is shown in the Control/Status Registers appendix of this manual. All control and status bits in the SPORT registers are active high unless otherwise noted. 9-8 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Because the SPORT registers are memory-mapped they cannot be written with data coming directly from memory. They must instead be written from (or read into) DSP core registers, usually one of the general-purpose universal registers of the register file (R15–R0). For instance, the SPORT control registers can also be written or read by external devices (for example, another DSP or a host processor) to set up a serial port DMA operation. Transmit and Receive Control Registers (STCTL, SRCTL) The main control registers for each serial port are the transmit control register, STCTLx, and the receive control register, SRCTLx. These registers are defined in Table A-27 on page A-71 and Table A-28 on page A-74. When changing operating modes, a serial port control register should be cleared (for example, written with all zeros) before the new mode is written to the register. The Transmit Underflow Status bit (TUVF) is set whenever the TFS signal occurs (from either external or internal source) while the TX buffer is empty. The internally generated TFS may be suppressed whenever TX is empty by clearing the DITFS control bit (DITFS=0). When DITFS=0, the default, the transmit frame sync signal (TFS) is dependent upon new data being present in the TX buffer—the TFS signal is only generated for new data. Setting DITFS to 1 selects data-independent frame syncs. This causes the TFS signal to be generated whether or not new data is present, transmitting the contents of the TX buffer regardless. Serial port DMA typically keeps the TX buffer full, and when the DMA operation is complete the last word in TX is continuously transmitted. The TXS status bits indicate whether the TX buffer is full (11), empty (00), or partially full (10). To test for space in TX, test for TXS0 (bit 30) equal to zero. To test for the presence of any data in TX, test for TXS1 (bit 31) equal to one. ADSP-21160 SHARC DSP Hardware Reference 9-9 Setting Serial Port Modes The SRCTLx and STCTLx registers control the serial ports operating modes for the I/O processor. Table A-28 on page A-74 lists all the bits in SRCTLx and Table A-27 on page A-71 lists all the bits in STCTLx. The following bits control serial port modes. Some other bits in the SRCTLx and STCTLx registers are used to setup DMA and I/O processor related serial port features. For information on these features, see “Setting I/O Processor—SPort Modes” on page 6-49. • Internal Transmit Clock Select. SRCTLx Bit 10 (ICLK) This bit selects the internal receive clock (if set, =1) or external receive clock (if cleared, =0). • Clock Rising Edge Select. SRCTLx Bit 12 (CKRE) This bit select whether the serial port uses the rising edge (if set, =1) or falling edge (if cleared, =0) of the clock signal for sampling data and the frame sync. • Receive Frame Sync Required Select. SRCTLx Bit 13 (RFSR) This bit selects whether the serial port requires (if set, =1) or does not require (if cleared, =0) a receive frame synch. • Internal Receive Frame Sync Select. SRCTLx Bit 14 (IRFS) This bit selects whether the serial port uses an internal RFS (if set, =1) or uses an external RFS (if cleared, =0). • Data Independent Receive Frame Sync Select. SRCTLx Bit 15 (DIRFS) This bit selects whether the serial port uses a data-independent RFS (synch at selected interval, if set, =1) or uses a data-dependent RFS (synch when data in RX, if cleared, =0). • Active Low Receive Frame Synch Select. SRCTLx Bit 16 (LRFS) This bit selects an active low RFS (if set, =1) or active high RFS (if cleared, =0). 9-10 ADSP-21160 SHARC DSP Hardware Reference Serial Ports • Late Receive Frame Sync Select. SRCTLx Bit 17 (LAFS) This bit selects a late RFS (RFS during first bit, if set, =1) or an early RFS (RFS before first bit, if cleared, =0). This bit must be cleared for multichannel operation. • Serial Port Loopback Enable. SRCTLx Bit 22 (SPL) This bit enables (if set, =1) or disables (if cleared, =0) serial port loopback mode. This bit must be cleared for multichannel operation. • Multichannel Enable. SRCTLx Bit 23 (MCE) This bit enables (if set, =1) or disables (if cleared, =0) multichannel serial port mode. • Number of Multi Channels (–1) Select. SRCTLx Bits 28-24 (NCHN) These bits select the number of channels (–1) for a multichannel serial port. The number of channels can be from 1 (NCHN=0) to 32 (NCHN=31). • Transmit Frame Sync Required Select. STCTLx Bit 13 (TFSR) This bit selects whether the serial port requires (if set, =1) or does not require (if cleared, =0) a transfer frame synch. • Internal Transmit Frame Sync Select. STCTLx Bit 14 (ITFS) This bit selects whether the serial port uses an internal TFS (if set, =1) or uses an external TFS (if cleared, =0). • Data Independent Transmit Frame Sync Select. STCTLx 15 (DITFS) This bit selects whether the serial port uses a data-independent TFS (synch at selected interval, if set, =1) or uses a data-dependent TFS (synch when data in TX, if cleared, =0). • Active Low Transmit Frame Synch Select. STCTLx Bit 16 (LTFS) This bit selects an active low TFS (if set, =1) or active high TFS (if cleared, =0). ADSP-21160 SHARC DSP Hardware Reference 9-11 Setting Serial Port Modes • Late Transmit Frame Sync Select. STCTLx Bit 17 (LAFS) This bit selects a late TFS (TFS during first bit, if set, =1) or an early TFS (TFS before first bit, if cleared, =0). • Multichannel Transmit Frame Sync Delay Select. STCTLx Bits 23-20 (MFD) These bits select the delay in serial clock cycles between the TFS and the first data bit. When MFD=0, the TFS and first data bit are concurrent. The maximum value is MFD=16. • Current Channel Selected (read-only). STCTLx Bits 28-24 (CHNL) These bits indicate which channel the DSP has selected for the serial port’s transmission in multichannel mode. Register Writes and Effect Latency SPORT register writes are internally completed at the end of the same CLKIN cycle in which they occur. The register is read back the newly written value on the very next cycle. When a read of one of the STCTLx or SRCTLx control registers is immediately followed by a write to that register, the write may take two cycles to complete. After a write to a SPORT register, control and mode bit changes generally take effect in the second CLKIN cycle after the write is completed. The serial ports are ready to start transmitting or receiving two CLKIN cycles after they are enabled (in the STCTLx or SRCTLx control register). No serial clocks are lost from this point on. Transmit and Receive Data Buffers (TX, RX) and TX1 are the transmit data buffers for SPORT0 and SPORT1. They are 32-bit buffers which must be loaded with the data to be transmitted; the data is loaded either by the DMA controller or by the program running on the DSP core. RX0 and RX1 are the receive data buffers for SPORT0 and SPORT1. They are 32-bit buffers which are automatically loaded from the receive shifter when a complete word has been received. TX0 9-12 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Word lengths of less than 32 bits are right-justified in the receive and transmit buffers. The TX buffers act like a two-location FIFO because they have a data register plus an output shift register as shown in Figure 9-1 on page 9-4. Two 32-bit words may be stored in TX at any one time. When the TX buffer is loaded and any previous word has been transmitted, the buffer contents are automatically loaded into the output shifter. An interrupt is generated when the output shifter has been loaded, signifying that the TX buffer is ready to accept the next word (for example, the TX buffer is “not full”). This interrupt does not occur if serial port DMA is enabled or if the corresponding mask bit in the IMASK register is set. The transmit underflow status bit (TUVF) is set in the transmit control register when a transmit frame synch occurs and no new data has been loaded into TX. The TUVF status bit is “sticky” and is only cleared by disabling the serial port. The RX buffers act like a three-location FIFO because they have two data registers plus an input shift register. Two complete 32-bit words can be stored in RX while a third word is being shifted in. The third word overwrites the second if the first word has not been read out (by the DSP core or the DMA controller). When this happens, the receive overflow status bit (ROVF) is set in the receive control register. Almost three complete words can be received without the RX buffer being read before overflow occurs. The overflow status is generated on the last bit of third word. The ROVF status bit is “sticky” and is only cleared by disabling the serial port. An interrupt is generated when the RX buffer has been loaded with a received word (for example, the RX buffer is “not empty”). This interrupt is masked out if serial port DMA is enabled or if the corresponding bit in the IMASK register is set. If your DSP program causes the core processor to attempt a read from an empty RX buffer or a write to a full TX buffer, the access is delayed until the buffer is accessed by the external I/O device. (This delay is called a core ADSP-21160 SHARC DSP Hardware Reference 9-13 Setting Serial Port Modes processor hang.) If it is not known whether the core processor can access the RX or TX buffer without a hang, the buffer’s full or empty status should be read first (in STCTLx or SRCTLx) to determine if the access can be made. support debugging buffer transfers, the DSP has a Buffer Hang ToDisable ( ) bit. When set (=1), this bit prevents the processor BHD core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on page -18. The status bits in STCTLx and SRCTLx are updated during reads and writes from the core processor even when the serial port is disabled. The serial port should be disabled when writing to the RX buffer or reading from the TX buffer. Clock and Frame Sync Frequencies (TDIV, RDIV) The TDIVx and RDIVx registers contain divisor values which determine the frequencies for internally generated clocks and frame syncs. These registers are defined in “SPORT Transmit Divisor Registers (TDIVx)” on page A-76 and “SPORT Receive Divisor Registers (RDIVx)” on page A-77. f CCLK f RCLK = -------------------------------------------2 RCLKDIV + 1 f CCLK f TCLK = -------------------------------------------2 TCLKDIV + 1 The maximum serial clock frequency is equal to 1/2 the DSP’s internal clock (CCLK) frequency, which occurs when xCLKDIV is set to zero. Use the 9-14 ADSP-21160 SHARC DSP Hardware Reference Serial Ports following equation to determine the value of xCLKDIV to use, given the CCLK frequency and desired serial clock frequency: f CCLK RCLKDIV = --------------------- – 1 2 f RCLK f CCLK TCLKDIV = --------------------- – 1 2 f TCLK internal clock ( ) is the frequency multiplied byThea DSP’s clock ratio ( ). For more information, see the clock CCLK CLKIN CLK_CFG3-0 ratio discussion on page 11-8. and RFSDIV specify how many transmit or receive clock cycles are counted before generating a TFS or RFS pulse (when the frame synch is internally generated). In this way a frame sync can be used to initiate periodic transfers. The counting of serial clock cycles applies to either internally or externally generated serial clocks. TFSDIV The formula for the number of cycles between frame synch pulses is: # of serial clocks between frame syncs = xFSDIV + 1 ADSP-21160 SHARC DSP Hardware Reference 9-15 Setting Serial Port Modes Use the following equations to determine the value of xFSDIV to use, given the serial clock frequency and desired frame sync frequency: f TCLK TFSDIV = ------------- – 1 f TFS f RCLK RFSDIV = ------------- – 1 f RFS The frame sync is continuously active if xFSDIV=0. The value of xFSDIV should not be less than the serial word length minus one (the value of the SLEN field in the transmit or receive control register), as this may cause an external device to abort the current operation or cause other unpredictable results. If the serial port is not being used, the xFSDIV divisor can be used as a counter for dividing an external clock or for generating a periodic pulse or periodic interrupt. The serial port must be enabled for this mode of operation to work. should be exercised when operating with externally gener Caution ated transmit clocks near the frequency of 1/2 the DSP’s internal clock. There is a delay between when the clock arrives at the TCLKx pin and when data is output—this delay may limit the receiver’s speed of operation. Refer to the data sheet for exact timing specifications. For reliable operation, it is recommended that full-speed serial clocks only be used when receiving with an externally generated clock and externally generated frame sync (ICLK=0, IRFS=0). Externally-generated late transmit frame syncs also experience a delay from when they arrive to when data is output—this can also limit the maximum serial clock speed. Refer to the data sheet for exact timing specifications. 9-16 ADSP-21160 SHARC DSP Hardware Reference Serial Ports The serial ports handle word lengths of 3 to 32 bits, but transmitting or receiving words smaller than 7 bits at 1/2 the full clock rate of the DSP may cause incorrect operation when DMA chaining is enabled. Chaining disables the DSP’s internal I/O bus for several cycles while the new TCB parameters are being loaded. Receive data may be lost (for example, overwritten) during this period. Data Word Formats The format of the data words transmitted over the serial ports is configured by the DTYPE, SENDN, SLEN, and PACK bits of the STCTLx and SRCTLx control registers. Word Length The serial ports handle word lengths of 3 to 32 bits. The word length is configured in the 5-bit SLEN field in the STCTLx and SRCTLx control registers. The value of SLEN is equal to the word length minus one: SLEN = Serial Word Length – 1 The SLEN value should not be set to zero or one. Words smaller than 32 bits are right-justified in the RX and TX buffers, residing in the least significant bit positions. Transmitting or receiving words smaller than 7 bits at 1/2 the full clock rate of the DSP may cause incorrect operation when DMA chaining is enabled. Chaining disables the DSP’s internal I/O bus for several cycles while the new TCB parameters are being loaded. Receive data may be lost (for example, overwritten) during this period. Endian Format Endian format determines whether the serial word is transmitted MSB-first or LSB-first. Endian format is selected by the SENDN bit in the STCTLx and SRCTLx control registers. When SENDN=0, serial words are ADSP-21160 SHARC DSP Hardware Reference 9-17 Setting Serial Port Modes transmitted (or received) MSB-first. When SENDN=1, serial words are transmitted (or received) LSB-first. Data Packing and Unpacking Received data words of 16 bits or less may be packed into 32-bit words, and 32-bit words being transmitted may be unpacked into 16-bit words. Word packing and unpacking is selected by the PACK bit in the SRCTLx and STCTLx control registers. When PACK=1 in the receive control register (SRCTLx), two successive words received are packed into a single 32-bit word. When PACK=1 in the transmit control register (STCTLx), each 32-bit word is unpacked and transmitted as two 16-bit words. The first 16-bit (or smaller) word is right-justified in bits 15-0 of the packed word, and the second 16-bit (or smaller) word is right-justified in bits 31-16. This applies for both receive (packing) and transmit (unpacking) operations. Companding may be used when word packing or unpacking is being used. When serial port data packing is enabled, the transmit and receive interrupts are generated for the 32-bit packed words, not for each 16-bit word. 16-bit received data is packed into 32-bit words and stored When in normal word space in DSP internal memory, the 16-bit words can be read or written with short word space addresses. 9-18 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Data Type The DTYPE field of the STCTLx and SRCTLx control registers specifies one of four data formats (for non-multichannel operation): Table 9-4. DTYPE and Data Formatting (non-multichannel) DTYPE Data Formatting 00 Right-justify, zero-fill unused MSBs 01 Right-justify, sign-extend into unused MSBs 10 Compand using -law 11 Compand using A-law These formats are applied to serial data words loaded into the RX and TX buffers. TX data words are not actually zero-filled or sign-extended, because only the significant bits are transmitted. For multichannel operation, the companding selection and MSB-fill selection is independent: Table 9-5. DTYPE and Data Formatting (multichannel) DTYPE Data Formatting x0 Right-justify, zero-fill unused MSBs x1 Right-justify, sign-extend into unused MSBs 0x Compand using -law 1x Compand using A-law Linear transfers occur if the channel is active but companding is not selected for that channel. Companded transfers occur if the channel is active and companding is selected for that channel. The multichannel compand select registers, MTCCSx and MRCCSx, are used to specify which ADSP-21160 SHARC DSP Hardware Reference 9-19 Setting Serial Port Modes transmit and receive channels are companded. For more information, see “Channel Selection Registers” on page 9-32. Transmit sign extension is selected by bit 0 of DTYPE in the STCTLx register and is common to all transmit channels. Receive sign extension is selected by bit 0 of DTYPE in the SRCTLx register and is common to all receive channels. If bit 0 of DTYPE is set, sign extension occurs on selected channels that do not have companding selected. If this bit is not set, the word contains 0s in the MSBs. Companding Companding (compressing/expanding) is the process of logarithmically encoding and decoding data to minimize the number of bits that must be sent. The DSP serial ports support the two most widely used companding algorithms, A-law and -law, performed according to the CCITT G.711 specification. The type of companding can be selected independently for each SPORT. Companding is selected by the DTYPE field of the STCTLx and SRCTLx control registers. When companding is enabled, the data in the RX0 or RX1 buffer is the right-justified, sign-extended expanded value of the eight LSBs received. A write to TX0 or TX1 causes the 32-bit value to be compressed to eight LSBs (sign-extended to the width of the transmit word) before it is transmitted. If the 32-bit value is greater than the 13-bit A-law or 14-bit -law maximum, it is automatically compressed to the maximum value. Because the values in the TX and RX buffers are actually companded in-place, the companding hardware can be used without transmitting (or receiving) any data, for example during testing or debugging. This operation requires a single cycle of overhead, as described below. For companding to execute properly, program the SPORT registers prior to loading data values into the SPORT buffers. 9-20 ADSP-21160 SHARC DSP Hardware Reference Serial Ports To compand data in-place, without transmitting, use the following sequence of operations: 1. Enable companding in the DTYPE field of the STCTLx transmit control register. 2. Write a 32-bit data word to TX. (The companding is calculated in this cycle.) 3. Wait one cycle. A NOP instruction can be used to do this; if a NOP is not inserted, the DSP core is held off for one cycle anyway. This allows the serial port companding hardware to reload TX with the companded value. 4. Read the 8-bit companded value from TX. To expand data in-place, the same sequence of operations is used but with RX rather than TX. When expanding data in this way, be sure that the serial word length (SLEN) is set appropriately in the SRCTLx control register. With companding enabled, interfacing the DSP serial port to a codec requires little additional programming effort. If companding is not selected, there are two formats available for received data words of fewer than 32 bits: one that fills unused MSBs with zeros, and another that sign-extends the MSB into the unused bits. For more information, see “Data Type” on page 9-19. Clock Signal Options Each serial port has a transmit clock signal (TCLKx) and a receive clock signal (RCLKx). The clock signals are configured by the ICLK and CKRE bits of the STCTLx and SRCTLx control registers. Serial clock frequency is configured in the TDIVx and RDIVx registers. receive clock pin may be tied to the transmit clock if a single The clock is desired for both input and output. ADSP-21160 SHARC DSP Hardware Reference 9-21 Setting Serial Port Modes Both transmit and receive clocks can be independently generated internally or input from an external source. The ICLK bit of the STCTLx and SRCTLx control registers determines the clock source. When ICLK=1, the clock signal is generated internally by the DSP and the TCLKx or RCLKx pins are outputs. The clock frequency is determined by the value of the serial clock divisor (TCLKDIV or RCLKDIV) in the TDIVx or RDIVx registers. When ICLK=0, the clock signal is accepted as an input on the TCLKx or RCLKx pins, and the serial clock divisors in the TDIVx/ RDIVx registers are ignored. The externally generated serial clock need not be synchronous with the DSP system clock. Frame Sync Options Framing signals indicate the beginning of each serial word transfer. The framing signals for each serial port are TFS (transmit frame synchronization) and RFS (receive frame synchronization). A variety of framing options are available; these options are configured in the serial port control registers. The TFS and RFS signals of a serial port are independent and are separately configured in the control registers. Framed Versus Unframed The use of frame sync signals is optional in serial port communications. The TFSR (transmit frame sync required) and RFSR (receive frame sync required) control bits determine whether frame sync signals are required. These bits are located in the STCTLx and SRCTLx control registers. When TFSR=1 or RFSR=1, a frame sync signal is required for every data word. To allow continuous transmitting from the DSP, each new data word must be loaded into the TX buffer before the previous word is shifted out and transmitted. For more information, see “Data-Independent Transmit Frame Sync” on page 9-27. 9-22 ADSP-21160 SHARC DSP Hardware Reference Serial Ports When TFSR=0 or RFSR=0, the corresponding frame sync signal is not required. A single frame sync is needed to initiate communications but is ignored after the first bit is transferred. Data words are then transferred continuously, unframed. DMA is enabled in this mode, with frame syncs not When required, DMA requests may be held off by chaining or may not be serviced frequently enough to guarantee continuous unframed data flow. Figure 9-2 illustrates framed serial transfers, which have the following characteristics: xCLK FRAMED DATA UNFRAMED DATA B 3 B 2 B 1 B 0 B 3 B 2 B 1 B 0 B 3 B 2 B 3 B 2 B 1 B 0 B 1 B 0 B 3 B 2 B 1 Figure 9-2. Framed Versus Unframed Data • TFSR and RFSR bits in STCTLx, SRCTLx control registers determine framed or unframed mode. • Framed mode requires a framing signal for every word. Unframed mode ignores framing signal after first word. • Unframed mode is appropriate for continuous reception. • Active-low or active-high frame syncs selected with LTFS and LRFS bits of STCTLx, SRCTLx control registers. ADSP-21160 SHARC DSP Hardware Reference 9-23 Setting Serial Port Modes Internal Versus External Frame Syncs Both transmit and receive frame syncs can be independently generated internally or input from an external source. The ITFS and IRFS bits of the STCTLx and SRCTLx control registers determine the frame sync source. When ITFS=1 or IRFS=1, the corresponding frame sync signal is generated internally by the DSP and the TFSx pin or RFSx pin is an output. The frequency of the frame sync signal is determined by the value of the frame sync divisor (TFSDIV or RFSDIV) in the TDIVx or RDIVx registers. When ITFS=0 or IRFS=0, the corresponding frame sync signal is accepted as an input on the TFSx pin or RFSx pins, and the frame sync divisors in the TDIVx/RDIVx registers are ignored. All of the various frame sync options are available whether the signal is generated internally or externally. Active Low Versus Active High Frame Syncs Frame sync signals may be either active high or active low (for example, inverted). The LTFS and LRFS bits of the STCTLx and SRCTLx control registers determine the frame syncs’ logic level: • When LTFS=0 or LRFS=0, the corresponding frame sync signal is active high. • When LTFS=1 or LRFS=1, the corresponding frame sync signal is active low. Active high frame syncs are the default. The LTFS and LRFS bits are initialized to 0 after a processor reset. 9-24 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Sampling Edge For Data and Frame Syncs Data and frame syncs can be sampled on either the rising or falling edges of the serial port clock signals. The CKRE bit of the STCTLx and SRCTLx control registers selects the sampling edge. For transmit data and frame syncs, setting CKRE=1 in STCTLx selects the rising edge of TCLKx. CKRE=0 selects the falling edge. Note that data and frame sync signals change state on the clock edge that is not selected. For receive data and frame syncs, setting CKRE=1 in SRCTLx selects the rising edge of RCLKx. CKRE=0 selects the falling edge. The transmit and receive functions of two serial ports connected together, for example, should always select the same value for CKRE so that any internally generated signals are driven on one edge and any received signals are sampled on the opposite edge. Early Versus Late Frame Syncs Frame sync signals can occur during the first bit of each data word (“late”) or during the serial clock cycle immediately preceding the first bit (“early”). The LAFS bit of the STCTLx and SRCTLx control registers configures this option. When LAFS=0, early frame syncs are configured; this is the normal mode of operation. In this mode, the first bit of the transmit data word is available (and the first bit of the receive data word is latched) in the serial clock cycle after the frame sync is asserted, and the frame sync is not checked again until the entire word has been transmitted (or received). (In multichannel operation, this is the case when frame delay is 1.) If data transmission is continuous in early framing mode (for example, the last bit of each word is immediately followed by the first bit of the next word), then the frame sync signal occurs during the last bit of each word. Internally generated frame syncs are asserted for one clock cycle in early framing mode. ADSP-21160 SHARC DSP Hardware Reference 9-25 Setting Serial Port Modes When LAFS=1, late frame syncs are configured; this is the alternate mode of operation. In this mode, the first bit of the transmit data word is available (and the first bit of the receive data word is latched) in the same serial clock cycle that the frame sync is asserted. (In multichannel operation, this is the case when frame delay is zero.) Receive data bits are latched by serial clock edges, but the frame sync signal is only checked during the first bit of each word. Internally generated frame syncs remain asserted for the entire length of the data word in late framing mode. Externally generated frame syncs are only checked during the first bit. Figure 9-3 illustrates the two modes of frame signal timing: • LAFS bits of STCTLx, SRCTLx control registers. LAFS=0 for early frame syncs, LAFS=1 for late frame syncs. • Early framing: frame sync precedes data by one cycle. Late framing: frame sync checked on first bit only. • Data transmitted MSB-first (SENDN=0) or LSB-first (SENDN=1). • Frame sync and clock generated internally or externally. xCLK LATE FRAME SYNC EARLY FRAME SYNC DATA B3 B2 B1 B0 Figure 9-3. Normal Versus Alternate Framing 9-26 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Data-Independent Transmit Frame Sync Normally the internally generated transmit frame sync signal (TFS) is output only when the TX buffer has data ready to transmit. The DITFS mode (data-independent transmit frame sync) allows the continuous generation of the TFS signal, with or without new data. The DITFS bit of the STCTLx control register configures this option. When DITFS=0, the internally generated TFS is only output when a new data word has been loaded into the TX buffer. Once data is loaded into TX, it is not transmitted until the next TFS is generated. This mode of operation allows data to be transmitted only at specific times. When DITFS=1, the internally generated TFS is output at its programmed interval regardless of whether new data is available in the TX buffer. Whatever data is present in TX is retransmitted with each assertion of TFS. The TUVF transmit underflow status bit (in the STCTLx control register) is set when this occurs (for example, when old data is retransmitted). The TUVF status bit is also set if the TX buffer does not have new data when an externally generated TFS occurs. Note that in this mode of operation, the first internally generated TFS is delayed until data has been loaded into the TX buffer. If the internally generated TFS is used, a single write to the TX data register is required to start the transfer. SPORT Loopback When the SPL bit (SPORT loopback) is set in the SRCTLx receive control register, the serial port is configured in an internal loopback connection. The loopback configuration allows the serial ports to be tested internally. When loopback is configured, the DRx, RCLKx, and RFSx signals of the receive section of the SPORT are internally connected to the DTx, TCLKx, and TFSx signals of the transmit section. ADSP-21160 SHARC DSP Hardware Reference 9-27 Setting Serial Port Modes The DTx, TCLKx, and TFSx signals are active and are available at their respective pins, while the DRx, RCLKx, and RFSx pins are ignored by the DSP. transmit clock and transmit frame sync options may be used Only in loopback mode—programs must ensure that the serial port is set up correctly in the STCTLx and SRCTLx control registers. Multichannel mode is not allowed. Multichannel Operation The DSP serial ports offer a multichannel mode of operation which allows the SPORT to communicate in a time-division-multiplexed (TDM) serial system. In multichannel communications, each data word of the serial bit stream occupies a separate channel— each word belongs to the next consecutive channel so that, for example, a 24-word block of data contains one word for each of 24 channels. The serial port can automatically select words for particular channels while ignoring the others. Up to 32 channels are available for transmitting or receiving—each SPORT can receive and transmit data selectively from any of the 32 channels. In other words, the SPORT can do any of the following on each channel: • transmit data • receive data • transmit and receive data, or • do nothing Data companding and DMA transfers can also be used in multichannel mode. 9-28 ADSP-21160 SHARC DSP Hardware Reference Serial Ports The DT pin is always driven (for example, not three-stated) if the serial port is enabled (SPEN=1 in the STCTLx control register), unless it is in multichannel mode and an inactive time slot occurs. Note that (in multichannel mode) the TCLKx pin is always an input and must be connected to its corresponding RCLKx pin. Figure 9-4 shows example timing for a multichannel transfer, which have the following characteristics: WORD 0 WORD 1 WORD 2 SCLK DR B3 B2 B1 B0 B3 IGNORED B2 RFS DT B3 B2 B1 B0 B3 B2 TFS Figure 9-4. Multichannel Operation • Uses TDM method where serial data is sent or received on different channels sharing the same serial bus. • The number of channels is selected with the NCH bits of SRCTLx: NCH=(# of channels) – 1 • Can independently select transmit and receive channels. • RFS • TFS signal start of frame. is used as “Transmit Data Valid” for external logic; active only during transmit channels. • Example: Receive on channels 0 and 2. Transmit on channels 1 and 2. ADSP-21160 SHARC DSP Hardware Reference 9-29 Setting Serial Port Modes Frame Syncs in Multichannel Mode All receiving and transmitting devices in a multichannel system must have the same timing reference. The RFS signal is used for this reference, indicating the start of a block (or frame) of multichannel data words. When multichannel mode is enabled on a SPORT, both the transmitter and receiver use RFS as a frame sync. This is true whether RFS is generated internally or externally. The RFS signal is used to synchronize the channels and restart each multichannel sequence. RFS assertion occurs the beginning of the channel 0 data word. is used as a transmit data valid signal which is active during transmission of an enabled word. Because the serial port’s DTx pin is three-stated when the time slot is not active, the TFS signal specifies whether or not DTx is being driven by the DSP. The DSP drives TFS in multichannel mode whether or not ITFS is cleared. TFS After the TX transmit buffer is loaded, transmission begins and the TFS signal is generated. When serial port DMA is being used, this may happen several cycles after the multichannel transmission is enabled. If a deterministic start time is required, the TX buffer should be preloaded. is normally left unconnected in multichannel mode, and the RFS pins of the serial port(s) are usually connected together. TFS Multichannel Control Bits in STCTL, SRCTL The STCTLx and SRCTLx control registers contain several bits used to enable and configure multichannel operations. Multichannel mode is enabled by setting the MCE bit in the SRCTLx control register: • When MCE=1, multichannel operation is enabled. • When MCE=0, all multichannel operations are disabled. 9-30 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Multichannel operation is activated three cycles after MCE is set. Internally generated frame sync signals activate four cycles after MCE is set. Setting the MCE bit enables multichannel operation for both receive and transmit sides of the SPORT. A transmitting SPORT must be in multichannel mode if the receiving SPORT is in multichannel mode. The SLEN bits determine the serial bit length of the transmit and receive data words. The SLEN bit settings in the STCTLx register (bits 4-8) should match the SLEN bit settings in the SRCTLx register. The number of channels used in multichannel operation is selected by the 5-bit NCH field in the SRCTLx control register. NCH should be set to the actual number of channels minus one: NCH = Number of Channels – 1 The 5-bit CHNL field in the STCTLx control register indicates which channel is currently selected during multichannel operation. This field is a read-only status indicator. CHNL(4:0) increments modulo NCH(4:0) as each channel is serviced. The 4-bit MFD field in the STCTLx control register specifies a delay between the frame sync pulse and the first data bit in multichannel mode. The value of MFD is the number of serial clock cycles of the delay. Multichannel frame delay allows the processor to work with different types of T1 interface devices. A value of zero for MFD causes the frame sync to be concurrent with the first data bit. The maximum value allowed for MFD is 15. A new frame sync may occur before data from the last frame has been received, because blocks of data occur back to back. A multichannel frame delay of at least one should be used when the DSP is generating frame syncs for the multichannel system and the serial clock of the system is equal to CLKIN (the processor clock). If MFD is not set to at least one, the master DSP in a multiprocessing system does not recognize ADSP-21160 SHARC DSP Hardware Reference 9-31 Setting Serial Port Modes the first frame sync after multichannel operation is enabled. All succeeding frame syncs are recognized normally. Channel Selection Registers Specific channels can be individually enabled or disabled to select which words are received and transmitted during multichannel communications. Data words from the enabled channels are received or transmitted, while disabled channel words are ignored. Up to 32 channels are available for transmitting and up to 32 channels for receiving. The multichannel selection registers are used to enable and disable individual channels. The registers for each serial port are as shown in Table 9-6. Table 9-6. Multichannel Selection Registers Register Name Function MTCSx Multichannel Transmit Select—specifies the active transmit channels MRCSx Multichannel Receive Select—specifies the active receive channels MTCCSx Multichannel Transmit Compand Select—specifies which active transmit channels are companded MRCCSx Multichannel Receive Compand Select—specifies which active receive channels are companded Each register has 32 bits, corresponding the 32 channels. Setting a bit enables that channel so that the serial port selects its word from the multiple-word block of data (for either receive or transmit). For example, setting bit 0 selects word 0, setting bit 12 selects word 12, and so on. Setting a particular bit to 1 in the MTCSx register causes the serial port to transmit the word in that channel’s position of the data stream. Clearing the bit to 0 in the MTCSx register causes the serial port’s DT (data transmit) pin to three-state during the time slot of that channel. 9-32 ADSP-21160 SHARC DSP Hardware Reference Serial Ports Setting a particular bit to 1 in the MRCSx register causes the serial port to receive the word in that channel’s position of the data stream; the received word is loaded into the RX buffer. Clearing the bit to 0 in the MRCSx register causes the serial port to ignore the data. Companding may be selected on a per-channel basis. The MTCCSx and MRCCSx registers are used to specify companding for any active channels. Setting a bit to 1 in these registers causes the data to be companded. A-law or -law companding is selected with the DTYPE bit 1 in the STCTLx and SRCTLx control registers. SPORT Receive Comparison Registers On the DSP, two sets of registers aid multiprocessor communications when using multichannel mode (MCE=1) through the serial ports. These 32-bit registers are the Receive Comparison (KEYWDx) registers and the Receive Comparison Mask (KEYMASKx) registers. Table 9-7 shows the MCE setting as well as the bits in the SRCTL register that control the operation of Receive Comparison. Table 9-7. Receive Comparison Selection IMODE (Bit 15) IMAT (Bit 20) Operation 0 x Receive comparison disabled 1 0 Accept receive data if the KEYWD comparison is false 1 1 Accept receive data if the KEYWD comparison is true The KEYWD0 or KEYWD1 register stores the pattern to be matched with the incoming data. The corresponding KEYMASK0 or KEYMASK1 register specifies which of the bits in the received data should be compared. Setting KEYMASKx bit (=1) masks the corresponding bit in KEYWDx register, disabling its comparison. ADSP-21160 SHARC DSP Hardware Reference 9-33 Setting Serial Port Modes The processor receiving the data compares it with the data in the KEYWDx register. Depending on the comparison results, the received data is accepted or ignored. If accepted, the receiver requests—based on the setting to the SRCTL register—a DMA transfer to internal memory or generates an interrupt. When receive comparison is enabled, companding is disabled on the transmitter and receiver. The MTCCSx register, which selects multichannel companding when receive comparison is disabled, determines whether the DSP performs a KEYWD comparison for the enabled received channels. If the MTCCSx bit for a particular channel is '0,' the processor does not perform a comparison and always accepts the receive data on that channel. If the MTCCSx bit for a particular channel is '1,' the processor performs the comparison and accepts (or rejects) the receive data, depending on the result of the comparison and IMAT setting in the SRCTLx register. The receive comparison feature lets the DSP's SPORTs generate a DMA request or an interrupt when the received data matches a specified condition on a specified channel in multichannel mode. Without this feature, the SPORT would interrupt the processor every time data was received and the processor would be required to check if the data was meant for it or not. It is possible that most of the time the data being sent is not meant for the processor. With the receive comparison feature, the SPORT on a particular processor can be programmed to interrupt only on messages meant for that processor. As a receive comparison example, consider four DSPs (A, B, C, and D) which use SPORT0 (in multichannel mode) for interprocessor communication. Channels 0, 1, 2, and 3 are used respectively by A, B, C, and D to transmit control information between the processors. Channels 4 through 10, 11 through 17, 18 through 24, and 25 through 31 are used respectively by A, B, C, and D to transmit data. Because channels 0 through 3 are used to send control information between the processors, the comparisons for incoming data is enabled only for these channels. Initially, channels 4 through 31 may have receive 9-34 ADSP-21160 SHARC DSP Hardware Reference Serial Ports disabled. For this example, consider communication between processors A and B only. The keyword for comparison is programmable; in this example, processor B can check for the keyword START TRANSMIT TO B. Processor B can check for this keyword as follows. 1. Set the KEYWD register to START TRANSMIT TO B. 2. Clear bits 31:16 of the KEYMASK register to 0 and set the other bits to 1. This step enables comparison only for bits 31:16. So, assume that the code for START TRANSMIT TO B only uses bits 31:16 and bits 15:0 indicate the source of the transmission and the data channels. 3. Set bits 15 and 20 of the SRCTL register to 1. This step enables the SPORT to generate an interrupt or DMA request only if the incoming data matches the KEYWD. 4. Set bits 0 through 3 of the Transmit Compand Channel Selector register to 1 and clear the remaining bits to 0. This step enables comparison only on channels 0 through 3. Until it receives the START TRANSMIT TO B keyword, processor B ignores all transmissions that it receives. When processor A wants to send data to B, it sends this keyword on channel 0. When receive comparison on processor B recognizes the START TRANSMIT TO B keyword, the SPORT interrupts processor B. Then, processor B analyzes the remaining 16-bits, determining that the source is processor A and the data is on channels 4 through 10. Because processor A is using channels 4 through 10 to transmit data, processor B enables receive channels 4 through 10 and sends a “READY TO RECEIVE DATA” message to processor A, using channel 1. After processor A receives this message, it sends the data on channels 4 through 10. ADSP-21160 SHARC DSP Hardware Reference 9-35 Moving Data Between SPORTS and Memory If the transfer protocol uses a fixed number of bytes in each message, processor B could send back a checksum message to processor A after receiving A's message, confirming that the data transferred accurately. Moving Data Between SPORTS and Memory Transmit and receive data can be transferred between the DSP serial ports and on-chip memory in one of two ways, with single-word transfers or with DMA block transfers. Both methods are interrupt-driven, using the same internally generated interrupts. When serial port DMA is not enabled in the STCTLx or SRCTLx control registers, the SPORT generates an interrupt every time it has received a data word or has started to transmit a data word. SPORT DMA provides a mechanism for receiving or transmitting an entire block of serial data before the interrupt is generated. The DSP’s on-chip DMA controller handles the DMA transfer, allowing the processor core to continue running until the entire block of data is transmitted or received. Service routines can then operate on the block of data rather than on single words, significantly reducing overhead. DMA Block Transfers The DSP’s on-chip DMA controller allows automatic DMA transfers between internal memory and the two serial ports. There are four DMA channels for serial port operations—each SPORT has one channel for receiving data and one for transmitting data. The serial port DMA channels are numbered as follows: • DMA Channel 0 – SPORT0 Receive • DMA Channel 1 – SPORT1 Receive 9-36 ADSP-21160 SHARC DSP Hardware Reference Serial Ports • DMA Channel 2 – SPORT0 Transmit • DMA Channel 3 – SPORT1 Transmit The SPORT DMA channels are assigned higher priority than all other DMA channels (for example, link ports and the external port) because of their relatively low service rate and their inability to hold off incoming data. Having higher priority causes the SPORT DMA transfers to be performed first when multiple DMA requests occur in the same cycle. Although the DMA transfers are always performed with 32-bit words, the serial ports can handle word sizes from 3 to 32 bits. If the serial words are 16 bits or smaller, they can be packed into 32-bit words for each DMA transfer; this is configured by the PACK bit of the STCTLx and SRCTLx control registers. When serial port data packing is enabled (PACK=1), the transmit and receive interrupts are generated for the 32-bit packed words, not for each 16-bit word. The following sections present an overview of serial port DMA operations; some additional details are covered in the DMA chapter of this manual. • For information on SPORT DMA Channel Setup, see “Setting up Serial Port DMA” on page 6-90. • For information on SPORT DMA Parameter Registers, see “Setting I/O Processor—SPort Modes” on page 6-49. • For information on SPORT DMA Chaining, see “Chaining DMA Processes” on page 6-69. Single-Word Transfers Individual data words may also be transmitted and received by the serial ports, with interrupts occurring as each 32-bit word is transmitted or received. When a serial port is enabled and DMA is disabled (in the STCTLx or SRCTLx control registers), the SPORT DMA interrupts are generated in this way—whenever a complete 32-bit word has been received in ADSP-21160 SHARC DSP Hardware Reference 9-37 SPORT Pin/Line Terminations the RX buffer, or whenever the TX buffer is not full. Single-word interrupts can be used to implement interrupt-driven I/O on the serial ports. Whenever the DSP core’s program reads a word from a serial port’s RX buffer or writes a word to its TX buffer, the buffer’s full/empty status should first be checked in order to avoid hanging the DSP core. (This can also happen to an external device, for example a host processor, when it is reading or writing a serial port buffer.) The full/empty status can be read in the RXS bits of the SRCTLx register or the TXS bits of the STCTLx register. Reading from an empty RX buffer or writing to a full TX buffer causes the DSP (or external device) to hang, waiting for the status to change. support debugging buffer transfers, the DSP has a Buffer Hang ToDisable ( ) bit. When set (=1), this bit prevents the processor BHD core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on page -18. Multiple interrupts can occur if both SPORTs transmit or receive data in the same cycle. Any interrupt can be masked out in the IMASK register; if the interrupt is later enabled in IMASK, the corresponding interrupt latch bit in IRPTL must be cleared in case the interrupt has occurred in the meantime. When serial port data packing is enabled (PACK=1 in the STCTLx or SRCTLx control registers), the transmit and receive interrupts are generated for the 32-bit packed words, not for each 16-bit word. SPORT Pin/Line Terminations The DSP has very fast drivers on all output pins including the serial ports. If connections on the data, clock, or frame sync lines are longer than six inches, you should consider using a series termination for strip lines on point-to-point connections. This may be necessary even when using low-speed serial clocks, because of the edge rates. 9-38 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port 10 JTAG TEST EMULATION PORT A boundary scan allows a system designer to test interconnections on a printed circuit board with minimal test-specific hardware. The scan is made possible by the ability to control and monitor each input and output pin on each chip through a set of serially scannable latches. Each input and output is connected to a latch, and the latches are connected as a long shift register so that data can be read from or written to them through a serial test access port (TAP). The ADSP-21160 DSP contains a test access port compatible with the industry-standard IEEE 1149.1 (JTAG) specification. Only the IEEE 1149.1 features specific to the ADSP-21160 DSPs are described here. For more information, see the IEEE 1149.1 specification and other the documents listed in “References” on page 10-55. The boundary scan allows a variety of functions to be performed on each input and output signal of the ADSP-21160 DSP. Each input has a latch that monitors the value of the incoming signal and can also drive data into the chip in place of the incoming value. Similarly, each output has a latch that monitors the outgoing signal and can also drive the output in place of the outgoing value. For bidirectional pins, the combination of input and output functions is available. Every latch associated with a pin is part of a single serial shift register path. Each latch is a master/slave type latch with the controlling clock provided externally. This clock (TCK) is asynchronous to the ADSP-21160 system clock (CLKIN). ADSP-21160 SHARC DSP Hardware Reference 10-1 The ADSP-21160 emulation features let you halt the processor at a pre-defined point. You then examine the state of the processor, execute arbitrary code, restore the original state, and continue execution. ADSP-21160 emulation features are a superset of the The ADSP-21060 emulation features. All emulation features supported by previous SHARC DSPs are supported on the ADSP-21160 DSP, except the ICSA output signal and function. The set of features on which EZ-ICE® designs rely are supported in an identical fashion on ADSP-21160 DSP. The ADSP-21160 DSP can be used with the ADSP-2106x SHARC EZ-ICE hardware. There are several changes/extensions to the base functionality of the ADSP-21060 emulation capability, which require changes in the EZ-ICE software for ADSP-21160 DSP support. These extensions include: • The emulation breakpoint address start/end registers have moved from UREG space to IOP register space. This change did not effect the TSTEMU block directly, only the address decodes to gain access to it. • has been added to the IR decode space. This shift register provides access to the full 64-bit wide PX register of ADSP-21160 DSP. EMU64PX • A memory test shift register has been added to the IR decode space. This feature is for Analog Devices internal use ONLY. • Addition of the MTST (Memory TeST) bit in the EMUCTL register. This feature is for Analog Devices internal use ONLY. EMUCTL is 40 bits wide on the ADSP-21160 DSP. Several on chip facilities are directly accessed through the JTAG interface. These facilities are listed in Table 10-2 on page 10-4. Other emulation facilities are only indirectly accessible. To indirectly access the facilities that do not appear in Table 10-2, scan the instruction which moves data 10-2 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port of interest to/from the PX register, scan the PX data (if the instruction is a PX read), let the core execute the instruction, then scan the PX register out (if the instruction was a PX write). The breakpoint start/end registers are mapped into the IOP register space of the ADSP-21160 DSP. For specific addresses, see “Register and Bit #Defines File (def21160.h)” on page A-81. The EMUN, EMUCLK, and EMUCLK2 registers occupy the same UREG address space as on the ADSP-2106 DSPs. These facilities are read only by the ADSP-21160 processor core in normal operation. JTAG Test Access Port The emulator uses JTAG boundary scan logic for ADSP-21160 communications and control. This JTAG logic consists of a state machine, a five pin Test Access Port (TAP), and shift registers. The state machine and pins conform to the IEEE 1149.1 specification. The TAP pins appear in Table 10-1. Table 10-1. JTAG Test Access Port (TAP) Pins Pin Function TCK (input) Test Clock: pin used to clock the TAP state machine.1 TMS (input) Test Mode Select: pin used to control the TAP state machine sequence.2 TDI (input) Test Data In: serial shift data input pin. TDO (output) Test Data Out: serial shift data output pin. TRST (input) Test Logic Reset: resets the TAP state machine 1 2 Asynchronous with CLKIN Synchronous to CLKIN ADSP-21160 SHARC DSP Hardware Reference 10-3 Instruction Register A BSDL file for the ADSP-21160 DSP is available on Analog Devices’ website. Set your browser to: http://www.analog.com/dsp Refer to the IEEE 1149.1 JTAG specification for detailed information on the JTAG interface. The many sections of this appendix assume a working knowledge of the JTAG specification. Instruction Register The instruction register allows an instruction to be shifted into the processor. This instruction selects the test to be performed and/or the test data register to be accessed. The instruction register is 5 bits long with no parity bit. A value of 10000 binary is loaded (LSB nearest TDO) into the instruction register whenever the TAP reset state is entered. Table 10-2 lists the binary code for each instruction. Bit 0 is nearest TDO and bit 4 is nearest TDI. No data registers are placed into test modes by any of the public instructions. The instructions affect the ADSP-21160 DSP as defined in the 1149.1 specification. The optional instructions RUNBIST, IDCODE and USERCODE are not supported by the ADSP-21160 DSP. Table 10-2. JTAG Instruction Register Codes 43210 Register Instruction 11111 Bypass BYPASS Public 00000 Boundary EXTEST Public 10000 Boundary SAMPLE Public 01000 EMUPMD EMULATION 11000 Boundary INTEST Public 00100 EMUCTL EMULATION Private 10100 EMUPX EMULATION 10-4 Comment 48-bit scan length 48-bit shift register Type Private Private ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-2. JTAG Instruction Register Codes (Cont’d) 43210 Register Instruction Comment Type 10110 EMU64PX EMULATION 64-bit shift register Private 01100 EMUSTAT EMULATION Private 11100 BRKSTAT EMULATION Private 00010 EMUPC EMULATION Private 10101 MEMTST TEST All others Reserved Reserved Memory test Private Private The entry under “Register” is the serial scan path, either Boundary or Bypass in this case, enabled by the instruction. Figure 10-1 on page 10-6 shows these register paths. The 1-bit Bypass register is fully defined in the 1149.1 specification. For more information on the Boundary register, see “Boundary Register” on page 10-17. No special values need be written into any register prior to selection of any instruction. As Table 10-2 on page 10-4 shows, certain instructions are reserved for emulator use. For more information, see Table 10-7 on page 10-18. EMUPMD Shift Register The EMUPMD serial shift register is located in the system unit. EMUPMD is 48 bits wide and is accessed by the emulator through TAP. When the TAP enters the UPDATE state and EMUPMD is selected, a 48-bit slave register is updated from EMUPMD. EMUPMD’s purpose is to force the ADSP-21160 DSP to execute emulator supplied instructions. The register accomplishes this by driving the instruction bus while in emulation space. ADSP-21160 SHARC DSP Hardware Reference 10-5 Instruction Register B OU NDAR Y R E GIS T E R 643 2 644 1 645 0 B YP AS S R E GIS T E R T DI 1 T DO 4 0 3 1 2 INS T R U CT ION R E GIS T E R Figure 10-1. Serial Scan Paths 10-6 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port EMUPX Shift Register The EMUPX serial shift register is located in the system unit. EMUPX is a 48-bits wide and is accessed by the emulator through the TAP. When the TAP goes into the UPDATE state and EMUPX is selected, the most significant 48-bits of PX is updated from EMUPX. When the TAP goes into the CAPTURE state and EMUPX is selected, EMUPX is updated with the most significant 48-bits of PX. The EMUPX register is used to transfer data between the emulator and the target system. is provided for backwards compatibility with the SHARC ICE hardware. PX is a 64-bit wide register. To provide compatibility, only the most significant 48 bits of PX are mapped to EMUPX. 48-bit instructions, and 40-bit extended precision data, are always aligned to the most significant bit. When transferring 32-bit data to/from PX register, PX2 must be specified as the source/destination to ensure that the 32 bits is aligned to the most significant bit. EMUPX EMU64PX Shift Register The EMU64PX serial shift register is located in the system unit. EMU64PX is 64-bits wide and is accessed by the emulator through the TAP. When the TAP goes into the UPDATE state and EMU64PX is selected, PX is updated from EMU64PX. When the TAP goes into the CAPTURE state and EMU64PX is selected, EMU64PX is updated from PX. The EMU64PX register transfers data between the emulator and the target system. The most significant 48 bits of EMU64PX are redundantly available in EMUPX. ADSP-21160 SHARC DSP Hardware Reference 10-7 Instruction Register EMUPC Shift Register The EMUPC serial shift register is located in the system unit. EMUPC is 24-bits wide and is accessed by the emulator through the TAP. The EMUPC register captures addresses from the PC register. This data can be used to statistically profile the user’s code. Addresses cannot be forced into the PC register from the EMUPC register. EMUCTL Shift Register The EMUCTL serial shift register is located in the system unit. EMUCTL is 40-bits wide and is accessed by the emulator through the TAP. EMUCTL controls all of the ADSP-21160 emulation functionality. Table 10-3 lists EMUCTL’s bits and describes their functionality. Table 10-3. EMUCTL (Emulation Control) Register Definition Bit # Name Function 0 EMUENA Emulator Function Enable. The EMUENA bit enables ADSP-21160 emulation functions. (0=ignore breakpoints and emulator interrupts, 1=respond to breakpoints and emulator interrupts) 1 EIRQENA Emulator Interrupt Enable. The EIRQENA bit enables the emulation logic to recognize external emulator interrupts. (0=disable, 1=enable) 2 BKSTOP Enable Autostop on Breakpoint. The BKSTOP bit enables the ADPS-21160 DSP to generate an external emulator interrupt when any breakpoint event occurs. (0=disable, 1=enable) 3 SS Enable Single Step Mode. The SS bit enables single-step operation. (0=disable, 1=enable) 4 SYSRST Software Reset of the ADSP-21160 DSP. The SYSRST bit resets the ADSP-21160 DSP in the same manner as the external RESET pin. The SYSRST bit must be cleared by the emulator.(0=normal operation, 1=reset) 10-8 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-3. EMUCTL (Emulation Control) Register Definition (Cont’d) Bit # Name Function 5 ENBRKOUT Enable the BRKOUT pin. The ENBRKOUT bit enables the BRKOUT pin operation. (0=BRKOUT pin at high-impedance state, 1=BRKOUT pin enabled) 6 IOSTOP Stop IOP DMAs in EMU space. The IOSTOP bit disables all DMA requests when the DSP is in emulation space. Data that is currently in the EP, LINK, or SPORT DMA buffers is held there unless the internal DMA request was already granted. IOSTOP causes incoming data to be held off and outgoing data to cease. Because SPORT receive data cannot be held off, it is lost and the overrun bit is set. The direct write buffer (internal memory write) and the EP pad buffer are allowed to flush any remaining data to internal memory. (0=IO continues, 1=IO Stops) 7 EPSTOP Stop I/O Processor EP operation in emulation space. The EPSTOP bit disables all EP requests when the DSP is in emulation space. After an emulation interrupt is acknowledged, EPSTOP deasserts ACK (deasserts REDY if host access) to prevent further data from being accepted if the EP is accessed. The emulator may clear this bit—allowing I/O to continue and the bus to clear—so that the emulator may use the EP (through BR and bus lock). Note that the EP bus clears only if accesses are direct writes or IOP register writes, because all other IOP functions are halted. The EP bus does not clear if accesses to any of the DMA buffers are extended due to a buffer full or empty condition. (0=EP IO continues, 1=EP IO Stops) 8 NEGPA1 Negate program memory data address breakpoint. The NEG bits enable breakpoint events if the address is greater than the end register value OR less than the start register value. This function is useful to detect index range violations in user code. (0=disable breakpoint, 1=enable breakpoint) 9 NEGDA1 Negate data memory address breakpoint #1. For more information, see NEGPA1 bit description on page 10-9. 10 NEGDA2 Negate data memory address breakpoint #2. For more information, see NEGPA1 bit description on page 10-9. 11 NEGIA1 Negate instruction address breakpoint #1. For more information, see NEGPA1 bit description on page 10-9. ADSP-21160 SHARC DSP Hardware Reference 10-9 Instruction Register Table 10-3. EMUCTL (Emulation Control) Register Definition (Cont’d) Bit # Name Function 12 NEGIA2 Negate instruction address breakpoint #2. For more information, see NEGPA1 bit description on page 10-9. 13 NEGIA3 Negate instruction address breakpoint #3. For more information, see NEGPA1 bit description on page 10-9. 14 NEGIA4 Negate instruction address breakpoint #4. For more information, see NEGPA1 bit description on page 10-9. 15 NEGIO1 Negate I/O address breakpoint. For more information, see NEGPA1 bit description on page 10-9. 16 NEGEP1 Negate EP address breakpoint. For more information, see NEGPA1 bit description on page 10-9. 17 ENBPA Enable program memory data address breakpoints. The ENB bits enable each breakpoint group. Note that when the ANDBKP bit is set, breakpoint types not involved in the generation of the effective breakpoint must be disabled. (0=disable breakpoints, 1=enable breakpoints) 18 ENBDA Enable data memory address breakpoints. For more information, see ENBPA bit description on page 10-10. 19 ENBIA Enable instruction address breakpoints. For more information, see ENBPA bit description on page 10-10. 20 ENBIO Enable I/O address breakpoint. For more information, see ENBPA bit description on page 10-10. 21 ENBEP Enable external port address breakpoint. For more information, see ENBPA bit description on page 10-10. 22-23 PA1MODE PA1 breakpoint triggering mode. The breakpoint triggering mode bits trigger on the following conditions: ModeTriggering condition 00Breakpoint is disabled 01WRITE accesses only 10READ accesses only 11any access 24-25 DA1MODE DA1 breakpoint triggering mode. For more information, see PA1MODES bit description on page 10-10. 10-10 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-3. EMUCTL (Emulation Control) Register Definition (Cont’d) Bit # Name Function 26-27 DA2MODE DA2 breakpoint triggering mode. For more information, see PA1MODES bit description on page 10-10. 28-29 IO1MODE IO1 breakpoint triggering mode. For more information, see PA1MODES bit description on page 10-10. 30-31 EP1MODE EP1 breakpoint triggering mode. For more information, see PA1MODES bit description on page 10-10. 32 ANDBKP AND composite breakpoints. The ANDBKP bit enables AND’ing of each breakpoint type to generate an effective breakpoint from the composite breakpoint signals. (0=OR breakpoint types, 1=AND breakpoint types) 33 RESERVED The ICSA function and DMDSEL bit used by that function not supported on ADSP-21160 DSP. 34 NOBOOT No power-up boot on reset. The NOBOOT bit forces the ADSP-21160 DSP into the No boot mode. In this mode, the processor does not boot load, but begins fetching instructions from 0x0080 0004 in external memory. (0=disable, 1=force No boot mode) 35 TMODE Test mode enable. The TMODE bit is for Analog Devices’ usage only. Do NOT set this bit. (0=normal operation) 36 BHO Buffer Hang Override bit. The BHO control bit overrides the BHD bit in SYSCON, disabling BHD’s control over core access of data buffer behavior. Note that the default (reset) state of BHD is now set for ADSP-21160 DSP, a change from ADSP-2106x DSPs. (0=normal BHD operation, 1=override BHD operation) 37 MTST Memory Test Enable Bit. The MTST bit enables scanning of data for to the latches used for memory test. (0=normal operation, 1=enable memory test) 38, 39 Reserved Reserved ADSP-21160 SHARC DSP Hardware Reference 10-11 Instruction Register EMUSTAT Shift Register The EMUSTAT serial shift register is located in the system unit. EMUSTAT is 8-bits wide and is accessed by the emulator through the TAP. This register is updated by the ADSP-21160 DSP when the TAP is in the CAPTURE state. The emulator reads EMUSTAT to determine the state of the ADSP-21160 DSP. None of the bits in this register can be written by the emulator. All bits are active high. Table 10-4 lists the EMUSTAT register’s bits. Table 10-4. EMUSTAT (Emulation Status) Register Definition Bit Name Function (If bit=1...) 0 EMUSPACE Indicates that the next instruction is to be fetched from the emulator. 1 EMUREADY Indicates that the ADSP-21160 has finished executing the previous emulator instruction. 2 INIDLE Indicates that the ADSP-21160 was in IDLE prior to the latest emulator interrupt. 3 COMHALT Indicates a core access to a SPORT or a LINK is hung because of an external device. 4 EPHALT Indicates a core access to a DMA buffer is hung because of the external port. 5-7 Reserved BRKSTAT Shift Register The BRKSTAT serial shift register is located in the system unit. BRKSTAT is a 16 bits wide and is accessed by the emulator through the TAP. This register monitors the status of the emulation breakpoints and is updated on every clock cycle. None of the bits of this register can be written by the emulator. Table 10-5 lists the BRKSTAT register’s bits. 10-12 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port A high bit indicates a breakpoint hit. When a breakpoint hit occurs, the register ceases updating. Stopping allows the emulator to see which breakpoint was triggered. When the ADSP-21160 DSP leaves emulation space the BRKSTAT register is cleared and resumes updating. All status bits are synchronized to TCLK before being scanned out. Table 10-5. BRKSTAT (Breakpoint Status) Register Definition Bit # Name Function (If bit=1...) 0 STATPA Program Memory Data breakpoint hit 1 STATDA0 Data Memory breakpoint hit 2 STATDA1 Data Memory breakpoint hit 3 STATIA0 Instruction Address breakpoint hit 4 STATIA1 Instruction Address breakpoint hit 5 STATIA2 Instruction Address breakpoint hit 6 STATIA2 Instruction Address breakpoint hit 7 STATIO I/O Address breakpoint hit 8 STATEP EP Address breakpoint hit 9-15 Reserved MEMTST Shift Register The MEMTST serial shift register is for Analog Devices’ usage only. not attempt to use this register—incorrect usage of this feature Do can result in permanent damage to the ADSP-21160 DSP being tested. ADSP-21160 SHARC DSP Hardware Reference 10-13 Instruction Register PSx, DMx, IOx, and EPx (Breakpoint) Registers The PSx, DMx, IOx, and EPx (Breakpoint) registers are located in the I/O Processor register set. The emulation breakpoint registers are not user accessible and can be written only when the ADSP-21160 DSP is in emulation space or test mode. The breakpoint registers vary in size according to the address type: instruction (24-bit address), data (32-bit address), or I/O data (19-bit address). Table 10-6 shows the sizes. The ADSP-21160 DSP contains nine sets of emulation breakpoint registers. Each set consists of a start and end register which describe an address range, with the start register setting the lower end of the address range. Each breakpoint set monitors a particular address bus. When a valid address is in the address range, than a breakpoint signal is generated. The address range includes the start and end addresses. The nine breakpoint sets are grouped into five types: instruction (IA), DM data (DA), PM data (PA), IO data (IO), and EP data (EP). The individual breakpoint signals in each type are OR’ed together to create five composite breakpoint signals. These composite signals can be optionally AND’ed or OR’ed together to create the effective breakpoint event signal used to generate an emulator interrupt. The ANDBKP bit in the EMUCTL register selects the function used. Each breakpoint type has an enable bit in the EMUCTL register. When set, these bits add the specified breakpoint type into the generation of the effective breakpoint signal. If cleared, the specified breakpoint type is not used in the generation of the effective breakpoint signal. This allows the user to trigger the effective breakpoint from a subset of the breakpoint types. 10-14 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port To provide further flexibility, each individual breakpoint can be programmed to trigger if the address is in range AND one of these conditions is met: READ access, WRITE access, ANY access, or NO access. The control bits for this feature are also located in EMUCTL. For more information, see PA1MODES bit description on page 10-10. The address ranges of the emulation breakpoint registers are negated by setting the appropriate renege negation bits in the EMUCTL register. For more information, see NEGPA1 bit description on page 10-9. Each breakpoint can be disabled by setting the start address larger than the end address. Four of the breakpoints monitor the instruction address. Two monitor the data memory address. One monitors the program memory data address, one monitors the I/O address bus and one monitors the EP address bus. The instruction address breakpoints monitor the address of the instruction being executed, not the address of the instruction being fetched. If the current execution is aborted, the breakpoint signal does not occur even if the address is in range. Data address breakpoints (DA and PA only) are also ignored during aborted instructions. The nine breakpoint sets appear in Table 10-6. Table 10-6. PSx, DMx, IOx, and EPx (Breakpoint) Registers Register Function Group1 PSA1S Instruction Address Start #1 IA PSA1E Instruction Address End #1 IA PSA2S Instruction Address Start #2 IA PSA2E Instruction Address End #2 IA PSA3S Instruction Address Start #3 IA PSA3E Instruction Address End #3 IA PSA4S Instruction Address Start #4 IA ADSP-21160 SHARC DSP Hardware Reference 10-15 Instruction Register Table 10-6. PSx, DMx, IOx, and EPx (Breakpoint) Registers (Cont’d) Register Function Group1 PSA4E Instruction Address End #4 IA DMA1S Data Address Start #1 DA DMA1E Data Address End #1 DA DMA2S Data Address Start #2 DA DMA2E Data Address End #2 DA PMDAS Program Data Address Start PA PMDAE Program Data Address End PA IOAS I/O Address Start IO IOAE I/O Address End IO EPAS External Port Address Start EP EPAE External Port Address End EP 1 Group IA=24-bit addresses, Groups DA, PA, and EP=32-bit addresses, Group IO=19-bit addresses. EMUN Register The EMUN (Nth event counter) register is located in the I/O Processor register set. The EMUN register is not user accessible and can be written only when the ADSP-21160 DSP is in emulation space. EMUN is read-only from normal-space and can be written only when the ADSP-21160 DSP is in emulation space. The Nth event counter allows an emulation breakpoint to occur on the Nth occurrence of the breakpoint event. This is accomplished by writing the desired Nth value to the EMUN register in UREG space. This register can be read from normal space, but it can be written only in emulation space. The counter decrements on each occurrence of the breakpoint event, asserting the interrupt when the counter is equal to zero and the hardware breakpoint event occurs. 10-16 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port EMUCLK and EMUCLK2 Registers The EMUCLK (clock counter) and EMUCLK2 (clock counter scaling) registers are located in the universal (UREG) register set. EMUCLK and EMUCLK2 are not user accessible and can be written only when the ADSP-21160 DSP is in emulation space. These registers are read-only from normal-space and can be written only when the ADSP-21160 DSP is in emulation space. The Emulation Clock Counter consists of a 32-bit count register (EMUCLK) and a 32-bit scaling register (EMUCLK2). The EMUCLK counts clock cycles while the user has control of the ADSP-21160 DSP and stops counting when the emulator gains control. These registers let you gauge the amount of time spent executing a particular section of code. The EMUCLK2 register extends the time EMUCLK can count by incrementing each time the EMUCLK value rolls over to zero. The combined emulation clock counter can count accurately for thousands of hours. EMUIDLE Instruction The EMUIDLE instruction places the ADSP-21160 DSP in the IDLE state and triggers an emulator interrupt. This operation lets you use the EMUIDLE instruction to be used as a software breakpoint. When EMUIDLE is executed, the emulation clock counter immediately halts. In-Circuit Signal Analyzer (ICSA) Function This function is NOT supported in ADSP-21160 DSP. Boundary Register The Boundary register is 655 bits long. Table 10-7 defines the latch type and function of each position in the scan path. The positions are numbered with 0 being the first bit output (closest to TDO) and 654 being the last (closest to TDI). ADSP-21160 SHARC DSP Hardware Reference 10-17 Boundary Register Notes on Boundary registers: • Scan position 0 (L0DAT0) is the end is closest to TDO (scan in first) • Scan position 654 (SPARE); this end is closest to TDI (scan in last) • Output Enables: • 1 = Drive the associated signals during the EXTEST and INTEST instructions • 0 = Three-state the associated signals during the EXTEST and INTEST instructions can be sampled but not controlled (read-only). CLKIN continues to clock the ADSP-21160DSP no matter which instruction is enabled. CLKIN Table 10-7. JTAG Boundary Register Scan # Signal Name Latch Type 0 L0DAT(0) Output 1 L0DAT(0) Output enable 2 L0DAT(0) Input 3 L0DAT(1) Output 4 L0DAT(1) Output enable 5 L0DAT(1) Input 6 L0DAT(2) Output 7 L0DAT(2) Output enable 10-18 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 8 L0DAT(2) Input 9 L0DAT(3) Output 10 L0DAT(3) Output enable 11 L0DAT(3) Input 12 L0ACK Output 13 L0ACK Output enable 14 L0ACK Input 15 L0CLK Output 16 L0CLK Output enable 17 L0CLK Input 18 L0DAT(4) Output 19 L0DAT(4) Output enable 20 L0DAT(4) Input 21 L0DAT(5) Output 22 L0DAT(5) Output enable 23 L0DAT(5) Input 24 L0DAT(6) Output 25 L0DAT(6) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-19 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 26 L0DAT(6) Input 27 L0DAT(7) Output 28 L0DAT(7) Output enable 29 L0DAT(7) Input 30 DT0 Output 31 DT0 Output enable 32 DT0 No function 33 TCLK0 Output 34 TCLK0 Output enable 35 TCLK0 Input 36 TFS0 Output 37 TFS0 Output enable 38 TFS0 Input 39 RFS0 Output 40 RFS0 Output enable 41 RFS0 Input 42 RCLK0 Output 43 RCLK0 Output enable 10-20 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 44 RCLK0 Input 45 DR0 Output 46 DR0 Output enable 47 DR0 Input 48 DR1 Output 49 DR1 Output enable 50 DR1 Input 51 RCLK1 Output 52 RCLK1 Output enable 53 RCLK1 Input 54 RFS1 Output 55 RFS1 Output enable 56 RFS1 Input 57 TFS1 Output 58 TFS1 Output enable 59 TFS1 Input 60 TCLK1 Output 61 TCLK1 Output enable ADSP-21160 SHARC DSP Hardware Reference 10-21 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 62 TCLK1 Input 63 DT1 Output 64 DT1 Output enable 65 DT1 No function 66 TIMEXP Output 67 TIMEXP No function 68 TIMEXP No function 69 FLAG0 Output 70 FLAG0 Output enable 71 FLAG0 Input 72 FLAG1 Output 73 FLAG1 Output enable 74 FLAG1 Input 75 FLAG2 Output 76 FLAG2 Output enable 77 FLAG2 Input 78 FLAG3 Output 79 FLAG3 Output enable 10-22 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 80 FLAG3 Input 81 IRQ0_B No function 82 IRQ0_B No function 83 IRQ0_B Input 84 IRQ1_B No function 85 IRQ1_B No function 86 IRQ1_B Input 87 IRQ2_B No function 88 IRQ2_B No function 89 IRQ2_B Input 90 RPBA No function 91 RPBA No function 92 RPBA Input 93 RESET_B No function 94 RESET_B No function 95 RESET_B Input 96 EMU_B Output 97 EMU_B Output enable ADSP-21160 SHARC DSP Hardware Reference 10-23 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 98 EMU_B No function 99 DATA(0) Output 100 DATA(0) Output enable 101 DATA(0) Input 102 DATA(1) Output 103 DATA(1) Output enable 104 DATA(1) Input 105 DATA(2) Output 106 DATA(2) Output enable 107 DATA(2) Input 108 DATA(3) Output 109 DATA(3) Output enable 110 DATA(3) Input 111 DATA(4) Output 112 DATA(4) Output enable 113 DATA(4) Input 114 DATA(5) Output 115 DATA(5) Output enable 10-24 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 116 DATA(5) Input 117 DATA(6) Output 118 DATA(6) Output enable 119 DATA(6) Input 120 DATA(7) Output 121 DATA(7) Output enable 122 DATA(7) Input 123 DATA(8) Output 124 DATA(8) Output enable 125 DATA(8) Input 126 DATA(9) Output 127 DATA(9) Output enable 128 DATA(9) Input 129 DATA(10) Output 130 DATA(10) Output enable 131 DATA(10) Input 132 DATA(11) Output 133 DATA(11) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-25 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 134 DATA(11) Input 135 DATA(12) Output 136 DATA(12) Output enable 137 DATA(12) Input 138 DATA(13) Output 139 DATA(13) Output enable 140 DATA(13) Input 141 DATA(14) Output 142 DATA(14) Output enable 143 DATA(14) Input 144 DATA(15) Output 145 DATA(15) Output enable 146 DATA(15) Input 147 DATA(16) Output 148 DATA(16) Output enable 149 DATA(16) Input 150 DATA(17) Output 151 DATA(17) Output enable 10-26 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 152 DATA(17) Input 153 DATA(18) Output 154 DATA(18) Output enable 155 DATA(18) Input 156 DATA(19) Output 157 DATA(19) Output enable 158 DATA(19) Input 159 DATA(20) Output 160 DATA(20) Output enable 161 DATA(20) Input 162 DATA(21) Output 163 DATA(21) Output enable 164 DATA(21) Input 165 DATA(22) Output 166 DATA(22) Output enable 167 DATA(22) Input 168 DATA(23) Output 169 DATA(23) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-27 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 170 DATA(23) Input 171 DATA(24) Output 172 DATA(24) Output enable 173 DATA(24) Input 174 DATA(25) Output 175 DATA(25) Output enable 176 DATA(25) Input 177 DATA(26) Output 178 DATA(26) Output enable 179 DATA(26) Input 180 DATA(27) Output 181 DATA(27) Output enable 182 DATA(27) Input 183 DATA(28) Output 184 DATA(28) Output enable 185 DATA(28) Input 186 DATA(29) Output 187 DATA(29) Output enable 10-28 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 188 DATA(29) Input 189 DATA(30) Output 190 DATA(30) Output enable 191 DATA(30) Input 192 DATA(31) Output 193 DATA(31) Output enable 194 DATA(31) Input 195 DATA(32) Output 196 DATA(32) Output enable 197 DATA(32) Input 198 DATA(33) Output 199 DATA(33) Output enable 200 DATA(33) Input 201 DATA(34) Output 202 DATA(34) Output enable 203 DATA(34) Input 204 DATA(35) Output 205 DATA(35) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-29 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 206 DATA(35) Input 207 DATA(36) Output 208 DATA(36) Output enable 209 DATA(36) Input 210 DATA(37) Output 211 DATA(37) Output enable 212 DATA(37) Input 213 DATA(38) Output 214 DATA(38) Output enable 215 DATA(38) Input 216 DATA(39) Output 217 DATA(39) Output enable 218 DATA(39) Input 219 DATA(40) Output 220 DATA(40) Output enable 221 DATA(40) Input 222 DATA(41) Output 223 DATA(41) Output enable 10-30 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 224 DATA(41) Input 225 DATA(42) Output 226 DATA(42) Output enable 227 DATA(42) Input 228 DATA(43) Output 229 DATA(43) Output enable 230 DATA(43) Input 231 DATA(44) Output 232 DATA(44) Output enable 233 DATA(44) Input 234 DATA(45) Output 235 DATA(45) Output enable 236 DATA(45) Input 237 DATA(46) Output 238 DATA(46) Output enable 239 DATA(46) Input 240 DATA(47) Output 241 DATA(47) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-31 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 242 DATA(47) Input 243 CLK_CFG0 No function 244 CLK_CFG0 No function 245 CLK_CFG0 Input 246 CLK_CFG1 No function 247 CLK_CFG1 No function 248 CLK_CFG1 Input 249 CLK_CFG2 No function 250 CLK_CFG2 No function 251 CLK_CFG2 Input 252 CLK_CFG3 No function 253 CLK_CFG3 No function 254 CLK_CFG3 Input 255 CLKOUT Output 256 CLKOUT Output enable 257 CLKOUT No function 258 DATA(48) Output 259 DATA(48) Output enable 10-32 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 260 DATA(48) Input 261 DATA(49) Output 262 DATA(49) Output enable 263 DATA(49) Input 264 DATA(50) Output 265 DATA(50) Output enable 266 DATA(50) Input 267 DATA(51) Output 268 DATA(51) Output enable 269 DATA(51) Input 270 DATA(52) Output 271 DATA(52) Output enable 272 DATA(52) Input 273 DATA(53) Output 274 DATA(53) Output enable 275 DATA(53) Input 276 DATA(54) Output 277 DATA(54) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-33 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 278 DATA(54) Input 279 DATA(55) Output 280 DATA(55) Output enable 281 DATA(55) Input 282 DATA(56) Output 283 DATA(56) Output enable 284 DATA(56) Input 285 DATA(57) Output 286 DATA(57) Output enable 287 DATA(57) Input 288 DATA(58) Output 289 DATA(58) Output enable 290 DATA(58) Input 291 DATA(59) Output 292 DATA(59) Output enable 293 DATA(59) Input 294 DATA(60) Output 295 DATA(60) Output enable 10-34 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 296 DATA(60) Input 297 DATA(61) Output 298 DATA(61) Output enable 299 DATA(61) Input 300 DATA(62) Output 301 DATA(62) Output enable 302 DATA(62) Input 303 DATA(63) Output 304 DATA(63) Output enable 305 DATA(63) Input 306 ADDR(2) Output 307 ADDR(2) Output enable 308 ADDR(2) Input 309 ADDR(3) Output 310 ADDR(3) Output enable 311 ADDR(3) Input 312 ADDR(4) Output 313 ADDR(4) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-35 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 314 ADDR(4) Input 315 ADDR(5) Output 316 ADDR(5) Output enable 317 ADDR(5) Input 318 ADDR(6) Output 319 ADDR(6) Output enable 320 ADDR(6) Input 321 ADDR(7) Output 322 ADDR(7) Output enable 323 ADDR(7) Input 324 ADDR(8) Output 325 ADDR(8) Output enable 326 ADDR(8) Input 327 ADDR(9) Output 328 ADDR(9) Output enable 329 ADDR(9) Input 330 ADDR(10) Output 331 ADDR(10) Output enable 10-36 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 332 ADDR(10) Input 333 ADDR(11) Output 334 ADDR(11) Output enable 335 ADDR(11) Input 336 ADDR(12) Output 337 ADDR(12) Output enable 338 ADDR(12) Input 339 ADDR(13) Output 340 ADDR(13) Output enable 341 ADDR(13) Input 342 ADDR(14) Output 343 ADDR(14) Output enable 344 ADDR(14) Input 345 ADDR(15) Output 346 ADDR(15) Output enable 347 ADDR(15) Input 348 ADDR(16) Output 349 ADDR(16) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-37 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 350 ADDR(16) Input 351 ADDR(17) Output 352 ADDR(17) Output enable 353 ADDR(17) Input 354 ADDR(18) Output 355 ADDR(18) Output enable 356 ADDR(18) Input 357 ADDR(19) Output 358 ADDR(19) Output enable 359 ADDR(19) Input 360 ADDR(20) Output 361 ADDR(20) Output enable 362 ADDR(20) Input 363 ADDR(21) Output 364 ADDR(21) Output enable 365 ADDR(21) Input 366 ADDR(22) Output 367 ADDR(22) Output enable 10-38 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 368 ADDR(22) Input 369 ADDR(23) Output 370 ADDR(23) Output enable 371 ADDR(23) Input 372 ADDR(24) Output 373 ADDR(24) Output enable 374 ADDR(24) Input 375 ADDR(25) Output 376 ADDR(25) Output enable 377 ADDR(25) Input 378 ADDR(26) Output 379 ADDR(26) Output enable 380 ADDR(26) Input 381 ADDR(27) Output 382 ADDR(27) Output enable 383 ADDR(27) Input 384 ADDR(28) Output 385 ADDR(28) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-39 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 386 ADDR(28) Input 387 ADDR(29) Output 388 ADDR(29) Output enable 389 ADDR(29) Input 390 ADDR(30) Output 391 ADDR(30) Output enable 392 ADDR(30) Input 393 ADDR(31) Output 394 ADDR(31) Output enable 395 ADDR(31) Input 396 ID0 No function 397 ID0 No function 398 ID0 Input 399 ID1 No function 400 ID1 No function 401 ID1 Input 402 ID2 No function 403 ID2 No function 10-40 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 404 ID2 Input 405 ADDR(0) Output 406 ADDR(0) Output enable 407 ADDR(0) Input 408 ADDR(1) Output 409 ADDR(1) Output enable 410 ADDR(1) Input 411 BRST Output 412 BRST Output enable 413 BRST Input 414 BMS_B Output 415 BMS_B Output enable 416 BMS_B Input 417 MS_B(0) Output 418 MS_B(0) Output enable 419 MS_B(0) No function 420 MS_B(1) Output 421 MS_B(1) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-41 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 422 MS_B(1) No function 423 MS_B(2) Output 424 MS_B(2) Output enable 425 MS_B(2) No function 426 MS_B(3) Output 427 MS_B(3) Output enable 428 MS_B(3) No function 429 CIF_B Output 430 CIF_B Output enable 431 CIF_B No function 432 CS_B No function 433 CS_B No function 434 CS_B Input 435 WRH_B Output 436 WRH_B Output enable 437 WRH_B Input 438 WRL_B Output 439 WRL_B Output enable 10-42 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 440 WRL_B Input 441 RDH_B Output 442 RDH_B Output enable 443 RDH_B Input 444 RDL_B Output 445 RDL_B Output enable 446 RDL_B Input 447 DMAG1_B Output 448 DMAG1_B Output enable 449 DMAG1_B Input 450 DMAG2_B Output 451 DMAG2_B Output enable 452 DMAG2_B Input 453 DMAR1_B Output 454 DMAR1_B Output enable 455 DMAR1_B Input 456 DMAR2_B Output 457 DMAR2_B Output enable ADSP-21160 SHARC DSP Hardware Reference 10-43 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 458 DMAR2_B Input 459 LBOOT No function 460 LBOOT No function 461 LBOOT Input 462 EBOOT No function 463 EBOOT No function 464 EBOOT Input 465 L5DAT(0) Output 466 L5DAT(0) Output enable 467 L5DAT(0) Input 468 L5DAT(1) Output 469 L5DAT(1) Output enable 470 L5DAT(1) Input 471 L5DAT(2) Output 472 L5DAT(2) Output enable 473 L5DAT(2) Input 474 L5DAT(3) Output 475 L5DAT(3) Output enable 10-44 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 476 L5DAT(3) Input 477 L5ACK Output 478 L5ACK Output enable 479 L5ACK Input 480 L5CLK Output 481 L5CLK Output enable 482 L5CLK Input 483 L5DAT(4) Output 484 L5DAT(4) Output enable 485 L5DAT(4) Input 486 L5DAT(5) Output 487 L5DAT(5) Output enable 488 L5DAT(5) Input 489 L5DAT(6) Output 490 L5DAT(6) Output enable 491 L5DAT(6) Input 492 L5DAT(7) Output 493 L5DAT(7) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-45 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 494 L5DAT(7) Input 495 L4DAT(0) Output 496 L4DAT(0) Output enable 497 L4DAT(0) Input 498 L4DAT(1) Output 499 L4DAT(1) Output enable 500 L4DAT(1) Input 501 L4DAT(2) Output 502 L4DAT(2) Output enable 503 L4DAT(2) Input 504 L4DAT(3) Output 505 L4DAT(3) Output enable 506 L4DAT(3) Input 507 L4ACK Output 508 L4ACK Output enable 509 L4ACK Input 510 L4CLK Output 511 L4CLK Output enable 10-46 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 512 L4CLK Input 513 L4DAT(4) Output 514 L4DAT(4) Output enable 515 L4DAT(4) Input 516 L4DAT(5) Output 517 L4DAT(5) Output enable 518 L4DAT(5) Input 519 L4DAT(6) Output 520 L4DAT(6) Output enable 521 L4DAT(6) Input 522 L4DAT(7) Output 523 L4DAT(7) Output enable 524 L4DAT(7) Input 525 L3DAT(0) Output 526 L3DAT(0) Output enable 527 L3DAT(0) Input 528 L3DAT(1) Output 529 L3DAT(1) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-47 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 530 L3DAT(1) Input 531 L3DAT(2) Output 532 L3DAT(2) Output enable 533 L3DAT(2) Input 534 L3DAT(3) Output 535 L3DAT(3) Output enable 536 L3DAT(3) Input 537 L3ACK Output 538 L3ACK Output enable 539 L3ACK Input 540 L3CLK Output 541 L3CLK Output enable 542 L3CLK Input 543 L3DAT(4) Output 544 L3DAT(4) Output enable 545 L3DAT(4) Input 546 L3DAT(5) Output 547 L3DAT(5) Output enable 10-48 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 548 L3DAT(5) Input 549 L3DAT(6) Output 550 L3DAT(6) Output enable 551 L3DAT(6) Input 552 L3DAT(7) Output 553 L3DAT(7) Output enable 554 L3DAT(7) Input 555 SBTS_B No function 556 SBTS_B No function 557 SBTS_B Input 558 PAGE Output 559 PAGE Output enable 560 PAGE No function 561 PA_B Output 562 PA_B Output enable 563 PA_B Input 564 REDY Output 565 REDY Output enable ADSP-21160 SHARC DSP Hardware Reference 10-49 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 566 REDY No function 567 ACK Output 568 ACK Output enable 569 ACK Input 570 BR_B(1) Output 571 BR_B(1) Output enable 572 BR_B(1) Input 573 BR_B(2) Output 574 BR_B(2) Output enable 575 BR_B(2) Input 576 BR_B(3) Output 577 BR_B(3) Output enable 578 BR_B(3) Input 579 BR_B(4) Output 580 BR_B(4) Output enable 581 BR_B(4) Input 582 BR_B(5) Output 583 BR_B(5) Output enable 10-50 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 584 BR_B(5) Input 585 BR_B(6) Output 586 BR_B(6) Output enable 587 BR_B(6) Input 588 HBR_B No function 589 HBR_B No function 590 HBR_B Input 591 HBG_B Output 592 HBG_B Output enable 593 HBG_B Input 594 L2DAT(0) Output 595 L2DAT(0) Output enable 596 L2DAT(0) Input 597 L2DAT(1) Output 598 L2DAT(1) Output enable 599 L2DAT(1) Input 600 L2DAT(2) Output 601 L2DAT(2) Output enable ADSP-21160 SHARC DSP Hardware Reference 10-51 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 602 L2DAT(2) Input 603 L2DAT(3) Output 604 L2DAT(3) Output enable 605 L2DAT(3) Input 606 L2ACK Output 607 L2ACK Output enable 608 L2ACK Input 609 L2CLK Output 610 L2CLK Output enable 611 L2CLK Input 612 L2DAT(4) Output 613 L2DAT(4) Output enable 614 L2DAT(4) Input 615 L2DAT(5) Output 616 L2DAT(5) Output enable 617 L2DAT(5) Input 618 L2DAT(6) Output 619 L2DAT(6) Output enable 10-52 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 620 L2DAT(6) Input 621 L2DAT(7) Output 622 L2DAT(7) Output enable 623 L2DAT(7) Input 624 L1DAT(0) Output 625 L1DAT(0) Output enable 626 L1DAT(0) Input 627 L1DAT(1) Output 628 L1DAT(1) Output enable 629 L1DAT(1) Input 630 L1DAT(2) Output 631 L1DAT(2) Output enable 632 L1DAT(2) Input 633 L1DAT(3) Output 634 L1DAT(3) Output enable 635 L1DAT(3) Input 636 L1ACK Output 637 L1ACK Output enable ADSP-21160 SHARC DSP Hardware Reference 10-53 Boundary Register Table 10-7. JTAG Boundary Register (Cont’d) Scan # Signal Name Latch Type 638 L1ACK Input 639 L1CLK Output 640 L1CLK Output enable 641 L1CLK Input 642 L1DAT(4) Output 643 L1DAT(4) Output enable 644 L1DAT(4) Input 645 L1DAT(5) Output 646 L1DAT(5) Output enable 647 L1DAT(5) Input 648 L1DAT(6) Output 649 L1DAT(6) Output enable 650 L1DAT(6) Input 651 L1DAT(7) Output 652 L1DAT(7) Output enable 653 L1DAT(7) Input 654 SPARE No function 10-54 ADSP-21160 SHARC DSP Hardware Reference JTAG Test Emulation Port Device Identification Register No device identification register is included in the ADSP-21160 DSP. Built-in Self-test Operation (BIST) No self-test functions are supported by the ADSP-21160 DSP. Private Instructions Table 10-2 on page 10-4 lists the private instructions that are reserved for emulation and memory test. The ADSP-21160 EZ-ICE emulator uses the TAP and boundary scan as a way to access the processor in the target system. The EZ-ICE emulator requires a target board connector for access to the TAP. For more information, see “Designing for JTAG Emulation” on page 11-24. References • IEEE Standard 1149.1-1990. Standard Test Access Port and Boundary-Scan Architecture. To order a copy, contact IEEE at 1-800-678-IEEE. • Maunder, C.M. and R. Tulloss. Test Access Ports and Boundary Scan Architectures. IEEE Computer Society Press, 1991. • Parker, Kenneth. The Boundary Scan Handbook. Kluwer Academic Press, 1992. ADSP-21160 SHARC DSP Hardware Reference 10-55 References • Bleeker, Harry, P. van den Eijnden, and F. de Jong. Boundary-Scan Test—A Practical Approach. Kluwer Academic Press, 1993. • Hewlett-Packard Co. HP Boundary-Scan Tutorial and BSDL Reference Guide. (HP part# E1017-90001.) 1992. 10-56 ADSP-21160 SHARC DSP Hardware Reference System Design 11 SYSTEM DESIGN The DSP supports many system design options. The options implemented in a system are influenced by cost, performance, and system requirements. This chapter provides the following system design information: • “DSP Pin Descriptions” on page 11-2 • “Designing for JTAG Emulation” on page 11-24 • “Conditioning Input Signals” on page 11-34 • “Designing For High Frequency Operation” on page 11-36 • “Booting Single and Multiple Processors” on page 11-48 Other chapters also discuss system design issues. Some other locations for system design information include: • “Setting External Port Modes” on page 7-2 • “Setting Link Port Modes” on page 8-5 • “Setting Serial Port Modes” on page 9-6 ADSP-21160 SHARC DSP Hardware Reference 11-1 DSP Pin Descriptions DSP Pin Descriptions This section describes the pins of the DSP and shows how these signals can be used in a DSP system. Figure 11-1 illustrates how the pins are used in a single-processor system. Figure 7-26 on page 7-93 shows a system diagram illustrating pin connections in an DSP multiprocessor cluster. ADSP-2116X CLKIN 4 FLAG3-0 TIMEXP LINK DEVICES (6 MAX) (OPTIONAL) LXCLK LXACK LXDAT7-0 SERIAL SERIAL DEVICE DEVICE (OPTIONAL) (OPTIONAL) TCLK0 RCLK0 TFS0 RSF0 DT0 DR0 SERIAL SERIAL DEVICE (OPTIONAL) (OPTIONAL) TCLK1 RCLK1 TFS1 RSF1 DT1 DR1 RPBA ID2-0 RESET ADDR CIF DATA BRST ADDR31-0 ADDR DATA63-0 DATA OE OE WE WE RDX WRX ACK ACK CS MS3-0 PAGE SBTS CLKOUT DMAR1-2 BOOT EPROM (OPTIONAL) MEMORY AND PERIPHERALS (OPTIONAL) DATA 3 EBOOT LBOOT IRQ2-0 CS CS BMS CLK_CFG3-0 ADDRESS 4 CONTROL CLOCK DMA DEVICE (OPTIONAL) DATA DMAG1-2 CS HBR HBG REDY HOST PROCESSOR INTERFACE (OPTIONAL) BR1-6 ADDR PA DATA JTAG 6 Figure 11-1. Single-Processor DSP System 11-2 ADSP-21160 SHARC DSP Hardware Reference System Design DSP pin definitions are listed in Table 11-1. The following symbols appear in the Type column of Table 11-1: Table 11-1. Pin Descriptions Pin Type Function ADDR31-0 I/O/T External Bus Address. The DSP outputs addresses for external memory and peripherals on these pins. In a multiprocessor system, the bus master outputs addresses for read/writes of the internal memory or I/O processor registers of other DSPs. The DSP inputs addresses when a host processor or multiprocessing bus master is reading or writing its internal memory or I/O processor registers. A keeper latch on the DSP’s ADDR31-0 pins maintains the input at the level it was last driven (only enabled on the DSP with ID2-0=00x). DATA63-0 I/O/T External Bus Data. The DSP inputs and outputs data and instructions on these pins. Pull-up resistors on unused DATA pins are not necessary. A keeper latch on the DSP’s DATA63-0 pins maintains the input at the level it was last driven (only enabled on the DSP with ID2-0=00x). MS3-0 O/T Memory Select Lines. These outputs are asserted (low) as chip selects for the corresponding banks of external memory. Memory bank size must be defined in the SYSCON control register. The MS3-0 outputs are decoded memory address lines. In asynchronous access mode, the MS3-0 outputs transition with the other address outputs. In synchronous access modes, the MS3-0 outputs assert with the other address lines. They deassert after the first CLKIN cycle in which ACK is sampled asserted. MS3-0 has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. CIF O Core Instruction Fetch. Signal is active low when an external instruction fetch is performed. Driven by bus master only. Three-state when host is bus master. CIF has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. RDL I/O/T Memory Read Low Strobe. RDL is asserted whenever DSP reads from the low word of external memory or from the internal memory of other DSPs. External devices, including other DSPs, must assert RDL for reading from low word of DSP internal memory. In a multiprocessing system, RDL is driven by the bus master RDL has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. ADSP-21160 SHARC DSP Hardware Reference 11-3 DSP Pin Descriptions Table 11-1. Pin Descriptions (Cont’d) Pin Type Function RDH I/O/T Memory Read High Strobe. RDH is asserted whenever DSP reads from the high word of external memory or from the internal memory of other DSPs. External devices, including other DSPs, must assert RDH for reading from the high word of DSP internal memory. In a multiprocessing system, RDH is driven by the bus master. RDH has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. WRL I/O/T Memory Write Low Strobe. WRL is asserted when DSP writes to the low word of external memory or internal memory of other DSPs. External devices must assert WRL for writing to DSP's low word of internal memory. In a multiprocessing system, WRL is driven by the bus master. WRL has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. WRH I/O/T Memory Write High Strobe. WRH is asserted when DSP writes to the high word of external memory or internal memory of other DSPs. External devices must assert WRH for writing to DSP's high word of internal memory. In a multiprocessing system, WRH is driven by the bus master. WRH has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. BRST I/O/T Sequential burst access. BRST is asserted by DSP or a host to indicate that data associated with consecutive addresses is being read or written. A slave device samples the initial address and increments an internal address counter after each transfer. The incremented address is not pipelined on the bus. If the burst access is a read from the DSP by a host, DSP increments the address automatically as long as BRST is asserted. BRST is asserted after the initial access of a burst transfer. It is asserted for every cycle after that, except for the last data request cycle (denoted by RDx or WRx asserted and BRST negated). A keeper latch on the DSP’s BRST pin maintains the input at the level it was last driven (only enabled on the DSP with ID2-0=00x). PAGE O/T DRAM Page Boundary. The DSP asserts this pin to signal that an external DRAM page boundary has been crossed. DRAM page size must be defined in the DSP's memory control register (WAIT). DRAM can only be implemented in external memory Bank 0. The PAGE signal can only be activated for Bank 0 accesses. In a multiprocessing system PAGE is output by the bus master. A keeper latch on the DSP’s PAGE pin maintains the output at the level it was last driven (only enabled on the DSP with ID2-0=00x). 11-4 ADSP-21160 SHARC DSP Hardware Reference System Design Table 11-1. Pin Descriptions (Cont’d) Pin Type Function ACK I/O/S Memory Acknowledge. External devices can deassert ACK (low) to add wait states to an external memory access. ACK is used by I/O devices, memory controllers, or other peripherals to hold off completion of an external memory access. The DSP deasserts ACK as an output to add wait states to a synchronous access of its internal memory. ACK has a 2k internal pull up resistor that is enabled on the DSP with ID2-0=00x SBTS I/S Suspend Bus and Three-State. External devices can assert SBTS (low) to place the external bus address, data, selects, and strobes in a high impedance state for the following cycle. If the DSP attempts to access external memory while SBTS is asserted, the processor halts and the memory access does not complete until SBTS is deasserted. SBTS should only be used to recover from host processor/DSP deadlock or used with a DRAM controller. IRQ2-0 I/A Interrupt Request Lines. These are sampled on the rising edge of CLKIN and may be either edge-triggered or level-sensitive. FLAG3-0 I/O/A Flag Pins. Each is configured via control bits as either an input or output. As an input, it can be tested as a condition. As an output, it can be used to signal external peripherals. TIMEXP O Timer Expired. Asserted for four core clock cycles when the timer is enabled, and TCOUNT decrements to zero. HBR I/A Host Bus Request. Must be asserted by a host processor to request control of the DSP's external bus. When HBR is asserted in a multiprocessing system, the DSP that is bus master relinquishes the bus and asserts HBG. To relinquish the bus, the DSP places the address, data, select, and strobe lines in a high impedance state. HBR has priority over all DSP bus requests (BR6-1) in a multiprocessing system. HBG I/O Host Bus Grant. Acknowledges an HBR bus request, indicating that the host processor may take control of the external bus. HBG is asserted (held low) by the ADSP-21160 until HBR is released. In a multiprocessing system, HBG is output by the DSP bus master and is monitored by all others. After HBR is asserted and before HBG is deasserted, HBG will float for one CLKIN cycle. To avoid erroneous grants, HBG should be pulled up with a 20k to 50k external resistor. ADSP-21160 SHARC DSP Hardware Reference 11-5 DSP Pin Descriptions Table 11-1. Pin Descriptions (Cont’d) Pin Type Function CS I/A Chip Select. Asserted by host processor to select the DSP. REDY O (O/D) Host Bus Acknowledge. The DSP deasserts REDY (low) to add waitstates to a host access when CS and HBR inputs are asserted. DMAR1 I/A DMA Request 1 (DMA Channel 11). Asserted by external port devices to request DMA services. DMAR1 has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x. DMAR2 I/A DMA Request 2 (DMA Channel 12). Asserted by external port devices to request DMA services. DMAR2 has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x DMAG1 O/T DMA Grant 1 (DMA Channel 11). Asserted by DSP to indicate that the requested DMA starts on the next cycle. Driven by bus master only. DMAG1 has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x DMAG2 O/T DMA Grant 2 (DMA Channel 12). Asserted by DSP to indicate that the requested DMA starts on the next cycle. Driven by bus master only. DMAG2 has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x BR6-1 I/O/S Multiprocessing Bus Requests. Used by multiprocessing DSPs to arbitrate for bus mastership. Each DSP only drives its own BRx line (corresponding to the value of its ID2-0 inputs) and monitors all others. In a multiprocessor system with less than six DSPs, the unused BRx pins should be pulled high. The processor's own BRx line must not be pulled high or low because it is an output. ID2-0 I Multiprocessing ID. Determines which multiprocessing bus request (BR1-BR6) is used by each DSP. ID = 001 corresponds to BR1, ID = 010 corresponds to BR2, and so on. Use ID = 000 in single-processor systems. These lines are a system configuration selection which should be hardwired or only changed at reset. RPBA I/S Rotating Priority Bus Arbitration Select. When RPBA is high, rotating priority for multiprocessor bus arbitration is selected. When RPBA is low, fixed priority is selected. This signal is a system configuration selection that must be set to the same value on every DSP. If the value of RPBA is changed during system operation, it must be changed in the same CLKIN cycle on every DSP. 11-6 ADSP-21160 SHARC DSP Hardware Reference System Design Table 11-1. Pin Descriptions (Cont’d) Pin Type Function PA I/O/T Priority Access. Asserting its PA pin allows an DSP bus slave to interrupt background DMA transfers and gain access to the external bus. PA is connected to all DSPs in the system. If access priority is not required in a system, the PA pin should be left unconnected. DMAR1 has a 20k internal pull up resistor that is enabled on the DSP with ID2-0=00x DTx O Data Transmit (Serial Ports 0, 1). Each DT pin has a 50k internal pull-up resistor. DRx I Data Receive (Serial Ports 0, 1). Each DR pin has a 50k internal pull-up resistor. TCLKx I/O Transmit Clock (Serial Ports 0, 1). Each TCLK pin has a 50k internal pull-up resistor. RCLKx I/O Receive Clock (Serial Ports 0, 1). Each RCLK pin has a 50k internal pull-up resistor. TFSx I/O Transmit Frame Sync (Serial Ports 0, 1). RFSx I/O Receive Frame Sync (Serial Ports 0, 1). LxDAT7-0 I/O Link Port Data (Link Ports 0-5). Each LxDAT pin has an internal pull-down resistor that is enabled or disabled by the LPDRD bit of the LCTL0-1 register. LxCLK I/O Link Port Clock (Link Ports 0-5). Each LxCLK pin has an internal pull-down resistor that is enabled or disabled by the LPDRD bit of the LCTL0-1 register. LxACK I/O Link Port Acknowledge (Link Ports 0-5). Each LxACK pin has an internal pull-down resistor that is enabled or disabled by the LPDRD bit of the LCTL0-1 register. EBOOT I EPROM Boot Select. For a description of how this pin operates, see the BMS pin description. This signal is a system configuration selection that should be hardwired. LBOOT I Link Boot. For a description of how this pin operates, see the BMS pin description. This signal is a system configuration selection that should be hardwired. ADSP-21160 SHARC DSP Hardware Reference 11-7 DSP Pin Descriptions Table 11-1. Pin Descriptions (Cont’d) Pin Type Function BMS I/O/T Boot Memory Select. Serves as an output or input as selected with the EBOOT and LBOOT pins as shown below. This input is a system configuration selection that should be hardwired. EBOOTLBOOT BMS Booting Mode 10OutputEPROM (Connect BMS to EPROM chip select.) 001 (Input)Host Processor 011 (Input)Link Port 000 (Input)No Booting. Processor executes from external memory 010 (Input)Reserved 11x (Input)Reserved CLKIN I Local Clock In. CLKIN is the DSP clock input. The DSP external port cycles at the frequency of CLKIN. The CLKIN frequency is multiplied by a ratio (CLK_CFG3-0) to select the instruction cycle rate, which is programmable at powerup. CLKIN may not be halted, changed, or operated below the specified frequency. CLK_ CFG3-0 I Core/CLKIN Ratio Control. DSP core clock (instruction cycle) rate is equal to n x CLKIN where n is user selectable to 2, 3, or 4, using the CLK_CFG3-0 inputs. CLKOUT O/T Local Clock Out. CLKOUT is driven at the CLKIN frequency by the DSP. This output is three-stated by setting the COD bit in the SYSCON register. A keeper latch on the DSP’s CLKOUT pin maintains the output at the level it was last driven (only enabled on the DSP with ID2-0=00x). RESET I/A Processor Reset. Resets the DSP to a known state and begins execution at the program memory location specified by the hardware reset vector address. The RESET input must be asserted (low) at power-up. The only difference between the soft and hard reset is that the external bus arbitration is not affected by a soft reset. The PLL does not get reset by a soft reset. TCK I Test Clock (JTAG). Provides a clock for JTAG boundary scan. TMS I/S Test Mode Select (JTAG). Used to control the test state machine. TMS has an internal pull-up resistor. TDI I/S Test Data Input (JTAG). Provides serial data for the boundary scan logic. TDI has an internal pull-up resistor. 11-8 ADSP-21160 SHARC DSP Hardware Reference System Design Table 11-1. Pin Descriptions (Cont’d) Pin Type Function TDO O Test Data Output (JTAG). Serial scan output of the boundary scan path. TRST I/A Test Reset (JTAG). Resets the test state machine. TRST must be asserted (pulsed low) after power-up or held low for proper operation of the DSP. TRST has an internal pull-up resistor. EMU O (o/d) Emulation Status. Must be connected to the DSP target board connector only. VDDINT P Processor Core Power Supply. Nominally +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP). (40 pins). VDDEXT P I/O Power Supply; Nominally +3.3V DC. (43 pins). AVDD P Analog Power Supply; Nominally +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP). It supplies the DSP’s internal PLL (clock generator). This pin has the same specifications as VDDINT, except that added filtering circuitry is required. For more information on supply specifications, see ADSP-21160 DSP Microcomputer Data Sheet. AGND G Analog Power Supply Return. GND G Power Supply Return. (82 pins). NC Do Not Connect. Reserved pins which must be left open and unconnected. (9 pins). A Asynchronous G Ground I Input O Output P Power Supply S Synchronous (a/d) Active Drive ADSP-21160 SHARC DSP Hardware Reference 11-9 DSP Pin Descriptions (o/d) Open Drain T Three-State (when SBTS is asserted or DSP is bus slave) Figure 11-2 shows how different data word sizes are transferred over the external port. Inputs identified as synchronous (S) must meet timing requirements with respect to CLKIN (or with respect to TCK for TMS, TDI). Inputs identified as asynchronous (A) can be asserted asynchronously to CLKIN (or to TCK for TRST). DATA63-0 63 55 47 39 31 23 15 BYTE 7 7 0 BYTE 0 RDL/WRL RDH/WRH 64-BIT LONG WORD, SIMD, OR DMA TRANSFERS 64-BIT TRANSFER FOR 48-BIT INSTRUCTION FETCH 64-BIT TRANSFER FOR 40-BIT EXT. PREC. 32-BIT NORMAL WORD(EVEN ADDR) 32-BIT NORMAL WORD (ODD ADDR) RESTRICTED DMA, HOST, EPROM DATA ALIGNMENTS: 32-BIT PACKED 16-BIT PACKED EPROM Figure 11-2. External Port Data Alignment 11-10 ADSP-21160 SHARC DSP Hardware Reference System Design Unused inputs should be tied or pulled to VDD or GND, except for the following: • ADDR31-0, DATA63-0, BRST, PAGE, CLKOUT —These pins have a logic-level hold circuit enabled on the DSP with ID2-0=00x that prevents input from floating internally. • PA, • LxDAT7-0 (LxPRDE=0), LxCLK, LxACK—These pins have an internal pull-down resistor that is controlled by bit settings in the LCTLx register. • DTx, DRx, TCLKx, RCLKx,EMU—These ACK, MS3-0, CIF, RDH/L, WRH/L, DMARx, DMAGx— These pins have a pull-up resistor enabled on the DSP with ID2-0=00x. pins have a 50k pull-up resistor. • TMS, TRST, TDI—These pins have a 20k pull-up resistor. input of the JTAG interface must be asserted (pulsed The low) or held low after power-up for proper operation of the DSP. TRST Do not leave this pin unconnected. Additional Notes: • In single-processor systems, the DSP owns the external bus during reset and does not perform bus arbitration to gain control of the bus. • Operation of the RDH/L and WRH/L signals changes when CS is asserted by a host processor. For more information, see “Asynchronous Transfers” on page 7-57 and “Synchronous Transfers” on page 7-62. • Except during a Host Transition Cycle (HTC), the RDH/L and WRH/L strobes should not be deasserted (low-to-high transition) while ACK or REDY are deasserted (low)—the DSP hangs if this happens. ADSP-21160 SHARC DSP Hardware Reference 11-11 DSP Pin Descriptions • In multiprocessor systems, the ACK signal is an input to the DSP bus master and does not float when it is not being driven, because the bus master maintains a pull-up on the pin. During reset, the ACK pin is pulled high internally with a 2 k equivalent resistor by the DSP bus master and is held high with the internal pull-up resistor. It is not necessary to use an external pull-up resistor on the ACK line during booting or at any other time. • For multiprocessor systems, PAGE is guaranteed to be asserted for the first true access after acquiring bus mastership. PAGE is not updated or asserted for multiprocessor memory space accesses or external memory space accesses to any bank other than Bank 0. • The HBR input is disabled during any access in which the PAGE signal is asserted. This prevents the possibility of the DSP becoming a bus slave while a DRAM controller is servicing a page change. Pin States At Reset Table 11-2 shows the DSP pin states during and after reset. Table 11-2. Post RESET Pin States Pin Type State During and After RESET ADDR31-0 I/O/T Driven1 MS3-0 O/T Driven High1 CIF O/T Driven High1 RDH I/O/T Driven High1 RDL I/O/T Driven High1 WRH I/O/T Driven High1 WRL I/O/T Driven High1 BRST I/O/T Driven Low1 11-12 ADSP-21160 SHARC DSP Hardware Reference System Design Table 11-2. Post RESET Pin States (Cont’d) Pin Type State During and After RESET PAGE O/T Driven Low1 CLKOUT O/T Driven ACK I/O/S/T Pulled High by Bus Master (w/ 2 k internal pull-up resistor)1 HBG I/O/S/T Driven High1 DMAG1 O/T Driven High1 DMAG2 O/T Driven High BR6-1 I/O BR1 DATA63-0 I/O/T Three-state1 SBTS I/S Input; causes the master to three-state during reset2 IRQ2-0 I/A Inputs2 FLAG3-0 I/O/A Inputs2 TIMEXP O Driven Low2 HBR I/A Input2 CS I Input2 REDY (o/d) O Three-state2 DMAR1 I Input2 DMAR2 I Input2 ID2-0 I Inputs2 RPBA I/S Input2 I/O Three-state2 EBOOT I Input2 LBOOT I Input2 BMS I/O/T Input2 PA (o/d) Driven Low if Bus Master, Otherwise Driven High1 ADSP-21160 SHARC DSP Hardware Reference 11-13 DSP Pin Descriptions Table 11-2. Post RESET Pin States (Cont’d) Pin Type State During and After RESET CLKIN I Input RESET I/A Input2 DTx O Three-state (for multichannel)3 DRx I Input3 TCLKx I/O Three-state3 RCLKx I/O Three-state3 TFSx I/O Three-state3 RFSx I/O Three-state3 LxDAT7-0 I/O Three-state3 LxCLK I/O Three-state3 LxACK I/O Three-state3 TCK I Input4 TMS I/S Input4 TDI I/S Input4 TDO O Three-state4 TRST I/A Input4 EMU O (o/d) Three-state4 1 2 3 4 11-14 Driven only by DSP bus master, otherwise three-stated Bus master independent Serial ports and link ports JTAG interface ADSP-21160 SHARC DSP Hardware Reference System Design Clock Derivation The DSP employs a phase-locked loop on-chip, to provide clocks that switch at higher frequencies than the system clock (CLKIN). The PLL-based clocking methodology employed on the DSP influences the clock frequencies and behavior for the serial, link, and external ports; in addition to the processor core and internal memory. In each case, the DSP PLL provides a de-skewed clock to the port logic and I/O pins. For the external port, this clock is fedback to the PLL, such that the external port clock always switches at the CLKIN frequency. The PLL provides internal clocks that switch at multiples of the CLKIN frequency for the internal memory, processor core, link and serial ports, and to modify the external port timing as required (for example, read/write strobes in asynchronous modes). The ratio of processor core clock frequency and CLKIN/external port clock frequency is determined by the CLK_CFG3-0 pins (as shown in Table 11-3 on page 11-17), during reset. core clock ratio cannot be altered dynamically. The DSP must The be reset to alter the clock ratio. The PLL provides a clock that switches at the processor core frequency to the serial and link ports. Each of the serial and link ports can be programmed to operate at clock frequencies derived from this clock. The two serial ports' transmit and receive clocks are divided down from the processor core clock frequency by setting the TDIVx and RDIVx registers appropriately. For more information, see “SPORT Transmit Divisor Registers (TDIVx)” on page A-76 and “SPORT Receive Divisor Registers (RDIVx)” on page A-77. Each of the six link port clock frequencies are determined by programming the LxCLKDx parameters in the LCTL registers. For more information, see “Link Port Buffer Control Registers (LCTLx)” on page A-62. The following shows an example clock derivation: ADSP-21160 SHARC DSP Hardware Reference 11-15 DSP Pin Descriptions Definition of terms: tCK = CLKIN clock period (and external port clock period) tCCLK = (processor) core clock period tLCLK = link port clock period tSCLK = serial port clock period Clock ratios: cRTO = core/CLKIN ratio, (2, 3, or 4:1, determined by CLK_CFG) lRTO = lport/core clock ratio (1:4, 1:2, 1:3, or 1:1, determined by LxCLKD) sRTO = sport/core clock ratio (wide range determined by xCLKDIV) Determining clock period: tCCLK = (tCK) / cRTO tLCLK = (tCCLK) * lRTO tSCLK = (tCCLK) * sRTO RESET and CLKIN The DSP receives its clock input on the CLKIN pin. The processor uses an on-chip phase-locked loop to generate its internal clock, which is a multiple of the CLKIN frequency. Because the phase-locked loop requires some time to achieve phase lock, CLKIN must be valid for a minimum time period during reset before the RESET signal can be deasserted. For information on minimum clock setup, see the DSP’s data sheet. 11-16 ADSP-21160 SHARC DSP Hardware Reference System Design Table 11-3 describes the internal clock to CLKIN frequency ratios supported by DSP: Table 11-3. Clock Configuration Definition CLK_CFG3-0 Core/CLKIN ratio 0010 2:1 0011 3:1 0100 4:1 1111 Reserved All others Reserved Table 11-4 demonstrates the internal core clock switching frequency, across a range of CLKIN (external port bus) frequencies. The minimum operational range for any given frequency is constrained by the operating range of the phase lock loop. Note that the goal in selecting a particular clock ratio for the DSP application is to provide the highest internal frequency, given a CLKIN frequency. Table 11-4. Selecting Core to CLKIN Ratio CLKIN Input (MHz) 25 33.3 40 50 Clock Ratios Core CLK (MHz) 2:1 50 66.6 80 80 (21160M) 100 (21160N) 3:1 75 100 N/A N/A 4:1 100 N/A N/A N/A Input Synchronization Delay The DSP has several asynchronous inputs: RESET, TRST, HBR, CS, DMAR1, DMAR2, IRQ2-0, and FLAG3-0 (when configured as inputs). These inputs can ADSP-21160 SHARC DSP Hardware Reference 11-17 DSP Pin Descriptions be asserted in arbitrary phase to the processor clock, CLKIN. The DSP synchronizes the inputs prior to recognizing them. The delay associated with recognition is called the synchronization delay. Any asynchronous input must be valid prior to the recognition point to be recognized in a particular cycle. If an input does not meet the setup time on a given cycle, it may be recognized in the current cycle or during the next cycle. To ensure recognition of an asynchronous input, it must be asserted for at least one full processor cycle plus setup and hold time, except for RESET, which must be asserted for at least four processor cycles. The minimum time prior to recognition (the setup and hold time) is specified in the DSP’s data sheet. Interrupt and Timer Pins The DSP’s external interrupt pins, flag pins, and timer pin can be used to send and receive control signals to and from other devices in the system. Hardware interrupt signals are received on the IRQ2-0 pins. Interrupts can come from devices that require the DSP to perform some task on demand. A memory-mapped peripheral, for example, can use an interrupt to alert the processor that it has data available. For more information, see “Interrupts and Sequencing” on page 3-31. The TIMEXP output is generated by the on-chip timer. It indicates to other devices that the programmed time period has expired. For more information, see “Timer and Sequencing” on page 3-48. Flag Pins The FLAG3-0 pins allow single-bit signalling between the DSP and other devices. For example, the DSP can raise an output flag to interrupt a host processor. Each flag pin can be programmed to be either an input or output. In addition, many DSP instructions can be conditioned on a flag’s 11-18 ADSP-21160 SHARC DSP Hardware Reference System Design input value, enabling efficient communication and synchronization between multiple processors or other interfaces. The flags are bidirectional pins, each with the same functionality. The FLGxO bits in the MODE2 register program the direction of each flag pin. For more information, see “Mode Control 2 Register (MODE2)” on page A-6. Flag Inputs When a flag pin is programmed as an input, its value is stored in a bit in the FLAGS register. The bit is updated in each cycle with the input value from the pin. Flag inputs can be asynchronous to the DSP clock, so there is a one-cycle delay before a change on the pin appears in FLAGS (if the rising edge of the input misses the setup requirement for that cycle). For more information, see “Flag Value Register (FLAGS)” on page A-27. An flag bit is read-only if the flag is configured as an input. Otherwise, the bit is readable and writable. The flag bit states are conditions that code can specify in conditional instructions. Flag Outputs When a flag is configured as an output, the value on the pin follows the value of the corresponding bit in the FLAGS register. A program can set or clear the flag bit to provide a signal to another processor or peripheral. The FLAG outputs transition on rising edge of CLKIN. Because the processor core operates at least twice the frequency of CLKIN, the programmer must hold the FLAG state stable for at least one full CLKIN period, to insure that the output changes state. Figure 11-3 describes the relationship between instruction execution and a Flag pin, when the processor core to bus clock ratio is set to 2:1. Figure 11-3 also describes the flag in/out process. Note that at least two instructions execute each CLKIN cycle. ADSP-21160 SHARC DSP Hardware Reference 11-19 DSP Pin Descriptions bit set MODE2 FLG0; bit clr FLAGS FLG0; bit set FLAGS FLG0; nop; /* /* /* /* /* bit clr FLAGS FLG0; /* /* /* /* /* bit clr MODE2 FLG0; Nop; nop; 1S T CL K IN CYCL E : 1st cycle: set FLAG0 to output in Mode2 */ clear FLAG0 */ 1st cycle: set FLAG0 output high */ 2nd cycle: FLAG register updated here */ A NOP indicates a NOP or another instruction not related to FLAG. */ 2nd cycle: clear FLAG0 output */ earliest assertion of FLAG0 output, depends on CLKIN phase*/ 3rd cycle: set FLAG0 back to input */ 3rd cycle: */ 4th cycle: earliest deassertion of FLAG0 output */ 2ND CL K IN CY CL E : 3R D CL K IN CY CL E : OU T PU T E NAB LE D FL AG HI GH 4T H CL KI N CY CL E : 5T H CLK IN CY CL E : CL K IN F L AG L OW OU T PU T DIS AB L E D, I NP U T S AMP L E D F L AGX OU T PU T VAL ID Figure 11-3. Flag Timing (At 2:1 Clock Ratio) JTAG Interface Pins The JTAG test access port consists of the TCK, TMS, TDI, TDO, and TRST pins. The JTAG port can be connected to a controller that performs a boundary scan for testing purposes. This port is also used by the Analog Devices (ADI) family of emulators to access on-chip emulation features. To allow the use of the emulator, a connector for its in-circuit probe must be included in the target system. For more information, see “Designing for JTAG Emulation” on page 11-24. If TRST is not asserted (or held low) at power-up, the JTAG port is in an undefined state that may cause the DSP to drive out on I/O pins that 11-20 ADSP-21160 SHARC DSP Hardware Reference System Design would normally be three-stated at reset. TRST can be held low with a jumper to ground on the target board connector. For more information, see Figure 11-9 on page 11-29. Dual-Voltage Power-up Sequencing The ADSP-21160 dual-voltage processor has special considerations related to power-up. Note that these are general recommendations, and specifics details on dual voltage power supply systems is beyond the scope of this book. When the system power is activated through the DSP's dual power supply system, both supplies should be brought up as quickly as possible. Ideally, the two supplies, VDDEXT and VDDINT should be powered up simultaneously. Many commercially available dual supply regulators address simultaneous power-up requirements of the core and I/O. When designing a dual supply system, the designer should consider the relative voltage and ramp-up timing between the core and I/O voltages in order to avoid potential issues with system and DSP long-term reliability. The ADSP-21160 DSP’s I/O pads have a network of internal diodes to protect the DSP from damage by electrostatic discharge. These protection diodes connect the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) core and 3.3V I/O supplies internally. Figure 11-4 shows how a network of protection diodes isolates the internal supplies and provides ESD protection for the I/O pins. During the power-up sequence of the DSP, differences in the ramp up rates and activation time between the two supplies can cause current to flow in the I/O ESD protection circuitry. When applying power separately to the VDDINT or VDDEXT pins, take precautions to prevent or limit the maximum current and duration conducted through the isolation diodes if the unpowered pins are at ground potential. Since the ESD protection diodes connect the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) core and 3.3V I/O supplies internally, these diodes can be damaged at any time the +2.5V DC ADSP-21160 SHARC DSP Hardware Reference 11-21 Dual-Voltage Power-up Sequencing (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) core supply voltage is present without the presence of the 3.3V I/O supply. V DDINT V DDEXT (3.3V) (2.5V for 21 160M) (1.9V for 21 160N) ADSP-21160 DSP INPUT IO PIN INTE RNAL LOG IC O UTPUT Figure 11-4. Protection Diodes and IO Pin ESD Protection ESD protection diodes connect the +2.5V DC The (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) core and the 3.3V I/O supplies internally. Improper supply sequencing can cause damage to the ESD protection circuitry. If the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) supply is active for prolonged periods of time before the 3.3V I/O supply is activated, there is a significant amount of loading on the I/O pins. Damage occurs because the I/O will be powered from the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) supply through the ESD diodes. To prevent this damage to the ESD diode protection circuitry, Analog Devices recommends including a bootstrap Schottky diode. The bootstrap Schottky diode connected between the +2.5V DC (ADSP-21160M DSP) 11-22 ADSP-21160 SHARC DSP Hardware Reference System Design or +1.9V DC (ADSP-21160N DSP) and 3.3V power supplies protects the ADSP-21160 DSPs from partially powering the 3.3V supply. Including a Schottky diode will shorten the delay between the supply ramps and thus prevent damage to the ESD diode protection circuitry. With this technique, of the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) rail rises ahead of the 3.3V rail, the Schottky diode pulls the 3.3V rail along with the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) rail. Figure 11-5 shows a basic block diagram of the Schottky diode connected between the core and I/O voltage regulators and the DSP. DC input source 3.3V I/O Voltage Regulator VDDEXT ADSP-21160 2.5V (21160M) 1.9V (21160N) Core Voltage Regulator VDDINT Figure 11-5. Dual +2.5V DC (21160M) or +1.9V DC (21160N)/3/3V Supplies with Schottky Diode The anode of the diode must be connected to the +2.5V DC (ADSP-21160M DSP) or +1.9V DC (ADSP-21160N DSP) supply. The diode must have a forward biased voltage of 0.6V or less and must have a current rating sufficient to supply the 3.3V rail of the system. The use of a Schottky diode is the recommended method suggested by Analog Devices. For recommendations on managing power-up sequencing for the core I/O dual voltage supply, refer to the “Power-up Sequencing” specifications in ADSP-21160 SHARC DSP Microcomputer Data Sheet. ADSP-21160 SHARC DSP Hardware Reference 11-23 Designing for JTAG Emulation Designing for JTAG Emulation The DSP Analog Devices DSP Tools product line of JTAG emulator is a development tool for debugging programs running in real time on DSP target system hardware. The Analog Devices DSP Tools product line of JTAG emulators provides a controlled environment for observing, debugging, and testing activities in a target system by connecting directly to the target processor through its JTAG interface. Because the Analog Devices DSP Tools product line of JTAG emulator controls the target system’s DSP through the processor’s IEEE 1149.1 JTAG Test Access Port (TAP), non-intrusive in-circuit emulation is assured. The emulator uses the TAP to access the internals of the DSP, allowing the developer to load code, set breakpoints, observe variables, observe memory, examine registers, etc. The DSP must be halted to send data and commands, but once an operation is completed by the emulator, the DSP system is set running at full speed with no impact on system timings emulator does not impact target loading or timing. The emulator’s in-circuit probe connects to a variety of host computers (PCI bus, or USB) with plug-in boards. Target systems must have a 14-pin connector in order to accept the Analog Devices DSP Tools product line of JTAG emulator in-circuit probe, a 14-pin plug. Designs must add this connector to the target board if the board is intended for use with the ADSP-21160 JTAG Emulator. The total trace length between the JTAG connector and the furthest device sharing the Emulator’s JTAG pins should be limited to 15 inches maximum for guaranteed operation. This length restriction must include the emulator’s JTAG signals, which are routed to one or more ADSP-21160 devices, or a combination of ADSP-21160 devices and other JTAG devices on the chain. 11-24 ADSP-21160 SHARC DSP Hardware Reference System Design Target Board Connector The emulator interface to an Analog Devices’ JTAG DSP is a 14-pin header, as shown in Figure 11-6. The customer must supply this header on their target board in order to communicate with the emulator. The interface consists of a standard dual row 0.025" square post header, set on 0.1" x 0.1" spacing, with a minimum post length of 0.235". Pin 3 is the key position used to prevent the pod from being inserted backwards. This pin must be clipped on the target board. The clearance (length, width, and height) around the header must be as shown in Figure 11-10. Maintain a minimum length of 0.15" and width of 0.10" for the target board header. The pod connector attaches the target board header in this area. Therefore, there must be clearance to attach and detach this connector. See the “DSP JTAG Pod Connector” on page 11-31 for detailed drawings of the pod connector. Figure 11-6. Emulator Interface for Analog Devices’ JTAG DSPs As can be seen in Figure 11-6, there are two sets of signals on the header, including the standard JTAG signals TMS, TCK, TDI, TDO, TRST, EMU used for emulation purposes (via an emulator). Secondary JTAG signals BTMS, ADSP-21160 SHARC DSP Hardware Reference 11-25 Designing for JTAG Emulation BTCK, BTDI, and BTRST are provided for optional use for board-level (boundary scan) testing. While they are rarely used, the “B” signals should be connected to a separate on-board JTAG boundary scan controller, if they are used. If the “B” signals will not be used, tie all of them to ground as shown in Figure 11-7. alternately be activated (for some older silicon) to (+5V,can+3.3V, or +2.5V) using a 4.7K resistor, as described in preBTCK VCC vious documents. Tying the signal to ground is universal and will work for all silicon. When the emulator is not connected to this header, jumpers should be placed across BTMS, BTCK, BTRST, and BTDI as shown in Figure 11-7. This holds the JTAG signals in the correct state to allow the DSP to run freely. All the jumpers should be removed when connecting the emulator to the JTAG header. Figure 11-7. JTAG Target Board Connector With No Local Boundary Scan 11-26 ADSP-21160 SHARC DSP Hardware Reference System Design For a list of the state of each standard JTAG signal refer to Table 11-5. Use the following legend: O = Output, I = Input, and NU = Not Used Table 11-5. State of Standard JTAG Signals Signal Description Emulator DSP TMS Test Mode Select O I TCK Test Clock (10 MHz) O I TRST Test Reset O I TDI Test Data In O I TDO Test Data Out I O EMU Emulation Pin I O (Open Drain) CLKIN DSP Clock Input NU I The DSP CLKIN signal is the clock signal line (typically 30 MHz or greater) that connects an oscillator to all DSPs in multiple DSP systems requiring synchronization. In order for synchronous DSP operations to work correctly the CLKIN signal on all the DSPs must be the same signal and the skew between them must be minimal (use clock drivers, or other means) - see the DSP user guide for more CLKIN details. Note that the CLKIN signal is not used by the emulator and can cause noise problems if connected to the JTAG header. Legacy documents show it connected to pin 4 of the JTAG header. Pin 4 should be tied to ground on the 14-pin JTAG header (do not connect the JTAG header pin to the DSP CLKIN signal). If you have already connected it to the JTAG header pin, and are experiencing noise from this signal, simply clip this pin on the 14-pin JTAG header. The final connections between a single DSP target and the emulation header (within 6 inches) are shown in Figure 11-8. A 4.7K pull-up resistor has been added on TCK, TDI and TMS for increased noise resistance. ADSP-21160 SHARC DSP Hardware Reference 11-27 Designing for JTAG Emulation Figure 11-8. Single DSP Connection to the JTAG Header If a design uses more than one DSP (or other JTAG device in the scan chain), or if the JTAG header is more than 6 inches from the DSP, use a buffered connection scheme as shown in Figure 11-9 (no local boundary scan mode shown). To keep signal skew to a minimum, be sure the buffers are all in the same physical package (typical chips have 6, 8, or 16 drivers). Using a buffer that includes a series of resistors such as the 74ABT2244 family can reduce ringing on the JTAG signal lines. For low voltage applications (3.3V, 2.5V, and 1.9V I/O), the 74ALVT, and 74AVC logic families is useful. Also, note the position of the pull-up resistor on EMU. This is required since the EMU line is an open drain signal. If more than one DSP (or JTAG device) is on the target (in the scan chain), you must buffer the JTAG header. This will keep the signals clean 11-28 ADSP-21160 SHARC DSP Hardware Reference System Design and avoid noise problems that occur with longer signal traces (ultimately resulting in reliable emulator operation). Figure 11-9. Multiple DSP Connection to JTAG Header Although the theoretical number of devices that can be supported (by the software) in one JTAG scan chain is large (50 devices or more) it is not recommended that you use more than eight physical devices in one scan chain. A physical device could however contain many JTAG devices such as inside a multi-chip module. The recommendation of not more than eight physical devices is mostly due to the transmission line effects that appear in long signal traces, and based on some field-collected empirical data. The best approach for large numbers of physical devices is to break the chain into several smaller independent chains, each with their own JTAG header and buffer. If this is not possible, at least add some jumpers that can reduce the number of devices in one chain for debug purposes, and pay special attention in the layout stage for transmission line effects. ADSP-21160 SHARC DSP Hardware Reference 11-29 Layout Requirements Layout Requirements All JTAG signals (TCK, TMS, TDI, TDO, EMU, TRST) should be treated as critical route signals. Specify a controlled impedance requirement for each route (value depends on your circuit board, typically 50-75). Keeping crosstalk and inductance to a minimum on these lines by using a good ground plane and by routing away from other high noise signals such as clock lines is also important. Keep these routes as short and clean as possible, and keep the bused signals (TMS, TCK, TRST, EMU) as close to the same length as possible. JTAG TAP relies on the state of the TMS line and the TCK The clock signal. If these signals have glitches (due to ground bounce, crosstalk, etc.) unreliable emulator operation will result. When experiencing emulator problems, look at these signals using a high-speed digital oscilloscope. These lines must be clean, and may require special termination schemes. If you are buffering the JTAG header (most customers will) you must provide signal termination appropriate for your target board (series, parallel, R/C, etc.). Power Sequence for Emulation The power-on sequence for your target and emulation system is as follows: 1. Apply power to the emulator first, then to the target board. This ensures that the JTAG signals are in the correct state for the DSP to run free. 2. Upon power-on, the emulator drives the TRST signal low, keeping the DSP TAP in the test-logic-reset state, until the emulation software takes control. Removal of power should be done in reverse: Turn off power to the target board, then to the emulator. 11-30 ADSP-21160 SHARC DSP Hardware Reference System Design Additional JTAG Emulator References The IEEE 1149.1 JTAG standard is sponsored by the Test Technology Standards Committee of the IEEE Computer Society, and published by the IEEE. The latest versions at the time of this publication are IEEE Standard. 1149.1-1990 and IEEE Standard 1149.1a-1993. To order a copy, call the IEEE at 1 800 678 4333 in the US and Canada, 1 908 981 1393 outside of the US and Canada. Visit the IEEE standards web site at http://standards.ieee.org/. Pod Specifications This section contains design details on various emulator pod designs by the Analog Devices DSP Tools product line. The emulator pod is the device that connects directly to the DSP target board 14-pin JTAG header. See also Engineer to Engineer Notes EE-68. DSP JTAG Pod Connector Figure 11-10 details the dimensions of the JTAG pod connector at the 14-pin target end. Figure 11-11 displays the keep-out area for a target board header. The keep-out area allows the pod connector to properly seat onto the target board header. This board area should contain no components (chips, resistors, capacitors, etc.). The dimensions are referenced to the center of the 0.25" square post pin. ADSP-21160 SHARC DSP Hardware Reference 11-31 Pod Specifications Figure 11-10. JTAG Pod Connector (14-pin Target End) Figure 11-11. Keep-out Area For a Target Board Header DSP 3.3V Pod Logic A portion of the Analog Devices DSP Tools product line 3.3V emulator pod interface is shown in Figure 11-12. This figure describes the driver circuitry of the emulator pod. As can be seen, TMS, TCK and TDI are driven with a 33 series resistor. TRST is driven with a 100 series resistor. TDO and CLKIN are terminated with an optional 91/120 parallel terminator. EMU is pulled up with a 4.7K resistor. The 74LVT244 chip drives the signals at 3.3V, with a maximum current rating of ±32mA. 11-32 ADSP-21160 SHARC DSP Hardware Reference System Design Figure 11-12. 3.3V JTAG Pod Driver Logic Parallel terminate the TMS, TCK, TRST, and TDI lines locally on your target board, if needed, since they are driven by the pod with sufficient current drive (±32mA). In order to use the terminators on the TDO line (CLKIN is not used), you MUST have a buffer on your target board JTAG header. The DSP is not capable of driving the parallel terminator load directly with TDO. Assuming the proper buffers are included, use the optional parallel terminators by placing a jumper on J2. DSP 2.5V Pod Logic A portion of the Analog Devices DSP Tools product line 2.5V emulator pod interface is shown in Figure 11-13. This figure describes the driver circuitry of the emulator pod. As can be seen, TMS, TCK, and TDI are driven with a 33 series resistor. TRST is driven with a 100 series resistor. TDO is pulled up with a 4.7K resistor and terminated with an optional parallel terminator that can be configured by the user. EMU is pulled up with a 4.7K resistor. ADSP-21160 SHARC DSP Hardware Reference 11-33 Conditioning Input Signals The CLKIN signal is not used and not connected inside the pod. The 74ALVT16244 chip drives the signals at 2.5V, with a maximum current rating of ±8mA. Figure 11-13. 2.5V JTAG Pod Driver Logic You can terminate the TMS, TCK, TRST, and TDI lines locally on your target board, if needed, as long as the terminator's current use does not exceed the driver's maximum current supply (±8mA). In order to use the terminator on the TDO line, include a buffer on your target board JTAG header. The DSP is not capable of driving a parallel terminator load (typically 50-75) directly with TDO. Assuming you have the proper buffers, you may use the optional parallel terminator by adding the appropriate resistors and placing a jumper on J2. Conditioning Input Signals The DSP is a CMOS device. It has input conditioning circuits which simplify system design by filtering or latching input signals to reduce susceptibility to glitches or reflections. 11-34 ADSP-21160 SHARC DSP Hardware Reference System Design The following sections describe why these circuits are needed and their effect on input signals. A typical CMOS input consists of an inverter with specific N and P device sizes that cause a switching point of approximately 1.4V. This level is selected to be the midpoint of the standard TTL interface specification of V IL =0.8V and V IH =2.0V. This input inverter, unfortunately, has a fast response to input signals and external glitches wider than about 1 ns. Filter circuits and hysteresis are added after the input inverter on some DSP inputs, as described in the following sections. Link Port Input Filter Circuits The DSP’s link port input signals have on-chip filter circuits rather than glitch rejection circuits. Filtering is not used on most signals because it delays the incoming signal and the timing specifications. Filtering is implemented only on the link port data and clock inputs. This is possible because the link ports are self-synchronized. The clock and data are sent together. It is not the absolute delay but rather the relative delay between clock and data that determines performance margin. By filtering both LxCLK and LxDAT3-0 with identical circuits, response to glitches and reflections are reduced but relative delay is unaffected. The filter has the effect of ignoring a full strength pulse (a glitch) narrower than approximately 2 ns. Glitches that are not full strength can be somewhat wider. The link ports do not use glitch rejection circuits because they can be used with longer, series-terminated transmission lines where the reflections do not occur near the signal transitions. LxCLK ADSP-21160 SHARC DSP Hardware Reference 11-35 Designing For High Frequency Operation RESET Input Hysteresis Hysteresis is used only on the RESET input signal. Hysteresis causes the switching point of the input inverter to be slightly above 1.4V for a rising edge and slightly below 1.4V for a falling edge. The value of the hysteresis is approximately ± 0.1V. The hysteresis is intended to prevent multiple triggering of signals which are allowed to rise slowly, as might be expected on a reset line with a delay implemented by an RC input circuit. Hysteresis is not used to reduce the effect of ringing on DSP input signals with fast edges, because the amount of hysteresis that can be used on a CMOS chip is too small to make much difference. The small amount of hysteresis allowable is due to the restrictions on the tolerance of the V IL and V IH TTL input levels under worst case conditions. Refer to the DSP’s data sheet for exact specifications. Designing For High Frequency Operation Because the DSP processor can operate at very fast clock frequencies, signal integrity and noise problems must be considered for circuit board design and layout. The following sections discuss these topics and suggest various techniques to use when designing and debugging DSP systems. Initial versions of the DSP are specified for operation at 50 MHz and 40 MHz internal clocks; the following information is based on these CLKIN frequencies. Refer to the DSP’s data sheet for current clock speed specifications. All DSP synchronous behavior is specified to CLKIN. System designers are encouraged to clock synchronous peripherals/memory (which are attached to the DSP external port) with this same clock source (or a different low-skew output from the same clock driver). Alternatively, the clock output (CLKOUT) from the DSP may be employed to clock synchronous peripherals/memory. 11-36 ADSP-21160 SHARC DSP Hardware Reference System Design Note the following behavior for CLKOUT: • The DSP whose ID2-0=000 (uniprocessor), or 001 drives CLKOUT during reset. • is specified relative to CLKIN in the DSP’s data sheet. When using this output to clock system components, the phase and jitter terms associated with this output must be treated as additional derating factors in determining specs. CLKOUT • The bus master drives CLKOUT. In an MP system, this clock has multiple sources, including host logic, if present. The system component clocked by CLKOUT must be able to tolerate one or more cycles in which CLKOUT is held high, as bus ownership changes. Also, if the host logic operates asynchronously to the DSP CLKOUT frequency, the system component clocked by CLKOUT must be able to both operate at that host frequency, and handle the electrical characteristics of the CLKOUT transition to that asynchronous frequency domain. • For systems not needing CLKOUT as a clock source, CLKOUT may be used to identify the current bus master. This requires that the outputs not be tied together. If and when this debug feature is not needed, the CLKOUT output can be disabled by setting the COD bit in the SYSCON register. Clock Specifications and Jitter The clock signal must be free of ringing and jitter. Clock jitter can easily be introduced in a system where more than one clock frequency exists. ADSP-21160 SHARC DSP Hardware Reference 11-37 Designing For High Frequency Operation High frequency jitter on the clock to the DSP may result in abbreviated internal cycles. FREQUENCY 1 CLOCK a ADSP-21160 S NO CONNECT NO CONNECT Figure 11-14. Reducing Clock Jitter and Ring share a clock buffer IC with a signal of a different clock fre Never quency. This introduces excessive jitter. As shown in Figure 11-14, keep the portions of the system that operate at different frequencies as physically separate as possible. Clock Distribution There must be low clock skew between DSPs in a multiprocessor cluster when communicating synchronously on the external bus. The clock must be routed in a controlled-impedance transmission line that can be properly terminated at either the end of the line or the source. Figure 11-15 illustrates end-of-line termination for the clock. End-of-line termination is not usually recommended unless the distance between the processors is extremely small, because devices that are at a different wire distance from each other receive a skewed clock. This is due to the propagation delay of a PCB transmission line, which is typically 5 to 6 inches/ns. 11-38 ADSP-21160 SHARC DSP Hardware Reference System Design CLOCK +5 V 50⍀TRANSMISSION LINE 180⍀ 1.4V 70⍀ a a a ADSP-21160 ADSP-21160 ADSP-21160 S S S Figure 11-15. Do Not Use End-Of-Line Termination for the Clock Figure 11-16 illustrates source termination for the clock. Source termination allows delays in each path to be identical. Each device must be at the end of the transmission line because only there does the signal have a single transition. The traces must be routed so that the delay through each is matched to the others. Line impedance higher than 50 may be used, but clock signal traces should be in the PCB layer closest to the ground plane to keep delays stable and crosstalk low. More than one device may be at the end of the line, but the wire length between them must be short and the impedance (capacitance) of these must be kept high. The matched inverters must be in the same IC and must be specified for a low skew (< 1 ns) with respect to each other. This skew should be as small as possible because it subtracts from the margin on most specifications. ADSP-21160 SHARC DSP Hardware Reference 11-39 Designing For High Frequency Operation OCT AL INVE R T E R ACT Q240 (NAT IONAL S E MICONDUCT OR ) OR IDT 49F CT 805/A OR CY7C992 40⍀ 50⍀ TRANSMISSION LINE a ADS P -21160 S 40⍀ CLOCK 50⍀ TRANSMISSION LINE a ADS P -21160 S 40⍀ a 50⍀ TRANSMISSION LINE ADS P -21160 S BUFFER DRIVE IMPEDANCE = 10⍀ A S E P AR AT E B UF F E R AND T R ANS MIS S ION L INE IS NE E DE D F OR E ACH GR OUP OF PR OCE S S OR S T HAT AR E F UR T HE R T HAN 4 INCHE S F R OM E ACH OT HE R . Figure 11-16. Use Source Termination to Distribute the Clock Point-To-Point Connections A series termination resistor may be added near the pin for point-to-point connections. This is typically used for link port applications when distances are greater than 6 inches as shown in Figure 11-17. For more specific guidance on related issues, see the reference source in “Recommended Reading” on page 11-47 for suggestions on transmission line termination and see ADSP-21160 SHARC DSP Microcomputer Data Sheet for output drivers' rise and fall time data. For link port operation at the full internal clock rate it is important to maintain low skew between the data (LxDAT7-0) and clock (LxCLK). 11-40 ADSP-21160 SHARC DSP Hardware Reference System Design Although the DSP’s serial ports may be operated at a slow rate, the output drivers still have fast edge rates and may require source termination for longer distances. 50⍀ TRANSMISSION LINE, LONGER THAN 6" (15.25 cm) a ADS P -21160 D RIVE R IMPE DANCE = 17 33⍀ ON OP E N CIRCUIT OF F L INK PORT T R ANS MIT TE R L INK POR T T RANS MIT T E R S ADS P -21160 S DRIVER IMPEDANCE = 17⍀ OPEN CIRCUIT OFF ON LINK PORT TRANSMITTER a 33⍀ REFLECTED WAVE IS ABSORBED AT THE SOURCE. LINK PORT RECEIVER Figure 11-17. Source Termination For Point-To-Point Connections Signal Integrity The capacitive loading on high-speed signals should be reduced as much as possible. Loading of buses can be reduced by using a buffer for devices that operate with wait states, for example DRAMs. This reduces the capacitance on signals tied to the zero-wait-state devices, allowing these signals to switch faster and reducing noise-producing current spikes. Signal run length (inductance) should also be minimized to reduce ringing. Extra care should be taken with certain signals such as the read and write strobes (RDH/L, WRH/L) and acknowledge (ACK). In a multiprocessor cluster, each DSP can drive the read or write strobes. In this case, some damping resistance should be put in the signal path if the line length is ADSP-21160 SHARC DSP Hardware Reference 11-41 Designing For High Frequency Operation greater than 6 inches. This is at the expense of additional signal delay. The time budget for these signals should be carefully analyzed. Two possible damping arrangements between four DSPs are shown in Figure 11-18 and Figure 11-19. In Figure 11-18, a star connection of resistors is used. Each DSP can drive the signal (for example, RDH/L or WRH/L strobe). Trace lengths should be minimized. Experiment with the optimal resistance value and placement; for example, near the processor or near the common connection. This adds signal delay. a a A D S P -2 1 1 6 0 A D S P -2 1 1 6 0 S S 1 0⍀ 1 0⍀ 1 0⍀ 1 0⍀ S T A R C O N N E C T IO N D A M P IN G R E S IS T O R S a a A D S P -2 1 1 6 0 A D S P -2 1 1 6 0 S S Figure 11-18. Star Connection Damping Resistors In the example of Figure 11-19, where processors 1 and 2, and 3 and 4 are close to each other, a single damping resistor between the processor pairs helps damp out reflections. Experiment with the resistor value. The two processor groups have a skew with respect to each other. 11-42 ADSP-21160 SHARC DSP Hardware Reference System Design a a A DSP-21160 A DSP-21160 S S 2 0⍀ SIN G L E D A M PIN G R E S IS T O R B E T W E EN P R O C E S S O R G R O U P S a a A DSP-21160 A DSP-21160 S S Figure 11-19. Single Damping Resistor Between Processor Groups Another solution to multiple drivers where longer distances are involved is to have a single transmission line that is terminated at both ends. This arrangement is shown in Figure 11-20. The stubs to the processors must be kept as short as possible. Each device driver sees an impedance of 25, but this resistor is biased at 1.4V, so the drive from the DSPs is sufficient for TTL levels. To reduce power dissipation in the system and in each DSP, this should only be used, if necessary, for signals such as the RDH/L or WRH/L strobe. With this arrangement, the signals are skewed, but well behaved. ADSP-21160 SHARC DSP Hardware Reference 11-43 Designing For High Frequency Operation +5 V a a a ADSP-21160 ADSP-21160 ADSP-21160 S S S +5 V 180⍀ 1.4V 180⍀ 50⍀ TRANSMISSION LINE, LONGER THAN 10" (25.4 cm) 1.4V 70⍀ 70⍀ a a a ADSP-21160 ADSP-21160 ADSP-21160 S S S Figure 11-20. Single Transmission Line Terminated At Both Ends Other Recommendations and Suggestions These recommendations and suggestions are: • Use more than one ground plane on the PCB to reduce crosstalk. Be sure to use lots of vias between the ground planes. One VDD plane for each supply is sufficient. These planes should be in the center of the PCB. • Keep critical signals such as clocks, strobes, and bus requests on a signal layer next to a ground plane and away from or layout perpendicular to other non-critical signals to reduce crosstalk. For example, data outputs switch at the same time that BRx inputs are sampled; if the layout permits crosstalk between them, the system could have problems with bus arbitration. 11-44 ADSP-21160 SHARC DSP Hardware Reference System Design • Position the processors on both sides of the board to reduce area and distances if possible. • Design for lower transmission line impedances to reduce crosstalk and to allow better control of impedance and delay. • Use of 3.3V peripheral components and power supplies to help reduce transmission line problems, because the receiver switching voltage of 1.4V is close to the middle of the voltage swing. In addition, ground bounce and noise coupling is less. • Experiment with the board and isolate crosstalk and noise issues from reflection issues. This can be done by driving a signal wire from a pulse generator and studying the reflections while other components and signals are passive. Decoupling Capacitors and Ground Planes Ground planes must be used for the ground and power supplies. Designs should use a minimum of eight bypass capacitors (six 0.1 F and two 0.01 F ceramic). The capacitors should be placed very close to the VDDEXT and VDDINT pins of the package as shown in Figure 11-21. Use short and fat traces for this. The ground end of the capacitors should be tied directly to the ground plane inside the package footprint of the DSP (underneath it, on the bottom of the board), not outside the footprint. A surface-mount capacitor is recommended because of its lower series inductance. Connect the power plane to the power supply pins directly with minimum trace length. The ground planes must not be densely perforated ADSP-21160 SHARC DSP Hardware Reference 11-45 Designing For High Frequency Operation with vias or traces as their effectiveness is reduced. In addition, there should be several large tantalum capacitors on the board. a ADSP-21160 S CASE 1: BYPASS CAPACITORS ON NON-COMPONENT (BOTTOM) SIDE OF BOARD, BENEATH DSP PACKAGE CASE 2: BYPASS CAPACITORS ON COMPONENT (TOP) SIDE OF BOARD, AROUND DSP PACKAGE Figure 11-21. Bypass Capacitor Placement can use either bypass placement case shown in Designs Figure 11-21 or combinations of the two. Designs should try to minimize signal feedthroughs that perforate the ground plane. 11-46 ADSP-21160 SHARC DSP Hardware Reference System Design Oscilloscope Probes When making high-speed measurements, be sure to use a “bayonet” type or similarly short (< 0.5 inch) ground clip, attached to the tip of the oscilloscope probe. The probe should be a low-capacitance active probe with 3 pF or less of loading. The use of a standard ground clip with 4 inches of ground lead causes ringing to be seen on the displayed trace and makes the signal appear to have excessive overshoot and undershoot. A 1 GHz or better sampling oscilloscope is needed to see the signals accurately. Recommended Reading The text High-Speed Digital Design: A Handbook of Black Magic is recommended for further reading. This book is a technical reference that covers the problems encountered in state-of-the-art, high-frequency digital circuit design, and is an excellent source of information and practical ideas. Topics covered in the book include: • High-Speed Properties of Logic Gates • Measurement Techniques • Transmission Lines • Ground Planes and Layer Stacking • Terminations • Vias • Power Systems • Connectors • Ribbon Cables ADSP-21160 SHARC DSP Hardware Reference 11-47 Booting Single and Multiple Processors • Clock Distribution • Clock Oscillators High-Speed Digital Design: A Handbook of Black Magic, Johnson and Graham, Prentice Hall, Inc., ISBN 0-13-395724-1 Booting Single and Multiple Processors Programs can be automatically downloaded to the internal memory of an DSP after power-up or after a software reset. This process is called booting. The DSP supports three booting modes: EPROM, host, and link port. For cases when the DSP must execute instructions from external memory without booting, a “No boot” mode may also be configured. For information on the setup and DMA processes for booting a single processor, see “Bootloading Through The External Port” on page 6-76 and “Bootloading Through The Link Port” on page 6-87. Multiprocessor systems can be booted from a host processor, from external EPROM, through a link port, or from external memory. Multiprocessor Host Booting To boot multiple DSP processors from a host, each DSP must have its EBOOT, LBOOT, and BMS pins configured for host booting: EBOOT=0, LBOOT=0, and BMS=1. After system powerup, each DSP is in the idle state and the BRx bus request lines are deasserted. The host must assert the HBR input and boot each DSP by: 11-48 ADSP-21160 SHARC DSP Hardware Reference System Design • Asserting its CS pin (for asynchronous) • Writing to the multiprocessor memory space location (for synchronous) • Downloading instructions as described in “Booting Another DSP” on page 7-113 Multiprocessor EPROM Booting There are two methods of booting a multiprocessor system from an EPROM. Processors perform the following steps in these methods: • Arbitrate for the bus • DMA the 256 word boot stream, after becoming bus master • Release the bus • Execute the loaded instructions All DSPs Boot in Turn from a Single EPROM The BMS signals from each DSP may be wire-OR’ed together to drive the chip select pin of the EPROM. Each DSP can boot in turn, according to its priority. When the last one has finished booting, it must inform the others (which may be in the idle state) that program execution can begin (if all DSPs are to begin executing instructions simultaneously). An example system that uses this processors-take-turns technique appears in Figure 11-22. When multiple DSPs boot from one EPROM, the DSPs can boot either identical code or different code from the EPROM. If the processors load differing code, a jump table (based on processor ID) can be used to select the code for each processor. ADSP-21160 SHARC DSP Hardware Reference 11-49 Booting Single and Multiple Processors EBOOT LBOOT ADDR ADDR31-0 DATA39-32 DATA DATA63-0 RDH ADSP-21160 (S1) RD ACK BMS CS EPROM ADDR31-0 EBOOT LBOOT HERE, MULTIPLE SHARCS BOOT FROM THE SAME EPROM. DATA63-0 RDH FOR THIS CONFIGURATION, THE LOADER ROUTINE USES A JUMP TABLE. ACK EBOOT LBOOT THIS TABLE INDICATES THE ADDRESS OF THE IMAGE THAT LOADS INTO EACH PROCESSOR. DATA ADDRESS BMS CONTROL ADSP-21160 (S2) THE PROCESSORS CAN LOAD THE SAME IMAGE OR INDIVIDUAL IMAGES. ADDR31-0 DATA63-0 RDH ACK ADSP-21160 (S6) BMS Figure 11-22. DSPs-Take-Turns Booting from an EPROM One DSP is Booted, which then Boots the Others The EBOOT pin of the DSP with IDx=1 must be set high for EPROM booting. All other DSPs should be configured for host booting (EBOOT=0, LBOOT=0, and BMS=1), which leaves them in the idle state at startup and allows the DSP with IDx=1 to become bus master and boot itself. Only the BMS pin of DSP #1 is connected to the chip select of the EPROM. When DSP #1 has finished booting, it can boot the remaining DSPs by writing to their external port DMA buffer 0 (EPB0) via multiprocessor memory 11-50 ADSP-21160 SHARC DSP Hardware Reference System Design space. An example system that uses this one-boots-others technique appears in Figure 11-23. EBOOT LBOOT ADSP-21160 (S1, MASTER) ADDR ADDR31-0 DATA39-32 DATA63-0 DATA RDH RD ACK BMS CS EPROM BMS ADDR31-0 EBOOT LBOOT DATA63-0 RDH BMS EBOOT LBOOT DATA CONTROL ADDRESS ACK ADSP-21160 (S2, SLAVE) ADDR31-0 DATA647-16 RDH ACK ADSP-21160 (S6, SLAVE) Figure 11-23. DSP-Boots-Others from an EPROM Multiprocessor Link Port Booting In systems where multiple DSPs are not connected by the parallel external bus, booting can be accomplished from a single source through the link ports. To simultaneously boot all of the DSPs, a parallel common connection should be made to link port buffer 4 (LBUF4) on each of the processors. If only a daisy chain connection exists between the processors’ ADSP-21160 SHARC DSP Hardware Reference 11-51 Booting Single and Multiple Processors link ports, then each DSP can boot the next one in turn. Link Buffer 4 must always be used for booting. Multiprocessor Booting From External Memory If external memory contains a program after reset, then the DSP with IDx=1 should be set up for no boot mode. It begins executing from address 0x0080 0004 in external memory. When booting has completed, the other DSPs may be booted by DSP #1 if they are set up for host booting, or they can begin executing out of external memory if they are set up for no boot mode. Multiprocessor bus arbitration allows this booting to occur in an orderly manner. The bus arbitration sequence after reset is described in “Multiprocessor Bus Arbitration” on page 7-98. 11-52 ADSP-21160 SHARC DSP Hardware Reference Registers A REGISTERS The DSP has general purpose and dedicated registers in each of its functional blocks. The register reference information for each functional block includes bit definitions, initialization values, and (for I/O processor registers) memory mapped address. Information on each type of register is available at the following locations: • “Control and Status System Registers” on page A-2 • “Processing Element Registers” on page A-15 • “Program Sequencer Registers” on page A-17 • “Data Address Generator Registers” on page A-32 • “I/O Processor Registers” on page A-33 When writing DSP programs, it is often necessary to set, clear, or test bits in the DSP’s registers. While these bit operations can all be done by referring to the bit’s location within a register or (for some operations) the register’s address with a hexadecimal number, it is much easier to use symbols that correspond to the bit’s or register’s name. For convenience and consistency, Analog Devices provides a header file that provides these bit and registers definitions. CrossCore Embedded Studio provides processor-specific header files in the SHARC/include directory. An #include file is provided with the VisualDSP++ tools and can be found in the ADSP-21160 SHARC DSP Hardware Reference A-1 Control and Status System Registers VisualDSP/processortype/include directory.For more information, see the “Register and Bit #Defines File (def21160.h)” on page A-81. registers have reserved bits. When writing to a register, pro Many grams may only clear (write zero to) the register’s reserved bits. Control and Status System Registers The DSP’s control and status system registers configure how the processor core operates and indicate the status of many processor core operations. In the ADSP-21160 SHARC DSP Instruction Set Reference, these registers are referred to as System Registers (SREG), which are a subset of the DSP’s Universal Registers (UREG). Not all registers are valid in all assembly language instructions. In the assembly syntax descriptions, the register group name (UREG, SREG, and others) indicates which type of register is valid within the instruction’s context. Table A-1 lists the processor core’s control and status registers with their initialization values. Descriptions of each register follow. Other system registers (SREG) are in the I/O processor. For more information, see “I/O Processor Registers” on page A-33. Table A-1. Control and Status System Registers (SREG and UREG) Register Name and Page Reference Initialization After Reset “Mode Control 1 Register (MODE1)” on page A-3 0x0000 0000 “Mode Mask Register (MMASK)” on page A-5 0x0020 0000 “Mode Control 2 Register (MODE2)” on page A-6 0xnn00 00001 “Arithmetic Status Registers (ASTATx and ASTATy)” on page A-8 0x0000 0000 “Sticky Status Registers (STKYx and STKYy)” on page A-12 0x0540 0000 “User-Defined Status Registers (USTATx)” on page A-15 0x0000 0000 1 A-2 MODE2 bits 31-25 are the processor ID and silicon revision number, so the initialization value varies with the DSP’s ID2-0 pins’ input and the silicon revision. ADSP-21160 SHARC DSP Hardware Reference Registers Mode Control 1 Register (MODE1) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for this register is 0x0000 0000. Table A-2. Mode Control 1 Register (MODE1) Bit Definitions Bit(s) Name Definition 0 BR8 Bit Reverse Addressing For Index I8 Enable. This bit enables (bit reversed if set, =1) or disables (normal if cleared, =0) bit reversed addressing for accesses that are indexed with DAG2 register I8. 1 BR0 Bit Reverse Addressing For Index I0 Enable. This bit enables (bit reversed if set, =1) or disables (normal if cleared, =0) bit reversed addressing for accesses that are indexed with DAG1 register I0. 2 SRCU Secondary Registers For Computational Units Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary result (MR) registers in the computational units. 3 SRD1H Secondary Registers For DAG1 High Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary DAG1 registers for the upper half (I, M, L, B7-4) of the address generator. 4 SRD1L Secondary Registers For DAG1 Low Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary DAG1 registers for the lower half (I, M, L, B3-0) of the address generator. 5 SRD2H Secondary Registers For DAG2 High Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary DAG2 registers for the upper half (I, M, L, B15-12) of the address generator. 6 SRD2L Secondary Registers For DAG2 Low Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary DAG2 registers for the lower half (I, M, L, B11-8) of the address generator. 7 SRRFH Secondary Registers For Register File High Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary data registers for the upper half (R15-8) of the computational units. ADSP-21160 SHARC DSP Hardware Reference A-3 Control and Status System Registers Table A-2. Mode Control 1 Register (MODE1) Bit Definitions (Cont’d) Bit(s) Name 9-8 Reserved 10 SRRFL Secondary Registers For Register File Low Enable. This bit enables (use secondary if set, =1) or disables (use primary if cleared, =0) secondary data registers for the lower half (R7-0) of the computational units. 11 NESTM Nesting Multiple Interrupts Enable. This bit enables (nest if set, =1) or disables (no nesting if cleared, =0) interrupt nesting in the interrupt controller. When interrupt nesting is disabled, a higher priority interrupt can not interrupt a lower priority interrupt’s service routine. Other interrupts are latched as they occur, but the DSP process them after the active routine finishes. When interrupt nesting is enabled, a higher priority interrupt can interrupt a lower priority interrupt’s service routine. Lower interrupts are latched as they occur, but the DSP process them after the nested routines finish. 12 IRPTEN Global Interrupt Enable. This bit enables (if set, =1) or disables (if cleared, =0) all maskable interrupts. 13 ALUSAT ALU Saturation Select. This bit selects whether the computational units saturate results on positive or negative fixed-point overflows (if 1) or return unsaturated results (if 0). 14 SSE Fixed-point Sign Extension Select. This bit selects whether the computational units sign extend short-word, 16-bit data (if 1) or zero-fill the upper 32 bits (if 0). 15 TRUNC Truncation Rounding Mode Select. This bit selects whether the computational units round results with round-to-zero (if 1) or round-to-nearest (if 0). 16 RND32 Rounding For 32-bit Floating-point Data Select. This bit selects whether the computational units round floating-point data to 32 bits (if 1) or round to 40 bits (if 0). 18-17 CSEL Bus Master Code Selection. These bits indicate whether the DSP processor has control of the external bus as follows: 00=DSP is bus master or 01, 10, 11=DSP is not bus master. 20-19 A-4 Definition Reserved ADSP-21160 SHARC DSP Hardware Reference Registers Table A-2. Mode Control 1 Register (MODE1) Bit Definitions (Cont’d) Bit(s) Name Definition 21 PEYEN Processor Element Y Enable. This bit enables computations in PEy— SIMD mode—(if 1) or disables PEy—SISD mode—(if 0). When set, Processing Element Y (computation units and register files) accepts instruction dispatches. When cleared, Processing Element Y goes into a low power mode. 22 BDCST9 Broadcast Register Loads Indexed With I9 Enable. This bit enables (broadcast I9 if set, =1) or disables (no I9 broadcast if cleared, =0) broadcast register loads for loads that use the data address generator I9 index. When the BDCST9 bit is set, data register loads from the PM data bus that use the I9 DAG2 index register are “broadcast” to a register or register pair in each PE. 23 BDCST1 Broadcast Register Loads Indexed With I1 Enable. This bit enables (broadcast I1 if set, =1) or disables (no I1 broadcast if cleared, =0) broadcast register loads for loads that use the data address generator I1 index. When the BDCST1 bit is set, data register loads from the DM data bus that use the I1 DAG1 index register are “broadcast” to a register or register pair in each PE. 24 CBUFEN Circular Buffer Addressing Enable. This bit enables (circular if set, =1) or disables (linear if cleared, =0) circular buffer addressing for buffers with loaded I, M, B, and L data address generator register. 31-25 Reserved Mode Mask Register (MMASK) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for this register is 0x0020 0000. Each bit in the MMASK register corresponds to a bit in the MODE1 register. Bits that are set in MMASK are used to clear bits in MODE1 when the DSP's status stack is pushed. This effectively disables different modes upon servicing an interrupt, or when executing a Push Sts instruction. ADSP-21160 SHARC DSP Hardware Reference A-5 Control and Status System Registers The DSP’s status stack will be pushed in two cases: • When you execute a Push Sts instruction explicitly in your code. • When an IRQ2-0 timer expires or a VIRPT interrupt occurs. Example: Before the Push Sts instruction, MODE1 is set to 0x01202811. This MODE1 value corresponds to the following settings being enabled: • Bit Reversing for I8 • Secondary Registers for DAG2 (high) • Interrupt Nesting, ALU Saturation • Processor Element Y (SIMD) • Circular Buffering is set to 0x0020 2001 indicating that you want to disable ALU Saturation, SIMD, and bit reversing for I8 after pushing the status stack. The value in MODE1 after Push Sts is 0x0100 0810. The other settings that were previously in MODE1 remain the same. The only bits that are affected are those that are set both in MMASK and in MODE1. These bits are cleared after the status stack is pushed. MMASK Note also that the reset value of MMASK is 0x0020 0000. If you do not make any changes to the MMASK register, the default setting will automatically disable SIMD when servicing any of the hardware interrupts mentioned above, or during any push of the status stack. Mode Control 2 Register (MODE2) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for this register is 0xmm00 0000. Because bits 31-25 in this register are the DSP ID and silicon revision, the reset value varies with the A-6 ADSP-21160 SHARC DSP Hardware Reference Registers system setting and silicon revision. Bits 31-25 of the MODE2 register are also readable in the MODE2_SHDW register. For more information, see “MODE2 Shadow Register (MODE2_SHDW)” on page A-53. Table A-3. Mode Control 2 Register (MODE2) Bit Definitions Bit Name Definition 0 IRQ0E IRQ0 1 IRQ1E IRQ1 2 IRQ2E IRQ2 3 Sensitivity Select. This bit selects sensitivity for IRQ0 as edge-sensitive (if set, =1) or level-sensitive (if cleared, =0). Sensitivity Select. This bit selects sensitivity for IRQ1 as edge-sensitive (if set, =1) or level-sensitive (if cleared, =0). Sensitivity Select. This bit selects sensitivity for IRQ2 as edge-sensitive (if set, =1) or level-sensitive (if cleared, =0). Reserved 4 CADIS Cache Disable. This bit disables the instruction cache (if set, =1) or enables the cache (if cleared, =0). 5 TIMEN Timer Enable. This bit enables the timer (starts, if set, =1) or disables the timer (stops, if cleared, =0). 6 BUSLK Bus Lock Request. This bit requests bus lock (DSP retains bus mastership, if set, =1) or does not request bus lock (normal bus mastering, if cleared, =0). 14-7 Reserved 15 FLG0O FLAG0 Output Select. This bit selects the I/O direction for FLAG0 as an output (if set, =1) or an input (if cleared, =0). 16 FLG1O FLAG1 Output Select. This bit selects the I/O direction for FLAG1 as an output (if set, =1) or an input (if cleared, =0). 17 FLG2O FLAG2 Output Select. This bit selects the I/O direction for FLAG2 as an output (if set, =1) or an input (if cleared, =0). 18 FLG3O FLAG3 Output Select. This bit selects the I/O direction for FLAG3 as an output (if set, =1) or an input (if cleared, =0). 19 CAFRZ Cache Freeze. This bit freezes the instruction cache (retains contents, if set, =1) or thaws the cache (allows new input, if cleared, =0). ADSP-21160 SHARC DSP Hardware Reference A-7 Control and Status System Registers Table A-3. Mode Control 2 Register (MODE2) Bit Definitions (Cont’d) Bit Name Definition 20 IRAE I/O Processor Register Access Enable. This bit enables detection of I/O processor register accesses (if set, =1) or disables detection of I/O processor register accesses (if cleared, =0). If IRAE is set, the DSP flags an access by setting the IRA bit in the STKYx register. For more information, see “IRA” on page A-14. 21 U64MAE Unaligned 64-bit Memory Access Enable. This bit enables detection of unaligned long word accesses (if set, =1) or disables detection of unaligned long word accesses (if cleared, =0). If U64MAE is set, the DSP flags an unaligned long word accesses by setting the U64MA bit in the STKYx register. For more information, see “U64MA” on page A-14. 31-22 Reserved Arithmetic Status Registers (ASTATx and ASTATy) These are non-memory mapped, universal, system registers (UREG and SREG). The reset value for these registers is 0x0000 0000. Each processing element has its own ASTAT register. ASTATx indicates status for PEx operations, and ASTATy indicates status for PEy operations (see Table A-4). If a program loads the register manually, there is a one cycle effect latency before the new value in can be used in a conASTATx ASTATx ditional instruction A-8 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-4. Arithmetic Status Registers (ASTATx/y) Bit Definitions Bit(s) Name Definition 0 AZ ALU Zero/Floating-Point Underflow. This bit indicates whether the last ALU operation’s result was zero (if set, =1) or non-zero (if cleared, =0). The ALU updates AZ for all fixed-point and floating-point ALU operations. AZ can also indicate a floating-point underflow. During an ALU underflow—indicated by a set (=1) AUS bit in the STKYx/y register, the DSP sets AZ if the floating-point result is smaller than can be represented in the output format. 1 AV ALU Overflow. This bit indicates whether the last ALU operation’s result overflowed (if set, =1) or did not overflow (if cleared, =0). The ALU updates AV for all fixed-point and floating-point ALU operations. For fixed-point results, the DSP sets AV and the AOS bit in the STKYx/y register when the XOR of the two most significant bits is a 1. For floating-point results, the DSP sets AV and the AVS bit is the STKYx/y register when the rounded result overflows (unbiased exponent > 127). 2 AN ALU Negative. This bit indicates whether the last ALU operation’s result was negative (if set, =1) or positive (if cleared, =0). The ALU updates AN for all fixed-point and floating-point ALU operations. 3 AC ALU fixed-point Carry. This bit indicates whether the last ALU operation had a carry out of most significant bit of the result (if set, =1) or had no carry (if cleared, =0). The ALU updates AC for all fixed-point operations. The DSP clears AC during fixed-point logic operations: PASS, MIN, MAX, COMP, ABS, and CLIP. The ALU reads the AC flag for fixed-point accumulate operations: addition with carry and fixed-point subtraction with carry. 4 AS ALU X-Input Sign (for ABS and MANT). This bit indicates whether the last ALU ABS or MANT operation’s input was negative (if set, =1) or positive (if cleared, =0). The ALU updates AS for only fixed-point and floating-point ABS and the MANT operations. The ALU clears AS for all operations other than ABS and MANT. ADSP-21160 SHARC DSP Hardware Reference A-9 Control and Status System Registers Table A-4. Arithmetic Status Registers (ASTATx/y) Bit Definitions (Cont’d) Bit(s) Name Definition 5 AI ALU Floating-Point Invalid Operation. This bit indicates whether the last ALU operation’s input was invalid (if set, =1) or valid (if cleared, =0). The ALU updates AI for all fixed-point and floating-point ALU operations. The DSP sets AI and the AIS bit in the STKYx/y register if the ALU operation: • Receives a NAN input operand • Adds opposite-signed Infinities • Subtracts like-signed Infinities • Overflows during a floating-point to fixed-point conversion when saturation mode is not set • Operates on an Infinity when saturation mode is not set 6 MN Multiplier Negative. This bit indicates whether the last multiplier operation’s result was negative (if set, =1) or positive (if cleared, =0). The multiplier updates MN for all fixed-point and floating-point multiplier operations. 7 MV Multiplier Overflow. This bit indicates whether the last multiplier operation’s result overflowed (if set, =1) or did not overflow (if cleared, =0). The multiplier updates MV for all fixed-point and floating-point multiplier operations. For floating-point results, the DSP sets MV and the MVS bit in the STKYx/y register if the rounded result overflows (unbiased exponent > 127). For fixed-point results, the DSP sets MV and the MOS bit in the STKYx/y register if the result of the multiplier operation is: • Twos-complement, fractional with the upper 17 bits of MR not all zeros or all ones • Twos-complement, integer with the upper 49 bits of MR not all zeros or all ones • Unsigned, fractional with the upper 16 bits of MR not all zeros • Unsigned, integer with the upper 48 bits of MR not all zeros If the multiplier operation directs a fixed-point result to an MR register, the DSP places the overflowed portion of the result in MR1 and MR2 for an integer result or places it in MR2 only for a fractional result. A-10 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-4. Arithmetic Status Registers (ASTATx/y) Bit Definitions (Cont’d) Bit(s) Name Definition 8 MU Multiplier Floating-Point Underflow. This bit indicates whether the last multiplier operation’s result underflowed (if set, =1) or did not underflow (if cleared, =0). The multiplier updates MU for all fixed-point and floating-point multiplier operations. For floating-point results, the DSP sets MU and the MUS bit in the STKYx/y register if the floating-point result underflows (unbiased exponent < –126). Denormal operands are treated as Zeros, therefore they never cause underflows. For fixed-point results, the DSP sets MU and the MUS bit in the STKYx/y register if the result of the multiplier operation is: • Twos-complement, fractional: upper 48 bits all zeros or all ones, lower 32 bits not all zeros • Unsigned, fractional: upper 48 bits all zeros, lower 32 bits not all zeros If the multiplier operation directs a fixed-point, fractional result to an MR register, the DSP places the underflowed portion of the result in MR0. 9 MI Multiplier Floating-Point Invalid Operation. This bit indicates whether the last multiplier operation’s input was invalid (if set, =1) or valid (if cleared, =0). The multiplier updates MI for floating-point multiplier operations. The DSP sets MI and the MIS bit in the STKYx/y register if the ALU operation: • Receives a NAN input operand • Receives an Infinity and Zero as input operands 10 AF ALU Floating-Point Operation. This bit indicates whether the last ALU operation was floating-point (if set, =1) or fixed-point (if cleared, =0). The ALU updates AF for all fixed-point and floating-point ALU operations. 11 SV Shifter Overflow. This bit indicates whether the last shifter operation’s result overflowed (if set, =1) or did not overflow (if cleared, =0). The shifter updates SV for all shifter operations. The DSP sets SV if the shifter operation: • Shifts the significant bits to the left of the 32-bit fixed-point field • Tests, sets, or clears a bit outside of the 32-bit fixed-point field • Extracts a field that is past or crosses the left edge of the 32-bit fixed-point field • Performs a LEFTZ or LEFTO operation that returns a result of 32 ADSP-21160 SHARC DSP Hardware Reference A-11 Control and Status System Registers Table A-4. Arithmetic Status Registers (ASTATx/y) Bit Definitions (Cont’d) Bit(s) Name Definition 12 SZ Shifter Zero. This bit indicates whether the last shifter operation’s result was zero (if set, =1) or non-zero (if cleared, =0). The shifter updates SZ for all shifter operations. The DSP also sets SZ if the shifter operation performs a bit test on a bit outside of the 32-bit fixed-point field. 13 SS Shifter Input Sign. This bit indicates whether the last shifter operation’s input was negative (if set, =1) or positive (if cleared, =0). The shifter updates SS for all shifter operations. 17-14 18 Reserved BTF 23-19 31-24 Bit Test Flag for System Registers. This bit indicates whether the last system register bit manipulation operation Bit Tst operation was true (if set, =1) or false (if cleared, =0). The DSP sets BTF when the bit(s) in a system register and value in the Bit Tst instruction match. The DSP also sets BTF when the bit(s) in a system register and value in the Bit Xor instruction match. Reserved CACC Compare Accumulation Shift Register. Bit 31 of CACC indicates which operand was greater during the last ALU compare operation: X input (if set, =1) or Y input (if cleared, =0). The other seven bits in CACC form a right-shift register, each storing a previous compare accumulation result. With each new compare, the DSP right shifts the values of CACC, storing the newest value in bit 31 and the oldest value in bit 24. Sticky Status Registers (STKYx and STKYy) These are non-memory mapped, universal, system registers (UREG and SREG). The reset value for these registers is 0x0000 0000. Each processing element has its own STKY register. STKYx indicates status for PEx operations and indicates status for some program sequencer stacks. The STKYy register only indicates status for PEy operations.Table A-5 lists bits for both STKYx and STKYy, noting with an the bits that apply only to STKYx. A-12 ADSP-21160 SHARC DSP Hardware Reference Registers not clear themselves after the condition they flag is no longerbitstrue.do They remain “sticky” until cleared by the program. STKY The DSP sets a STKY bit in response to a condition. For example, the DSP sets the AUS bit in the STKY register when an ALU underflow set AZ in the ASTAT register. The DSP clears AZ if the next ALU operation does not cause an underflow, but AUS remains set until a program clears the STKY bit. Interrupt service routines should clear their interrupt’s corresponding STKY bit so the DSP can detect a re-occurrence of the condition. For example, an interrupt service routine for the floating-point underflow exception interrupt (FLTUI) would clear the AUS bit in the STKY register near the beginning of the routine. Table A-5. Sticky Status Registers (STKYx/y) Bit Definitions Bit(s) Name Definition At right: + shows bits in both STKYx/y x shows bits in STKYx only 0 AUS ALU Floating-Point Underflow. This bit is a sticky indicator for the ALU AZ bit. For more information, see “AZ” on page A-9. 1 AVS ALU Floating-Point Overflow. This bit is a sticky indicator for the + ALU AV bit. For more information, see “AV” on page A-9. 2 AOS ALU Fixed-Point Overflow. This bit is a sticky indicator for the ALU AV bit. For more information, see “AV” on page A-9. 4-3 + + Reserved 5 AIS ALU Floating-Point Invalid Operation. This bit is a sticky indica- + tor for the ALU AI bit. For more information, see “AI” on page A-10. 6 MOS Multiplier Fixed-Point Overflow. This bit is a sticky indicator for + the multiplier MV bit. For more information, see “MV” on page A-10. 7 MVS Multiplier Floating-Point Overflow. This bit is a sticky indicator for the multiplier MV bit. For more information, see “MV” on page A-10. ADSP-21160 SHARC DSP Hardware Reference + A-13 Control and Status System Registers Table A-5. Sticky Status Registers (STKYx/y) Bit Definitions (Cont’d) Bit(s) Name Definition At right: + shows bits in both STKYx/y x shows bits in STKYx only 8 MUS Multiplier Floating-Point Underflow. This bit is a sticky indicator + for the multiplier MU bit. For more information, see “MU” on page A-11. 9 MIS Multiplier Floating-Point Invalid Operation. This bit is a sticky indicator for the multiplier MI bit. For more information, see “MI” on page A-11. 16-10 + Reserved 17 CB7S DAG1 Circular Buffer 7 Overflow. This bit indicates whether a x circular buffer being addressed with DAG1 register I7 has overflowed (if set, =1) or has not overflowed (if cleared, =0). A circular buffer overflow occurs when DAG circular buffering operation increments the I register past the end of buffer. 18 CB15S DAG2 Circular Buffer 15 Overflow. This bit indicates whether a x circular buffer being addressed with DAG2 register I15 has overflowed (if set, =1) or has not overflowed (if cleared, =0). A circular buffer overflow occurs when DAG circular buffering operation increments the I register past the end of buffer. 19 IRA IOP Register Access. This bit indicates whether a core, host, or x multiprocessor access to I/O processor registers has occurred (if 1) or has not occurred (if 0). 20 U64MA Unaligned 64-bit Memory Access. This bit indicates whether a x Normal word access with the LW mnemonic addressing an uneven memory address has occurred (if 1) or has not occurred (if 0). 21 PCFL PC Stack Full. This bit indicates whether the PC stack is full (if 1) x or not full (if 0)—Not a sticky bit, cleared by a Pop. 22 PCEM PC Stack Empty. This bit indicates whether the PC stack is empty x (if 1) or not empty (if 0)—Not sticky, cleared by a Push. 23 SSOV Status Stack Overflow. This bit indicates whether the status stack is overflowed (if 1) or not overflowed (if 0)—A sticky bit. x 24 SSEM Status Stack Empty. This bit indicates whether the status stack is empty (if 1) or not empty (if 0)—Not sticky, cleared by a Push. x A-14 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-5. Sticky Status Registers (STKYx/y) Bit Definitions (Cont’d) Bit(s) Name Definition At right: + shows bits in both STKYx/y x shows bits in STKYx only 25 LSOV Loop Stack Overflow. This bit indicates whether the loop counter x stack and loop stack are overflowed (if 1) or not overflowed (if 0)—A sticky bit. 26 LSEM Loop Stack Empty. This bit indicates whether the loop counter stack and loop stack are empty (if 1) or not empty (if 0)—Not sticky, cleared by a Push. 31-27 x Reserved User-Defined Status Registers (USTATx) These are non-memory mapped, universal, system registers (UREG and SREG). The reset value for these registers is 0x0000 0000. The USTATx registers are user-defined, general-purpose status registers. Programs can use these 32-bit registers with bitwise instructions (set, clear, test, and others). Often, programs use these registers for low-overhead, general-purpose flags or for temporary 32-bit storage of data. Processing Element Registers Except for the PX register, the DSP’s processing element registers store data for each element’s ALU, multiplier, and shifter. The inputs and outputs for processing element operations go through these registers. The PX register lets programs transfer data between the data buses, but cannot be an input or output in a calculation. ADSP-21160 SHARC DSP Hardware Reference A-15 Processing Element Registers Table A-6. Processing Element Universal Registers (UREG) Register Name and Page Reference Initialization After Reset “Data File Data Registers (Rx, Fx, Sx)” on page A-16 Undefined “Multiplier Results Registers (MRxF, MRxB)” on page A-16 Undefined “Program Memory Bus Exchange Register (PX)” on page A-17 Undefined Data File Data Registers (Rx, Fx, Sx) These are non-memory mapped, universal, data registers (UREG and DREG). Each of the DSP’s processing elements has a data register file—a set of 40-bit data registers that transfer data between the data buses and the computation units. These registers also provides local storage for operands and results. The R, F, and S prefixes on register names do not effect the 32-bit or 40-bit data transfer; the naming convention determines how the ALU, multiplier, and shifter treat the data and determines which processing element’s data registers are being used. For more information on how to use these registers, see “Data Register File” on page 2-28. Multiplier Results Registers (MRxF, MRxB) These are non-memory mapped, universal, data registers (UREG and DREG). Each of the DSP’s multipliers has a primary or foreground (MRF) register and alternate or background (MRB) results register. Fixed-point operations place 80-bit results in the multiplier’s foreground MRF register or background MRB register, depending on which is active. For more information on selecting the result register, see “Alternate (Secondary) Data Registers” on page 2-30. For more information on result register fields, see “Data Register File” on page 2-28. A-16 ADSP-21160 SHARC DSP Hardware Reference Registers Program Memory Bus Exchange Register (PX) These are non-memory mapped, universal registers (UREG only). The PM Bus Exchange (PX) register permits data to flow between the PM and DM data buses. The PX register can work as one 64-bit register or as two 32-bit registers (PX1 and PX2). For more information on data alignment of PX1 and PX2 within PX and PX register usage, see “Internal Data Bus Exchange” on page 5-7. Program Sequencer Registers The DSP’s program sequencer registers direct the execution of instructions. These registers include support for: • Instruction pipeline • Program and loop stacks • Timer • Interrupt mask and latch Table A-7. Program Sequencer System Registers (UREG and SREG) Register Initialization After Reset “Interrupt Latch Register (IRPTL)” on page A-18 0x0000 0000 (cleared) “Interrupt Mask Register (IMASK)” on page A-23 0x0000 0003 “Interrupt Mask Pointer Register (IMASKP)” on page A-23 0x0000 0000 (cleared) “Link Port Interrupt Register (LIRPTL)” on page A-24 0x0000 0000 (cleared) “Flag Value Register (FLAGS)” on page A-27 0x0000 000n1 1 FLAGS bits 0-3 are equal to the values of the FLAG0-3 input pins after reset; the flag pins are configured as inputs after reset. ADSP-21160 SHARC DSP Hardware Reference A-17 Program Sequencer Registers Table A-8. Program Sequencer Universal Registers (UREG only) Register Initialization After Reset “Program Counter Register (PC)” on page A-28 Undefined “Program Counter Stack Register (PCSTK)” on page A-29 Undefined “Program Counter Stack Pointer Register (PCSTKP)” on page A-29 Undefined “Fetch Address Register (FADDR)” on page A-30 Undefined “Decode Address Register (DADDR)” on page A-30 Undefined “Loop Address Stack Register (LADDR)” on page A-30 Undefined “Current Loop Counter Register (CURLCNTR)” on page A-31 Undefined “Loop Counter Register (LCNTR)” on page A-31 Undefined “Timer Period Register (TPERIOD)” on page A-31 Undefined “Timer Count Register (TCOUNT)” on page A-31 Undefined Interrupt Latch Register (IRPTL) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for this register is 0x0000 0000. The IRPTL register indicates latch status for interrupts. Table A-9. Interrupt Latch Register (IRPTL) Bit Definitions Bit(s) Name Definition 0 EMUI Emulator Interrupt. This bit indicates whether an EMUI interrupt is latched and is pending (if set, =1) or no EMUI interrupt is pending (if cleared, =0). An EMUI interrupt occurs on reset and when an external device asserts the EMU pin. 1 RSTI Reset Interrupt. This bit indicates whether an RSTI interrupt is latched and is pending (if set, =1) or no RSTI interrupt is pending (if cleared, =0). An RSTI interrupt occurs on reset as an external device asserts the RESET pin. A-18 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-9. Interrupt Latch Register (IRPTL) Bit Definitions (Cont’d) Bit(s) Name Definition 2 IICDI Illegal Input Condition Detected. This bit indicates whether an IICD interrupt is latched and is pending (if set, =1) or no IICD interrupt is pending (if cleared, =0). An IICD interrupt occurs when a TRUE results from the logical OR'ing of the Illegal I/O Processor Register Access (IRA) and Unaligned 64-bit Memory Access bits in the STKYx registers. For more information, see “IRA” on page A-14 and “U64MA” on page A-14. 3 SOVFI Stack Overflow/Full. This bit indicates whether an SOVFI interrupt is latched and is pending (if set, =1) or no SOVFI interrupt is pending (if cleared, =0). An SOVFI interrupt occurs when a stack in the program sequencer overflows or is full. For more information see “PCFL” on page A-14, “SSOV” on page A-14, and “LSOV” on page A-15. 4 TMZHI Timer Expired High Priority. This bit indicates whether a TMZHI interrupt is latched and is pending (if set, =1) or no TMZHI interrupt is pending (if cleared, =0). A TMZHI interrupt occurs when the timer decrements to zero. Note that this event also triggers a TMZLI interrupt. The following control timer operations: • The TCOUNT register contains the timer counter. The timer decrements the TCOUNT register each clock cycle. • The TPERIOD value specifies the frequency of timer interrupts. The number of cycles between interrupts is TPERIOD + 1. The maximum value of TPERIOD is 232 – 1. • The TIMEN bit in the MODE2 register starts and stops the timer. Because the timer expired event (TCOUNT decrements to zero) generates two interrupts, TMZHI and TMZLI, programs should unmask the timer interrupt with the desired priority and leave the other one masked. 5 VIRPTI Multiprocessor Vector Interrupt. This bit indicates whether a VIRPTI interrupt is latched and is pending (if set, =1) or no VIRPTI interrupt is pending (if cleared, =0). A VIRPTI interrupt occurs when one of the DSPs in a multiprocessor system writes an address (the vector) to the DSP’s VIRPT register. 6 IRQ2I IRQ2 Hardware Interrupt. This bit indicates whether an IRQ2I interrupt is latched and is pending (if set, =1) or no IRQ2I interrupt is pending (if cleared, =0). An IRQ2I interrupt occurs when an external device asserts the IRQ2 pin. ADSP-21160 SHARC DSP Hardware Reference A-19 Program Sequencer Registers Table A-9. Interrupt Latch Register (IRPTL) Bit Definitions (Cont’d) Bit(s) Name Definition 7 IRQ1I IRQ1 Hardware Interrupt. This bit indicates whether an IRQ1I interrupt is latched and is pending (if set, =1) or no IRQ1I interrupt is pending (if cleared, =0). An IRQ1I interrupt occurs when an external device asserts the IRQ1 pin. 9 IRQ0I IRQ0 Hardware Interrupt. This bit indicates whether an IRQ0I interrupt is latched and is pending (if set, =1) or no IRQ0I interrupt is pending (if cleared, =0). An IRQ0I interrupt occurs when an external device asserts the IRQ0 pin. 9 Reserved 10 SPR0I SPORT Receive 0. This bit indicates whether a SPR0I interrupt is latched and is pending (if set, =1) or no SPR0I interrupt is pending (if cleared, =0). A SPR0I interrupt occurs two cycles after the last bit of an input the serial word is latched into RX0. 11 SPR1I SPORT Receive 1. This bit indicates whether a SPR1I interrupt is latched and is pending (if set, =1) or no SPR1I interrupt is pending (if cleared, =0). A SPR1I interrupt occurs two cycles after the last bit of an input the serial word is latched into RX1. 12 SPT0I SPORT Transmit 0. This bit indicates whether a SPT0I interrupt is latched and is pending (if set, =1) or no SPT0I interrupt is pending (if cleared, =0). An SPT0I interrupt occurs two cycles after the last bit of an output the serial word is latched from TX0. 13 SPT1I SPORT Transmit 1. This bit indicates whether a SPT1I interrupt is latched and is pending (if set, =1) or no SPT1I interrupt is pending (if cleared, =0). An SPT1I interrupt occurs two cycles after the last bit of an output the serial word is latched from TX1. 14 LPISUMI Link Buffer DMA Summary. This bit indicates whether an LPISUMI interrupt is latched and is pending (if set, =1) or no LPISUMI interrupt is pending (if cleared, =0). An LPISUMI interrupt occurs when a TRUE results from the logical Or’ing of unmasked link port interrupts, which are configured in the LIRPTL register. A-20 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-9. Interrupt Latch Register (IRPTL) Bit Definitions (Cont’d) Bit(s) Name Definition 15 EP0I External Port Buffer 0 DMA. This bit indicates whether an EP0I interrupt is latched and is pending (if set, =1) or no EP0I interrupt is pending (if cleared, =0). An EP0I interrupt occurs when the external port buffer’s DMA is disabled (DEN=0) and either: • The buffer set to receive (TRAN=0), and the buffer is not empty • The buffer set to transmit (TRAN=1), and the buffer is not full 16 EP1I External Port Buffer 1 DMA. This bit indicates whether an EP1I interrupt is latched and is pending (if set, =1) or no EP1I interrupt is pending (if cleared, =0). For more information, see “EP0I” on page A-21. 17 EP2I External Port Buffer 2 DMA. This bit indicates whether an EP2I interrupt is latched and is pending (if set, =1) or no EP2I interrupt is pending (if cleared, =0). For more information, see “EP0I” on page A-21. 18 EP3I External Port Buffer 3 DMA. This bit indicates whether an EP3I interrupt is latched and is pending (if set, =1) or no EP3I interrupt is pending (if cleared, =0). For more information, see “EP0I” on page A-21. 19 LSRQI Link Port Service Request. This bit indicates whether an LSRQI interrupt is latched and is pending (if set, =1) or no LSRQI interrupt is pending (if cleared, =0). An LSRQI interrupt occurs when an external source accesses an unassigned link port or accesses an assigned link port that has its link buffer disabled. 20 CB7I DAG1 Circular Buffer 7 Overflow. This bit indicates whether a CB7I interrupt is latched and is pending (if set, =1) or no CB7I interrupt is pending (if cleared, =0). For more information, see “CB7S” on page A-14. 21 CB15I DAG2 Circular Buffer 15 Overflow. This bit indicates whether a CB15I interrupt is latched and is pending (if set, =1) or no CB15I interrupt is pending (if cleared, =0). For more information, see “CB15S” on page A-14. 22 TMZLI Timer Expired (Low Priority). This bit indicates whether a TMZLI interrupt is latched and is pending (if set, =1) or no TMZLI interrupt is pending (if cleared, =0). For more information, see “TMZHI” on page A-19. ADSP-21160 SHARC DSP Hardware Reference A-21 Program Sequencer Registers Table A-9. Interrupt Latch Register (IRPTL) Bit Definitions (Cont’d) Bit(s) Name Definition 23 FIXI Fixed-Point Overflow. This bit indicates whether a FIXI interrupt is latched and is pending (if set, =1) or no FIXI interrupt is pending (if cleared, =0). For more information, see “AOS” on page A-13. 24 FLTOI Floating-Point Overflow. This bit indicates whether a FLTOI interrupt is latched and is pending (if set, =1) or no FLTOI interrupt is pending (if cleared, =0). For more information, see “AVS” on page A-13. 25 FLTUI Floating-Point Underflow. This bit indicates whether a FLTUI interrupt is latched and is pending (if set, =1) or no FLTUI interrupt is pending (if cleared, =0). For more information, see “AUS” on page A-13. 26 FLTII Floating-Point Invalid Operation. This bit indicates whether a FLTII interrupt is latched and is pending (if set, =1) or no FLTII interrupt is pending (if cleared, =0). For more information, see “AIS” on page A-13. 27 SFT0I User Software Interrupt 0. This bit indicates whether a SFT0I interrupt is latched and is pending (if set, =1) or no SFT0I interrupt is pending (if cleared, =0). An SFT0I interrupt occurs when a program sets (=1) this bit. 28 SFT1I User Software Interrupt 1. This bit indicates whether a SFT1I interrupt is latched and is pending (if set, =1) or no SFT1I interrupt is pending (if cleared, =0). For more information, see “SFT0I” on page A-22. 29 SFT2I User Software Interrupt 2. This bit indicates whether a SFT2I interrupt is latched and is pending (if set, =1) or no SFT2I interrupt is pending (if cleared, =0). For more information, see “SFT0I” on page A-22. 30 SFT3I User Software Interrupt 3. This bit indicates whether a SFT3I interrupt is latched and is pending (if set, =1) or no SFT3I interrupt is pending (if cleared, =0). For more information, see “SFT0I” on page A-22. 31 A-22 Reserved ADSP-21160 SHARC DSP Hardware Reference Registers Interrupt Mask Register (IMASK) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for this register is 0x0000 0003. Each bit in the IMASK register corresponds to a bit with the same name in the IRPTL registers. The bits in IMASK unmask (enable if set, =1) or mask (disable if cleared, =0) the interrupts that are latched in the IRPTL register. Except for RESET, all interrupts are maskable. When IMASK masks an interrupt, the masking disables the DSP’s response to the interrupt. The IRPTL register still latches an interrupt even when masked, and the DSP responds to that latched interrupt if it is later unmasked. Interrupt Mask Pointer Register (IMASKP) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for this register is 0x0000 0000. Each bit in the IMASKP register corresponds to a bit with the same name in the IRPTL registers. This register supports an interrupt nesting scheme that lets higher priority events interrupt an interrupt service routine and keeps lower priority events from interrupting. When interrupt nesting is enabled (NESTM=1 in the MODE1 register), the bits in IMASKP mask lower priority and unmask higher priority interrupts than the interrupt that is currently being serviced. The IRPTL register still latches a lower priority interrupt even when masked, and the DSP responds to that latched interrupt if it is later unmasked. When interrupt nesting is disabled (NESTM=0 in the MODE1 register), the bits in IMASKP mask all interrupts while an interrupt is currently being serviced. The IRPTL register still latches these interrupts even when masked, and the DSP responds to the highest priority latched interrupt after servicing the current interrupt. For more information, see “NESTM” on page A-4. ADSP-21160 SHARC DSP Hardware Reference A-23 Program Sequencer Registers Link Port Interrupt Register (LIRPTL) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for these registers is 0x0000 0000. The LIRPTL register indicates latch status, select masking, and displays mask pointers for link port interrupts. Note that the bit in the OR’ing of the link port latch bits ( LPISUM IRPTL register contains a logical For more informa- LIRPTL5-0). tion, see “LPISUMI” on page A-20. Table A-10. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions Bit Name Definition 0 LP0 Link Port Buffer 0 Interrupt. This bit indicates whether an LP0 interrupt is latched and is pending (if set, =1) or no LP0 interrupt is pending (if cleared, =0). An LP0 interrupt occurs when the link port buffer’s DMA is disabled (DEN=0) and either: • The buffer set to receive (TRAN=0), and the buffer is not empty • The buffer set to transmit (TRAN=1), and the buffer is not full Note that the LPx bit is set whether the link port is enabled in core mode or DMA mode. 1 LP1 Link Port Buffer 1 Interrupt. This bit indicates whether an LP1 interrupt is latched and is pending (if set, =1) or no LP1 interrupt is pending (if cleared, =0). For more information, see “LP0” on page A-24. 2 LP2 Link Port Buffer 2 Interrupt. This bit indicates whether an LP2 interrupt is latched and is pending (if set, =1) or no LP2 interrupt is pending (if cleared, =0). For more information, see “LP0” on page A-24. 3 LP3 Link Port Buffer 3 Interrupt. This bit indicates whether an LP3 interrupt is latched and is pending (if set, =1) or no LP3 interrupt is pending (if cleared, =0). For more information, see “LP0” on page A-24. A-24 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-10. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions (Cont’d) Bit Name Definition 4 LP4 Link Port Buffer 4 Interrupt. This bit indicates whether an LP4 interrupt is latched and is pending (if set, =1) or no LP4 interrupt is pending (if cleared, =0). For more information, see “LP0” on page A-24. 5 LP5 Link Port Buffer 5 Interrupt. This bit indicates whether an LP5 interrupt is latched and is pending (if set, =1) or no LP5 interrupt is pending (if cleared, =0). For more information, see “LP0” on page A-24. 15-6 Reserved 16 LP0MSK Link Buffer 0 DMA Interrupt Mask. This bit unmasks the LP0 interrupt (if set, =1) or masks the LP0 interrupt (if cleared, =0). For more information on how interrupt masking works, see “Interrupt Mask Register (IMASK)” on page A-23. 17 LP1MSK Link Buffer 1 DMA Interrupt Mask. This bit unmasks the LP1 interrupt (if set, =1) or masks the LP1 interrupt (if cleared, =0). For more information on how interrupt masking works, see “Interrupt Mask Register (IMASK)” on page A-23. 18 LP2MSK Link Buffer 2 DMA Interrupt Mask. This bit unmasks the LP2 interrupt (if set, =1) or masks the LP2 interrupt (if cleared, =0). For more information on how interrupt masking works, see “Interrupt Mask Register (IMASK)” on page A-23. 19 LP3MSK Link Buffer 3 DMA Interrupt Mask. This bit unmasks the LP3 interrupt (if set, =1) or masks the LP3 interrupt (if cleared, =0). For more information on how interrupt masking works, see “Interrupt Mask Register (IMASK)” on page A-23. 20 LP4MSK Link Buffer 4 DMA Interrupt Mask. This bit unmasks the LP4 interrupt (if set, =1) or masks the LP4 interrupt (if cleared, =0). For more information on how interrupt masking works, see “Interrupt Mask Register (IMASK)” on page A-23. 21 LP5MSK Link Buffer 5 DMA Interrupt Mask. This bit unmasks the LP5 interrupt (if set, =1) or masks the LP5 interrupt (if cleared, =0). For more information on how interrupt masking works, see “Interrupt Mask Register (IMASK)” on page A-23. 23-22 Reserved ADSP-21160 SHARC DSP Hardware Reference A-25 Program Sequencer Registers Table A-10. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions (Cont’d) Bit Name Definition 24 LP0MSKP Link Buffer 0 DMA Interrupt Mask Pointer. When the DSP is servicing another interrupt, this bit indicates whether the LP0 interrupt is masked (if set, =1) or the LP0 interrupt is unmasked (if cleared, =0). For more information on how interrupt mask pointers works, see “Interrupt Mask Pointer Register (IMASKP)” on page A-23. 25 LP1MSKP Link Buffer 1 DMA Interrupt Mask Pointer. When the DSP is servicing another interrupt, this bit indicates whether the LP1 interrupt is masked (if set, =1) or the LP1 interrupt is unmasked (if cleared, =0). For more information on how interrupt mask pointers works, see “Interrupt Mask Pointer Register (IMASKP)” on page A-23. 26 LP2MSKP Link Buffer 2 DMA Interrupt Mask Pointer. When the DSP is servicing another interrupt, this bit indicates whether the LP2 interrupt is masked (if set, =1) or the LP2 interrupt is unmasked (if cleared, =0). For more information on how interrupt mask pointers works, see “Interrupt Mask Pointer Register (IMASKP)” on page A-23. 27 LP3MSKP Link Buffer 3 DMA Interrupt Mask Pointer. When the DSP is servicing another interrupt, this bit indicates whether the LP3 interrupt is masked (if set, =1) or the LP3 interrupt is unmasked (if cleared, =0). For more information on how interrupt mask pointers works, see “Interrupt Mask Pointer Register (IMASKP)” on page A-23. 28 LP4MSKP Link Buffer 4 DMA Interrupt Mask Pointer. When the DSP is servicing another interrupt, this bit indicates whether the LP4 interrupt is masked (if set, =1) or the LP4 interrupt is unmasked (if cleared, =0). For more information on how interrupt mask pointers works, see “Interrupt Mask Pointer Register (IMASKP)” on page A-23. A-26 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-10. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions (Cont’d) Bit Name Definition 29 LP5MSKP Link Buffer 5 DMA Interrupt Mask Pointer. When the DSP is servicing another interrupt, this bit indicates whether the LP5 interrupt is masked (if set, =1) or the LP5 interrupt is unmasked (if cleared, =0). For more information on how interrupt mask pointers works, see “Interrupt Mask Pointer Register (IMASKP)” on page A-23. 31-30 Reserved Flag Value Register (FLAGS) This is a non-memory mapped, universal, system register (UREG and SREG). The reset value for these registers is 0x0000 0000. The FLAGS register indicates the state of the FLGx pins. When a FLGx pin is an output, the DSP outputs a high when a program sets the pin’s bit in FLAGS. The I/O direction (input or output) selection of each bit is controlled by its FLGxO bit in the MODE2 register. For more information, see “FLG0O” on page A-7. Table A-11. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions Bit Name Definition 0 FLG0 FLAG0 Value. This bit indicates the state of the FLAG0 pin, whether the pin is high (if set, =1) or low (if cleared, =0). 1 FLG1 FLAG1 Value. This bit indicates the state of the FLAG1 pin, whether the pin is high (if set, =1) or low (if cleared, =0). 2 FLG2 FLAG2 Value. This bit indicates the state of the FLAG2 pin, whether the pin is high (if set, =1) or low (if cleared, =0). 3 FLG3 FLAG3 Value. This bit indicates the state of the FLAG3 pin, whether the pin is high (if set, =1) or low (if cleared, =0). 31-4 Reserved ADSP-21160 SHARC DSP Hardware Reference A-27 Program Sequencer Registers Program Counter Register (PC) This is a non-memory mapped, universal register (UREG only). The Program Counter register is the last stage in the fetch-decode-execute instruction pipeline and contains the 24-bit address of the instruction that the DSP executes on the next cycle. The PC couples with the Program Counter Stack, PCSTK, which stores return addresses and top-of-loop addresses. All addresses generated by the sequencer are 24-bit program memory instruction addresses. As shown in Figure A-1, the address buses can handle 32-bit addresses, but the program sequencer only generates 24-bit addresses over the PM bus. Because the sequencer generates 24-bit addresses, sequencing is limited to the low 12 Mwords of the DSP’s 4 Gword memory map. PM and DM Address Buses and DAGs Can Handle 32-Bit Addresses Program Sequencer Handles 24-Bit Addresses E Field M Field S Field Bits 19-17, System (Internal) Memory Bits 22-20, Multiprocessor Memory Bits 31-23, External Memory Three fields in the address identify the type of memory being addressed. Figure A-1. PM and DM Bus Addresses Versus Sequencing Addresses A-28 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-12 describes the three fields that appear in Figure A-1. The content of the External (E), Multiprocessor (M), and System (S) fields in the address route the data or instruction access to the memory space. Table A-12. PM and DM Address Bus E, M, and S Fields Bit Field Description E External — Values in this field have the following meaning: non-zero: the address is in external memory; with the E bits active remaining bits [22-0] are a valid address. all zeros: the address is in the DSP’s internal memory or in the internal memory of another ADSP-21160 DSP (M and S activated). M Multiprocessor — Values in this field have the following meaning: non-zero: ID of another ADSP-21160 111: broadcast write to internal memory of all ADSP-21160s 000: address in the DSP’s own internal memory S System — Values in this field have the following meaning: 000: address of an IOP register 001: address in Long Word Addressing space 01x: address in Normal Word Addressing space 1xx: address in Short Word Addressing space Program Counter Stack Register (PCSTK) This is a non-memory mapped, universal register (UREG only). The Program Counter Stack register contains the address of the top of the PC stack. This register is a readable and writable register. Program Counter Stack Pointer Register (PCSTKP) This is a non-memory mapped, universal register (UREG only). The Program Counter Stack Pointer register contains the value of PCSTKP. This value is zero when the PC stack is empty, is 1...30 when the stack contains data, and is 31 when the stack overflows. This register is readable and writable. A write to PCSTKP takes effect after a one-cycle delay. If the PC stack is overflowed, a write to PCSTKP has no effect. ADSP-21160 SHARC DSP Hardware Reference A-29 Program Sequencer Registers Fetch Address Register (FADDR) This is a non-memory mapped, universal register (UREG only). The Fetch Address register is the first stage in the fetch-decode-execute instruction pipeline and contains the 24-bit address of the instruction that the DSP fetches from memory on the next cycle. Decode Address Register (DADDR) This is a non-memory mapped, universal register (UREG only). The Decode Address register is the second stage in the fetch-decode-execute instruction pipeline and contains the 24-bit address of the instruction that the DSP decodes on the next cycle. Loop Address Stack Register (LADDR) This is a non-memory mapped, universal register (UREG only). The Loop Address Stack is six levels deep by 32 bits wide. The 32-bit word of each level consists of a 24-bit loop termination address, a 5-bit termination code, and a 2-bit loop type code: Table A-13. Loop Address Stack Register (LADDR) Bits Value 0-23 loop termination address 24-28 termination code 29 reserved (always reads 0) 30-31 loop type code 00 = arithmetic condition-based (not LCE) 01 = counter-based, length 1 10 = counter-based, length 2 11 = counter-based, length > 2 A-30 ADSP-21160 SHARC DSP Hardware Reference Registers Current Loop Counter Register (CURLCNTR) This is a non-memory mapped, universal register (UREG only). The Current Loop Counter register provides access to the loop counter stack and tracks iterations for the Do/Until LCE loop being executed. For more information on how to use CURLCNTR, see “Loop Counter Stack” on page 3-28. Loop Counter Register (LCNTR) This is a non-memory mapped, universal register (UREG only). The Loop Counter register provides access to the loop counter stack and holds the count value before the Do/Until LCE loop is executed. For more information on how to use LCNTR, see “Loop Counter Stack” on page 3-28. Timer Period Register (TPERIOD) This is a non-memory mapped, universal register (UREG only). The Timer Period register contains the decrementing timer count value, counting down the cycles between timer interrupts. For more information on how to use the timer, see “Timer and Sequencing” on page 3-48. Timer Count Register (TCOUNT) This is a non-memory mapped, universal register (UREG only). The Timer Count register contains the timer period, indicating the number of cycles between timer interrupts. For more information on how to use the timer, see “Timer and Sequencing” on page 3-48. ADSP-21160 SHARC DSP Hardware Reference A-31 Data Address Generator Registers Data Address Generator Registers The DSP’s Data Address Generator (DAG) registers hold data addresses, modify values, and circular buffer configurations. Using these registers, the DAGs can automatically increment addressing for ranges of data locations (a buffer). Table A-14. Data Address Generator Universal Registers (UREG only) Register Initialization After Reset “Index Registers (Ix)” on page A-32 Undefined “Modify Registers (Mx)” on page A-32 Undefined “Length and Base Register (Lx, Bx)” on page A-33 Undefined Index Registers (Ix) These are non-memory mapped, universal registers (UREG only). The Data Address Generators store addresses in Index registers (I0-I7 for DAG1 and I8-I15 for DAG2). An index register holds an address and acts as a pointer to memory. For more information, see “Overview” in Chapter 4, Data Address Generators. Modify Registers (Mx) These are non-memory mapped, universal registers (UREG only). The Data Address Generators update stored addresses using Modify registers (M0-M7 for DAG1 and M8-M15 for DAG2). A modify register provides the increment or step size by which an index register is pre- or post-modified during a register move. For more information, see “Overview” in Chapter 4, Data Address Generators. A-32 ADSP-21160 SHARC DSP Hardware Reference Registers Length and Base Register (Lx, Bx) These are non-memory mapped, universal registers (UREG only). The Data Address Generators control circular buffering operations with Length and Base registers (L0-L7 and B0-B7 for DAG1 and L8-L15 and B8-B15 for DAG2). Length and base registers setup the range of addresses and the starting address for a circular buffer. For more information, see “Overview” in Chapter 4, Data Address Generators. I/O Processor Registers The I/O processor’s registers are accessible as part of the DSP’s memory map. Table A-16 on page A-36 lists the I/O processor’s memory mapped registers in address order and provides a cross reference to a description of each register. These registers occupy addresses 0x00 through 0xFF of the memory map and control I/O operations, including: • External port DMA • Link port DMA • Serial port DMA processor registers have a one cycle effect latency (changes take I/O effect on the second cycle after the change). Because the I/O processor’s registers are part of the DSP’s memory map, buses access these registers as locations in memory. While these registers act as memory mapped locations, they are separate from the DSP’s internal memory and have different bus access. One bus can access one I/O processor register from one I/O processor register group at a time. Table A-15 lists the I/O processor register groups. ADSP-21160 SHARC DSP Hardware Reference A-33 I/O Processor Registers When there is contention among the buses for access to registers in the same I/O processor register group, the DSP arbitrates register access as follows: • External Port (EP) bus accesses (highest priority) • Data Memory (DM) bus • Program Memory (PM) bus • I/O processor (IO) bus (lowest priority) The bus with highest priority gets access to the I/O processor register group, and the other buses are held off from accessing that I/O processor register group until that access been completed. There is one exception to this access contention rule. The IO bus and EP bus can simultaneously access the DB (DMA buffer) group of registers, allowing DMA transfers to internal memory at full speed. Table A-15. I/O Processor Register Groups Register Group I/O Processor Registers In This Group System Control (SC) Registers SYSCON, VIRPT, WAIT, SYSTAT, MSGR0, MSGR1, MSGR2, MSGR3, MSGR4, MSGR5, MSGR6, MSGR7, BMAX, BCNT, ELAST, PC_SHDW, MODE2_SHDW DMA Address (DA) Registers II4, IM4, C4, CP4, GP4, DB4, DA4, II5, IM5, C5, CP5, GP5, DB5, DA5, II6, IM6, C6, CP6, GP6, EI6, EM6, EC6, II7, IM7, C7, CP7, GP7, EI7, EM7, EC7, II8, IM8, C8, CP8, GP8, EI8, EM8, EC8, II9, IM9, C9, CP9, GP9, EI9, EM9, EC9, II0, IM0, C0, CP0, GP0, DB0, DA0, II1, IM1, C1, CP1, GP1, DB1, DA1, II2, IM2, C2, CP2, GP2, DB2, DA2, II3, IM3, C3, CP3, GP3, DB3, DA3, DMASTAT A-34 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-15. I/O Processor Register Groups Register Group I/O Processor Registers In This Group DMA Buffer (DB) Registers EPB0, EPB1, EPB2, EPB3, DMAC6, DMAC7, DMAC8, DMAC9 Link and Serial Port (LSP) LBUF0, LBUF1, LBUF2, LBUF3, LBUF4, LBUF5, LCTL, LCOM, Registers LAR, LSRQ, LPATH1, LPATH2, LPATH3, LPCNT, CNST1, CNST2, STCTL0, SRCTL0, TX0, RX0, TDIV0, RDIV0, MTCS0, MRCS0, MTCCS0, MRCCS0, SPATH0, KEYWD0, KEYMASK0, STCTL1, SRCTL1, TX1, RX1, TDIV1, RDIV1, MTCS1, MRCS1, MTCCS1, MRCCS1, SPATH1, KEYWD1, KEYMASK1 Because the I/O processor registers are memory-mapped, the DSP’s architecture does not allow programs to directly transfer data between these registers and other memory locations, except as part of a DMA operation. To read or write I/O processor registers, programs must use the processor core registers. The following example code shows a value being transferred from memory to the USTAT1 register, then the value is transferred to the I/O processor WAIT registers. USTAT2= 0x108421; /* 1st instr. to be executed after reset */ DM(WAIT)=USTAT2; /* Set external memory waitstates to 0 */ The register names for I/O processor registers are not part of the DSP’s assembly syntax. To ease access to these registers, programs should use the #include command to incorporate a file containing the registers’ symbolic names and addresses. An example #include file appears in the “Register and Bit #Defines File (def21160.h)” on page A-81. ADSP-21160 SHARC DSP Hardware Reference A-35 I/O Processor Registers Table A-16. I/O Processor Registers Memory Map Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x00 SYSCON 0x0001 0010 SC on page A-45 0x01 VIRPT 0x0004 0014 SC on page A-48 0x02 WAIT 0x01ce 739c SC on page A-48 0x03 SYSTAT 0x000 0nn0 SC on page A-50 0x04 EPB0 ni DB on page A-52 0x06 EPB1 ni DB on page A-52 0x08 MSGR0 ni SC on page A-52 0x09 MSGR1 ni SC on page A-52 0x0a MSGR2 ni SC on page A-52 0x0b MSGR3 ni SC on page A-52 0x0c MSGR4 ni SC on page A-52 0x0d MSGR5 ni SC on page A-52 0x0e MSGR6 ni SC on page A-52 0x0f MSGR7 ni SC on page A-52 0x10 PC_SHDW ni SC on page A-53 0x11 MODE2_ SHDW ni SC on page A-53 0x12 – 0x13 Reserved 0x14 EPB2 ni DB on page A-52 0x16 EPB3 ni DB on page A-52 0x18 BMAX 0x0000 0000 SC on page A-54 0x19 BCNT 0x0000 0000 SC on page A-54 Reserved Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. A-36 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x1a Reserved 0x1b ELAST ni SC on page A-54 0x1c DMAC10 ni* DB on page A-54 0x1d DMAC11 0x0000 0000 DB on page A-54 0x1e DMAC12 0x0000 0000 DB on page A-54 0x1f DMAC13 0x0000 0000 DB on page A-54 0x20 – 0x2f Reserved 0x30 II4 ni DA on page A-58 0x31 IM4 ni DA on page A-59 0x32 C4 ni DA on page A-59 0x33 CP4 ni DA on page A-59 0x34 GP4 ni DA on page A-60 0x35 DB4 ni DA on page A-60 0x36 DA4 ni DA on page A-60 0x37 DMASTAT ni DA on page A-60 0x38 II5 ni DA on page A-58 0x39 IM5 ni DA on page A-59 0x3a C5 ni DA on page A-59 0x3b CP5 ni DA on page A-59 0x3c GP5 ni DA on page A-60 0x3d DB5 ni DA on page A-60 0x3e DA5 ni DA on page A-60 Reserved Reserved Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. ADSP-21160 SHARC DSP Hardware Reference A-37 I/O Processor Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x3f Reserved ni DA Reserved 0x40 II10 ni* DA on page A-58 0x41 IM10 ni* DA on page A-59 0x42 C10 ni* DA on page A-59 0x43 CP10 ni* DA on page A-59 0x44 GP10 ni* DA on page A-60 0x45 EI10 ni* DA on page A-61 0x46 EM10 ni* DA on page A-61 0x47 EC10 ni* DA on page A-62 0x48 II11 ni DA on page A-58 0x49 IM11 ni DA on page A-59 0x4a C11 ni DA on page A-59 0x4b CP11 ni DA on page A-59 0x4c GP11 ni DA on page A-60 0x4d EI11 ni DA on page A-61 0x4e EM11 ni DA on page A-61 0x4f EC11 ni DA on page A-62 0x50 II12 ni DA on page A-58 0x51 IM12 ni DA on page A-59 0x52 C12 ni DA on page A-59 0x53 CP12 ni DA on page A-59 0x54 GP12 ni DA on page A-60 Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. A-38 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x55 EI12 ni DA on page A-61 0x56 EM12 ni DA on page A-61 0x57 EC12 ni DA on page A-62 0x58 II13 ni DA on page A-58 0x59 IM13 ni DA on page A-59 0x5a C13 ni DA on page A-59 0x5b CP13 ni DA on page A-59 0x5c GP13 ni DA on page A-60 0x5d EI13 ni DA on page A-61 0x5e EM13 ni DA on page A-61 0x5f EC13 ni DA on page A-62 0x60 II0 ni DA on page A-58 0x61 IM0 ni DA on page A-59 0x62 C0 ni DA on page A-59 0x63 CP0 ni DA on page A-59 0x64 GP0 ni DA on page A-60 0x65 DB0 ni DA on page A-60 0x66 DA0 ni DA on page A-60 0x68 II1 ni DA on page A-58 0x69 IM1 ni DA on page A-59 0x6a C1 ni DA on page A-59 0x6b CP1 ni DA on page A-59 Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. ADSP-21160 SHARC DSP Hardware Reference A-39 I/O Processor Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x6c GP1 ni DA on page A-60 0x6d DB1 ni DA on page A-60 0x6e DA1 ni DA on page A-60 0x6f Reserved 0x70 II2 ni DA on page A-58 0x71 IM2 ni DA on page A-59 0x72 C2 ni DA on page A-59 0x73 CP2 ni DA on page A-59 0x74 GP2 ni DA on page A-60 0x75 DB2 ni DA on page A-60 0x76 DA2 ni DA on page A-60 0x78 II3 ni DA on page A-58 0x79 IM3 ni DA on page A-59 0x7a C3 ni DA on page A-59 0x7b CP3 ni DA on page A-59 0x7c GP3 ni DA on page A-60 0x7d DB3 ni DA on page A-60 0x7e DA3 ni DA on page A-60 0x7f Reserved 0x80 II6 ni DA on page A-58 0x81 IM6 ni DA on page A-59 0x82 C6 ni DA on page A-59 Reserved Reserved Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. A-40 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x83 CP6 ni DA on page A-59 0x84 GP6 ni DA on page A-60 0x85 DB6 ni DA on page A-60 0x86 DA6 ni DA on page A-60 0x88 II7 ni DA on page A-58 0x89 IM7 ni DA on page A-59 0x8a C7 ni DA on page A-59 0x8b CP7 ni DA on page A-59 0x8c GP7 ni DA on page A-60 0x8d DB7 ni DA on page A-60 0x8e DA7 ni DA on page A-60 0x8f Reserved 0x90 II8 ni DA on page A-58 0x91 IM8 ni DA on page A-59 0x92 C8 ni DA on page A-59 0x93 CP8 ni DA on page A-59 0x94 GP8 ni DA on page A-60 0x95 DB8 ni DA on page A-60 0x96 DA8 ni DA on page A-60 0x98 II9 ni DA on page A-58 0x99 IM9 ni DA on page A-59 0x9a C9 ni DA on page A-59 Reserved Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. ADSP-21160 SHARC DSP Hardware Reference A-41 I/O Processor Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0x9b CP9 ni DA on page A-59 0x9c GP9 ni DA on page A-60 0x9d DB9 ni DA on page A-60 0x9e DA9 ni DA on page A-60 0x9f–0xbf Reserved (emulation control registers) 0xc0 LBUF0 ni LSP on page A-62 0xc2 LBUF1 ni LSP on page A-62 0xc4 LBUF2 ni LSP on page A-62 0xc6 LBUF3 ni LSP on page A-62 0xc8 LBUF4 ni LSP on page A-62 0xca LBUF5 ni LSP on page A-62 0xcc LCTL0 0x00000000 LSP on page A-62 0xcd LCTL1 0x00000000 LSP on page A-62 0xce LCOM 0x00000000 LSP on page A-65 0xcf LAR 0x0002C688 LSP on page A-67 0xd0 LSRQ 0x0000 0000 LSP on page A-68 0xd1 LPATH1 ni LSP on page A-70 0xd2 LPATH2 ni LSP on page A-70 0xd3 LPATH3 ni LSP on page A-70 0xd4 LPCNT ni LSP on page A-70 0xd5 CNST1 ni LSP on page A-71 0xd6 CNST2 ni LSP on page A-71 Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. A-42 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0xd7 – 0xdf Reserved 0xe0 STCTL0 0x0000 0000 LSP on page A-71 0xe1 SRCTL0 0x0000 0000 LSP on page A-73 0xe2 TX0 ni LSP on page A-73 0xe3 RX0 ni LSP on page A-76 0xe4 TDIV0 ni LSP on page A-76 0xe5 TCNT0 ni LSP on page A-77 0xe6 RDIV0 ni LSP on page A-77 0xe7 RCNT0 ni LSP on page A-78 0xe8 MTCS0 ni LSP on page A-78 0xe9 MRCS0 ni LSP on page A-78 0xea MTCCS0 ni LSP on page A-79 0xeb MRCCS0 ni LSP on page A-79 0xec KEYWD0 ni LSP on page A-80 0xed KEYMASK0 ni LSP on page A-80 0xee SPATH0 0x0000 0001 LSP on page A-80 0xef SPCNT0 0x0000 0001 LSP on page A-80 0xf0 STCTL1 0x0000 0000 LSP on page A-71 0xf1 SRCTL1 0x0000 0000 LSP on page A-73 0xf2 TX1 ni LSP on page A-73 0xf3 RX1 ni LSP on page A-76 0xf4 TDIV1 ni LSP on page A-76 Reserved Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. ADSP-21160 SHARC DSP Hardware Reference A-43 I/O Processor Registers Table A-16. I/O Processor Registers Memory Map (Cont’d) Register Address Register Name Initialization After Reset Register Group Page Cross Reference 0xf5 TCNT1 ni LSP on page A-77 0xf6 RDIV1 ni LSP on page A-77 0xf7 RCNT1 ni LSP on page A-78 0xf8 MTCS1 ni LSP on page A-78 0xf9 MRCS1 ni LSP on page A-78 0xfa MTCCS1 ni LSP on page A-79 0xfb MRCCS1 ni LSP on page A-79 0xfc KEYWD1 ni LSP on page A-80 0xfd KEYMASK1 ni LSP on page A-80 0xfe SPATH1 0x0000 0001 LSP on page A-80 0xff SPCNT1 0x0000 0001 LSP on page A-80 Notes: An “ni” in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-15 on page A-34. An * denotes that initialization depends on the booting mode. For more information, see “Bootloading Through The External Port” on page 6-76 or “Bootloading Through The Link Port” on page 6-87. A-44 ADSP-21160 SHARC DSP Hardware Reference Registers System Configuration Register (SYSCON) This register’s address is 0x00. The reset value for this register is 0x10, configuring the HPM bits for 16-to-32/64 bit packing (see Table A-17). Table A-17. System Configuration Register (SYSCON) Bit Definitions Bit(s) Name Definition 0 SRST Software Reset. This bit resets (when set, =1) the DSP. When a program sets (=1) SRST, the DSP responds to the non-maskable RSTI interrupt and clears (=0) SRST. 1 BSO Boot Select Override. This bit enables (if set, =1) or disables (if cleared, =0) access to Boot Memory Space. When BSO is set, the DSP uses the BMS select line (instead of MS3-0) to perform DMA channel 10 accesses of external memory. The DSP uses 8- to 48-bit packing when reading from 8-bit boot memory space, but does no packing on writes to this space. For appropriate byte alignment on DMA writes to boot memory space, programs must use the shifter to place the ordered bytes for the transfer in bits 39-32 of each internal Long word address. 2 IIVT Internal Interrupt Vector Table. This bit forces placement of the interrupt vector table at address 0x0004 0000 regardless of booting mode (if 1) or allows placement of the interrupt vector table as selected by the booting mode (if 0). 3 Reserved 6-4 HPM Host Packing Mode. These bits select the external bus packing mode for host accesses as follows: 000=no packing, 001=16-to-32/64 (reset value), 010=16-to-48, 011=32-to-48, 100=32-to-32/64 7 HMSWF Host Most Significant Word First Packing Select. This bit selects the word packing order for host accesses as most-significant-word first (if set, =1) or least-significant-word first (if cleared, =0). ADSP-21160 SHARC DSP Hardware Reference A-45 I/O Processor Registers Table A-17. System Configuration Register (SYSCON) Bit Definitions (Cont’d) Bit(s) Name Definition 8 HPFLSH Host Packing Status Flush. This bit flushes (when set, =1) settings for the direct write FIFO. Flushing these settings does the following: • Clears (=0) the HPS status bits in the SYSTAT register • Clears (=0) the channel’s DMA request counter • Clears (-0) any partially packed words When a program sets (=1) HPFLSH, the DSP flushes the settings and clears (=0) HPFLSH. There is a two-cycle effect latency in completing the flush operation. Programs must not set the buffer’s HPFLSH during the same write that enables the buffer. Also, programs must not set the HPFLSH bit while the DMA channel is active. Programs should determine the channel’s active status by reading the corresponding bit in the DMASTAT register. 9 IMDW0 Internal Memory Block 0 Data Width. This bit selects the Normal word data access size for internal memory Block 0 as 40-bit data (if set, =1) or 32-bit data (if cleared, =0). 10 IMDW1 Internal Memory Block 1 Data Width. This bit selects the Normal word data access size for internal memory Block 1 as 40-bit data (if set, =1) or 32-bit data (if cleared, =0). 11 ADREDY Active Drive REDY. This bit selects line driver type for the DSP’s REDY pin as active drive (a/d) (if set, =1) or open drain (o/d) (if cleared, =0). 15-12 MSIZE Memory Bank Size. These bits select the size of the four external memory banks (Bank 3-0). The external memory that is not allotted to a bank is part of the Unbanked external memory region. The formula for external memory bank size is: MSIZE=log2(desired bank size in words) - 13 16 BHD Buffer Hang Disable. This bit controls whether the processor core proceeds (hang disabled if set, =1) or is held-off (hang enabled if cleared, =0) when the core tries to read from an empty EPBx, Tx, or LBUFx buffer or tries to write to a full EPBX, Rx, or LBUFx buffer. A-46 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-17. System Configuration Register (SYSCON) Bit Definitions (Cont’d) Bit(s) Name Definition 18-17 EBPR External Bus Priority. These bits select the priority for the I/O processor’s EP bus when arbitrating access to the DSP’s external port as follows: 00—priority rotates between DM or PM and IO buses, 01— the winning DM or PM bus has priority over the IO bus, 10—the IO bus has priority over the winning DM or PM bus. 19 DCPR External Port DMA Channel Priority Rotation Enable. This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation among external port DMA channels (channel 10-13). 20 LDCPR Link Port DMA Channel Priority Rotation Enable. This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation among link port DMA channels (channel 4-9). 21 PRROT Link–External Port DMA Channel Priority Rotation Enable. This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation between link port DMA channels (channel 4-9) and external port DMA channels (channel 10-13). 22 COD CLKOUT Disable. This bit enables (if set, =1) or disables (if cleared, =0) DSP clock output on the CLKOUT pin. If enabled, the DSP outputs the clock signal on CLKOUT. If disabled, the DSP three-states the CLKOUT pin. This bit is the only way to control the CLKOUT pin. 31-23 Reserved ADSP-21160 SHARC DSP Hardware Reference A-47 I/O Processor Registers Vector Interrupt Address Register (VIRPT) This register’s address is 0x01. The reset value for this register is 0x0004 0014 (see Table A-18). The sequencer uses the VIRPT register to support multiprocessor vector interrupts. The vector interrupt (VIRPTI) permits passing interprocessor commands in multiple-processor systems. This interrupt occurs when an external processor (a host or another DSP) writes an address to the VIRPT register, inserting a new vector address for VIRPTI. Table A-18. Vector Interrupt Address Register (VIRPT) Bit Definitions Bit(s) Name Definition 23-0 VIRPTA Vector Interrupt Address. These bits contain the multiprocessor interrupt’s vector (address). When an external processor loads an address into this register, the DSP pushes the status stack and starts executing the routine at the vector address. 31-25 VIRPTD Vector Interrupt (optional) Data. These bits contain optional data that the external processor may pass to the interrupt service routine. External Memory Waitstate and Access Mode Register (WAIT) This register’s address is 0x02. The reset value for this register is 0x01ce739c, which equates to the following DSP external memory settings: asynchronous access mode for all external memory banks, seven waitstates with a hold cycle for all accesses to external memory banks, external DRAM page size of 256 words (if installed), and disable idle cycle for DMA handshake (see Table A-19). A-48 ADSP-21160 SHARC DSP Hardware Reference Registers Table A-19. External Memory Setup Register (WAIT) Bit Definitions Bit(s) Name Definition 1-0 EB0AM External Bank 0 Access Mode. These bits select the access mode for external memory Bank 0 as follows: EBxAM=External Bank Access Mode 00=Asynchronous—DSP RDH/L and WRH/L strobes change before CLKOUT’s edge—accesses use the waitstate count setting from EBxWS and require external acknowledge (ACK), allowing a deasserted ACK to extend the access time. 01=Synchronous—DSP RDH/L and WRH/L strobes change on CLKOUT’s edge—reads use the waitstate count setting from EBxWS (minimum EBxWS=001); writes are 0-wait state. 10=Synchronous—DSP RDH/L and WRH/L strobes change on CLKOUT’s edge—reads use the waitstate count setting from EBxWS (minimum EBxWS=001); writes are 1-wait state. 11=Reserved 4-2 EB0WS External Bank 0 Waitstates. These bit fields select the waitstates for external memory Bank 0 as follows: EBxWS# of WaitstatesHold Time Cycle? 0000no 0011no 0102yes 0113yes 1004yes 1015yes 1106yes 1117yes Note that Hold Cycles applies to asynchronous mode only. 6-5 EB1AM External Bank 1 Access Mode. (see EB0AM definition) 9-7 EB1WS External Bank 1 Waitstates. (see EB0WS definition) 11-10 EB2AM External Bank 2 Access Mode. (see EB0AM definition) 14-12 EB2WS External Bank 2 Waitstates. (see EB0WS definition) 16-15 EB3AM External Bank 3