External Memory Interface Handbook Volume 3: Reference Material EMI_RM 2017.05.08 Last updated for Intel® Quartus® Prime Design Suite: 17.0 Subscribe Send Feedback Contents Contents 1 Functional Description—UniPHY..................................................................................... 13 1.1 1.2 1.3 1.4 1.5 I/O Pads.............................................................................................................. 14 Reset and Clock Generation....................................................................................14 Dedicated Clock Networks...................................................................................... 15 Address and Command Datapath............................................................................ 16 Write Datapath..................................................................................................... 17 1.5.1 Leveling Circuitry...................................................................................... 17 1.6 Read Datapath..................................................................................................... 19 1.7 Sequencer........................................................................................................... 20 1.7.1 Nios II-Based Sequencer............................................................................21 1.7.2 RTL-based Sequencer................................................................................ 26 1.8 Shadow Registers................................................................................................. 27 1.8.1 Shadow Registers Operation....................................................................... 29 1.9 UniPHY Interfaces................................................................................................. 29 1.9.1 The DLL and PLL Sharing Interface.............................................................. 30 1.9.2 The OCT Sharing Interface......................................................................... 32 1.10 UniPHY Signals................................................................................................... 33 1.11 PHY-to-Controller Interfaces................................................................................. 37 1.12 Using a Custom Controller.................................................................................... 42 1.13 AFI 3.0 Specification............................................................................................43 1.13.1 Bus Width and AFI Ratio...........................................................................43 1.13.2 AFI Parameters....................................................................................... 44 1.13.3 AFI Signals.............................................................................................45 1.14 Register Maps.....................................................................................................49 1.14.1 UniPHY Register Map............................................................................... 50 1.14.2 Controller Register Map............................................................................52 1.15 Ping Pong PHY.................................................................................................... 52 1.15.1 Ping Pong PHY Feature Description............................................................ 52 1.15.2 Ping Pong PHY Architecture.......................................................................54 1.15.3 Ping Pong PHY Operation..........................................................................55 1.16 Efficiency Monitor and Protocol Checker................................................................. 55 1.16.1 Efficiency Monitor....................................................................................56 1.16.2 Protocol Checker..................................................................................... 56 1.16.3 Read Latency Counter..............................................................................56 1.16.4 Using the Efficiency Monitor and Protocol Checker........................................56 1.16.5 Avalon CSR Slave and JTAG Memory Map................................................... 57 1.17 UniPHY Calibration Stages.................................................................................... 58 1.17.1 Calibration Overview................................................................................58 1.17.2 Calibration Stages................................................................................... 59 1.17.3 Memory Initialization............................................................................... 60 1.17.4 Stage 1: Read Calibration Part One—DQS Enable Calibration and DQ/DQS Centering................................................................................................60 1.17.5 Stage 2: Write Calibration Part One........................................................... 65 1.17.6 Stage 3: Write Calibration Part Two—DQ/DQS Centering...............................66 1.17.7 Stage 4: Read Calibration Part Two—Read Latency Minimization.................... 66 1.17.8 Calibration Signals...................................................................................67 1.17.9 Calibration Time......................................................................................67 External Memory Interface Handbook Volume 3: Reference Material 2 Contents 1.18 Document Revision History................................................................................... 67 2 Functional Description—Intel Stratix® 10 EMIF IP......................................................... 70 2.1 2.2 2.3 2.4 Stratix 10 Supported Memory Protocols................................................................... 70 Stratix 10 EMIF IP Support for 3DS/TSV DDR4 Devices.............................................. 71 Migrating to Stratix 10 from Previous Device Families................................................ 71 Stratix 10 EMIF Architecture: Introduction................................................................71 2.4.1 Stratix 10 EMIF Architecture: I/O Subsystem................................................ 72 2.4.2 Stratix 10 EMIF Architecture: I/O Column.....................................................73 2.4.3 Stratix 10 EMIF Architecture: I/O SSM......................................................... 74 2.4.4 Stratix 10 EMIF Architecture: I/O Bank........................................................ 74 2.4.5 Stratix 10 EMIF Architecture: I/O Lane.........................................................75 2.4.6 Stratix 10 EMIF Architecture: Input DQS Clock Tree....................................... 77 2.4.7 Stratix 10 EMIF Architecture: PHY Clock Tree................................................ 78 2.4.8 Stratix 10 EMIF Architecture: PLL Reference Clock Networks........................... 78 2.4.9 Stratix 10 EMIF Architecture: Clock Phase Alignment..................................... 79 2.5 Hardware Resource Sharing Among Multiple Stratix 10 EMIFs..................................... 80 2.5.1 I/O SSM Sharing.......................................................................................80 2.5.2 I/O Bank Sharing...................................................................................... 81 2.5.3 PLL Reference Clock Sharing.......................................................................82 2.5.4 Core Clock Network Sharing....................................................................... 83 2.6 Stratix 10 EMIF IP Component................................................................................ 84 2.6.1 Instantiating Your Stratix 10 EMIF IP in a Qsys Project................................... 84 2.6.2 File Sets.................................................................................................. 87 2.6.3 Customized readme.txt File........................................................................ 87 2.6.4 Clock Domains..........................................................................................88 2.6.5 ECC in Stratix 10 EMIF IP...........................................................................88 2.7 Examples of External Memory Interface Implementations for DDR4............................. 90 2.8 Stratix 10 EMIF Sequencer..................................................................................... 95 2.8.1 Stratix 10 EMIF DQS Tracking..................................................................... 96 2.9 Stratix 10 EMIF Calibration.....................................................................................96 2.9.1 Stratix 10 Calibration Stages ..................................................................... 97 2.9.2 Stratix 10 Calibration Stages Descriptions.................................................... 97 2.9.3 Stratix 10 Calibration Algorithms.................................................................98 2.9.4 Stratix 10 Calibration Flowchart................................................................ 101 2.10 Stratix 10 EMIF and SmartVID............................................................................ 102 2.11 Stratix 10 Hard Memory Controller Rate Conversion Feature....................................102 2.12 Differences Between User-Requested Reset in Stratix 10 versus Arria 10.................. 103 2.12.1 Method for Initiating a User-requested Reset.............................................104 2.13 Compiling Stratix 10 EMIF IP with the Quartus Prime Software................................ 105 2.13.1 Instantiating the Stratix 10 EMIF IP......................................................... 105 2.13.2 Setting I/O Assignments in Stratix 10 EMIF IP........................................... 106 2.14 Debugging Stratix 10 EMIF IP............................................................................. 106 2.14.1 External Memory Interface Debug Toolkit..................................................106 2.14.2 On-Chip Debug for Stratix 10.................................................................. 107 2.14.3 Configuring Your EMIF IP for Use with the Debug Toolkit............................. 107 2.14.4 Stratix 10 EMIF Debugging Examples....................................................... 108 2.15 Stratix 10 EMIF for Hard Processor Subsystem...................................................... 110 2.15.1 Restrictions on I/O Bank Usage for Stratix 10 EMIF IP with HPS................... 111 2.16 Stratix 10 EMIF Ping Pong PHY............................................................................ 114 2.16.1 Stratix 10 Ping Pong PHY Feature Description............................................ 114 External Memory Interface Handbook Volume 3: Reference Material 3 Contents 2.17 2.18 2.19 2.20 2.21 2.16.2 Stratix 10 Ping Pong PHY Architecture...................................................... 115 2.16.3 Stratix 10 Ping Pong PHY Limitations........................................................ 117 2.16.4 Stratix 10 Ping Pong PHY Calibration........................................................ 119 2.16.5 Using the Ping Pong PHY.........................................................................120 2.16.6 Ping Pong PHY Simulation Example Design................................................120 AFI 4.0 Specification.......................................................................................... 120 2.17.1 Bus Width and AFI Ratio.........................................................................121 2.17.2 AFI Parameters..................................................................................... 121 2.17.3 AFI Signals........................................................................................... 123 2.17.4 AFI 4.0 Timing Diagrams........................................................................ 128 Stratix 10 Resource Utilization.............................................................................142 2.18.1 QDR-IV Resource Utilization in Stratix 10 Devices...................................... 142 Stratix 10 EMIF Latency..................................................................................... 142 Integrating a Custom Controller with the Hard PHY................................................ 143 Document Revision History................................................................................. 143 3 Functional Description—Intel Arria 10 EMIF IP............................................................ 144 3.1 3.2 3.3 3.4 Supported Memory Protocols................................................................................ 145 Key Differences Compared to UniPHY IP and Previous Device Families........................ 145 Migrating from Previous Device Families................................................................. 146 Arria 10 EMIF Architecture: Introduction................................................................ 146 3.4.1 Arria 10 EMIF Architecture: I/O Subsystem.................................................147 3.4.2 Arria 10 EMIF Architecture: I/O Column..................................................... 148 3.4.3 Arria 10 EMIF Architecture: I/O AUX.......................................................... 149 3.4.4 Arria 10 EMIF Architecture: I/O Bank......................................................... 149 3.4.5 Arria 10 EMIF Architecture: I/O Lane......................................................... 153 3.4.6 Arria 10 EMIF Architecture: Input DQS Clock Tree........................................155 3.4.7 Arria 10 EMIF Architecture: PHY Clock Tree................................................. 156 3.4.8 Arria 10 EMIF Architecture: PLL Reference Clock Networks............................156 3.4.9 Arria 10 EMIF Architecture: Clock Phase Alignment...................................... 157 3.5 Hardware Resource Sharing Among Multiple EMIFs.................................................. 158 3.5.1 I/O Aux Sharing...................................................................................... 158 3.5.2 I/O Bank Sharing.................................................................................... 159 3.5.3 PLL Reference Clock Sharing..................................................................... 160 3.5.4 Core Clock Network Sharing..................................................................... 161 3.6 Arria 10 EMIF IP Component.................................................................................162 3.6.1 Instantiating Your Arria 10 EMIF IP in a Qsys Project....................................162 3.6.2 File Sets.................................................................................................165 3.6.3 Customized readme.txt File...................................................................... 165 3.6.4 Clock Domains........................................................................................166 3.6.5 ECC in Arria 10 EMIF IP............................................................................166 3.7 Examples of External Memory Interface Implementations for DDR4........................... 168 3.8 Arria 10 EMIF Sequencer......................................................................................173 3.8.1 Arria 10 EMIF DQS Tracking......................................................................174 3.9 Arria 10 EMIF Calibration......................................................................................174 3.9.1 Calibration Stages .................................................................................. 174 3.9.2 Calibration Stages Descriptions................................................................. 175 3.9.3 Calibration Algorithms..............................................................................176 3.9.4 Calibration Flowchart............................................................................... 179 3.9.5 Periodic OCT Recalibration........................................................................ 180 3.10 Back-to-Back User-Controlled Refresh Usage in Arria 10......................................... 182 External Memory Interface Handbook Volume 3: Reference Material 4 Contents 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21 3.22 3.23 Arria 10 EMIF and SmartVID............................................................................... 183 Hard Memory Controller Rate Conversion Feature.................................................. 184 Back-to-Back User-Controlled Refresh for Hard Memory Controller........................... 184 Compiling Arria 10 EMIF IP with the Quartus Prime Software................................... 185 3.14.1 Instantiating the Arria 10 EMIF IP............................................................ 185 3.14.2 Setting I/O Assignments in Arria 10 EMIF IP..............................................185 Debugging Arria 10 EMIF IP................................................................................ 186 3.15.1 External Memory Interface Debug Toolkit..................................................186 3.15.2 On-Chip Debug for Arria 10.................................................................... 186 3.15.3 Configuring Your EMIF IP for Use with the Debug Toolkit............................. 186 3.15.4 Arria 10 EMIF Debugging Examples..........................................................187 Arria 10 EMIF for Hard Processor Subsystem.........................................................189 3.16.1 Restrictions on I/O Bank Usage for Arria 10 EMIF IP with HPS......................190 3.16.2 Using the EMIF Debug Toolkit with Arria 10 HPS Interfaces..........................192 Arria 10 EMIF Ping Pong PHY...............................................................................192 3.17.1 Ping Pong PHY Feature Description...........................................................193 3.17.2 Ping Pong PHY Architecture..................................................................... 194 3.17.3 Ping Pong PHY Limitations...................................................................... 196 3.17.4 Ping Pong PHY Calibration.......................................................................197 3.17.5 Using the Ping Pong PHY.........................................................................198 3.17.6 Ping Pong PHY Simulation Example Design................................................198 AFI 4.0 Specification.......................................................................................... 199 3.18.1 Bus Width and AFI Ratio.........................................................................199 3.18.2 AFI Parameters..................................................................................... 200 3.18.3 AFI Signals........................................................................................... 201 3.18.4 AFI 4.0 Timing Diagrams........................................................................ 206 Resource Utilization........................................................................................... 220 3.19.1 QDR-IV Resource Utilization in Arria 10 Devices......................................... 220 Arria 10 EMIF Latency........................................................................................ 220 Arria 10 EMIF Calibration Times...........................................................................221 Integrating a Custom Controller with the Hard PHY................................................ 222 Memory Mapped Register (MMR) Tables................................................................222 3.23.1 ctrlcfg0: Controller Configuration............................................................. 224 3.23.2 ctrlcfg1: Controller Configuration............................................................. 225 3.23.3 ctrlcfg2: Controller Configuration............................................................. 226 3.23.4 ctrlcfg3: Controller Configuration............................................................. 227 3.23.5 ctrlcfg4: Controller Configuration............................................................. 228 3.23.6 ctrlcfg5: Controller Configuration............................................................. 229 3.23.7 ctrlcfg6: Controller Configuration............................................................. 230 3.23.8 ctrlcfg7: Controller Configuration............................................................. 230 3.23.9 ctrlcfg8: Controller Configuration............................................................. 230 3.23.10 ctrlcfg9: Controller Configuration........................................................... 230 3.23.11 dramtiming0: Timing Parameters........................................................... 231 3.23.12 dramodt0: On-Die Termination Parameters..............................................231 3.23.13 dramodt1: On-Die Termination Parameters..............................................231 3.23.14 sbcfg0: Sideband Configuration............................................................. 232 3.23.15 sbcfg1: Sideband Configuration............................................................. 232 3.23.16 sbcfg2: Sideband Configuration............................................................. 232 3.23.17 sbcfg3: Sideband Configuration............................................................. 233 3.23.18 sbcfg4: Sideband Configuration............................................................. 233 3.23.19 sbcfg5: Sideband Configuration............................................................. 233 External Memory Interface Handbook Volume 3: Reference Material 5 Contents 3.23.20 sbcfg6: Sideband Configuration............................................................. 234 3.23.21 sbcfg7: Sideband Configuration............................................................. 234 3.23.22 sbcfg8: Sideband Configuration............................................................. 234 3.23.23 sbcfg9: Sideband Configuration............................................................. 234 3.23.24 caltiming0: Command/Address/Latency Parameters................................. 235 3.23.25 caltiming1: Command/Address/Latency Parameters................................. 235 3.23.26 caltiming2: Command/Address/Latency Parameters................................. 235 3.23.27 caltiming3: Command/Address/Latency Parameters................................. 236 3.23.28 caltiming4: Command/Address/Latency Parameters................................. 236 3.23.29 caltiming5: Command/Address/Latency Parameters................................. 236 3.23.30 caltiming6: Command/Address/Latency Parameters................................. 237 3.23.31 caltiming7: Command/Address/Latency Parameters................................. 237 3.23.32 caltiming8: Command/Address/Latency Parameters................................. 237 3.23.33 caltiming9: Command/Address/Latency Parameters................................. 238 3.23.34 caltiming10: Command/Address/Latency Parameters................................238 3.23.35 dramaddrw: Row/Column/Bank Address Width Configuration.....................238 3.23.36 sideband0: Sideband............................................................................239 3.23.37 sideband1: Sideband............................................................................239 3.23.38 sideband2: Sideband............................................................................239 3.23.39 sideband3: Sideband............................................................................239 3.23.40 sideband4: Sideband............................................................................239 3.23.41 sideband5: Sideband............................................................................240 3.23.42 sideband6: Sideband............................................................................240 3.23.43 sideband7: Sideband............................................................................240 3.23.44 sideband8: Sideband............................................................................240 3.23.45 sideband9: Sideband............................................................................240 3.23.46 sideband10: Sideband.......................................................................... 240 3.23.47 sideband11: Sideband.......................................................................... 241 3.23.48 sideband12: Sideband.......................................................................... 241 3.23.49 sideband13: Sideband.......................................................................... 241 3.23.50 sideband14: Sideband.......................................................................... 241 3.23.51 sideband15: Sideband.......................................................................... 242 3.23.52 dramsts: Calibration Status...................................................................242 3.23.53 ecc1: ECC General Configuration............................................................242 3.23.54 ecc2: Width Configuration.....................................................................243 3.23.55 ecc3: ECC Error and Interrupt Configuration............................................243 3.23.56 ecc4: Status and Error Information........................................................ 244 3.23.57 ecc5: Address of Most Recent SBE/DBE.................................................. 244 3.23.58 ecc6: Address of Most Recent Correct Command Dropped......................... 245 3.24 Document Revision History................................................................................. 245 4 Functional Description—Intel MAX® 10 EMIF IP........................................................... 247 4.1 4.2 4.3 4.4 4.5 MAX 10 EMIF Overview........................................................................................ 247 External Memory Protocol Support.........................................................................248 MAX 10 Memory Controller................................................................................... 248 MAX 10 Low Power Feature.................................................................................. 248 MAX 10 Memory PHY........................................................................................... 249 4.5.1 Supported Topologies...............................................................................249 4.5.2 Read Datapath........................................................................................249 4.5.3 Write Datapath....................................................................................... 250 4.5.4 Address and Command Datapath...............................................................252 External Memory Interface Handbook Volume 3: Reference Material 6 Contents 4.5.5 Sequencer..............................................................................................252 4.6 Calibration......................................................................................................... 255 4.6.1 Read Calibration......................................................................................255 4.6.2 Write Calibration..................................................................................... 255 4.7 Sequencer Debug Information.............................................................................. 256 4.8 Register Maps.....................................................................................................257 4.9 Document Revision History................................................................................... 257 6 Functional Description—Hard Memory Interface.......................................................... 258 6.1 Multi-Port Front End (MPFE)..................................................................................259 6.2 Multi-port Scheduling...........................................................................................260 6.2.1 Port Scheduling.......................................................................................260 6.2.2 DRAM Burst Scheduling............................................................................260 6.2.3 DRAM Power Saving Modes.......................................................................261 6.3 MPFE Signal Descriptions..................................................................................... 261 6.4 Hard Memory Controller....................................................................................... 264 6.4.1 Clocking.................................................................................................265 6.4.2 Reset.....................................................................................................265 6.4.3 DRAM Interface.......................................................................................265 6.4.4 ECC.......................................................................................................266 6.4.5 Bonding of Memory Controllers................................................................. 266 6.5 Hard PHY........................................................................................................... 268 6.5.1 Interconnections..................................................................................... 268 6.5.2 Clock Domains........................................................................................268 6.5.3 Hard Sequencer...................................................................................... 268 6.5.4 MPFE Setup Guidelines.............................................................................269 6.5.5 Soft Memory Interface to Hard Memory Interface Migration Guidelines........... 270 6.5.6 Bonding Interface Guidelines.................................................................... 271 6.6 Document Revision History................................................................................... 271 7 Functional Description—HPS Memory Controller.......................................................... 273 7.1 7.2 7.3 7.4 Features of the SDRAM Controller Subsystem......................................................... 273 SDRAM Controller Subsystem Block Diagram.......................................................... 274 SDRAM Controller Memory Options........................................................................ 275 SDRAM Controller Subsystem Interfaces................................................................ 276 7.4.1 MPU Subsystem Interface.........................................................................276 7.4.2 L3 Interconnect Interface......................................................................... 276 7.4.3 CSR Interface......................................................................................... 276 7.4.4 FPGA-to-HPS SDRAM Interface..................................................................276 7.5 Memory Controller Architecture.............................................................................277 7.5.1 Multi-Port Front End................................................................................ 278 7.5.2 Single-Port Controller.............................................................................. 279 7.6 Functional Description of the SDRAM Controller Subsystem.......................................281 7.6.1 MPFE Operation Ordering......................................................................... 281 7.6.2 MPFE Multi-Port Arbitration....................................................................... 281 7.6.3 MPFE SDRAM Burst Scheduling..................................................................285 7.6.4 Single-Port Controller Operation................................................................ 285 7.7 SDRAM Power Management.................................................................................. 295 7.8 DDR PHY............................................................................................................296 7.9 Clocks............................................................................................................... 296 7.10 Resets............................................................................................................. 296 External Memory Interface Handbook Volume 3: Reference Material 7 Contents 7.10.1 Taking the SDRAM Controller Subsystem Out of Reset ............................... 297 7.11 Port Mappings................................................................................................... 297 7.12 Initialization..................................................................................................... 298 7.12.1 FPGA-to-SDRAM Protocol Details..............................................................298 7.13 SDRAM Controller Subsystem Programming Model................................................. 302 7.13.1 HPS Memory Interface Architecture..........................................................302 7.13.2 HPS Memory Interface Configuration........................................................ 302 7.13.3 HPS Memory Interface Simulation............................................................303 7.13.4 Generating a Preloader Image for HPS with EMIF....................................... 304 7.14 Debugging HPS SDRAM in the Preloader............................................................... 305 7.14.1 Enabling UART or Semihosting Printout.....................................................305 7.14.2 Enabling Simple Memory Test..................................................................306 7.14.3 Enabling the Debug Report..................................................................... 307 7.14.4 Writing a Predefined Data Pattern to SDRAM in the Preloader...................... 310 7.15 SDRAM Controller Address Map and Register Definitions......................................... 311 7.15.1 SDRAM Controller Address Map............................................................... 311 7.16 Document Revision History................................................................................. 341 8 Functional Description—HPC II Controller....................................................................343 8.1 HPC II Memory Interface Architecture.................................................................... 343 8.2 HPC II Memory Controller Architecture................................................................... 344 8.2.1 Backpressure Support..............................................................................346 8.2.2 Command Generator............................................................................... 347 8.2.3 Timing Bank Pool.................................................................................... 347 8.2.4 Arbiter...................................................................................................347 8.2.5 Rank Timer............................................................................................ 348 8.2.6 Read Data Buffer and Write Data Buffer..................................................... 348 8.2.7 ECC Block.............................................................................................. 348 8.2.8 AFI and CSR Interfaces............................................................................ 348 8.3 HPC II Controller Features.................................................................................... 348 8.3.1 Data Reordering......................................................................................348 8.3.2 Pre-emptive Bank Management.................................................................349 8.3.3 Quasi-1T and Quasi-2T............................................................................ 349 8.3.4 User Autoprecharge Commands................................................................ 349 8.3.5 Address and Command Decoding Logic...................................................... 349 8.3.6 Low-Power Logic..................................................................................... 350 8.3.7 ODT Generation Logic.............................................................................. 350 8.3.8 Burst Merging......................................................................................... 353 8.3.9 ECC.......................................................................................................353 8.4 External Interfaces.............................................................................................. 356 8.4.1 Clock and Reset Interface.........................................................................356 8.4.2 Avalon-ST Data Slave Interface................................................................. 356 8.4.3 AXI Data Slave Interface.......................................................................... 356 8.4.4 Controller-PHY Interface...........................................................................362 8.4.5 Memory Side-Band Signals....................................................................... 362 8.4.6 Controller External Interfaces................................................................... 363 8.5 Top-Level Signals Description................................................................................364 8.5.1 Clock and Reset Signals........................................................................... 364 8.5.2 Local Interface Signals............................................................................. 365 8.5.3 Controller Interface Signals...................................................................... 369 8.5.4 CSR Interface Signals.............................................................................. 371 External Memory Interface Handbook Volume 3: Reference Material 8 Contents 8.5.5 Soft Controller Register Map..................................................................... 371 8.5.6 Hard Controller Register Map.................................................................... 375 8.6 Sequence of Operations....................................................................................... 379 8.7 Document Revision History................................................................................... 380 9 Functional Description—QDR II Controller................................................................... 382 9.1 Block Description................................................................................................ 382 9.1.1 Avalon-MM Slave Read and Write Interfaces................................................382 9.1.2 Command Issuing FSM.............................................................................383 9.1.3 AFI........................................................................................................383 9.2 Avalon-MM and Memory Data Width...................................................................... 383 9.3 Signal Descriptions..............................................................................................384 9.4 Document Revision History................................................................................... 385 10 Functional Description—QDR-IV Controller................................................................ 386 10.1 Block Description...............................................................................................386 10.1.1 Avalon-MM Slave Read and Write Interfaces.............................................. 386 10.1.2 AFI...................................................................................................... 387 10.2 Avalon-MM and Memory Data Width.....................................................................388 10.3 Signal Descriptions............................................................................................ 388 10.4 Document Revision History................................................................................. 389 11 Functional Description—RLDRAM II Controller........................................................... 390 11.1 Block Description...............................................................................................390 11.1.1 Avalon-MM Slave Interface..................................................................... 390 11.1.2 Write Data FIFO Buffer........................................................................... 391 11.1.3 Command Issuing FSM...........................................................................391 11.1.4 Refresh Timer....................................................................................... 391 11.1.5 Timer Module........................................................................................391 11.1.6 AFI...................................................................................................... 391 11.2 User-Controlled Features.................................................................................... 392 11.2.1 Error Detection Parity.............................................................................392 11.2.2 User-Controlled Refresh......................................................................... 392 11.3 Avalon-MM and Memory Data Width.....................................................................392 11.4 Signal Descriptions............................................................................................ 393 11.5 Document Revision History................................................................................. 394 12 Functional Description—RLDRAM 3 PHY-Only IP........................................................ 395 12.1 12.2 12.3 12.4 12.5 Block Description...............................................................................................395 Features...........................................................................................................395 RLDRAM 3 AFI Protocol...................................................................................... 396 RLDRAM 3 Controller with Arria 10 EMIF Interfaces................................................ 397 Document Revision History................................................................................. 398 13 Functional Description—Example Designs.................................................................. 399 13.1 Arria 10 EMIF IP Example Designs Quick Start Guide..............................................399 13.1.1 Typical Example Design Workflow............................................................ 399 13.1.2 Example Designs Interface Tab................................................................400 13.1.3 Development Kit Preset Workflow............................................................ 402 13.1.4 Compiling and Simulating the Design....................................................... 403 13.1.5 Compiling and Testing the Design in Hardware.......................................... 404 13.2 Testing the EMIF Interface Using the Traffic Generator 2.0...................................... 405 External Memory Interface Handbook Volume 3: Reference Material 9 Contents 13.2.1 Configurable Traffic Generator 2.0 Configuration Options.............................406 13.2.2 Performing Your Own Tests Using Traffic Generator 2.0............................... 411 13.2.3 Signal Splitter Component ..................................................................... 413 13.3 UniPHY-Based Example Designs...........................................................................413 13.3.1 Synthesis Example Design...................................................................... 413 13.3.2 Simulation Example Design.....................................................................415 13.3.3 Traffic Generator and BIST Engine........................................................... 416 13.3.4 Creating and Connecting the UniPHY Memory Interface and the Traffic Generator in Qsys.................................................................................. 420 13.4 Document Revision History................................................................................. 422 14 Introduction to UniPHY IP......................................................................................... 424 14.1 14.2 14.3 14.4 14.5 14.6 14.7 Release Information...........................................................................................424 Device Support Levels........................................................................................425 Device Family and Protocol Support..................................................................... 425 UniPHY-Based External Memory Interface Features................................................ 426 System Requirements........................................................................................ 427 Intel FPGA IP Core Verification............................................................................ 427 Resource Utilization........................................................................................... 427 14.7.1 DDR2, DDR3, and LPDDR2 Resource Utilization in Arria V Devices................ 428 14.7.2 DDR2 and DDR3 Resource Utilization in Arria II GZ Devices.........................429 14.7.3 DDR2 and DDR3 Resource Utilization in Stratix III Devices..........................430 14.7.4 DDR2 and DDR3 Resource Utilization in Stratix IV Devices.......................... 431 14.7.5 DDR2 and DDR3 Resource Utilization in Arria V GZ and Stratix V Devices...... 433 14.7.6 QDR II and QDR II+ Resource Utilization in Arria V Devices.........................434 14.7.7 QDR II and QDR II+ Resource Utilization in Arria II GX Devices................... 435 14.7.8 QDR II and QDR II+ Resource Utilization in Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V Devices............................................... 435 14.7.9 RLDRAM II Resource Utilization in Arria V Devices...................................... 435 14.7.10 RLDRAM II Resource Utilization in Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V Devices...............................................................436 14.8 Document Revision History................................................................................. 436 15 Latency for UniPHY IP................................................................................................438 15.1 15.2 15.3 15.4 15.5 15.6 15.7 15.8 DDR2 SDRAM LATENCY...................................................................................... 438 DDR3 SDRAM LATENCY...................................................................................... 439 LPDDR2 SDRAM LATENCY................................................................................... 439 QDR II and QDR II+ SRAM Latency......................................................................440 RLDRAM II Latency............................................................................................440 RLDRAM 3 Latency............................................................................................ 441 Variable Controller Latency................................................................................. 441 Document Revision History................................................................................. 441 16 Timing Diagrams for UniPHY IP................................................................................. 443 16.1 16.2 16.3 16.4 16.5 16.6 16.7 DDR2 Timing Diagrams...................................................................................... 443 DDR3 Timing Diagrams...................................................................................... 448 QDR II and QDR II+ Timing Diagrams..................................................................456 RLDRAM II Timing Diagrams............................................................................... 460 LPDDR2 Timing Diagrams................................................................................... 467 RLDRAM 3 Timing Diagrams................................................................................474 Document Revision History................................................................................. 478 External Memory Interface Handbook Volume 3: Reference Material 10 Contents 17 External Memory Interface Debug Toolkit.................................................................. 480 17.1 User Interface...................................................................................................480 17.1.1 Communication..................................................................................... 480 17.1.2 Calibration and Report Generation........................................................... 481 17.2 Setup and Use.................................................................................................. 481 17.2.1 General Workflow.................................................................................. 482 17.2.2 Linking the Project to a Device................................................................ 482 17.2.3 Establishing Communication to Connections.............................................. 483 17.2.4 Selecting an Active Interface...................................................................483 17.2.5 Reports................................................................................................ 484 17.3 Operational Considerations................................................................................. 485 17.4 Troubleshooting................................................................................................ 487 17.5 Debug Report for Arria V and Cyclone V SoC Devices............................................. 487 17.5.1 Enabling the Debug Report for Arria V and Cyclone V SoC Devices............... 487 17.5.2 Determining the Failing Calibration Stage for a Cyclone V or Arria V HPS SDRAM Controller...................................................................................487 17.6 On-Chip Debug Port for UniPHY-based EMIF IP...................................................... 488 17.6.1 Access Protocol..................................................................................... 489 17.6.2 Command Codes Reference.................................................................... 490 17.6.3 Header Files......................................................................................... 491 17.6.4 Generating IP With the Debug Port.......................................................... 491 17.6.5 Example C Code for Accessing Debug Data............................................... 492 17.7 On-Chip Debug Port for Arria 10 EMIF IP.............................................................. 493 17.7.1 Access Protocol..................................................................................... 494 17.7.2 EMIF On-Chip Debug Port....................................................................... 495 17.7.3 On-Die Termination Calibration ............................................................... 496 17.7.4 Eye Diagram ........................................................................................ 496 17.8 Driver Margining for Arria 10 EMIF IP................................................................... 496 17.8.1 Determining Margin............................................................................... 497 17.9 Read Setting and Apply Setting Commands for Arria 10 EMIF IP.............................. 497 17.9.1 Reading or Applying Calibration Settings...................................................497 17.10 Traffic Generator 2.0........................................................................................ 498 17.10.1 Configuring the Traffic Generator 2.0...................................................... 498 17.10.2 Running the Traffic Generator 2.0.......................................................... 500 17.10.3 Understanding the Custom Traffic Generator User Interface....................... 500 17.10.4 Applying the Traffic Generator 2.0.......................................................... 505 17.11 The Traffic Generator 2.0 Report........................................................................ 508 17.12 Example Tcl Script for Running the EMIF Debug Toolkit......................................... 509 17.13 Calibration Adjustment Delay Step Sizes for Arria 10 Devices................................ 509 17.13.1 Addressing..........................................................................................509 17.13.2 Output and Strobe Enable Minimum and Maximum Phase Settings............. 512 17.14 Using the EMIF Debug Toolkit with Arria 10 HPS Interfaces....................................513 17.15 Document Revision History............................................................................... 513 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers............515 18.1 18.2 18.3 18.4 18.5 18.6 Generating Equivalent Design............................................................................. 516 Replacing the ALTMEMPHY Datapath with UniPHY Datapath..................................... 516 Resolving Port Name Differences......................................................................... 517 Creating OCT Signals......................................................................................... 518 Running Pin Assignments Script.......................................................................... 518 Removing Obsolete Files.....................................................................................519 External Memory Interface Handbook Volume 3: Reference Material 11 Contents 18.7 Simulating your Design...................................................................................... 519 18.8 Document Revision History................................................................................. 520 External Memory Interface Handbook Volume 3: Reference Material 12 1 Functional Description—UniPHY 1 Functional Description—UniPHY UniPHY is the physical layer of the external memory interface. The major functional units of the UniPHY layer include the following: • Reset and clock generation • Address and command datapath • Write datapath • Read datapath • Sequencer The following figure shows the PHY block diagram. Figure 1. PHY Block Diagram PHY - Memory Domain PHY - AFI Domain FPGA UniPHY Sequencer Address and Command Datapath External Memory Device I/O Pads Write Datapath MUX Memory Controller Read Datapath Reset Generation Related Links • UniPHY Interfaces on page 29 The following figure shows the major blocks of the UniPHY and how it interfaces with the external memory device and the controller. • UniPHY Signals on page 33 The following tables list the UniPHY signals. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 1 Functional Description—UniPHY • PHY-to-Controller Interfaces on page 37 Various modules connect to UniPHY through specific ports. • Using a Custom Controller on page 42 By default, the UniPHY-based external memory interface IP cores are delivered with both the PHY and the memory controller integrated, as depicted in the following figure. • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Register Maps on page 49 The following table lists the overall register mapping for the DDR2, DDR3, and LPDDR2 SDRAM Controllers with UniPHY. • Ping Pong PHY on page 52 Ping Pong PHY is an implementation of UniPHY that allows two memory interfaces to share address and command buses through time multiplexing. • Efficiency Monitor on page 56 The Efficiency Monitor reports read and write throughput on the controller input, by counting command transfers and wait times, and making that information available to the External Memory Interface Toolkit via an Avalon slave port. • Calibration Stages on page 59 The calibration process begins when the PHY reset signal deasserts and the PLL and DLL lock. 1.1 I/O Pads The I/O pads contain all the I/O instantiations. 1.2 Reset and Clock Generation At a high level, clocks in the PHY can be classified into two domains: the PHY-memory domain and the PHY-AFI domain. External Memory Interface Handbook Volume 3: Reference Material 14 1 Functional Description—UniPHY The PHY-memory domain interfaces with the external memory device and always operate at full-rate. The PHY-AFI domain interfaces with the memory controller and can be a full-rate, half-rate, or quarter-rate clock, based on the controller in use. The number of clock domains in a memory interface can vary depending on its configuration; for example: • At the PHY-memory boundary, separate clocks may exist to generate the memory clock signal, the output strobe, and to output write data, as well as address and command signals. These clocks include pll_dq_write_clk, pll_write_clk, pll_mem_clk, and pll_addr_cmd_clk. These clocks are phase-shifted as required to achieve the desired timing relationships between memory clock, address and command signals, output data, and output strobe. • For quarter-rate interfaces, additional clock domains such as pll_hr_clock are required to convert signals between half-rate and quarter-rate. • For high-performance memory interfaces using Arria V, Cyclone V, or Stratix V devices, additional clocks may be required to handle transfers between the device core and the I/O periphery for timing closure. For core-to-periphery transfers, the latch clock is pll_c2p_write_clock; for periphery-to-core transfers, it is pll_p2c_read_clock. These clocks are automatically phase-adjusted for timing closure during IP generation, but can be further adjusted in the parameter editor. If the phases of these clocks are zero, the Fitter may remove these clocks during optimization.Also, high-performance interfaces using a Nios II-based sequencer require two additional clocks, pll_avl_clock for the Nios II processor, and pll_config_clock for clocking the I/O scan chains during calibration. For a complete list of clocks in your memory interface, compile your design and run the Report Clocks command in the TimeQuest Timing Analyzer. 1.3 Dedicated Clock Networks The UniPHY layer employs three types of dedicated clock networks: • Global clock network • Dual-regional clock network • PHY clock network (applicable to Arria V, Cyclone V, and Stratix V devices, and later) The PHY clock network is a dedicated high-speed, low-skew, balanced clock tree designed for high-performance external memory interface. For device families that support the PHY clock network, UniPHY always uses the PHY clock network for all clocks at the PHY-memory boundary. For families that do not support the PHY clock network, UniPHY uses either dualregional or global clock networks for clocks at the PHY-memory boundary. During generation, the system selects dual-regional or global clocks automatically, depending on whether a given interface spans more than one quadrant. UniPHY does not mix the usage of dual-regional and global clock networks for clocks at the PHY-memory boundary; this ensures that timing characteristics of the various output paths are as similar as possible. The <variation_name>_pin_assignments.tcl script creates the appropriate clock network type assignment. The use of the PHY clock network is specified directly in the RTL code, and does not require an assignment. External Memory Interface Handbook Volume 3: Reference Material 15 1 Functional Description—UniPHY The UniPHY uses an active-low, asychronous assert and synchronous de-assert reset scheme. The global reset signal resets the PLL in the PHY and the rest of the system is held in reset until after the PLL is locked. 1.4 Address and Command Datapath The memory controller controls the read and write addresses and commands to meet the memory specifications. The PHY is indifferent to address or command—that is, it performs no decoding or other operations—and the circuitry is the same for both. In full-rate and half-rate interfaces, address and command is full rate, while in quarter-rate interfaces, address and command is half rate. Address and command signals are generated in the Altera PHY interface (AFI) clock domain and sent to the memory device in the address and command clock domain. The double-data rate input/output (DDIO) stage converts the half-rate signals into full-rate signals, when the AFI clock runs at half-rate. For quarter-rate interfaces, additional DDIO stages exist to convert the address and command signals in the quarter-rate AFI clock domain to half-rate. The address and command clock is offset with respect to the memory clock to balance the nominal setup and hold margins at the memory device (center-alignment). In the example below, this offset is 270 degrees. The Fitter can further optimize margins based on the actual delays and clock skews. In half-rate and quarter-rate designs, the full-rate cycle shifter blocks can perform a shift measured in full-rate cycles to implement the correct write latency; without this logic, the controller would only be able to implement even write latencies as it operates at half the speed. The full-rate cycle shifter is clocked by either the AFI clock or the address and command clock, depending on the PHY configuration, to maximize timing margins on the path from the AFI clock to the address and command clock. Figure 2. Address and Command Datapath (Half-rate example shown) Full-Rate Cycle Shifter Core DDIO clk afi_clk add_cmd_clk 270 Degrees afi_clk Address/Command mem_clk Center-aligned at the memory device H0/L0 add_cmd_clk L0 mem_clk External Memory Interface Handbook Volume 3: Reference Material 16 H0 1 Functional Description—UniPHY 1.5 Write Datapath The write datapath passes write data from the memory controller to the I/O. The write data valid signal from the memory controller generates the output enable signal to control the output buffer. For memory protocols with a bidirectional data bus, it also generates the dynamic termination control signal, which selects between series (output mode) and parallel (input mode) termination. The figure below illustrates a simplified write datapath of a typical half-rate interface. The full-rate DQS write clock is sent to a DDIO_OUT cell. The output of DDIO_OUT feeds an output buffer which creates a pair of pseudo differential clocks that connects to the memory. In full-rate mode, only the SDR-DDR portion of the path is used; in half-rate mode, the HDR-SDR circuitry is also required. The use of DDIO_OUT in both the output strobe and output data generation path ensures that their timing characteristics are as similar as possible. The <variation_name>_pin_assignments.tcl script automatically specifies the logic option that associates all data pins to the output strobe pin. The Fitter treats the pins as a DQS/DQ pin group. Figure 3. Write Datapath vcc Output strobe Output strobe (n) ALTIOBUF gnd DQS write clock (full-rate) DDIO_OUT DDIO_OUT 0 DDIO_OUT 0 Output data [0] DDIO_OUT 1 wdata[0] wdata[1] wdata[2] wdata[3] wdata[4n-1:0] DDIO_OUT 2n-2 DDIO_OUT n-1 Output data [n-1] SDR DDR DDIO_OUT 2n-1 HDR SDR wdata[4n-4] wdata[4n-3] wdata[4n-2] wdata[4n-1] Half-rate clock DQ write clock (full-rate, -90 degrees from DQS write clock) 1.5.1 Leveling Circuitry Leveling circuitry is dedicated I/O circuitry to provide calibration support for fly-by address and command networks. For DDR3, leveling is always invoked, whether the interface targets a DIMM or a single component. For DDR3 implementations at higher frequencies, a fly-by topology is recommended for optimal performance. For DDR2, leveling circuitry is invoked automatically for frequencies above 240 MHz; no leveling is used for frequencies below 240 MHz. External Memory Interface Handbook Volume 3: Reference Material 17 1 Functional Description—UniPHY For DDR2 at frequencies below 240 MHz, you should use a tree-style layout. For frequencies above 240 MHz, you can choose either a leveled or balanced-T or Y topology, as the leveled PHY calibrates to either implementation. Regardless of protocol, for devices without a levelling block—such as Arria II GZ, Arria V, and Cyclone V—a balanced-T PCB topology for address/command/clock must be used because fly-by topology is not supported. For details about leveling delay chains, consult the memory interfaces hardware section of the device handbook for your FPGA. The following figure shows the write datapath for a leveling interface. The full-rate PLL output clock phy_write_clk goes to a leveling delay chain block which generates all other periphery clocks that are needed. The data signals that generate DQ and DQS signals pass to an output phase alignment block. The output phase alignment block feeds an output buffer which creates a pair of pseudo differential clocks that connect to the memory. In full-rate designs, only the SDR-DDR portion of the path is used; in half-rate mode, the HDR-SDR circuitry is also required. The use of DDIO_OUT in both the output strobe and output data generation paths ensures that their timing characteristics are as similar as possible. The <variation_name>_pin_assignments.tcl script automatically specifies the logic option that associates all data pins to the output strobe pin. The Quartus Prime Fitter treats the pins as a DQS/DQ pin group. External Memory Interface Handbook Volume 3: Reference Material 18 1 Functional Description—UniPHY Figure 4. Write Datapath for a Leveling Interface Output Phase Alignment SDR - DDR DDIO_OUT_ DQS_0 DQS DQSn HDR - SDR ALTIOBUF DDIO_OUT_ DQS_1 afi_data_valid 0 0 DDIO_OUT DQ[0] DDIO_OUT 0 DDIO_OUT 0 DDIO_OUT 1 wdata[0] wdata[1] wdata[2] wdata[3] wdata[4n-1:0] DDIO_OUT 2n-2 DQ[n-1] DDIO_OUT n-1 DDIO_OUT 2n-1 wdata[4n-4] wdata[4n-3] wdata[4n-2] wdata[4n-1] 0_Phase DQ clock DQS clock phy_write_clk Leveling Delay Chain 1.6 Read Datapath The read datapath passes read data from memory to the PHY. The following figure shows the blocks and flow in the read datapath. For all protocols, the DQS logic block delays the strobe by 90 degrees to center-align the rising strobe edge within the data window. For DDR2, DDR3, and LPDDR2 protocols, the logic block also performs strobe gating, holding the DQS enable signal high for the entire period that data is received. One DQS logic block exists for each data group. External Memory Interface Handbook Volume 3: Reference Material 19 1 Functional Description—UniPHY One VFIFO buffer exists for each data group. For DDR2, DDR3, and LPDDR2 protocols, the VFIFO buffer generates the DQS enable signal, which is delayed (by an amount determined during calibration) to align with the incoming DQS signal. For QDR and RLDRAM protocols, the output of the VFIFO buffer serves as the write enable signal for the Read FIFO buffer, signaling when to begin capturing data. DDIO_IN receives data from memory at double-data rate and passes data on to the Read FIFO buffer at single-data rate. The Read FIFO buffer temporarily holds data read from memory; one Read FIFO buffer exists for each data group. For half-rate interfaces, the Read FIFO buffer converts the full-rate, single data-rate input to a half-rate, single data-rate output which is then passed to the PHY core logic. In the case of a quarter-rate interface, soft logic in the PHY performs an additional conversion from half-rate single data rate to quarter-rate single data rate. One LFIFO buffer exists for each memory interface; the LFIFO buffer generates the read enable signal for all Read FIFO blocks in an interface. The read enable signal is asserted when the Read FIFO blocks have buffered sufficient data from the memory to be read. The timing of the read enable signal is determined during calibration. Figure 5. Read Datapath LFIFO (one per interface) Delayed DQS/clock Strobe Memory Data bus DQS Logic Block DQS enable (DDRx) DDIO_IN (Read Capture) Read enable Write Half-rate clk clk Read FIFO Data Data in out Write enable PHY (QDR & RLDRAM) VFIFO (one per group) Full-rate double data rate Full-rate single data rate Half-rate single data rate (or quarter-rate single data rate) 1.7 Sequencer Depending on the combination of protocol and IP architecture in your external memory interface, you may have either an RTL-based sequencer or a Nios® II-based sequencer. External Memory Interface Handbook Volume 3: Reference Material 20 1 Functional Description—UniPHY RTL-based sequencer implementations and Nios II-based sequencer implementations can have different pin requirements. You may not be able to migrate from an RTLbased sequencer to a Nios II-based sequencer and maintain the same pinout. For information on sequencer support for different protocol-architecture combinations, refer to Introduction to Intel® FPGA Memory Solutions in Volume 1 of this handbook. For information on pin planning, refer to Planning Pin and FPGA Resources in Volume 2 of this handbook. Related Links • Protocol Support Matrix • Planning Pin and FPGA Resources 1.7.1 Nios II-Based Sequencer The DDR2, DDR3, and LPDDR2 controllers with UniPHY employ a Nios II-based sequencer that is parameterizable and is dynamically generated at run time. The Nios II-based sequencer is also available with the QDR II and RLDRAM II controllers. 1.7.1.1 NIOS II-based Sequencer Function The sequencer enables high-frequency memory interface operation by calibrating the interface to compensate for variations in setup and hold requirements caused by transmission delays. UniPHY converts the double-data rate interface of high-speed memory devices to a full-rate or half-rate interface for use within an FPGA. To compensate for slight variations in data transmission to and from the memory device, double-data rate is usually center-aligned with its strobe signal; nonetheless, at high speeds, slight variations in delay can result in setup or hold time violations. The sequencer implements a calibration algorithm to determine the combination of delay and phase settings necessary to maintain center-alignment of data and clock signals, even in the presence of significant delay variations. Programmable delay chains in the FPGA I/Os then implement the calculated delays to ensure that data remains centered. Calibration also applies settings to the FIFO buffers within the PHY to minimize latency and ensures that the read valid signal is generated at the appropriate time. When calibration is completed, the sequencer returns control to the memory controller. For more information about calibration, refer to UniPHY Calibration Stages, in this chapter. Related Links UniPHY Calibration Stages on page 58 The DDR2, DDR3, and LPDDR2 SDRAM, QDR II and QDR II+ SRAM, and RLDRAM II Controllers with UniPHY, and the RLDRAM 3 PHY-only IP, go through several stages of calibration. Calibration information is useful in debugging calibration failures. External Memory Interface Handbook Volume 3: Reference Material 21 1 Functional Description—UniPHY 1.7.1.2 Nios II-based Sequencer Architecture The sequencer is composed of a Nios II processor and a series of hardware-based component managers, connected together by an Avalon bus. The Nios II processor performs the high-level algorithmic operations of calibration, while the component managers handle the lower-level timing, memory protocol, and bit-manipulation operations. The high-level calibration algorithms are specified in C code, which is compiled into Nios II code that resides in the FPGA RAM blocks. The debug interface provides a mechanism for interacting with the various managers and for tracking the progress of the calibration algorithm, and can be useful for debugging problems that arise within the PHY. The various managers are specified in RTL and implement operations that would be slow or inefficient if implemented in software. Figure 6. NIOS II-based Sequencer Block Diagram DQS enable Samples To Debug Module Tracking Manager Debug Interface RAM Nios II Processor Avalon-MM Interface SCC Manager RW Manager PHY Manager I/O Scan Chain AFI Interface PHY Parameters Data Manager The C code that defines the calibration routines is available for your reference in the \<name>_s0_software subdirectory. Intel recommends that you do not modify this C code. External Memory Interface Handbook Volume 3: Reference Material 22 1 Functional Description—UniPHY 1.7.1.3 Nios II-based Sequencer SCC Manager The scan chain control (SCC) manager allows the sequencer to set various delays and phases on the I/Os that make up the memory interface. The latest Intel device families provide dynamic delay chains on input, output, and output enable paths which can be reconfigured at runtime. The SCC manager provides the calibration routines access to these chains to add delay on incoming and outgoing signals. A master on the Avalon-MM interface may require the maximum allowed delay setting on input and output paths, and may set a particular delay value in this range to apply to the paths. The SCC manager implements the Avalon-MM interface and the storage mechanism for all input, output, and phase settings. It contains circuitry that configures a DQ- or DQS-configuration block. The Nios II processor may set delay, phases, or register settings; the sequencer scans the settings serially to the appropriate DQ or DQS configuration block. 1.7.1.4 NIOS II-based Sequencer RW Manager The read write (RW) manager encapsulates the protocol to read and write to the memory device through the Altera PHY Interface (AFI). It provides a buffer that stores the data to be sent to and read from memory, and provides the following commands: • Write configuration—configures the memory for use. Sets up burst lengths, read and write latencies, and other device specific parameters. • Refresh—initiates a refresh operation at the DRAM. The command does not exist on SRAM devices. The sequencer also provides a register that determines whether the RW manager automatically generates refresh signals. • Enable or disable multi-purpose register (MPR)—for memory devices with a special register that contains calibration specific patterns that you can read, this command enables or disables access to the register. • Activate row—for memory devices that have both rows and columns, this command activates a specific row. Subsequent reads and writes operate on this specific row. • Precharge—closes a row before you can access a new row. • Write or read burst—writes or reads a burst length of data. • Write guaranteed—writes with a special mode where the memory holds address and data lines constant. Intel guarantees this type of write to work in the presence of skew, but constrains to write the same data across the entire burst length. • Write and read back-to-back—performs back-to-back writes or reads to adjacent banks. Most memory devices have strict timing constraints on subsequent accesses to the same bank, thus back-to-back writes and reads have to reference different banks. • Protocol-specific initialization—a protocol-specific command required by the initialization sequence. 1.7.1.5 NIOS II-based Sequencer PHY Manager The PHY Manager provides access to the PHY for calibration, and passes relevant calibration results to the PHY. For example, the PHY Manager sets the VFIFO and LFIFO buffer parameters resulting from calibration, signals the PHY when the memory initialization sequence finishes, and reports the pass/fail status of calibration. External Memory Interface Handbook Volume 3: Reference Material 23 1 Functional Description—UniPHY 1.7.1.6 NIOS II-based Sequencer Data Manager The Data Manager stores parameterization-specific data in RAM, for the software to query. 1.7.1.7 NIOS II-based Sequencer Tracking Manager The Tracking Manager detects the effects of voltage and temperature variations that can occur on the memory device over time resulting in reduced margins, and adjusts the DQS enable delay as necessary to maintain adequate operating margins. The Tracking Manager briefly assumes control of the AFI interface after each memory refresh cycle, issuing a read routine to the RW Manager, and then sampling the DQS tracking. Ideally, the falling edge of the DQS enable signal would align to the last rising edge of the raw DQS signal from the memory device. The Tracking Manager determines whether the DQS enable signal is leading or trailing the raw DQS signal. Each time a refresh occurs, the Tracking Manager takes a sample of the raw DQS signal; any adjustments of the DQS enable signal occur only after sufficient samples of raw DQS have been taken. When the Tracking Manager determines that the DQS enable signal is either leading or lagging the raw DQS signal, it adjusts the DQS enable appropriately. The following figure shows the Tracking manager signals. Figure 7. Tracking Manager Signals When the Refresh Completes, the Controller Asserts the Signal & Waits for the Tracking Manager’s Response After afi_seq_busy Goes Low, the Controller Deasserts the Signal & Continues with Normal Operation afi_clk afi_ctl_refresh_done afi_seq_busy The Tracking Manager Responds by Driving afi_seq_busy High & Can Begin Taking Over the AFI Interface External Memory Interface Handbook Volume 3: Reference Material 24 When the Tracking Manager Is Done with DQS Tracking, It Asserts the afi_seq_busy Signal 1 Functional Description—UniPHY Some notes on Tracking Manager operation: • The time taken by the Tracking Manager is arbitrary; if the period taken exceeds the refresh period, the Tracking Manager handles memory refresh. • afi_seq_busy should go high fewer than 10 clock cycles after afi_ctl_refresh_done or afi_ctl_long_idle is asserted. • afi_refresh_done should deassert fewer than 10 clock cycles after afi_seq_busy deasserts. • afi_ctl_long_idle causes the Tracking Manager to execute an algorithm different than periodic refresh; use afi_ctl_long_idle when a long session has elapsed without a periodic refresh. • Table 1. The Tracking Manager is instantiated into the sequencer system when DQS Tracking is turned on. Configurations Supporting DQS Tracking Device Family Protocol Memory Clock Frequency Arria V (GX/GT/SX/ST) , Cyclone V LPDDR2 (single rank) All frequencies. Arria V (GX/GT/SX/ST) DDR3 (single rank) 450 MHz or higher for speed grade 5, or higher than 534 MHz. Arria V GZ, Stratix V (E/GS/GT/GX) • 750 MHz or higher. If you do not want to use DQS tracking, you can disable it (at your own risk), by opening the Verilog file <variant_name>_if0_c0.v in an editor, and changing the value of the USE_DQS_TRACKING parameter from 1 to 0. 1.7.1.8 NIOS II-based Sequencer Processor The Nios II processor manages the calibration algorithm; the Nios II processor is unavailable after calibration is completed. The same calibration algorithm supports all device families, with some differences. The following sections describe the calibration algorithm for DDR3 SDRAM on Stratix III devices. Calibration algorithms for other protocols and families are a subset and significant differences are pointed out when necessary. As the algorithm is fully contained in the software of the sequencer (in the C code) enabling and disabling specific steps involves turning flags on and off. Calibration consists of the following stages: • Initialize memory. • Calibrate read datapath. • Calibrate write datapath. • Run diagnostics. 1.7.1.9 NIOS II-based Sequencer Calibration and Diagnostics Calibration must initialize all memory devices before they can operate properly. The sequencer performs this memory initialization stage when it takes control of the PHY at startup. External Memory Interface Handbook Volume 3: Reference Material 25 1 Functional Description—UniPHY Calibrating the read datapath comprises the following steps: • Calibrate DQS enable cycle and phase. • Perform read per-bit deskew to center the strobe signal within data valid window. • Reduce LFIFO latency. Calibrating the write datapath involves the following steps: • Center align DQS with respect to DQ. • Align DQS with mem_clk. The sequencer estimates the read and write margins under noisy conditions, by sweeping input and output DQ and DQS delays to determine the size of the data valid windows on the input and output sides. The sequencer stores this diagnostic information in the local memory and you can access it through the debugging interface. When the diagnostic test finishes, control of the PHY interface passes back to the controller and the sequencer issues a pass or fail signal. Related Links External Memory Interface Debug Toolkit on page 480 The EMIF Toolkit lets you run your own traffic patterns, diagnose and debug calibration problems, and produce margining reports for your external memory interface. 1.7.2 RTL-based Sequencer The RTL-based sequencer is available for QDR II and RLDRAM II interfaces, on supported device families other than Arria V. The RTL sequencer is a state machine that processes the calibration algorithm. The sequencer assumes control of the interface at reset (whether at initial startup or when the IP is reset) and maintains control throughout the calibration process. The sequencer relinquishes control to the memory controller only after successful calibration. The following tables list the major states in the RTL-based sequencer. Table 2. Sequencer States RTL-based Sequencer State Description RESET Remain in this state until reset is released. LOAD_INIT Load any initialization values for simulation purposes. STABLE Wait until the memory device is stable. WRITE_ZERO Issue write command to address 0. WAIT_WRITE_ZERO Write all 0xAs to address 0. WRITE_ONE Issue write command to address 1. WAIT_WRITE_ONE Write all 0x5s to address 1. Valid Calibration States V_READ_ZERO Issue read command to address 0 (expected data is all 0xAs). continued... External Memory Interface Handbook Volume 3: Reference Material 26 1 Functional Description—UniPHY RTL-based Sequencer State Description V_READ_NOP This state represents the minimum number of cycles required between 2 back-toback read commands. The number of NOP states depends on the burst length. V_READ_ONE Issue read command to address 1 (expected data is all 0x5s). V_WAIT_READ Wait for read valid signal. V_COMPARE_READ_ZERO_READ_ONE Parameterizable number of cycles to wait before making the read data comparisons. V_CHECK_READ_FAIL When a read fails, the write pointer (in the AFI clock domain) of the valid FIFO buffer is incremented. The read pointer of the valid FIFO buffer is in the DQS clock domain. The gap between the read and write pointers is effectively the latency between the time when the PHY receives the read command and the time valid data is returned to the PHY. V_ADD_FULL_RATE Advance the read valid FIFO buffer write pointer by an extra full rate cycle. V_ADD_HALF_RATE Advance the read valid FIFO buffer write pointer by an extra half rate cycle. In full-rate designs, equivalent to V_ADD_FULL_RATE. V_READ_FIFO_RESET Reset the read and write pointers of the read data synchronization FIFO buffer. V_CALIB_DONE Valid calibration is successful. Latency Calibration States L_READ_ONE Issue read command to address 1 (expected data is all 0x5s). L_WAIT_READ Wait for read valid signal from read datapath. Initial read latency is set to a predefined maximum value. L_COMPARE_READ_ONE Check returned read data against expected data. If data is correct, go to L_REDUCE_LATENCY; otherwise go to L_ADD_MARGIN. L_REDUCE_LATENCY Reduce the latency counter by 1. L_READ_FLUSH Read from address 0, to flush the contents of the read data resynchronization FIFO buffer. L_WAIT_READ_FLUSH Wait until the whole FIFO buffer is flushed, then go back to L_READ and try again. L_ADD_MARGIN Increment latency counter by 3 (1 cycle to get the correct data, 2 more cycles of margin for run time variations). If latency counter value is smaller than predefined ideal condition minimum, then go to CALIB_FAIL. CALIB_DONE Calibration is successful. CALIB_FAIL Calibration is not successful. 1.8 Shadow Registers Shadow registers are a hardware feature of Arria V GZ and Stratix V devices that enables high-speed multi-rank calibration for DDR3 quarter-rate and half-rate memory interfaces, up to 800MHz for dual-rank interfaces and 667MHz for quad-rank interfaces. Prior to the introduction of shadow registers, the data valid window of a multi-rank interface was calibrated to the overlapping portion of the data valid windows of the individual ranks. The resulting data valid window for the interface would be smaller than the individual data valid windows, limiting overall performance. External Memory Interface Handbook Volume 3: Reference Material 27 1 Functional Description—UniPHY Figure 8. Calibration of Overlapping Data Valid Windows, without Shadow Registers Rank 0 Window Rank 1 Window Actual Window Shadow registers allow the sequencer to calibrate each rank separately and fully, and then to save the calibrated settings for each rank in its own set of shadow registers, which are part of the IP scan chains. During a rank-to-rank switch, the rank-specific set of calibration settings is restored just-in-time to optimize the data valid window for each rank. The following figure illustrates how the use of rank-specific calibration settings results in a data valid window appropriate for the current rank. Figure 9. Rank-specific Calibration Settings, with Shadow Registers Rank 0 Window Rank 1 Window Actual window when accessing Rank 0 Actual window when accessing Rank 1 The shadow registers and their associated rank-switching circuitry are part of the device I/O periphery hardware. External Memory Interface Handbook Volume 3: Reference Material 28 1 Functional Description—UniPHY 1.8.1 Shadow Registers Operation The sequencer calibrates each rank individually and stores the resulting configuration in shadow registers, which are part of the IP scan chains. UniPHY then selects the appropriate configuration for the rank in use, switching between configurations as necessary. Calibration results for deskew delay chains are stored in the shadow registers. For DQS enable/disable, delay chain configurations come directly from the FPGA core. Signals The afi_wrank signal indicates the rank to which the controller is writing, so that the PHY can switch to the appropriate setting. Signal timing is identical to afi_dqs_burst; that is, afi_wrank must be asserted at the same time as afi_dqs_burst, and must be of the same duration. The afi_rrank signal indicates the rank from which the controller is reading, so that the PHY can switch to the appropriate setting. This signal must be asserted at the same time as afi_rdata_en when issuing a read command, and once asserted, must remain unchanged until the controller issues a new read command to another rank. 1.9 UniPHY Interfaces The following figure shows the major blocks of the UniPHY and how it interfaces with the external memory device and the controller. Note: Instantiating the delay-locked loop (DLL) and the phase-locked loop (PLL) on the same level as the UniPHY eases DLL and PLL sharing. Figure 10. UniPHY Interfaces with the Controller and the External Memory UniPHY Top-Level File Memory Interface RUP and RDN AFI Reset Interface UniPHY OCT DLL PLL DLL Sharing PLL Sharing Interface Interface The following interfaces are on the UniPHY top-level file: • AFI • Memory interface • DLL sharing interface • PLL sharing interface • OCT interface External Memory Interface Handbook Volume 3: Reference Material 29 1 Functional Description—UniPHY AFI The UniPHY datapath uses the Altera PHY interface (AFI). The AFI is in a simple connection between the PHY and controller. The AFI is based on the DDR PHY interface (DFI) specification, with some calibration-related signals not used and some additional Intel-specific sideband signals added. For more information about the AFI, refer to AFI 4.0 Specification, in this chapter. The Memory Interface For information on the memory interface, refer to UniPHY Signals, in this chapter. Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • UniPHY Signals on page 33 The following tables list the UniPHY signals. 1.9.1 The DLL and PLL Sharing Interface You can generate the UniPHY memory interface and configure it to share its PLL, DLL, or both interfaces. By default, a UniPHY memory interface variant contains a PLL and DLL; the PLL produces a variety of required clock signals derived from the reference clock, and the DLL produces a delay codeword. In this case the PLL sharing mode is "No sharing". A UniPHY variant can be configured as a PLL Master and/or DLL Master, in which case the corresponding interfaces are exported to the UniPHY top-level and can be connected to an identically configured UniPHY variant PLL Slave and/or DLL Slave. The UniPHY slave variant is instantiated without a PLL and/or DLL, which saves device resources. Note: For Arria II GX, Arria II GZ, Stratix III, and Stratix IV devices, the PLL and DLL must both be shared at the same time—their sharing modes must match. This restriction does not apply to Arria V, Arria V GZ, Cyclone V, or Stratix V devices. Note: For devices with hard memory interface components onboard, you cannot share PLL or DLL resources between soft and hard interfaces. 1.9.1.1 Sharing PLLs or DLLs To share PLLs or DLLs, follow these steps: 1. To create a PLL or DLL master, create a UniPHY memory interface IP core. To make the PLL and/or DLL interface appear at the top-level in the core, on the PHY Settings tab in the parameter editor, set the PLL Sharing Mode and/or DLL Sharing Mode to Master. 2. To create a PLL or DLL slave, create a second UniPHY memory interface IP core. To make the PLL and/or DLL interface appear at the top-level in the core, on the PHY Settings tab set the PLL Sharing Mode and/or DLL Sharing Mode to Slave. 3. Connect the PLL and/or DLL sharing interfaces by following the appropriate step, below: External Memory Interface Handbook Volume 3: Reference Material 30 1 Functional Description—UniPHY • For cores generated with IP Catalog : connect the PLL and/or DLL interface ports between the master and slave cores in your wrapper RTL. When using PLL sharing, connect the afi_clk, afi_half_clk, and afi_reset_export_n outputs from the UniPHY PLL master to the afi_clk, afi_half_clk, and afi_reset_in inputs on the UniPHY PLL slave. • For cores generated with Qsys , connect the PLL and/or DLL interface in the Qsys GUI. When using PLL sharing, connect the afi_clk, afi_half_clk, and afi_reset_export_n outputs from the UniPHY PLL master to the afi_clk, afi_half_clk, and afi_reset_in inputs on the UniPHY PLL slave. Qsys supports only one-to-one conduit connections in the patch panel. To share a PLL from a UniPHY PLL master with multiple slaves, you should replicate the number of PLL sharing conduit interfaces in the Qsys patch panel by choosing Number of PLL sharing interfaces in the parameter editor. Note: You may connect a slave UniPHY instance to the clocks from a user-defined PLL instead of from a UniPHY master. The general procedure for doing so is as follows: 1. Make a template, by generating your IP with PLL Sharing Mode set to No Sharing, and then compiling the example project to determine the frequency and phases of the clock outputs from the PLL. 2. Generate an external PLL using the IP Catalog flow, with the equivalent output clocks. 3. Generate your IP with PLL Sharing Mode set to Slave, and connect the external PLL to the PLL sharing interface. You must be very careful when connecting clock signals to the slave. Connecting to clocks with frequency or phase different than what the core expects may result in hardware failure. Note: The signal dll_pll_locked is an internal signal from the PLL to the DLL which ensures that the DLL remains in reset mode until the PLL becomes locked. This signal is not available for use by customer logic. 1.9.1.2 About PLL Simulation PLL frequencies may differ between the synthesis and simulation file sets. In either case the achieved PLL frequencies and phases are calculated and reported in real time in the parameter editor. For the simulation file set, clocks are specified in the RTL, not in units of frequency but by the period in picoseconds, thus avoiding clock drift due to picosecond rounding error. For the synthesis file set, there are two mechanisms by which clock frequencies are specified in the RTL, based on the target device family: • For Arria V, Arria V GZ, Cyclone V, and Stratix V, clock frequencies are specified in MHz. • For Arria II GX, Arria II GZ, Stratix III, and Stratix IV, clock frequencies are specified by integer multipliers and divisors. For these families, the real simulation model—as opposed to the default abstract simulation model—also uses clock frequencies specified by integer ratios. External Memory Interface Handbook Volume 3: Reference Material 31 1 Functional Description—UniPHY 1.9.2 The OCT Sharing Interface By default, the UniPHY IP generates the required OCT control block at the top-level RTL file for the PHY. If you want, you can instantiate this block elsewhere in your code and feed the required termination control signals into the IP core by turning off Master for OCT Control Block on the PHY Settings tab. If you turn off Master for OCT Control Block, you must instantiate the OCT control block or use another UniPHY instance as a master, and ensure that the parallel and series termination control bus signals are connected to the PHY. The following figures show the PHY architecture with and without Master for OCT Control Block. Figure 11. PHY Architecture with Master for OCT Control Block UniPHY Top-Level File Memory Interface UniPHY OCT RUP and RDN AFI Reset Interface DLL PLL OCT Sharing DLL Sharing PLL Sharing Interface Interface Interface Figure 12. PHY Architecture without Master for OCT Control Block UniPHY Top-Level File Memory Interface RUP and RDN AFI OCT UniPHY Series and Parallel Termination Control Buses Reset Interface DLL DLL Sharing Interface PLL PLL Sharing Interface 1.9.2.1 Modifying the Pin Assignment Script for QDR II and RLDRAM II If you generate a QDR II or RLDRAM II slave IP core, you must modify the pin assignment script to allow the fitter to correctly resolve the OCT termination block name in the OCT master core. External Memory Interface Handbook Volume 3: Reference Material 32 1 Functional Description—UniPHY To modify the pin assignment script for QDR II or RLDRAM II slaves, follow these steps: 1. In a text editor, open your system's Tcl pin assignments script file, as follows: 2. • For systems generated with the IP Catalog: Open the <IP core name>/ <slave core name>_p0_pin_assignments.tcl file. • For systems generated with Qsys: Open the <HDL Path>/<submodules>/ <slave core name>_p0_pin_assignments.tcl file. Search for the following line: set ::master_corename "_MASTER_CORE_" 3. Replace _MASTER_CORE_ with the instance name of the UniPHY master to which the slave is connected. The instance name is determined from the pin assignments file name, as follows: • For systems generated with Qsys, the instance name is the <master core name> component of the pins assignments file name: <HDL path>/ <submodules>/<master core name>_p0_pin_assignments.tcl. • For systems generated with the IP Catalog, the instance name is the <master core name> component of the pins assignments file name: <IP core name>/<master core name>_p0_pin_assignments.tcl . 1.10 UniPHY Signals The following tables list the UniPHY signals. Table 3. Clock and Reset Signals Name Direction Width Description Input 1 PLL reference clock input. Input 1 Active low global reset for PLL and all logic in the PHY, which causes a complete reset of the whole system. Minimum recommended pulse width is 100ns. Input 1 Holding soft_reset_n low holds the PHY in a reset state. However it does not reset the PLL, which keeps running. It also holds the afi_reset_n output low. pll_ref_clk global_reset_n soft_reset_n Table 4. DDR2 and DDR3 SDRAM Interface Signals Name Direction Width Output mem_ck, mem_ck_n Memory clock. MEM_CK_WIDTH Output mem_cke Description Clock enable. MEM_CLK_EN_WIDTH continued... External Memory Interface Handbook Volume 3: Reference Material 33 1 Functional Description—UniPHY Name Direction Width Output mem_cs_n Description Chip select.. MEM_CHIP_SELECT_WIDTH Output mem_cas_n Column address strobe. MEM_CONTROL_WIDTH Output mem_ras_n Row address strobe. MEM_CONTROL_WIDTH Output mem_we_n Write enable. MEM_CONTROL_WIDTH Output mem_a Address. MEM_ADDRESS_WIDTH Output mem_ba Bank address. MEM_BANK_ADDRESS_WIDTH Bidirectional mem_dqs, mem_dqs_n Data strobe. MEM_DQS_WIDTH Bidirectional mem_dq Data. MEM_DQ_WIDTH Output mem_dm Data mask. MEM_DM_WIDTH Output mem_odt On-die termination. MEM_ODT_WIDTH Output mem_reset_n (DDR3 only) Reset 1 Output mem_ac_parity (DDR3 only, RDIMM/ LRDIMM only) MEM_CONTROL_WIDTH Input mem_err_out_n (DDR3 only, RDIMM/ LRDIMM only) Table 5. Address/command parity bit. (Even parity, per the RDIMM spec, JESD82-29A.) Address/command parity error. MEM_CONTROL_WIDTH UniPHY Parameters Parameter Name AFI_RATIO Description AFI_RATIO is 1 in full-rate designs. AFI_RATIO is 2 for half-rate designs. AFI_RATIO is 4 for quarter-rate designs. The number of DQS pins in the interface. MEM_IF_DQS_WIDTH The address width of the specified memory device. MEM_ADDRESS_WIDTH continued... External Memory Interface Handbook Volume 3: Reference Material 34 1 Functional Description—UniPHY Parameter Name Description The bank width of the specified memory device. MEM_BANK_WIDTH The chip select width of the specified memory device. MEM_CHIP_SELECT_WIDTH The control width of the specified memory device. MEM_CONTROL_WIDTH The DM width of the specified memory device. MEM_DM_WIDTH The DQ width of the specified memory device. MEM_DQ_WIDTH The READ DQS width of the specified memory device. MEM_READ_DQS_WIDTH The WRITE DQS width of the specified memory device. MEM_WRITE_DQS_WIDTH — OCT_SERIES_TERM_CONTROL_WIDTH — OCT_PARALLEL_TERM_CONTROL_WIDTH The AFI address width, derived from the corresponding memory interface width. AFI_ADDRESS_WIDTH The AFI bank width, derived from the corresponding memory interface width. AFI_BANK_WIDTH The AFI chip select width, derived from the corresponding memory interface width. AFI_CHIP_SELECT_WIDTH The AFI data mask width. AFI_DATA_MASK_WIDTH The AFI control width, derived from the corresponding memory interface width. AFI_CONTROL_WIDTH The AFI data width. AFI_DATA_WIDTH The AFI DQS width. AFI_DQS_WIDTH The DLL delay output control width. DLL_DELAY_CTRL_WIDTH A read datapath parameter for timing purposes. NUM_SUBGROUP_PER_READ_DQS continued... External Memory Interface Handbook Volume 3: Reference Material 35 1 Functional Description—UniPHY Parameter Name Description A read datapath parameter for timing purposes. QVLD_EXTRA_FLOP_STAGES A read datapath parameter; calibration fails when the timeout counter expires. READ_VALID_TIMEOUT_WIDTH A read datapath parameter; the write address width for half-rate clocks. READ_VALID_FIFO_WRITE_ADDR_WIDTH A read datapath parameter; the read address width for full-rate clocks. READ_VALID_FIFO_READ_ADDR_WIDTH A latency calibration parameter; the maximum latency count width. MAX_LATENCY_COUNT_WIDTH A latency calibration parameter; the maximum read latency. MAX_READ_LATENCY — READ_FIFO_READ_ADDR_WIDTH — READ_FIFO_WRITE_ADDR_WIDTH A write datapath parameter; the maximum write latency count width. MAX_WRITE_LATENCY_COUNT_WIDTH An initailization sequence. INIT_COUNT_WIDTH A memory-specific initialization parameter. MRSC_COUNT_WIDTH A memory-specific initialization parameter. INIT_NOP_COUNT_WIDTH A memory-specific initialization parameter. MRS_CONFIGURATION A memory-specific initialization parameter. MRS_BURST_LENGTH A memory-specific initialization parameter. MRS_ADDRESS_MODE A memory-specific initialization parameter. MRS_DLL_RESET A memory-specific initialization parameter. MRS_IMP_MATCHING A memory-specific initialization parameter. MRS_ODT_EN continued... External Memory Interface Handbook Volume 3: Reference Material 36 1 Functional Description—UniPHY Parameter Name Description A memory-specific initialization parameter. MRS_BURST_LENGTH A memory-specific initialization parameter. MEM_T_WL A memory-specific initialization parameter. MEM_T_RL The burst count width for the sequencer. SEQ_BURST_COUNT_WIDTH The width of a counter that the sequencer uses. VCALIB_COUNT_WIDTH — DOUBLE_MEM_DQ_WIDTH — HALF_AFI_DATA_WIDTH The width of the calibration status register. CALIB_REG_WIDTH The number of AFI resets to generate. NUM_AFI_RESET Note: For information about the AFI signals, refer to AFI 4.0 Specification in this chapter. Related Links AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. 1.11 PHY-to-Controller Interfaces Various modules connect to UniPHY through specific ports. The AFI standardizes and simplifies the interface between controller and PHY for all Intel memory designs, thus allowing you to easily interchange your own controller code with Intel's high-performance controllers. The AFI PHY interface includes an administration block that configures the memory for calibration and performs necessary accesses to mode registers that configure the memory as required. For half-rate designs, the address and command signals in the UniPHY are asserted for one mem_clk cycle (1T addressing), such that there are two input bits per address and command pin in half-rate designs. If you require a more conservative 2T addressing (where signals are asserted for two mem_clk cycles), drive both input bits (of the address and command signal) identically in half-rate designs. External Memory Interface Handbook Volume 3: Reference Material 37 1 Functional Description—UniPHY The following figure shows the half-rate write operation. Figure 13. Half-Rate Write with Word-Aligned Data afi_clk afi_dqs_burst afi_wdata_valid afi_wdata 00 10 00 -- ba 11 00 11 00 dc -- b -- The following figure shows a full-rate write. Figure 14. Full-Rate Write afi_clk afi_dqs_burst afi_wdata_valid afi_wdata -- a For quarter-rate designs, the address and command signals in the UniPHY are asserted for one mem_clk cycle (1T addressing), such that there are four input bits per address and command pin in quarter-rate designs. If you require a more conservative 2T addressing (where signals are asserted for two mem_clk cycles), drive either the two lower input bits or the two upper input bits (of the address and command signal) identically. External Memory Interface Handbook Volume 3: Reference Material 38 1 Functional Description—UniPHY After calibration is completed, the sequencer sends the write latency in number of clock cycles to the controller. The AFI has the following conventions: • With the AFI, high and low signals are combined in one signal, so for a single chip select (afi_cs_n) interface, afi_cs_n[1:0] , location 0 appears on the memory bus on one mem_clk cycle and location 1 on the next mem_clk cycle. Note: This convention is maintained for all signals so for an 8 bit memory interface, the write data (afi_wdata) signal is afi_wdata[31:0], where the first data on the DQ pins is afi_wdata[7:0], then afi_wdata[15:8], then afi_wdata[23:16], then afi_wdata[31:24]. • Spaced reads and writes have the following definitions: — Spaced writes—write commands separated by a gap of one controller clock (afi_clk) cycle. — Spaced reads—read commands separated by a gap of one controller clock (afi_clk) cycle. The following figures show writes and reads, where the IP core writes data to and reads from the same address. In each example, afi_rdata and afi_wdata are aligned with controller clock (afi_clk) cycles. All the data in the bit vector is valid at once. These figures assume the following general points: • The burst length is four. • An 8-bit interface with one chip select. • The data for one controller clock (afi_clk) cycle represents data for two memory clock (mem_clk) cycles (half-rate interface). External Memory Interface Handbook Volume 3: Reference Material 39 1 Functional Description—UniPHY Figure 15. Word-Aligned Writes Note 1 Note 2 Note 3 Note 4 afi_clk 3 afi_wlat afi_ras_n 00 11 afi_cas_n 11 00 afi_we_n 11 00 afi_cs_n 11 afi_dqs_burst afi_wdata_valid 01 11 00 01 10 00 11 11 10 11 11 00 11 afi_wdata 00000000 afi_addr 00000000 0020008 ACT WR 03020100 07060504 0b0a0908 00 0f0e0d0c Memory Interface mem_clk command (Note 5 ) mem_cs_n mem_dqs mem_dq Notes to Figure: 1. To show the even alignment of afi_cs_n, expand the signal (this convention applies for all other signals). 2. The afi_dqs_burst must go high one memory clock cycle before afi_wdata_valid. Compare with the word-unaligned case. 3. The afi_wdata_valid is asserted afi_wlat + 1 controller clock (afi_clk) cycles after chip select (afi_cs_n) is asserted. The afi_wlat indicates the required write latency in the system. The value is determined during calibration and is dependant upon the relative delays in the address and command path and the write datapath in both the PHY and the external DDR SDRAM subsystem. The controller must drive afi_cs_n and then wait afi_wlat (two in this example) afi_clks before driving afi_wdata_valid. 4. Observe the ordering of write data (afi_wdata). Compare this to data on the mem_dq signal. 5. In all waveforms a command record is added that combines the memory pins ras_n, cas_n and we_n into the current command that is issued. This command is registered by the memory when chip select (mem_cs_n) is low. The important commands in the presented waveforms are WR= write, ACT = activate. External Memory Interface Handbook Volume 3: Reference Material 40 1 Functional Description—UniPHY Figure 16. Word-Aligned Reads Note 1 Note 2 Note 3 Note 2 Note 4 afi_clk afi_rlat 15 afi_ras_n 11 afi_cas_n 0 afi_we_n 00 afi_cs_n 11 afi_rdata_en 00 11 01 11 11 01 00 11 11 00 afi_rdata_valid 00 afi_rdata FFFFFFFF afi_ba afi_addr 11 00 11 00 00 0000000 0020008 afi_dm Memory Interface mem_clk command ACT RD mem_cs_n mem_dqs mem_dq Notes to Figure: 1. For AFI, afi_rdata_en is required to be asserted one memory clock cycle before chip select (afi_cs_n) is asserted. In the half-rate afi_clk domain, this requirement manifests as the controller driving 11 (as opposed to the 01) on afi_rdata_en. 2. AFI requires that afi_rdata_en is driven for the duration of the read. In this example, it is driven to 11 for two half-rate afi_clks, which equates to driving to 1, for the four memory clock cycles of this four-beat burst. 3. The afi_rdata_valid returns 15 (afi_rlat) controller clock (afi_clk) cycles after afi_rdata_en is asserted. Returned is when the afi_rdata_valid signal is observed at the output of a register within the controller. A controller can use the afi_rlat value to determine when to register to returned data, but this is unnecessary as the afi_rdata_valid is provided for the controller to use as an enable when registering read data. 4. Observe the alignment of returned read data with respect to data on the bus. External Memory Interface Handbook Volume 3: Reference Material 41 1 Functional Description—UniPHY Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Timing Diagrams for UniPHY IP on page 443 The following topics contain timing diagrams for UniPHY-based external memory interface IP for supported protocols. 1.12 Using a Custom Controller By default, the UniPHY-based external memory interface IP cores are delivered with both the PHY and the memory controller integrated, as depicted in the following figure. If you want to use your own custom controller with the UniPHY PHY, check the Generate PHY only box on the PHY Settings tab of the parameter editor and generate the IP. The resulting top-level IP consists of only the sequencer, UniPHY datapath, and PLL/DLL — the shaded area in the figure below. Figure 17. Memory Controller with UniPHY Controller with UniPHY PHY Only Sequencer Managers Controller Front End Avalon Controller AFI UniPHY Datapath Memory Interface Memory Device PLL/DLL The AFI interface is exposed at the top-level of the generated IP core; you can connect the AFI interface to your custom controller. When you enable Generate PHY only, the generated example designs include the memory controller appropriately instantiated to mediate read/write commands from the traffic generator to the PHY-only IP. For information on the AFI protocol, refer to the AFI 4.0 Specification, in this chapter. For information on the example designs, refer to Chapter 9, Example Designs, in this volume. Related Links • AFI 4.0 Specification on page 120 External Memory Interface Handbook Volume 3: Reference Material 42 1 Functional Description—UniPHY The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Traffic Generator and BIST Engine on page 416 The traffic generator and built-in self test (BIST) engine for Avalon-MM memory interfaces generates Avalon-MM traffic on an Avalon-MM master interface. 1.13 AFI 3.0 Specification The Altera PHY interface (AFI) 3.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. The AFI is a single-data-rate interface, meaning that data is transferred on the rising edge of each clock cycle. Most memory interfaces, however, operate at double-datarate, transferring data on both the rising and falling edges of the clock signal. If the AFI interface is to directly control a double-data-rate signal, two single-data-rate bits must be transmitted on each clock cycle; the PHY then sends out one bit on the rising edge of the clock and one bit on the falling edge. The AFI convention is to send the low part of the data first and the high part second, as shown in the following figure. Figure 18. Single Versus Double Data Rate Transfer clock Single-data-rate Double-data-rate A High , A Low A Low A High B High , B Low B Low B High 1.13.1 Bus Width and AFI Ratio In cases where the AFI clock frequency is one-half or one-quarter of the memory clock frequency, the AFI data must be twice or four times as wide, respectively, as the corresponding memory data. The ratio between AFI clock and memory clock frequencies is referred to as the AFI ratio. (A half-rate AFI interface has an AFI ratio of 2, while a quarter-rate interface has an AFI ratio of 4.) In general, the width of the AFI signal depends on the following three factors: • The size of the equivalent signal on the memory interface. For example, if a[15:0] is a DDR3 address input and the AFI clock runs at the same speed as the memory interface, the equivalent afi_addr bus will be 16-bits wide. • The data rate of the equivalent signal on the memory interface. For example, if d[7:0] is a double-data-rate QDR II input data bus and the AFI clock runs at the same speed as the memory interface, the equivalent afi_write_data bus will be 16-bits wide. • The AFI ratio. For example, if cs_n is a single-bit DDR3 chip select input and the AFI clock runs at half the speed of the memory interface, the equivalent afi_cs_n bus will be 2-bits wide. External Memory Interface Handbook Volume 3: Reference Material 43 1 Functional Description—UniPHY The following formula summarizes the three factors described above: AFI_width = memory_width * signal_rate * AFI_RATE_RATIO Note: The above formula is a general rule, but not all signals obey it. For definite signal-size information, refer to the specific table. 1.13.2 AFI Parameters The following tables list Altera PHY interface (AFI) parameters for AFI 4.0. The parameters described in the following tables affect the width of AFI signal buses. Parameters prefixed by MEM_IF_ refer to the signal size at the interface between the PHY and memory device. Table 6. Ratio Parameters Parameter Name Description AFI_RATE_RATIO The ratio between the AFI clock frequency and the memory clock frequency. For full-rate interfaces this value is 1, for half-rate interfaces the value is 2, and for quarter-rate interfaces the value is 4. DATA_RATE_RATIO The number of data bits transmitted per clock cycle. For single-date rate protocols this value is 1, and for double-data rate protocols this value is 2. ADDR_RATE_RATIO The number of address bits transmitted per clock cycle. For single-date rate address protocols this value is 1, and for double-data rate address protocols this value is 2. Table 7. Memory Interface Parameters Parameter Name Description MEM_IF_ADDR_WIDTH The width of the address bus on the memory device(s). MEM_IF_BANKADDR_WIDTH The width of the bank address bus on the interface to the memory device(s). Typically, the log 2 of the number of banks. MEM_IF_CS_WIDTH The number of chip selects on the interface to the memory device(s). MEM_IF_WRITE_DQS_WIDTH The number of DQS (or write clock) signals on the write interface. For example, the number of DQS groups. MEM_IF_CLK_PAIR_COUNT The number of CK/CK# pairs. MEM_IF_DQ_WIDTH The number of DQ signals on the interface to the memory device(s). For single-ended interfaces such as QDR II, this value is the number of D or Q signals. MEM_IF_DM_WIDTH The number of data mask pins on the interface to the memory device(s). MEM_IF_READ_DQS_WIDTH The number of DQS signals on the read interface. For example, the number of DQS groups. Table 8. Derived AFI Parameters Parameter Name Derivation Equation AFI_ADDR_WIDTH MEM_IF_ADDR_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_BANKADDR_WIDTH MEM_IF_BANKADDR_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_CONTROL_WIDTH AFI_RATE_RATIO * ADDR_RATE_RATIO continued... External Memory Interface Handbook Volume 3: Reference Material 44 1 Functional Description—UniPHY Parameter Name Derivation Equation AFI_CS_WIDTH MEM_IF_CS_WIDTH * AFI_RATE_RATIO AFI_DM_WIDTH MEM_IF_DM_WIDTH * AFI_RATE_RATIO * DATA_RATE_RATIO AFI_DQ_WIDTH MEM_IF_DQ_WIDTH * AFI_RATE_RATIO * DATA_RATE_RATIO AFI_WRITE_DQS_WIDTH MEM_IF_WRITE_DQS_WIDTH * AFI_RATE_RATIO AFI_LAT_WIDTH 6 AFI_RLAT_WIDTH AFI_LAT_WIDTH AFI_WLAT_WIDTH AFI_LAT_WIDTH * MEM_IF_WRITE_DQS_WIDTH AFI_CLK_PAIR_COUNT MEM_IF_CLK_PAIR_COUNT AFI_WRANK_WIDTH Number of ranks * MEM_IF_WRITE_DQS_WIDTH *AFI_RATE_RATIO AFI_RRANK_WIDTH Number of ranks * MEM_IF_READ_DQS_WIDTH *AFI_RATE_RATIO 1.13.3 AFI Signals The following tables list Altera PHY interface (AFI) signals grouped according to their functions. In each table, the Direction column denotes the direction of the signal relative to the PHY. For example, a signal defined as an output passes out of the PHY to the controller. The AFI specification does not include any bidirectional signals. Not all signals are used for all protocols. 1.13.3.1 AFI Clock and Reset Signals The AFI interface provides up to two clock signals and an asynchronous reset signal. Table 9. Clock and Reset Signals Signal Name Direction Width Description afi_clk Output 1 Clock with which all data exchanged on the AFI bus is synchronized. In general, this clock is referred to as full-rate, half-rate, or quarter-rate, depending on the ratio between the frequency of this clock and the frequency of the memory device clock. afi_half_clk Output 1 Clock signal that runs at half the speed of the afi_clk. The controller uses this signal when the half-rate bridge feature is in use. This signal is optional. afi_reset_n Output 1 Asynchronous reset output signal. You must synchronize this signal to the clock domain in which you use it. 1.13.3.2 AFI Address and Command Signals The address and command signals for AFI 3.0 encode read/write/configuration commands to send to the memory device. The address and command signals are single-data rate signals. External Memory Interface Handbook Volume 3: Reference Material 45 1 Functional Description—UniPHY Table 10. Address and Command Signals Signal Name Direction Width Description afi_ba Input AFI_BANKADDR_WIDTH Bank address. (Not applicable for LPDDR3.) afi_cke Input AFI_CLK_EN_WIDTH Clock enable. afi_cs_n Input AFI_CS_WIDTH Chip select signal. (The number of chip selects may not match the number of ranks; for example, RDIMMs and LRDIMMs require a minimum of 2 chip select signals for both single-rank and dual-rank configurations. Consult your memory device data sheet for information about chip select signal width.) (Matches the number of ranks for LPDDR3.) afi_ras_n Input AFI_CONTROL_WIDTH RAS# (for DDR2 and DDR3 memory devices.) afi_we_n Input AFI_CONTROL_WIDTH WE# (for DDR2, DDR3, and RLDRAM II memory devices.) afi_cas_n Input AFI_CONTROL_WIDTH CAS# (for DDR2 and DDR3 memory devices.) afi_ref_n Input AFI_CONTROL_WIDTH REF# (for RLDRAM II memory devices.) afi_rst_n Input AFI_CONTROL_WIDTH RESET# (for DDR3 and DDR4 memory devices.) afi_odt Input AFI_CLK_EN_WIDTH On-die termination signal for DDR2, DDR3, and LPDDR3 memory devices. (Do not confuse this memory device signal with the FPGA’s internal on-chip termination signal.) afi_mem_clk_disable Input AFI_CLK_PAIR_COUNT When this signal is asserted, mem_clk and mem_clk_n are disabled. This signal is used in lowpower mode. afi_wps_n Output AFI_CS_WIDTH WPS (for QDR II/II+ memory devices.) afi_rps_n Output AFI_CS_WIDTH RPS (for QDR II/II+ memory devices.) 1.13.3.3 AFI Write Data Signals Write Data Signals for AFI 3.0 control the data, data mask, and strobe signals passed to the memory device during write operations. Table 11. Write Data Signals Signal Name afi_dqs_burst Direction Input Width AFI_WRITE_DQS_WIDTH Description Controls the enable on the strobe (DQS) pins for DDR2, DDR3, and LPDDR2 memory devices. When this signal is asserted, mem_dqs and mem_dqsn are driven. continued... External Memory Interface Handbook Volume 3: Reference Material 46 1 Functional Description—UniPHY Signal Name Direction Width Description This signal must be asserted before afi_ wdata_valid to implement the write preamble, and must be driven for the correct duration to generate a correctly timed mem_dqs signal. afi_wdata_valid Input AFI_WRITE_DQS_WIDTH Write data valid signal. This signal controls the output enable on the data and data mask pins. afi_wdata Input AFI_DQ_WIDTH Write data signal to send to the memory device at double-data rate. This signal controls the PHY’s mem_dq output. afi_dm Input AFI_DM_WIDTH Data mask. This signal controls the PHY’s mem_dm signal for DDR2, DDR3, LPDDR2 and RLDRAM II memory devices.) afi_bws_n Input AFI_DM_WIDTH Data mask. This signal controls the PHY’s mem_bws_n signal for QDR II/II+ memory devices. afi_wrank Input AFI_WRANK_WIDTH Shadow register signal. Signal indicating the rank to which the controller is writing, so that the PHY can switch to the appropriate setting. Signal timing is identical to afi_dqs_burst; that is, afi_ wrank must be asserted at the same time as afi_dqs_burst, and must be of the same duration. 1.13.3.4 AFI Read Data Signals Read Data Signals for AFI 3.0 control the data sent from the memory device during read operations. Table 12. Read Data Signals Signal Name Direction Width Description afi_rdata_en Input AFI_RATE_RATIO Read data enable. Indicates that the memory controller is currently performing a read operation. This signal is held high only for cycles of relevant data (read data masking).If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). afi_rdata_en_full Input AFI_RATE_RATIO Read data enable full. Indicates that the memory controller is currently performing a read operation. This signal is held high for the entire read burst.If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). continued... External Memory Interface Handbook Volume 3: Reference Material 47 1 Functional Description—UniPHY Signal Name Direction Width Description afi_rdata Output AFI_DQ_WIDTH Read data from the memory device. This data is considered valid only when afi_rdata_valid is asserted by the PHY. afi_rdata_valid Output AFI_RATE_RATIO Read data valid. When asserted, this signal indicates that the afi_rdata bus is valid. If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). afi_rrank Input AFI_RRANK_WIDTH Shadow register signal. Signal indicating the rank from which the controller is reading, so that the PHY can switch to the appropriate setting. Must be asserted at the same time as afi_rdata_en when issuing a read command, and once asserted, must remain unchanged until the controller issues a new read command to another rank. 1.13.3.5 AFI Calibration Status Signals The PHY instantiates a sequencer which calibrates the memory interface with the memory device and some internal components such as read FIFOs and valid FIFOs. The sequencer reports the results of the calibration process to the controller through the Calibration Status Signals in the AFI interface. Table 13. Calibration Status Signals Signal Name Direction Width Description afi_cal_success Output 1 Asserted to indicate that calibration has completed successfully. afi_cal_fail Output 1 Asserted to indicate that calibration has failed. afi_cal_req Input 1 Effectively a synchronous reset for the sequencer. When this signal is asserted, the sequencer returns to the reset state; when this signal is released, a new calibration sequence begins. afi_wlat Output AFI_WLAT_WIDTH The required write latency in afi_clk cycles, between address/command and write data being issued at the PHY/ controller interface. The afi_wlat value can be different for different groups; each group’s write latency can range from 0 to 63. If write latency is the same for all groups, only the lowest 6 bits are required. afi_rlat Output AFI_RLAT_WIDTH The required read latency in afi_clk cycles between address/command and read data being returned to the PHY/controller interface. Values can range from 0 to 63. (1) Note to Table: 1. The afi_rlat signal is not supported for PHY-only designs. Instead, you can sample the afi_rdata_valid signal to determine when valid read data is available. External Memory Interface Handbook Volume 3: Reference Material 48 1 Functional Description—UniPHY 1.13.3.6 AFI Tracking Management Signals When tracking management is enabled, the sequencer can take control over the AFI 3.0 interface at given intervals, and issue commands to the memory device to track the internal DQS Enable signal alignment to the DQS signal returning from the memory device. The tracking management portion of the AFI 3.0 interface provides a means for the sequencer and the controller to exchange handshake signals. Table 14. Tracking Management Signals Signal Name Direction Width Description afi_ctl_refresh_done Input MEM_IF_CS_WIDTH Handshaking signal from controller to tracking manager, indicating that a refresh has occurred and waiting for a response. afi_seq_busy Output MEM_IF_CS_WIDTH Handshaking signal from sequencer to controller, indicating when DQS tracking is in progress. afi_ctl_long_idle Input MEM_IF_CS_WIDTH Handshaking signal from controller to tracking manager, indicating that it has exited low power state without a periodic refresh, and waiting for response. 1.14 Register Maps The following table lists the overall register mapping for the DDR2, DDR3, and LPDDR2 SDRAM Controllers with UniPHY. Note: Addresses shown in the table are 32-bit word addresses. If a byte-addressed master such as a Nios II processor accesses the CSR, it is necessary to multiply the addresses by four. Table 15. Register Map Address Description UniPHY Register Map 0x001 Reserved. 0x004 UniPHY status register 0. 0x005 UniPHY status register 1. 0x006 UniPHY status register 2. 0x007 UniPHY memory initialization parameters register 0. Controller Register Map 0x100 Reserved. 0x110 Controller status and configuration register. 0x120 Memory address size register 0. 0x121 Memory address size register 1. 0x122 Memory address size register 2. 0x123 Memory timing parameters register 0. continued... External Memory Interface Handbook Volume 3: Reference Material 49 1 Functional Description—UniPHY Address Description 0x124 Memory timing parameters register 1. 0x125 Memory timing parameters register 2. 0x126 Memory timing parameters register 3. 0x130 ECC control register. 0x131 ECC status register. 0x132 ECC error address register. 1.14.1 UniPHY Register Map The UniPHY register map allows you to control the memory components’ mode register settings. The following table lists the register map for UniPHY. Note: Addresses shown in the table are 32-bit word addresses. If a byte-addressed master such as a Nios II processor accesses the CSR, it is necessary to multiply the addresses by four. Table 16. UniPHY Register Map Address 0x001 0x002 0x004 0x005 0x006 Bit Name Default Access Description 15:0 Reserved. 0 — Reserved for future use. 31:16 Reserved. 0 — Reserved for future use. 15:0 Reserved. 0 — Reserved for future use. 31:16 Reserved. 0 — Reserved for future use. 0 SOFT_RESET — Write only 23:1 Reserved. 0 — 24 AFI_CAL_SUCCESS — Read only Reports the value of the UniPHY afi_cal_success. Writing to this bit has no effect. 25 AFI_CAL_FAIL — Read only Reports the value of the UniPHY afi_cal_fail. Writing to this bit has no effect. 26 PLL_LOCKED — Read only Reports the PLL lock status. 31:27 Reserved. 0 — Reserved for future use. 7:0 Reserved. 0 — Reserved for future use. 15:8 Reserved. 0 — Reserved for future use. 23:16 Reserved. 0 — Reserved for future use. 31:24 Reserved. 0 — Reserved for future use. 7:0 INIT_FAILING_STAGE — Read only Initiate a soft reset of the interface. This bit is automatically deasserted after reset. Reserved for future use. Initial failing error stage of calibration. Only applicable if AFI_CAL_FAIL=1. 0: None 1: Read Calibration - VFIFO 2: Write Calibration - Write Leveling continued... External Memory Interface Handbook Volume 3: Reference Material 50 1 Functional Description—UniPHY Address Bit Name Default Access Description 3: Read Calibration - LFIFO Calibration 4: Write Calibration - Write Deskew 5: Unused 6: Refresh 7: Calibration Skipped 8: Calibration Aborted 9: Read Calibration - VFIFO After Writes 15:8 INIT_FAILING_SUBSTA GE — Read only Initial failing error substage of calibration. Only applicable if AFI_CAL_FAIL=1. If INIT_FAILING_STAGE = 1 or 9: 1: Read Calibration - Guaranteed read failure 2: Read Calibration - No working DQSen phase found 3: Read Calibration - Per-bit read deskew failure If INIT_FAILING_STAGE = 2: 1: Write Calibration - No first working write leveling phase found 2: Write Calibration - No last working write leveling phase found 3: Write Calibration - Write leveling copy failure If INIT_FAILING_STAGE = other, substages do not apply. 23:16 INIT_FAILING_GROUP — Read only Initial failing error group of calibration. Only applicable if AFI_CAL_FAIL=1. Returns failing DQ pin instead of failing group, if: INIT_FAILING_STAGE=1 and INIT_FAILING_SUBSTAGE=3. Or INIT_FAILING_STAGE=4 and INIT_FAILING_SUBSTAGE=1. 31:24 Reserved. 0 — 0x007 31:0 DQS_DETECT — Read only Identifies if DQS edges have been identified for each of the groups. Each bit corresponds to one DQS group. 0x008(DD R2) 1:0 RTT_NOM — Read only Rtt (nominal) setting of the DDR2 Extended Mode Register used during memory initialization. 31:2 Reserved. 0 — 2:0 RTT_NOM — Rtt (nominal) setting of the DDR3 MR1 mode register used during memory initialization. 4:3 Reserved. 0 Reserved for future use. 6:5 ODS — Output driver impedence control setting of the DDR3 MR1 mode register used during memory initialization. 0x008(DD R3) Reserved for future use. Reserved for future use. continued... External Memory Interface Handbook Volume 3: Reference Material 51 1 Functional Description—UniPHY Address 0x008(LP DDR2) Bit Name Default Access Description 8:7 Reserved. 0 Reserved for future use. 10:9 RTT_WR — Rtt (writes) setting of the DDR3 MR2 mode register used during memory initialization. 31:11 Reserved. 0 Reserved for future use. 3:0 DS Driver impedence control for MR3 during initialization. 31:4 Reserved. Reserved for future use. 1.14.2 Controller Register Map The controller register map allows you to control the memory controller settings. Note: Dynamic reconfiguration is not currently supported. For information on the controller register map, refer to Controller Register Map, in the Functional Description—HPC II chapter. Related Links • Soft Controller Register Map on page 371 The soft controller register map allows you to control the soft memory controller settings. • Hard Controller Register Map on page 375 The hard controller register map allows you to control the hard memory controller settings. 1.15 Ping Pong PHY Ping Pong PHY is an implementation of UniPHY that allows two memory interfaces to share address and command buses through time multiplexing. Compared to having two independent interfaces, Ping Pong PHY uses fewer pins and less logic, while maintaining equivalent throughput. The Ping Pong PHY supports only quarter-rate configurations of the DDR3 protocol on Arria V GZ and Stratix V devices. 1.15.1 Ping Pong PHY Feature Description In conventional UniPHY, the address and command buses of a DDR3 quarter-rate interface use 2T time—meaning that they are issued for two full-rate clock cycles, as illustrated below. External Memory Interface Handbook Volume 3: Reference Material 52 1 Functional Description—UniPHY Figure 19. 2T Command Timing CK CSn Addr, ba Extra Setup Time 2T Command Issued Active Period With the Ping Pong PHY, address and command signals from two independent controllers are multiplexed onto shared buses by delaying one of the controller outputs by one full-rate clock cycle. The result is 1T timing, with a new command being issued on each full-rate clock cycle. The following figure shows address and command timing for the Ping Pong PHY. Figure 20. 1T Command Timing Use by Ping Pong PHY CK CSn[0] CSn[1] Addr, ba Cmd Dev1 Cmd Dev0 External Memory Interface Handbook Volume 3: Reference Material 53 1 Functional Description—UniPHY 1.15.2 Ping Pong PHY Architecture The following figure shows a top-level block diagram of the Ping Pong PHY. Functionally, the IP looks like two independent memory interfaces. The two controller blocks are referred to as right-hand side (RHS) and left-hand side (LHS), respectively. A gasket block located between the controllers and the PHY merges the AFI signals. The PHY is double data width and supports both memory devices. The sequencer is the same as with regular UniPHY, and calibrates the entire double-width PHY. Figure 21. Ping Pong PHY Architecture Ping Pong PHY Driver 0 DQ, DQS, DM Sequencer x2n CS, ODT, CKE PHY x2n Ping Pong Gasket Driver 1 Dev 0 xn CAS, RAS, WE, ADDR, BA AFI Multiplexer LHS Controller xn CAS, RAS, WE, ADDR, BA RHS Controller CS, ODT, CKE DQ, DQS, DM Dev 1 xn xn 1.15.2.1 Ping Pong Gasket The gasket delays and remaps quarter-rate signals so that they are correctly timemultiplexed at the full-rate PHY output. The gasket also merges address and command buses ahead of the PHY. AFI interfaces at the input and output of the gasket provide compatibility with the PHY and with memory controllers. Figure 22. Ping Pong PHY Gasket Architecture 2x AFI AFI Ping Pong Gasket RHS Controller ADD/CMD, WDATA RDATA LHS Controller ADD/CMD, WDATA RDATA 1T Delay Reorder & Merge Reorder & Split ADD/CMD, WDATA RDATA To AFI Multiplexer & PHY From AFI Multiplexer & PHY The following table shows how the gasket processes key AFI signals. External Memory Interface Handbook Volume 3: Reference Material 54 1 Functional Description—UniPHY Table 17. Key AFI Signals Processed by Ping Pong PHY Gasket Signal Direction(Width multiplier) Description Gasket Conversions cas, ras, we, addr, ba Controller (1x) to PHY (1x) Address and command buses shared between devices. Delay RHS by 1T; merge. cs, odt, cke Controller (1x) to PHY (2x) Chip select, on-die termination, and clock enable, one per device. Delay RHS by 1T; reorder, merge. wdata, wdata_valid, dqs_burst, dm Controller (1x) to PHY (2x) Write datapath signals, one per device. Delay RHS by 1T; reorder, merge. rdata_en_rd, rdata_en_rd_full Controller (1x) to PHY (2x) Read datapath enable signals indicating controller performing a read operation, one per device. Delay RHS by 1T. rdata_rdata_valid PHY (2x) to Controller (1x) Read data, one per device. Reorder; split. cal_fail, cal_success, seq_busy, wlat, rlat PHY (1x) to Controller (1x) Calibration result, one per device. Pass through. rst_n, mem_clk_disable, ctl_refresh_done, ctl_long_idle Controller (1x) to PHY (1x) Reset and DQS tracking signals, one per PHY. AND (&) cal_req, init_req Controller (1x) to PHY (1x) Controller to sequencer requests. OR (|) wrank, rrank Controller (1x) to PHY (2x) Shadow register support. Delay RHS by 1T; reorder; merge. 1.15.2.2 Ping Pong PHY Calibration The sequencer treats the Ping Pong PHY as a regular interface of double the width. For example, in the case of two x16 devices, the sequencer calibrates both devices together as a x32 interface. The sequencer chip select signal fans out to both devices so that they are treated as a single interface. The VFIFO calibration process is unchanged. For LFIFO calibration, the LFIFO buffer is duplicated for each interface and the worst-case read datapath delay of both interfaces is used. 1.15.3 Ping Pong PHY Operation To use the Ping Pong PHY, proceed as described below. 1. Configure a single memory interface according to your requirements. 2. Select the Enable Ping Pong PHY option in the Advanced PHY Options section of the PHY Settings tab in the DDR3 parameter editor. The Quartus Prime software then replicates the interface, resulting in two memory controllers and a shared PHY, with the gasket block inserted between the controllers and PHY. The system makes the necessary modifications to top-level component connections, as well as the PHY read and write datapaths, and the AFI mux, without further input from you. 1.16 Efficiency Monitor and Protocol Checker The Efficiency Monitor and Protocol Checker allows measurement of traffic efficiency on the Avalon-MM bus between the controller and user logic, measures read latencies, and checks the legality of Avalon commands passed from the master. The Efficiency External Memory Interface Handbook Volume 3: Reference Material 55 1 Functional Description—UniPHY Monitor and Protocol Checker is available with the DDR2, DDR3, and LPDDR2 SDRAM controllers with UniPHY and the RLDRAM II Controller with UniPHY. The Efficiency Monitor and Protocol Checker is is not available for QDR II and QDR II+ SRAM, or for the MAX 10 device family, or for Arria V or Cyclone V designs using the Hard Memory Controller. 1.16.1 Efficiency Monitor The Efficiency Monitor reports read and write throughput on the controller input, by counting command transfers and wait times, and making that information available to the External Memory Interface Toolkit via an Avalon slave port. This information may be useful to you when experimenting with advanced controller settings, such as command look ahead depth and burst merging. 1.16.2 Protocol Checker The Protocol Checker checks the legality of commands on the controller’s input interface against the Intel Avalon interface specification, and sets a flag in a register on an Avalon slave port if an illegal command is detected. 1.16.3 Read Latency Counter The Read Latency Counter measures the minimum and maximum wait times for read commands to be serviced on the Avalon bus. Each read command is time-stamped and placed into a FIFO buffer upon arrival, and latency is determined by comparing that timestamp to the current time when the first beat of the returned read data is provided back to the master. 1.16.4 Using the Efficiency Monitor and Protocol Checker To include the Efficiency Monitor and Protocol Checker when you generate your IP core, proceed as described below. 1. On the Diagnostics tab in the parameter editor, turn on Enable the Efficiency Monitor and Protocol Checker on the Controller Avalon Interface. 2. To see the results of the data compiled by the Efficiency Monitor and Protocol Checker, use the External Memory Interface Toolkit. For information on the External Memory Interface Toolkit, refer to External Memory Interface Debug Toolkit, in section 2 of this volume. For information about the Avalon interface, refer to Avalon Interface Specifications. Related Links • Avalon Interface Specifications • External Memory Interface Debug Toolkit on page 480 The EMIF Toolkit lets you run your own traffic patterns, diagnose and debug calibration problems, and produce margining reports for your external memory interface. External Memory Interface Handbook Volume 3: Reference Material 56 1 Functional Description—UniPHY 1.16.5 Avalon CSR Slave and JTAG Memory Map The following table lists the memory map of registers inside the Efficiency Monitor and Protocol Checker. This information is only of interest if you want to communicate directly with the Efficiency Monitor and Protocol Checker without using the External Memory Interface Toolkit. This CSR map is not part of the UniPHY CSR map. Prior to reading the data in the CSR, you must issue a read command to address 0x01 to take a snapshot of the current data. Table 18. Avalon CSR Slave and JTAG Memory Map Address Bit Name Default Access Description Used internally by EMIF Toolkit to identify Efficiency Monitor type. This address must be read prior to reading the other CSR contents. 0x01 31:0 Reserved 0 Read Only 0x02 31:0 Reserved 0 — 0x08 0 Efficiency Monitor reset — Write only 7:1 Reserved — — 8 Protocol Checker reset — Write only 15:9 Reserved — — 16 Start/stop Efficiency Monitor — Read/Write 23:17 Reserved — — 31:24 Efficiency Monitor status — Read Only bit bit bit bit 15:0 Efficiency Monitor address width — Read Only Address width of the Efficiency Monitor. 31:16 Efficiency Monitor data width — Read Only Data Width of the Efficiency Monitor. 15:0 Efficiency Monitor byte enable — Read Only Byte enable width of the Efficiency Monitor. 31:16 Efficiency Monitor burst count width — Read Only Burst count width of the Efficiency Monitor. 0x14 31:0 Cycle counter — Read Only Clock cycle counter for the Efficiency Monitor. Lists the number of clock cycles elapsed before the Efficiency Monitor stopped. 0x18 31:0 Transfer counter — Read Only Counts any read or write data transfer cycle. 0x1C 31:0 Write counter — Read Only Counts write requests, including those during bursts. 0x20 31:0 Read counter — Read Only Counts read requests. 0x24 31:0 Readtotal counter — Read Only Counts read requests (total burst requests). 0x10 0x11 Used internally by EMIF Toolkit to identify Efficiency Monitor version. Write a 0 to reset. Reserved for future use. Write a 0 to reset. Reserved for future use. Starting and stopping statistics gathering. Reserved for future use. 0: 1: 2: 3: Efficiency Monitor stopped Waiting for start of pattern Running Counter saturation continued... External Memory Interface Handbook Volume 3: Reference Material 57 1 Functional Description—UniPHY Address Bit Name Default Access Description 0x28 31:0 NTC waitrequest counter — Read Only Counts Non Transfer Cycles (NTC) due to slave wait request high. 0x2C 31:0 NTC noreaddatavalid counter — Read Only Counts Non Transfer Cycles (NTC) due to slave not having read data. 0x30 31:0 NTC master write idle counter — Read Only Counts Non Transfer Cycles (NTC) due to master not issuing command, or pause in write burst. 0x34 31:0 NTC master idle counter — Read Only Counts Non Transfer Cycles (NTC) due to master not issuing command anytime. 0x40 31:0 Read latency min — Read Only The lowest read latency value. 0x44 31:0 Read latency max — Read Only The highest read latency value. 0x48 31:0 Read latency total [31:0] — Read Only The lower 32 bits of the total read latency. 0x49 31:0 Read latency total [63:32] — Read Only The upper 32 bits of the total read latency. 0x50 7:0 Illegal command — Read Only Bits used to indicate which illegal command has occurred. Each bit represents a unique error. 31:8 Reserved — — Reserved for future use. 1.17 UniPHY Calibration Stages The DDR2, DDR3, and LPDDR2 SDRAM, QDR II and QDR II+ SRAM, and RLDRAM II Controllers with UniPHY, and the RLDRAM 3 PHY-only IP, go through several stages of calibration. Calibration information is useful in debugging calibration failures. The section includes an overview of calibration, explanation of the calibration stages, and a list of generated calibration signals. The information in this section applies only to the Nios II-based sequencer used in the DDR2, DDR3, and LPDDR2 SDRAM Controllers with UniPHY versions 10.0 and later, and, optionally, in the QDR II and QDR II+ SRAM and RLDRAM II Controllers with UniPHY version 11.0 and later, and the RLDRAM 3 PHY-only IP. The information in this section applies to the Arria II GZ, Arria V, Arria V GZ, Cyclone V, Stratix III, Stratix IV, and Stratix V device families. Note: For QDR II and QDR II+ SRAM and RLDRAM II Controllers with UniPHY version 11.0 and later, you have the option to select either the RTL-based sequencer or the Nios II-based sequencer. Generally, choose the RTL-based sequencer when area is the major consideration, and choose the Nios II-based sequencer when performance is the major consideration. Note: For RLDRAM 3, write leveling is not performed. The sequencer does not attempt to optimize margin for the tCKDK timing requirement. 1.17.1 Calibration Overview Calibration configures the memory interface (PHY and I/Os) so that data can pass reliably to and from memory. External Memory Interface Handbook Volume 3: Reference Material 58 1 Functional Description—UniPHY The sequencer illustrated in the figure below calibrates the PHY and the I/Os. To correctly transmit data between a memory device and the FPGA at high speed, the data must be center-aligned with the data clock. Calibration also determines the delay settings needed to center-align the various data signals with respect to their clocks. I/O delay chains implement the required delays in accordance with the computed alignments. The Nios II-based sequencer performs two major tasks: FIFO buffer calibration and I/O calibration. FIFO buffer calibration adjusts FIFO lengths and I/O calibration adjusts any delay chain and phase settings to centeralign data signals with respect to clock signals for both reads and writes. When the calibration process completes, the sequencer shuts off and passes control to the memory controller. Sequencer in Memory Interface Logic Memory Device UniPHY PHY PLL Controller Sequencer User Interface (Avalon-MM) Figure 23. 1.17.2 Calibration Stages The calibration process begins when the PHY reset signal deasserts and the PLL and DLL lock. The following stages of calibration take place: 1. Read calibration part one—DQS enable calibration (only for DDR2 and DDR3 SDRAM Controllers with UniPHY) and DQ/DQS centering 2. Write calibration part one—Leveling 3. Write calibration part two—DQ/DQS centering 4. Read calibration part two—Read latency minimization Note: For multirank calibration, the sequencer transmits every read and write command to each rank in sequence. Each read and write test is successful only if all ranks pass the test. The sequencer calibrates to the intersection of all ranks. The calibration process assumes the following conditions; if either of these conditions is not true, calibration likely fails in its early stages: • The address and command paths must be functional; calibration does not tune the address and command paths. (The Quartus Prime software fully analyzes the timing for the address and command paths, and the slack report is accurate, assuming the correct board timing parameters.) • At least one bit per group must work before running per-bit-deskew calibration. (This assumption requires that DQ-to-DQS skews be within the recommended 20 ps.) External Memory Interface Handbook Volume 3: Reference Material 59 1 Functional Description—UniPHY 1.17.3 Memory Initialization The memory is powered up according to protocol initialization specifications. All ranks power up simultaneously. Once powered, the device is ready to receive mode register load commands. This part of initialization occurs separately for each rank. The sequencer issues mode register set commands on a per-chip-select basis and initializes the memory to the user-specified settings. 1.17.4 Stage 1: Read Calibration Part One—DQS Enable Calibration and DQ/DQS Centering Read calibration occurs in two parts. Part one is DQS enable calibration with DQ/DQS centering, which happens during stage 1 of the overall calibration process; part two is read latency minimization, which happens during stage 4 of the overall calibration process. The objectives of DQS enable calibration and DQ/DQS centering are as follows: • To calculate when the read data is received after a read command is issued to setup the Data Valid Prediction FIFO (VFIFO) cycle • To align the input data (DQ) with respect to the clock (DQS) to maximize the read margins (DDR2 and DDR3 only) DQS enable calibration and DQ/DQS centering consists of the following actions: • Guaranteed Write • DQS Enable Calibration • DQ/DQS Centering The following figure illustrates the components in the read data path that the sequencer calibrates in this stage. (The round knobs in the figure represent configurable hardware over which the sequencer has control.) External Memory Interface Handbook Volume 3: Reference Material 60 1 Functional Description—UniPHY Figure 24. Read Data Path Calibration Model DQS Capture DQ Read Data LFIFO DQS Enable DQS Read Enable Read Calibration VFIFO mem_clk afi_clk 1.17.4.1 Guaranteed Write Because initially no communication can be reliably performed with the memory device, the sequencer uses a guaranteed write mechanism to write data into the memory device. (For the QDR II protocol, guaranteed write is not necessary, a simple write mechanism is sufficient.) The guaranteed write is a write command issued with all data pins, all address and bank pins, and all command pins (except chip select) held constant. The sequencer begins toggling DQS well before the expected latch time at memory and continues to toggle DQS well after the expected latch time at memory. DQ-to-DQS relationship is not a factor at this stage because DQ is held constant. Figure 25. Guaranteed Write of Zeros Ex tended ea rlier DQ[0] Ac tual burst Ex tended later 000 0 00 0 00 0 DQ S External Memory Interface Handbook Volume 3: Reference Material 61 1 Functional Description—UniPHY The guaranteed write consists of a series of back-to-back writes to alternating columns and banks. For example, for DQ[0] for the DDR3 protocol, the guaranteed write performs the following operations: • Writes a full burst of zeros to bank 0, column 0 • Writes a full burst of zeros to bank 0, column 1 • Writes a full burst of ones to bank 3, column 0 • Writes a full burst of ones to bank 3, column 1 (Different protocols may use different combinations of banks and columns.) The guaranteed write is followed by back-to-back read operations at alternating banks, effectively producing a stream of zeros followed by a stream of ones, or vice versa. The sequencer uses the zero-to-one and one-to-zero transitions in between the two bursts to identify a correct read operation, as shown in the figure below. Although the approach described above for pin DQ[0] would work by writing the same pattern to all DQ pins, it is more effective and robust to write (and read) alternating ones and zeros to alternating DQ bits. The value of the DQ bit is still constant across the burst, and the back-to-back read mechanism works exactly as described above, except that odd DQ bits have ones instead of zeros, or vice versa. The guaranteed write does not ensure a correct DQS-to-memory clock alignment at the memory device—DQS-to-memory clock alignment is performed later, in stage 2 of the calibration process. However, the process of guaranteed write followed by read calibration is repeated several times for different DQS-to-memory clock alignments, to ensure at least one correct alignment is found. Figure 26. Back to Back Reads on Pin DQ[0] B an k 0 B an k 3 C o lu m n 0 000 0 00 0 0 111 1 11 1 1 C o lu m n 1 111 1 11 1 1 000 0 00 0 0 B an k 3 B an k 0 1.17.4.2 DQS Enable Calibration DQS enable calibration ensures reliable capture of the DQ signal without glitches on the DQS line. At this point LFIFO is set to its maximum value to guarantee a reliable read from read capture registers to the core. Read latency is minimized later. Note: The full DQS enable calibration is applicable only for DDR2 and DDR3 protocols; QDR II and RLDRAM protocols use only the VFIFO-based cycle-level calibration, described below. Note: Delay and phase values used in this section are examples, for illustrative purposes. Your exact values may vary depending on device and configuration. DQS enable calibration controls the timing of the enable signal using 3 independent controls: a cycle-based control (the VFIFO), a phase control, and a delay control. The VFIFO selects the cycle by shifting the controller-generated read data enable signal, External Memory Interface Handbook Volume 3: Reference Material 62 1 Functional Description—UniPHY rdata_en, by a number of full-rate clock cycles. The phase is controlled using the DLL, while the delays are adjusted using a sequence of individual delay taps. The resolution of the phase and delay controls varies with family and configuration, but is approximately 45° for the phase, and between 10 and 50 picoseconds for the delays. The sequencer finds the two edges of the DQS enable window by searching the space of cycles, phases, and delays (an exhaustive search can usually be avoided by initially assuming the window is at least one phase wide). During the search, to test the current settings, the sequencer issues back-to-back reads from column 0 of bank 0 and bank 3, and column 1 of bank 0 and bank 3, as shown in the preceding figure. Two full bursts are read and compared with the reference data for each phase and delay setting. Once the sequencer identifies the two edges of the window, it center-aligns the falling edge of the DQS enable signal within the window. At this point, per-bit deskew has not yet been performed, therefore not all bits are expected to pass the read test; however, for read calibration to succeed, at least one bit per group must pass the read test. The following figure shows the DQS and DQS enable signal relationship. The goal of DQS enable calibration is to find settings that satisfy the following conditions: • The DQS enable signal rises before the first rising edge of DQS. • The DQS enable signal is at one after the second-last falling edge of DQS. • The DQS enable signal falls before the last falling edge of DQS. The ideal position for the falling edge of the DQS enable signal is centered between the second-last and last falling edges of DQS. Figure 27. DQS and DQS Enable Signal Relationships Row 1 Row 2 DQS +90 dqs_enable (inside I/O) VFIFO Latency Row 3 dqs_enable aligned Row 4 dqs_enable (inside I/O) Row 5 dqs_enable aligned nable DQS E VFIFO Latency 1 Search for first working setting nable DQS E Search for last working setting External Memory Interface Handbook Volume 3: Reference Material 63 1 Functional Description—UniPHY The following points describe each row of the above figure: • Row 1 shows the DQS signal shifted by 90° to center-align it to the DQ data. • Row 2 shows the raw DQS enable signal from the VFIFO. • Row 3 shows the effect of sweeping DQS enable phases. The first two settings (shown in red) fail to properly gate the DQS signal because the enable signal turns off before the second-last falling edge of DQS. The next six settings (shown in green) gate the DQS signal successfully, with the DQS signal covering DQS from the first rising edge to the second-last falling edge. • Row 4 shows the raw DQS enable signal from the VFIFO, increased by one clock cycle relative to Row 2. • Row 5 shows the effect of sweeping DQS enable, beginning from the initial DQS enable of Row 4. The first setting (shown in green) successfully gates DQS, with the signal covering DQS from the first rising edge to the second-last falling edge. The second signal (shown in red), does not gate DQS successfully because the enable signal extends past the last falling edge of DQS. Any further adjustment would show the same failure. 1.17.4.3 Centering DQ/DQS The centering DQ/DQS stage attempts to align DQ and DQS signals on reads within a group. Each DQ signal within a DQS group might be skewed and consequently arrive at the FPGA at a different time. At this point, the sequencer sweeps each DQ signal in a DQ group to align them, by adjusting DQ input delay chains (D1). The following figure illustrates a four DQ/DQS group per-bit-deskew and centering. Figure 28. Per-bit Deskew DQ0 DQ1 DQ2 DQ3 DQ4 Original DQ relationship in a DQ group DQ signals aligned to left DQS centered with respect to by tuning D1 delay chains DQ by adjusting DQ signals To align and center DQ and DQS, the sequencer finds the right edge of DQ signals with respect to DQS by sweeping DQ signals within a DQ group to the right until a failure occurs. In the above figure, DQ0 and DQ3 fail after six taps to the right; DQ1 and DQ2 fail after 5 taps to the right. To align the DQ signals, DQ0 and DQ3 are shifted to the right by 1 tap. External Memory Interface Handbook Volume 3: Reference Material 64 1 Functional Description—UniPHY To find the center of DVW, the DQS signal is shifted to the right until a failure occurs. In the above figure, a failure occurs after 3 taps, meaning that there are 5 taps to the right edge and 3 taps to the left edge. To center-align DQ and DQS, the sequencer shifts the aligned DQ signal by 1 more tap to the right. Note: The sequencer does not adjust DQS directly; instead, the sequencer center-aligns DQS with respect to DQ by delaying the DQ signals. 1.17.5 Stage 2: Write Calibration Part One The objectives of the write calibration stage are to align DQS to the memory clock at each memory device, and to compensate for address, command, and memory clock skew at each memory device. This stage is important because the address, command, and clock signals for each memory component arrive at different times. Note: This stage applies only to DDR2, DDR3, LPDDR2, and RLDRAM II protocols; it does not apply to the QDR II and QDR II+ protocols. Memory clock signals and DQ/DM and DQS signals have specific relationships mandated by the memory device. The PHY must ensure that these relationships are met by skewing DQ/DM and DQS signals. The relationships between DQ/DM and DQS and memory clock signals must meet the tDQSS, tDSS, and tDSH timing constraints. The sequencer calibrates the write data path using a variety of random burst patterns to compensate for the jitter on the output data path. Simple write patterns are insufficient to ensure a reliable write operation because they might cause imprecise DQS-to-CK alignments, depending on the actual capture circuitry on a memory device. The write patterns in the write leveling stage have a burst length of 8, and are generated by a linear feedback shift register in the form of a pseudo-random binary sequence. The write data path architecture is the same for DQ, DM, and DQS pins. The following figure illustrates the write data path for a DQ signal. The phase coming out of the Output Phase Alignment block can be set to different values to center-align DQS with respect to DQ, and it is the same for data, OE, and OCT of a given output. Figure 29. Write Data Path OC T write_ck 2 DDIO T9 ? T10 ? DDIO T9 ? T10 ? DDIO T9 ? T10 ? ? Output Phase Alignment OE 2 data 2 OC T External Memory Interface Handbook Volume 3: Reference Material 65 1 Functional Description—UniPHY In write leveling, the sequencer performs write operations with different delay and phase settings, followed by a read. The sequencer can implement any phase shift between 0° and 720° (depending on device and configuration). The sequencer uses the Output Phase Alignment for coarse delays and T9 and T10 for fine delays; T9 has 15 taps of 50 ps each, and T10 has 7 taps of 50 ps each. The DQS signal phase is held at +90° with respect to DQ signal phase (Stratix IV example). Note: Coarse delays are called phases, and fine delays are called delays; phases are process, voltage, and temperature (PVT) compensated, delays are not (depending on family). For 28 nm devices: Note: • I/O delay chains are not PVT compensated. • DQS input delay chain is PVT compensated. • Leveling delay chains are PVT compensated (does not apply to Arria V or Cyclone V devices). • T11 delay chain for postamble gating has PVT and nonPVT compensated modes, but the PVT compensated mode is not used. Delay and phase values used in this section are examples, for illustrative purposes. Your exact values may vary depending on device and configuration. The sequencer writes and reads back several burst-length-8 patterns. Because the sequencer has not performed per-bit deskew on the write data path, not all bits are expected to pass the write test. However, for write calibration to succeed, at least one bit per group must pass the write test. The test begins by shifting the DQ/DQS phase until the first write operation completes successfully. The DQ/DQS signals are then delayed to the left by D5 and D6 to find the left edge for that working phase. Then DQ/DQS phase continues the shift to find the last working phase. For the last working phase, DQ/DQS is delayed in 50 ps steps to find the right edge of the last working phase. The sequencer sweeps through all possible phase and delay settings for each DQ group where the data read back is correct, to define a window within which the PHY can reliably perform write operations. The sequencer picks the closest value to the center of that window as the phase/delay setting for the write data path. 1.17.6 Stage 3: Write Calibration Part Two—DQ/DQS Centering The process of DQ/DQS centering in write calibration is similar to that performed in read calibration, except that write calibration is performed on the output path, using D5 and D6 delay chains. 1.17.7 Stage 4: Read Calibration Part Two—Read Latency Minimization At this stage of calibration the sequencer adjusts LFIFO latency to determine the minimum read latency that guarantees correct reads. External Memory Interface Handbook Volume 3: Reference Material 66 1 Functional Description—UniPHY Read Latency Tuning In general, DQ signals from different DQ groups may arrive at the FPGA in a staggered fashion. In a DIMM or multiple memory device system, the DQ/DQS signals from the first memory device arrive sooner, while the DQ/DQS signals from the last memory device arrive the latest at the FPGA. LFIFO transfers data from the capture registers in IOE to the core and aligns read data to the AFI clock. Up to this point in the calibration process, the read latency has been a maximum value set initially by LFIFO; now, the sequencer progressively lowers the read latency until the data can no longer be transferred reliably. The sequencer then increases the latency by one cycle to return to a working value and adds an additional cycle of margin to assure reliable reads. 1.17.8 Calibration Signals The following table lists signals produced by the calibration process. Table 19. Calibration Signals Signal Description afi_cal_fail Asserts high if calibration fails. afi_cal_success Asserts high if calibration is successful. 1.17.9 Calibration Time The time needed for calibration varies, depending on many factors including the interface width, the number of ranks, frequency, board layout, and difficulty of calibration. In general, designs using the Nios II-based sequencer will take longer to calibrate than designs using the RTL-based sequencer. The following table lists approximate typical and maximum calibration times for various protocols. Table 20. Approximate Calibration Times Protocol Typical Calibration Time Maximum Calibration Time DDR2, DDR3, LPDDR2, RLDRAM 3 50-250 ms Can take several minutes if the interface is difficult to calibrate, or if calibration initially fails and exhausts multiple retries. QDR II/II+, RLDRAM II (with Nios II-based sequencer) 50-100 ms Can take several minutes if the interface is difficult to calibrate, or if calibration initially fails and exhausts multiple retries. QDR II/II+, RLDRAM II (with RTLbased sequencer) <5 ms <5 ms 1.18 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 • Changed Note 3 to the Word-Aligned Writes figure, in the PHY-toController Interfaces topic. continued... External Memory Interface Handbook Volume 3: Reference Material 67 1 Functional Description—UniPHY Date Version Changes May 2016 2016.05.02 • Added statement that Efficiency Monitor and Protocol checker is not available for QDR II and QDR II+ SRAM, or for the MAX 10 device family, or for Arria V or Cyclone V designs using the Hard Memory Controller, to Efficiency Monitor and Protocol Checker topic. November 2015 2015.11.02 • • Replaced the AFI 4.0 Specification with the AFI 3.0 Specification. Replaced instances of Quartus II with Quartus Prime. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 • • • August 2014 2014.08.15 • • December 2013 2013.12.16 • • • • • • • • • November 2012 3.1 • • • • Added several parameters to the AFI Specification section: — MEM_IF_BANKGROUP_WIDTH — MEM_IF_C_WIDTH — MEM_IF_CKE_WIDTH — MEM_IF_ODT_WIDTH — AFI_BANKGROUP_WIDTH — AFI_C_WIDTH — AFI_CKE_WIDTH — AFI_ODT_WIDTH Added several signals to the AFI Specification section: — afi_addr — afi_bg — afi_c_n — afi_rw_n — afi_act_n — afi_par — afi_alert_n — afi_ainv — afi_dinv Changed the Width information for several signals in the AFI Specification section: — afi_dqs_burst — afi_wdata_valid — afi_stl_refresh_done — afi_seq_busy — afi_ctl_long_idle Added note about 32-bit word addresses to Register Maps and UniPHY Register Map tables. Changed register map information for address 0x004, bit 26, in UniPHY Register Map table. Removed references to HardCopy. Removed DLL Offset Control Block. Removed references to SOPC Builder. Increased minimum recommended pulse width for global_reset_n signal to 100ns. Corrected terminology inconsistency. Added information explaining PVT compensation. Added quarter-rate information to PHY-to-Controller Interfaces section. Expanded descriptions of INIT_FAILING_STAGE and INIT_FAILING_SUBSTAGE in UniPHY Register map. Added footnote about afi_rlat signal to Calibration Status Signals table. Moved Controller Register Map to Functional Description—HPC II Controller chapter. Updated Sequencer States information in Table 1–2. Enhanced Using a Custom Controller information. Enhanced Tracking Manager information. continued... External Memory Interface Handbook Volume 3: Reference Material 68 1 Functional Description—UniPHY Date Version Changes • • • • Added Added Added Added Ping Pong PHY information. RLDRAM 3 support. LRDIMM support. Arria V GZ support. Shadow Registers section. LPDDR2 support. new AFI signals. Calibration Time section. Feedback icon. June 2012 3.0 • • • • • Added Added Added Added Added November 2011 2.1 • Consolidated UniPHY information from 11.0 DDR2 and DDR3 SDRAM Controller with UniPHY User Guide, QDR II and QDR II+ SRAM Controller with UniPHY User Guide, and RLDRAM II Controller with UniPHY IP User Guide. Revised Reset and Clock Generation and Dedicated Clock Networks sections. Revised Figure 1–3 and Figure 1–5. Added Tracking Manager to Sequencer section. Revised Interfaces section for DLL, PLL, and OCT sharing interfaces. Revised Using a Custom Controller section. Added UniPHY Calibration Stages section; reordered stages 3 and 4, removed stage 5. • • • • • • External Memory Interface Handbook Volume 3: Reference Material 69 2 Functional Description—Intel Stratix® 10 EMIF IP 2 Functional Description—Intel Stratix® 10 EMIF IP Intel Stratix® 10 devices can interface with external memory devices clocking at frequencies of up to 1.3 GHz. The external memory interface IP component for Stratix 10 devices provides a single parameter editor for creating external memory interfaces, regardless of memory protocol. Unlike earlier EMIF solutions which used protocolspecific parameter editors to create memory interfaces via a complex RTL generation method, the Stratix 10 EMIF solution captures the protocol-specific hardened EMIF logic of the Stratix 10 device together with more generic soft logic. The Stratix 10 EMIF solution is designed with the following implementations in mind: Hard Memory Controller and Hard PHY This implementation provides a complete external memory interface, based on the hard memory controller and hard PHY that are part of the Stratix 10 silicon. An Avalon-MM interface is available for integration with user logic. Soft Memory Controller and Hard PHY This implementation provides a complete external memory interface, using an Intelprovided soft-logic-based memory controller and the hard PHY that is part of the Stratix 10 silicon. An Avalon-MM interface is available for integration with user logic. Custom Memory Controller and Hard PHY (PHY only) This implementation provides access to the Altera PHY interface (AFI), to allow use of a custom or third-party memory controller with the hard PHY that is part of the Stratix 10 silicon. Because only the PHY component is provided by Intel, this configuration is also known as PHY only. 2.1 Stratix 10 Supported Memory Protocols The following table lists the external memory protocols supported by Stratix 10 devices. Table 21. Supported Memory Protocols Protocol Hard Controller and Hard PHY Soft Controller and Hard PHY PHY Only DDR4 Yes — Yes DDR3 Yes — Yes LPDDR3 Yes — Yes RLDRAM 3 — Third party Yes QDR II/II+/II+ Xtreme — Yes — QDR-IV — Yes — Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 2 Functional Description—Intel Stratix® 10 EMIF IP Memory protocols not listed above are not supported by the Stratix 10 EMIF IP; however, you can implement a custom memory interface for these protocols using the PHYLite Megafunction. 2.2 Stratix 10 EMIF IP Support for 3DS/TSV DDR4 Devices The Stratix 10 EMIF IP supports high-density DDR4 memory devices with threedimensional stacked-die (3DS) memory technology in RDIMM format. Both 2H and 4H 3DS devices are supported. To configure a DDR4 RDIMM memory interface to use 3DS devices, do one of the following: • Select one of the 3DS-enabled IP presets available in the parameter editor. • 1. Specify a non-zero value for the Chip ID Width parameter on the Memory tab in the parameter editor. 2. Enter the 3DS-specific specific timing parameters (tRDD_dir, tFAW_dir, tRFC_dir) from the memory vendor's data sheet appropriately. 2.3 Migrating to Stratix 10 from Previous Device Families There is no automatic migration mechanism for external memory interface IP generated for previous device families. To migrate an existing EMIF IP from an earlier device family to Stratix 10, you must reparameterize and regenerate the IP targeting Stratix 10, using either the IP Catalog or Qsys. If you attempt to recompile an existing IP generated for a previous device family, you will encounter errors in the Quartus Prime software. UniPHY-based IP continues to be supported for previous device families. 2.4 Stratix 10 EMIF Architecture: Introduction The Stratix 10 EMIF architecture contains many new hardware features designed to meet the high-speed requirements of emerging memory protocols, while consuming the smallest amount of core logic area and power. The following are key hardware features of the Stratix 10 EMIF architecture: Hard Sequencer The sequencer employs a hard Nios II processor, and can perform memory calibration for a wide range of protocols. You can share the sequencer among multiple memory interfaces of the same or different protocols. Note: You cannot use the hard Nios II processor for any user applications after calibration is complete. Hard PHY The hard PHY in Stratix 10 devices can interface with external memories running at speeds of up to 1.3 GHz. The PHY circuitry is hardened in the silicon, which simplifies the challenges of achieving timing closure and minimal power consumption. External Memory Interface Handbook Volume 3: Reference Material 71 2 Functional Description—Intel Stratix® 10 EMIF IP Hard Memory Controller The hard memory controller reduces latency and minimizes core logic consumption in the external memory interface. The hard memory controller supports the DDR3, DDR4, and LPDDR3 memory protocols. PHY-Only Mode Protocols that use a hard controller (DDR4, DDR3, LPDDR3, and RLDRAM 3), provide a PHY-only option, which generates only the PHY and sequencer, but not the controller. This PHY-only mode provides a mechanism by which to integrate your own custom soft controller. High-Speed PHY Clock Tree Dedicated high speed PHY clock networks clock the I/O buffers in Stratix 10 EMIF IP. The PHY clock trees exhibit low jitter and low duty cycle distortion, maximizing the data valid window. Automatic Clock Phase Alignment Automatic clock phase alignment circuitry dynamically adjusts the clock phase of core clock networks to match the clock phase of the PHY clock networks. The clock phase alignment circuitry minimizes clock skew that can complicate timing closure in transfers between the FPGA core and the periphery. Resource Sharing The Stratix 10 architecture simplifies resource sharing between memory interfaces. Resources such as the OCT calibration block, PLL reference clock pin, and core clock can be shared. The hard Nios processor in the I/O subsystem manager (I/O SSM) must be shared across all interfaces in a column. 2.4.1 Stratix 10 EMIF Architecture: I/O Subsystem The I/O subsystem consists of three columns inside the core of Stratix 10 devices. Each column can be thought of as loosely analogous to an I/O bank. Figure 30. Stratix 10 I/O Subsystem Core Fabric I/O Column Transceivers (if applicable) External Memory Interface Handbook Volume 3: Reference Material 72 2 Functional Description—Intel Stratix® 10 EMIF IP The I/O subsystem provides the following features: • General-purpose I/O registers and I/O buffers • On-chip termination control (OCT) • I/O PLLs for external memory interfaces and user logic • Low-voltage differential signaling (LVDS) • External memory interface components, as follows: — Hard memory controller — Hard PHY — Hard Nios processor and calibration logic — DLL 2.4.2 Stratix 10 EMIF Architecture: I/O Column Stratix 10 devices have two I/O columns, which contain the hardware related to external memory interfaces. Each I/O column contains the following major parts: A hardened Nios processor with dedicated memory. This Nios block is referred to as the I/O SSM. • Up to 13 I/O banks. Each I/O bank contains the hardware necessary for an external memory interface. I/O Column LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair Individual I/O Banks Transceiver Block 2L 3H 2K 3G 2J 3F 2I 3E 2H 3D 2G 3C 2F 3B 2A 3A I/O Column I/O Column LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair I/O Center Transceiver Block Transceiver Block Figure 31. • I/O PLL LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA I/O DLL I/O CLK OCT VR Hard Memory Controller and PHY Sequencer I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA Bank Control External Memory Interface Handbook Volume 3: Reference Material 73 2 Functional Description—Intel Stratix® 10 EMIF IP 2.4.3 Stratix 10 EMIF Architecture: I/O SSM Each column includes one I/O subsystem manager (I/O SSM), which contains a hardened Nios II processor with dedicated memory. The I/O SSM is responsible for calibration of all the EMIFs in the column. The I/O SSM includes dedicated memory which stores both the calibration algorithm and calibration run-time data. The hardened Nios II processor and the dedicated memory can be used only by an external memory interface, and cannot be employed for any other use. The I/O SSM can interface with soft logic, such as the debug toolkit, via an Avalon-MM bus. The I/O SSM is clocked by an on-die oscillator, and therefore does not consume a PLL. 2.4.4 Stratix 10 EMIF Architecture: I/O Bank A single I/O bank contains all the hardware needed to build an external memory interface. Each I/O column contains up to 13 I/O banks; the exact number of banks depends on device size and pin package. You can make a wider interface by connecting multiple banks together. Each I/O bank resides in an I/O column, and contains the following components: Figure 32. • Hard memory controller • Sequencer components • PLL and PHY clock trees • DLL • Input DQS clock trees • 48 pins, organized into four I/O lanes of 12 pins each I/O Bank Architecture in Stratix 10 Devices to / from bank above I/O Bank Output Path Input Path I/O Lane 3 S equencer Output Path Input Path I/O Lane 2 Output Path Input Path I/O Lane 1 PLL C lock Phase Alignment Output Path Input Path I/O Lane 0 Memory C ontroller to / from FPGA core to / from bank below External Memory Interface Handbook Volume 3: Reference Material 74 2 Functional Description—Intel Stratix® 10 EMIF IP I/O Bank Usage The pins in an I/O bank can serve as address and command pins, data pins, or clock and strobe pins for an external memory interface. You can implement a narrow interface, such as a DDR3 or DDR4 x8 interface, with only a single I/O bank. A wider interface, such as x72 or x144, can be implemented by configuring multiple adjacent banks in a multi-bank interface. Any pins in a bank which are not used by the external memory interface remain available for use as general purpose I/O pins (of the same voltage standard). Every I/O bank includes a hard memory controller which you can configure for DDR3 or DDR4. In a multi-bank interface, only the controller of one bank is active; controllers in the remaining banks are turned off to conserve power. To use a multi-bank Stratix 10 EMIF interface, you must observe the following rules: • Designate one bank as the address and command bank. • The address and command bank must contain all the address and command pins. • The locations of individual address and command pins within the address and command bank must adhere to the pin map defined in the pin table— regardless of whether you use the hard memory controller or not. • If you do use the hard memory controller, the address and command bank contains the active hard controller. All the I/O banks in a column are capable of functioning as the address and command bank. However, for minimal latency, you should select the center-most bank of the interface as the address and command bank. 2.4.5 Stratix 10 EMIF Architecture: I/O Lane An I/O bank contains 48 I/O pins, organized into four I/O lanes of 12 pins each. Each I/O lane can implement one x8/x9 read capture group (DQS group), with two pins functioning as the read capture clock/strobe pair (DQS/DQS#), and up to 10 pins functioning as data pins (DQ and DM pins). To implement x18 and x36 groups, you can use multiple lanes within the same bank. It is also possible to implement a pair of x4 groups in a lane. In this case, four pins function as clock/strobe pair, and 8 pins function as data pins. DM is not available for x4 groups. There must be an even number of x4 groups for each interface. For x4 groups, DQS0 and DQS1 must be placed in the same I/O lane as a pair. Similarly, DQS2 and DQS3 must be paired. In general, DQS(x) and DQS(x+1) must be paired in the same I/O lane. Table 22. Lanes Used Per Group Group Size Number of Lanes Used Maximum Number of Data Pins per Group x8 / x9 1 10 x18 2 22 x36 4 46 pair of x4 1 4 per group, 8 per lane External Memory Interface Handbook Volume 3: Reference Material 75 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 33. x4 Group Memory Controller Sequencer PLL Clock Phase Alignment Figure 34. Output Path I/O Lane 3 Input Path X4 Groups 6 and 7 Output Path I/O Lane 2 Input Path X4 Groups 4 and 5 Output Path I/O Lane 1 Input Path X4 Groups 2 and 3 Output Path I/O Lane 0 Input Path X4 Groups 0 and 1 Output Path I/O Lane 3 Input Path X8 Group 3 Output Path I/O Lane 2 Input Path X8 Group 2 Output Path I/O Lane 1 Input Path X8 Group 1 Output Path I/O Lane 0 Input Path X8 Group 0 x8 Group Memory Controller Sequencer PLL Clock Phase Alignment External Memory Interface Handbook Volume 3: Reference Material 76 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 35. x18 Group M emo ry Controller Sequ encer PLL Clock Phase Alignment Figure 36. Output Path Input Path X18 Group 0 Output Path Input Path I/O Lane 3 I/O Lane 2 Output Path Input Path X18 Group 1 Output Path Input Path I/O Lane 1 Output Path Input Path I/O Lane 3 Output Path Input Path X36 Group 0 Output Path Input Path I/O Lane 2 Output Path Input Path I/O Lane 0 I/O Lane 0 x36 Group M emo ry Controller Sequ encer PLL Clock Phase Alignment I/O Lane 1 2.4.6 Stratix 10 EMIF Architecture: Input DQS Clock Tree The input DQS clock tree is a balanced clock network that distributes the read capture clock and strobe from the external memory device to the read capture registers inside the I/Os. You can configure an input DQS clock tree in x4 mode, x8/x9 mode, x18 mode, or x36 mode. External Memory Interface Handbook Volume 3: Reference Material 77 2 Functional Description—Intel Stratix® 10 EMIF IP Within every bank, only certain physical pins at specific locations can drive the input DQS clock trees. The pin locations that can drive the input DQS clock trees vary, depending on the size of the group. Table 23. Pins Usable as Read Capture Clock / Strobe Pair Group Size Index of Lanes Spanned by Clock Tree In-Bank Index of Pins Usable as Read Capture Clock / Strobe Pair Positive Leg Negative Leg x4 0A 4 5 x4 0B 8 9 x4 1A 16 17 x4 1B 20 21 x4 2A 28 29 x4 2B 32 33 x4 3A 40 41 x4 3B 44 45 x8 / x9 0 4 5 x8 / x9 1 16 17 x8 / x9 2 28 29 x8 / x9 3 40 41 x18 0, 1 12 13 x18 2, 3 36 37 x36 0, 1, 2, 3 20 21 2.4.7 Stratix 10 EMIF Architecture: PHY Clock Tree Dedicated high-speed clock networks drive I/Os in Stratix 10 EMIF. Each PHY clock network spans only one bank. The relatively short span of the PHY clock trees results in low jitter and low duty-cycle distortion, maximizing the data valid window. The PHY clock tree in Stratix 10 devices can run as fast as 1.3 GHz. All Stratix 10 external memory interfaces use the PHY clock trees. 2.4.8 Stratix 10 EMIF Architecture: PLL Reference Clock Networks Each I/O bank includes a PLL that can drive the PHY clock trees of that bank, through dedicated connections. In addition to supporting EMIF-specific functions, such PLLs can also serve as general-purpose PLLs for user logic. Stratix 10 external memory interfaces that span multiple banks use the PLL in each bank. (Some previous device families relied on a single PLL with clock signals broadcast to all I/Os by a clock network.) The Stratix 10 architecture allows for relatively short PHY clock networks, reducing jitter and duty-cycle distortion. External Memory Interface Handbook Volume 3: Reference Material 78 2 Functional Description—Intel Stratix® 10 EMIF IP The following mechanisms ensure that the clock outputs of individual PLLs in a multibank interface remain in phase: • A single PLL reference clock source feeds all PLLs. The reference clock signal reaches the PLLs by a balanced PLL reference clock tree. The Quartus Prime software automatically configures the PLL reference clock tree so that it spans the correct number of banks. • The EMIF IP sets the PLL M and N values appropriately to maintain synchronization among the clock dividers across the PLLs. This requirement restricts the legal PLL reference clock frequencies for a given memory interface frequency and clock rate. The Stratix 10 EMIF IP parameter editor automatically calculates and displays the set of legal PLL reference clock frequencies. If you plan to use an on-board oscillator, you must ensure that its frequency matches the PLL reference clock frequency that you select from the displayed list. The correct M and N values of the PLLs are set automatically based on the PLL reference clock frequency that you select. Note: The PLL reference clock pin may be placed in the address and command I/O bank or in a data I/O bank, there is no implication on timing. However, for debug flexibility, it is recommended to place the PLL reference clock in the address and command I/O bank. Figure 37. PLL Balanced Reference Clock Tree I/O Bank PLL PHY clock tree I/O Bank I/O Column ref_clk Balanced Reference Clock Network PLL PHY clock tree I/O Bank PLL PHY clock tree I/O Bank PLL PHY clock tree 2.4.9 Stratix 10 EMIF Architecture: Clock Phase Alignment In Stratix 10 external memory interfaces, a global clock network clocks registers inside the FPGA core, and the PHY clock network clocks registers inside the FPGA periphery. Clock phase alignment circuitry employs negative feedback to dynamically adjust the phase of the core clock signal to match the phase of the PHY clock signal. The clock phase alignment feature effectively eliminates the clock skew effect in all transfers between the core and the periphery, facilitating timing closure. All Stratix 10 external memory interfaces employ clock phase alignment circuitry. External Memory Interface Handbook Volume 3: Reference Material 79 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 38. Clock Phase Alignment Illustration FPGA Periphery FPGA Core PHY Clock Network PLL Core Clock Network Clock Phase Alignment p Figure 39. +t Effect of Clock Phase Alignment Without Clock Phase Alignment With Clock Phase Alignment Core clock Core clock PHY clock PHY clock Skew between core and PHY clock network Core and PHY clocks aligned dynamically by clock phase alignment 2.5 Hardware Resource Sharing Among Multiple Stratix 10 EMIFs Often, it is necessary or desirable to share certain hardware resources between interfaces. 2.5.1 I/O SSM Sharing The I/O SSM contains a hard Nios-II processor and dedicated memory storing the calibration software code and data. External Memory Interface Handbook Volume 3: Reference Material 80 2 Functional Description—Intel Stratix® 10 EMIF IP When a column contains multiple memory interfaces, the hard Nios-II processor calibrates each interface serially. Interfaces placed within the same I/O column always share the same I/O SSM. The Quartus Prime Fitter handles I/O SSM sharing automatically. 2.5.2 I/O Bank Sharing Data lanes from multiple compatible interfaces can share a physical I/O bank to achieve a more compact pin placement. To share an I/O bank, interfaces must use the same memory protocol, rate, frequency, I/O standard, and PLL reference clock signal. Rules for Sharing I/O Banks • A bank cannot serve as the address and command bank for more than one interface. This means that lanes which implement address and command pins for different interfaces cannot be allocated to the same physical bank. Note: An exception to the above rule exists when two interfaces are configured in a Ping-Pong PHY fashion. In such a configuration, two interfaces share the same set of address and command pins, effectively meaning that they share the same address and command tile. • Pins within a lane cannot be shared by multiple memory interfaces. • Pins that are not used by EMIF IP can serve as general-purpose I/Os of compatible voltage and termination settings. • You can configure a bank as LVDS or as EMIF, but not both at the same time. • Interfaces that share banks must reside at consecutive bank locations. The following diagram illustrates two x16 interfaces sharing an I/O bank. The two interfaces share the same clock phase alignment block, so that one core clock signal can interact with both interfaces. Without sharing, the two interfaces would occupy a total of four physical banks instead of three. External Memory Interface Handbook Volume 3: Reference Material 81 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 40. I/O Bank Sharing Memory Controller Sequencer PLL Clock Phase Alignment Memory Controller Sequencer PLL Clock Phase Alignment Memory Controller Sequencer PLL Clock Phase Alignment Output Path I/O Lane 3 Input Path Address/Command Lane 2 Output Path I/O Lane 2 Input Path Address/Command Lane 1 Output Path I/O Lane 1 Input Path Address/Command Lane 0 Output Path I/O Lane 0 Input Path DQ Group 0 Output Path I/O Lane 3 Input Path DQ Group 1 Output Path I/O Lane 2 Interface 1 Input Path Output Path I/O Lane 1 Input Path Output Path I/O Lane 0 Input Path DQ Group 1 Output Path I/O Lane 3 Input Path Address/Command Lane 2 Output Path I/O Lane 2 Input Path Address/Command Lane 1 Output Path I/O Lane 1 Input Path Address/Command Lane 0 Output Path I/O Lane 0 Input Path DQ Group 0 Interface 2 2.5.3 PLL Reference Clock Sharing In Stratix 10, every I/O bank contains a PLL, meaning that it is not necessary to share PLLs in the interest of conserving resources. Nonetheless, it is often desirable to share PLLs for other reasons. External Memory Interface Handbook Volume 3: Reference Material 82 2 Functional Description—Intel Stratix® 10 EMIF IP You might want to share PLLs between interfaces for the following reasons: • To conserve pins. • When combined with the use of the balanced PLL reference clock tree, to allow the clock signals at different interfaces to be synchronous and aligned to each other. For this reason, interfaces that share core clock signals must also share the PLL reference clock signal. To implement PLL reference clock sharing, in your RTL code connect the PLL reference clock signal at your design's top-level to the PLL reference clock port of multiple interfaces. To share a PLL reference clock, the following requirements must be met: • Interfaces must expect a reference clock signal of the same frequency. • Interfaces must be placed in the same column. • Interfaces must be placed at adjacent bank locations. 2.5.4 Core Clock Network Sharing It is often desirable or necessary for multiple memory interfaces to be accessible using a single clock domain in the FPGA core. You might want to share core clock networks for the following reasons: • To minimize the area and latency penalty associated with clock domain crossing. • To minimize consumption of core clock networks. Multiple memory interfaces can share the same core clock signals under the following conditions: • The memory interfaces have the same protocol, rate, frequency, and PLL reference clock source. • The interfaces reside in the same I/O column. • The interfaces reside in adjacent bank locations. For multiple memory interfaces to share core clocks, you must specify one of the interfaces as master and the remaining interfaces as slaves. Use the Core clocks sharing setting in the parameter editor to specify the master and slaves. In your RTL, connect the clks_sharing_master_out signal from the master interface to the clks_sharing_slave_in signal of all the slave interfaces. Both the master and slave interfaces expose their own output clock ports in the RTL (e.g. emif_usr_clk, afi_clk), but the signals are equivalent, so it does not matter whether a clock port from a master or a slave is used. Core clock sharing necessitates PLL reference clock sharing; therefore, only the master interface exposes an input port for the PLL reference clock. All slave interfaces use the same PLL reference clock signal. External Memory Interface Handbook Volume 3: Reference Material 83 2 Functional Description—Intel Stratix® 10 EMIF IP 2.6 Stratix 10 EMIF IP Component The external memory interface IP component for Stratix 10 provides a complete solution for implementing external memory interfaces. The EMIF IP also includes a protocol-specific calibration algorithm that automatically determines the optimal delay settings for a robust external memory interface. The external memory interface IP comprises the following parts: • A set of synthesizable files that you can integrate into a larger design • A stand-alone synthesizable example design that you can use for hardware validation • A set of simulation files that you can incorporate into a larger project • A stand-alone simulation example project that you can use to observe controller and PHY operation • A set of timing scripts that you can use to determine the maximum operating frequency of the memory interface based on external factors such as board skew, trace delays, and memory component timing parameters • A customized data sheet specific to your memory interface configuration 2.6.1 Instantiating Your Stratix 10 EMIF IP in a Qsys Project The following steps describe how to instantiate your Stratix 10 EMIF IP in a Qsys project. 1. Within the Qsys interface, select Memories and Memory Controllers in the component Library tree. 2. Under Memories and Memory Controllers, select External Memory Interfaces (Stratix 10). 3. Under External Memory Interfaces (Stratix 10), select the Stratix 10 External Memory Interface component. External Memory Interface Handbook Volume 3: Reference Material 84 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 41. Instantiating Stratix 10 EMIF IP in Qsys 2.6.1.1 Logical Connections The following logical connections exist in a Stratix 10 EMIF IP core. Table 24. Logical Connections Table Logical Connection afi_conduit_end (Conduit) afi_clk_conduit_end (Conduit) afi_half_clk_conduit_end (Conduit) afi_reset_n_conduit_end (Conduit) cal_debug_avalon_slave (Avalon Slave/Target) cal_debug_clk_clock_sink (Clock Input) cal_debug_reset_reset_sink (Reset Input) Description The Altera PHY Interface (AFI) connects a memory controller to the PHY. This interface is exposed only when you configure the memory interface in PHY-Only mode. The interface is synchronous to the afi_clk clock and afi_reset_n reset. Use this clock signal to clock the soft controller logic. The afi_clk is an output clock coming from the PHY when the memory interface is in PHY-Only mode. The phase of afi_clk is adjusted dynamically by hard circuitry for the best data transfer between FPGA core logic and periphery logic with maximum timing margin. Multiple memory interface instances can share a single afi_clk using the Core Clocks Sharing option during IP generation. This clock runs at half the frequency of afi_lk. It is exposed only when the memory interface is in PHY-Only mode. This single-bit reset provides a synchronized reset output. Use this signal to reset all registers that are clocked by either afi_clk or afi_half_clk. This interface is exposed when the EMIF Debug Toolkit/On-chip Debug Port option is set to Export. This interface can be connected to a Stratix 10 external memory interface debug component to allow EMIF Debug Toolkit access, or it can be used directly by user logic to access calibration diagnostic data. This clock is used for the cal_debug_avalon_slave interface. It can be connected to the emif_usr_clk_clock_source interface. This reset is used for the cal_debug_avalon_slave interface. It can be connected to the emif_usr_reset_reset_source interface. continued... External Memory Interface Handbook Volume 3: Reference Material 85 2 Functional Description—Intel Stratix® 10 EMIF IP Logical Connection Description (Avalon Master/Source) This interface is exposed when the Enable Daisy-Chaining for EMIF Debug Toolkit/On-Chip Debug Port option is enabled. Connect this interface to the cal_debug_avalon_slave interface of the next EMIF instance in the same I/O column. cal_debug_out_clk_clock_sourc e This interface should be connected to the cal_debug_clk_clock_sink interface of the next EMIF instance in the same I/O column (similar to cal_debug_out_avalon_master). cal_debug_out_avalon_master (Clock Output) cal_debug_out_reset_reset_sou rce (Reset Output) This interface should be connected to the cal_debug_reset_reset_sink interface of the next EMIF instance in the same I/O column (similar to cal_debug_out_avalon_master). (Avalon Slave/Target) This interface allows access to the Efficiency Monitor CSR. For more information, see the documentation on the UniPHY Efficiency Monitor. global_reset_reset_sink This single-wire input port is the asynchronous reset input for the EMIF core. effmon_csr_avalon_slave (Reset Input) pll_ref_clk_clock_sink (Clock Input) oct_conduit_end (Conduit) mem_conduit_end (Conduit) status_conduit_end (Conduit) emif_usr_reset_reset_source (Reset Output) emif_usr_clk_clock_source (Clock Output) ctrl_amm_avalon_slave (Avalon Slave/Target) This single-wire input port connects the external PLL reference clock to the EMIF core. Multiple EMIF cores may share a PLL reference clock source, provided the restrictions outlined in the PLL Reference Clock Network section are observed. This logical port is connected to an OCT pin and provides calibrated reference data for EMIF cores with pins that use signaling standards that require on-chip termination. Depending on the I/O standard, reference voltage, and memory protocol, multiple EMIF cores may share a single OCT pin. This logical conduit can attach an Intel memory model to an EMIF core for simulation. Memory models for various protocols are available under the Memories and Memory Controllers — External Memory Interfaces — Memory Models section of the component library in Qsys. You must ensure that all configuration parameters for the memory model match the configuration parameters of the EMIF core. The status conduit exports two signals that you can sample to determine whether the calibration operation passed or failed for that core. You should use this single-bit, synchronized, reset output to reset all components that are synchronously connected to the EMIF core. Assertion of the global reset input triggers an assertion of this output as well, therefore you should rely on this signal only as a reset source for all components connected to the EMIF core. Use this single-bit clock output to clock all logic connected to the EMIF core. The phase of this clock signal is adjusted dynamically by circuitry in the EMIF core such that data can be transferred between core logic and periphery registers with maximum timing margin. Drive all logic connected to the EMIF core with this clock signal. Other clock sources generated from the same reference clock or even the same PLL may have unknown phase relationships. Multiple EMIF cores can share a single core clock using the Core Clocks Sharing option described in the Example Design tab of the parameter editor. This Avalon target port initiates read or write commands to the controller. Refer to the Avalon Interface Specification for more information on how to design cores that comply to the Avalon Bus Specification. For DDR3, DDR4, and LPDDR3 protocols with the hard PHY and hard controller configuration and an AVL slave interface exposed, ctrl_amm_avalon_slave is renamed to crtl_amm_avalon_slave_0. For QDR II, QDR II+, and QDR II+ Xtreme interfaces with hard PHY and soft controller, separate read and write connections are used. ctrl_amm_avalon_slave_0 is the read port and ctrl_amm_avalon_slave_1 is the write port. For QDR-IV interfaces with hard PHY and soft controller operating at quarter rate, a total of eight separate Avalon interfaces (named ctrl_amm_avalon_slave_0 to ctrl_amm_avalon_slave_7) are used to maximize bus efficiency. External Memory Interface Handbook Volume 3: Reference Material 86 2 Functional Description—Intel Stratix® 10 EMIF IP Related Links Avalon Interface Specifications 2.6.2 File Sets The Stratix 10 EMIF IP core generates four output file sets for every EMIF IP core, arranged according to the following directory structure. Table 25. Generated File Sets Directory <core_name>/* Description This directory contains only the files required to integrate a generated EMIF core into a larger design. This directory contains: • Synthesizable HDL source files • Customized TCL timing scripts specific to the core (protocol and topology) • HEX files used by the calibration algorithm to identify the interface parameters • A customized data sheet that describes the operation of the generated core Note: The top-level HDL file is generated in the root folder as <core_name>.v (or <core_name.vhd> for VHDL designs). You can reopen this file in the parameter editor if you want to modify the EMIF core parameters and regenerate the design. <core_name>_sim/* This directory contains the simulation fileset for the generated EMIF core. These files can be integrated into a larger simulation project. For convenience, simulation scripts for compiling the core are provided in the /mentor, /cadence, /synopsys, and /riviera subdirectories. The top-level HDL file, <core_name>.v (or <core_name>.vhd) is located in this folder, and all remaining HDL files are placed in the / altera_emif_arch_nf subfolder, with the customized data sheet. The contents of this directory are not intended for synthesis. emif_<instance_num>_example_design/* This directory contains a set of TCL scripts, QSYS project files and README files for the complete synthesis and simulation example designs. You can invoke these scripts to generate a standalone fullysynthesizable project complete with an example driver, or a standalone simulation design complete with an example driver and a memory model. 2.6.3 Customized readme.txt File When you generate your Stratix 10 EMIF IP, the system produces a customized readme file, containing data indicative of the settings in your IP core. External Memory Interface Handbook Volume 3: Reference Material 87 2 Functional Description—Intel Stratix® 10 EMIF IP The readme file is <variation_name>/ altera_emif_arch_nf_<version_number>/<synth|sim>/ <variation_name>_altera_emif_arch_nf_<version_number>_<unique ID>_readme.txt, and contains a summary of Stratix 10 EMIF information, and details specific to your IP core, including: • Pin location guidelines • External port names, directions, and widths • Internal port names, directions, and widths • Avalon interface configuration details (if applicable) • Calibration mode • A brief summary of all configuration settings for the generated IP You should review the generated readme file for implementation guidelines specific to your IP core. 2.6.4 Clock Domains The Stratix 10 EMIF IP core provides a single clock domain to drive all logic connected to the EMIF core. The frequency of the clock depends on the core-clock to memory-clock interface rate ratio. For example, a quarter-rate interface with an 800 MHz memory clock would provide a 200 MHz clock to the core (800 MHz / 4 = 200 MHz). The EMIF IP dynamically adjusts the phase of the core clock with respect to the periphery clock to maintain the optimum alignment for transferring data between the core and periphery. Independent EMIF IP cores driven from the same reference clock have independent core clock domains. You should employ one of the following strategies if you are implementing multiple EMIF cores: 1. Treat all crossing between independent EMIF-generated clock domains as asynchronous, even though they are generated from the same reference clock. 2. Use the Core clock sharing option to enforce that multiple EMIF cores share the same core clock. You must enable this option before IP generation. This option is applicable only for cores that reside in the same I/O column. 2.6.5 ECC in Stratix 10 EMIF IP The ECC (error correction code) is a soft component of the Stratix 10 EMIF IP that reduces the chance of errors when reading and writing to external memory. ECC allows correction of single-bit errors and reduces the chances of system failure. The ECC component includes an encoder, decoder, write FIFO buffer, and modification logic, to allow read-modify-write operations. The ECC code employs standard Hamming logic to correct single-bit errors and to detect double-bit errors. ECC is available in 16, 24, 40, and 72 bit widths. When writing data to memory, the encoder creates ECC bits and writes them together with the regular data. When reading from memory, the decoder checks the ECC bits and regular data, and passes the regular data unchanged if no errors are detected. If a single-bit error is detected, the ECC logic corrects the error and passes the regular data. If more than a single-bit error is detected, the ECC logic sets a flag to indicate the error. External Memory Interface Handbook Volume 3: Reference Material 88 2 Functional Description—Intel Stratix® 10 EMIF IP Read-modify-write operations can occur in the following circumstances: • A partial write in data mask mode (with or without ECC), where at least one memory burst of byte-enable is not all ones or all zeros. • Auto-correction with ECC enabled. This is usually a dummy write issued by the auto-correction logic when a single-bit error is detected. Read-modify-write operations can have a significant effect on memory interface performance; for best performance, you should minimize partial writes. 2.6.5.1 ECC Components The ECC logic communicates with user logic via an Avalon-MM interface, and with the hard memory controller via an Avalon-ST interface ECC Encoder The ECC encoder consists of a x64/x72 encoder IP core capable of single-bit error correction and double-bit error detection. The encoder takes 64 bits input and converts it to 72 bits output, where the 8 additional bits are ECC code. The encoder supports any input data width less than 64-bits. Any unused input data bits are set to zero. ECC Decoder The ECC decoder consists of a x72/x64 decoder IP core capable of double-bit error detection. The decoder takes 72 bits input and converts it to 64 bits output. The decoder also produces single-bit error and double-bit error information. The decoder controls the user read data valid signal; when read data is intended for partial write, the user read data valid signal is deasserted, because the read data is meant for merging, not for the user. Partial Write Data FIFO Buffer The Partial Write Data FIFO Buffer is implemented in soft logic to store partial write data and byte enable. Data and byte enable are popped and merged with the returned read data. A partial write can occur in the following situations: • At least one memory burst of byte enable is not all ones or all zeroes. • Non data masked mode, where all memory bursts of byte enable are not all ones. • A dummy write with auto-correction logic, where all memory bursts of byte enable are all zeroes. (You might use a dummy write when correcting memory content with a single-bit error.) Merging Logic Merge return partial read data with write data based on byte enabled popped from the FIFO buffer, and send it to the ECC encoder. Pointer FIFO Buffer The pointer FIFO buffer is implemented in soft logic to store write data pointers. The ECC logic refers to the pointers when sending write data to the data buffer control (DBC). The pointers serve to overwrite existing write data in the data buffer during a read-modify-write process. External Memory Interface Handbook Volume 3: Reference Material 89 2 Functional Description—Intel Stratix® 10 EMIF IP Partial Logic Partial logic decodes byte enable information and distinguishes between normal and partial writes. Memory Mode Register Interface The Memory Mode Register (MMR) interface is an Avalon-based interface through which core logic can access debug signals and sideband operation requests in the hard memory controller. The MMR logic routes ECC-related operations to an MMR register implemented in soft logic, and returns the ECC information via an Avalon-MM interface. The MMR logic tracks single-bit and double-bit error status, and provides the following information: • Interrupt status. • Single-bit error and double-bit error status. • Single-bit error and double-bit error counts, to a maximum of 15. (If more than 15 errors occur, the count will overflow.) • Address of the last error. 2.6.5.2 ECC User Interface Controls You can enable the ECC logic from the Configuration, Status, and Error Handling section of the Controller tab in the parameter editor. There are three user interface settings related to ECC: • Enabled Memory-Mapped Configuration and Status Register (MMR) Interface: Allows run-time configuration of the memory controller. You can enable this option together with ECC to retrieve error detection information from the ECC logic. • Enabled Error Detection and Correction Logic: Enables ECC logic for single-bit error correction and double-bit error detection. • Enable Auto Error Correction: Allows the controller to automatically correct single-bit errors detected by the ECC logic. 2.7 Examples of External Memory Interface Implementations for DDR4 The following figures are examples of external memory interface implementations for different DDR4 memory widths. The figures show the locations of the address/ command and data pins in relation to the locations of the memory controllers. External Memory Interface Handbook Volume 3: Reference Material 90 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 42. DDR4 1x8 Implementation Example (One I/O Bank) 1 x 8 Pin (1 Bank) Data Address/Command Controller Address/Command Address/Command Figure 43. DDR4 1x32 Implementation Example (Two I/O Banks) 1 x 32 Pin (2 Banks) Data Data Data Data Address/Command Controller Address/Command Address/Command External Memory Interface Handbook Volume 3: Reference Material 91 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 44. DDR4 1x72 Implementation Example (Three I/O Banks) 1 x 72 Pin (3 Banks) Data Data Data Data Data Address/Command Controller Address/Command Address/Command Data Data Data Data External Memory Interface Handbook Volume 3: Reference Material 92 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 45. DDR4 2x16 Implementation Example with Controllers in Non-Adjacent Banks (Three I/O Banks) 2 x 16 Pin (3 Banks) Controller 1 Address/Command 1 Address/Command 1 Address/Command 1 Data 1 Data 1 Data 2 Data 2 Controller 2 Address/Command 2 Address/Command 2 Address/Command 2 External Memory Interface Handbook Volume 3: Reference Material 93 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 46. DDR4 2x16 Implementation Example with Controllers in Adjacent Banks (Three I/O Banks) 2 x 16 Pin (3 Banks) Data 1 Controller 1 Address/Command 1 Address/Command 1 Address/Command 1 Data 1 Controller 2 Address/Command 2 Address/Command 2 Address/Command 2 Data 2 Data 2 External Memory Interface Handbook Volume 3: Reference Material 94 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 47. DDR4 1x144 Implementation Example (Six I/O Banks) 1 x 144 Pin (6 Banks) Data Data Data Data Data Data Data Data Data Data Address/Command Controller Address/Command Address/Command Data Data Data Data Data Data Data Data 2.8 Stratix 10 EMIF Sequencer The Stratix 10 EMIF sequencer is fully hardened in silicon, with executable code to handle protocols and topologies. Hardened RAM contains the calibration algorithm. External Memory Interface Handbook Volume 3: Reference Material 95 2 Functional Description—Intel Stratix® 10 EMIF IP The Stratix 10 EMIF sequencer is responsible for the following operations: Figure 48. • Initializes memory devices. • Calibrates the external memory interface. • Governs the hand-off of control to the memory controller. • Handles recalibration requests and debug requests. • Handles all supported protocols and configurations. Stratix 10 EMIF Sequencer Operation Start Discover EMIFs in column Sequencer software Processed all interfaces? Data Yes No Initialize external memory Calibrate interface Hand-off House-keeping tasks 2.8.1 Stratix 10 EMIF DQS Tracking DQS tracking is enabled for QDR II / II+ / QDR II+ Xtreme, RLDRAM 3, and LPDDR3 protocols. DQS tracking is not available for DDR3 and DDR4 protocols. 2.9 Stratix 10 EMIF Calibration The calibration process compensates for skews and delays in the external memory interface. The following effects can be compensated for by the calibration process: Note: • Timing and electrical constraints, such as setup/hold time and Vref variations. • Circuit board and package factors, such as skew, fly-by effects, and manufacturing variations. • Environmental uncertainties, such as variations in voltage and temperature. • The demanding effects of small margins associated with high-speed operation. The calibration process is intended to maximize margins for robust EMIF operation; it cannot compensate for an inadequate PCB layout. External Memory Interface Handbook Volume 3: Reference Material 96 2 Functional Description—Intel Stratix® 10 EMIF IP 2.9.1 Stratix 10 Calibration Stages At a high level, the calibration routine consists of address and command calibration, read calibration, and write calibration. The stages of calibration vary, depending on the protocol of the external memory interface. Table 26. Calibration Stages by Protocol Stage DDR4 DDR3 LPDDR3 RLDRAM 3 QDR-IV QDR II/II+ Address and command Leveling Yes Yes — — — — Deskew Yes — Yes — Yes — DQSen Yes Yes Yes Yes Yes Yes Deskew Yes Yes Yes Yes Yes Yes VREF-In Yes — — — Yes — LFIFO Yes Yes Yes Yes Yes Yes Leveling Yes Yes Yes Yes Yes — Deskew Yes Yes Yes Yes Yes Yes VREF-Out Yes — — — — — Read Write 2.9.2 Stratix 10 Calibration Stages Descriptions The various stages of calibration perform address and command calibration, read calibration, and write calibration. Address and Command Calibration The goal of address and command calibration is to delay address and command signals as necessary to optimize the address and command window. This stage is not available for all protocols, and cannot compensate for an inefficient board design. Address and command calibration consists of the following parts: • Leveling calibration— Centers the CS# signal and the entire address and command bus, relative to the CK clock. This operation is available for DDR3 and DDR4 interfaces only. • Deskew calibration— Provides per-bit deskew for the address and command bus (except CS#), relative to the CK clock. This operation is available for DDR4 and QDR-IV interfaces only. External Memory Interface Handbook Volume 3: Reference Material 97 2 Functional Description—Intel Stratix® 10 EMIF IP Read Calibration Read calibration consists of the following parts: • DQSen calibration— Calibrates the timing of the read capture clock gating and ungating, so that the PHY can gate and ungate the read clock at precisely the correct time—if too early or too late, data corruption can occur. The algorithm for this stage varies, depending on the memory protocol. • Deskew calibration— Performs per-bit deskew of read data relative to the read strobe or clock. • VREF-In calibration— Calibrates the VREF level at the FPGA. • LFIFO calibration: Normalizes differences in read delays between groups due to fly-by, skews, and other variables and uncertainties. Write Calibration Write calibration consists of the following parts: • Leveling calibration— Aligns the write strobe and clock to the memory clock, to compensate for skews, especially those associated with fly-by topology. The algorithm for this stage varies, depending on the memory protocol. • Deskew calibration— Performs per-bit deskew of write data relative to the write strobe and clock. • VREF-Out calibration— Calibrates the VREF level at the memory device. 2.9.3 Stratix 10 Calibration Algorithms The calibration algorithms sometimes vary, depending on the targeted memory protocol. Address and Command Calibration Address and command calibration consists of the following parts: External Memory Interface Handbook Volume 3: Reference Material 98 2 Functional Description—Intel Stratix® 10 EMIF IP • Leveling calibration— (DDR3 and DDR4 only) Toggles the CS# and CAS# signals to send read commands while keeping other address and command signals constant. The algorithm monitors for incoming DQS signals, and if the DQS signal toggles, it indicates that the read commands have been accepted. The algorithm then repeats using different delay values, to find the optimal window. • Deskew calibration— (DDR4, QDR-IV, and LPDDR3 only) — (DDR4) Uses the DDR4 address and command parity feature. The FPGA sends the address and command parity bit, and the DDR4 memory device responds with an alert signal if the parity bit is detected. The alert signal from the memory device tells the FPGA that the parity bit was received. Deskew calibration requires use of the PAR/ALERT# pins, so you should not omit these pins from your design. One limitation of deskew calibration is that it cannot deskew ODT and CKE pins. — (QDR-IV) Uses the QDR-IV loopback mode. The FPGA sends address and command signals, and the memory device sends back the address and command signals which it captures, via the read data pins. The returned signals indicate to the FPGA what the memory device has captured. Deskew calibration can deskew all synchronous address and command signals. — (LPDDR3) Uses the LPDDR3 CA training mode. The FPGA sends signals onto the LPDDR3 CA bus, and the memory device sends back those signals that it captures, via the DQ pins. The returned signals indicate to the FPGA what the memory device has captured. Deskew calibration can deskew all signals on the CA bus. The remaining command signals (CS, CKE, and ODT) are calibrated based on the average of the deskewed CA bus. External Memory Interface Handbook Volume 3: Reference Material 99 2 Functional Description—Intel Stratix® 10 EMIF IP Read Calibration • DQSen calibration— (DDR3, DDR4, LPDDR3, RLDRAMx and QDRx) DQSen calibration occurs before Read deskew, therefore only a single DQ bit is required to pass in order to achieve a successful read pass. — (DDR3, DDR4,and LPDDR3) The DQSen calibration algorithm searches the DQS preamble using a hardware state machine. The algorithm sends many back-to-back reads with a one clock cycle gap between. The hardware state machine searches for the DQS gap while sweeping DQSen delay values. the algorithm then increments the VFIFO value, and repeats the process until a pattern is found. The process is then repeated for all other read DQS groups. — (RLDRAMx and QDRx) The DQSen calibration algorithm does not use a hardware state machine; rather, it calibrates cycle-level delays using software and subcycle delays using DQS tracking hardware. The algorithm requires good data in memory, and therefore relies on guaranteed writes. (Writing a burst of 0s to one location, and a burst of 1s to another; back-to-back reads from these two locations are used for read calibration.) The algorithm enables DQS tracking to calibrate the phase component of DQS enable, and then issues a guaranteed write, followed by back-to-back reads. The algorithm sweeps DQSen values cycle by cycle until the read operation succeeds. The process is then repeated for all other read groups. • Deskew calibration— Read deskew calibration is performed before write leveling, and must be performed at least twice: once before write calibration, using simple data patterns from guaranteed writes, and again after write calibration, using complex data patterns. The deskew calibration algorithm performs a guaranteed write, and then sweeps dqs_in delay values from low to high, to find the right edge of the read window. The algorithm then sweeps dq-in delay values low to high, to find the left edge of the read window. Updated dqs_in and dq_in delay values are then applied to center the read window. The algorithm then repeats the process for all data pins. • Vref-In calibration— Read Vref-In calibration begins by programming Vref-In with an arbitrary value. The algorithm then sweeps the Vref-In value from the starting value to both ends, and measures the read window for each value. The algorithm selects the Vref-In value which provides the maximum read window. • LFIFO calibration— Read LFIFO calibration normalizes read delays between groups. The PHY must present all data to the controller as a single data bus. The LFIFO latency should be large enough for the slowest read data group, and large enough to allow proper synchronization across FIFOs. External Memory Interface Handbook Volume 3: Reference Material 100 2 Functional Description—Intel Stratix® 10 EMIF IP Write Calibration • Leveling calibration— Write leveling calibration aligns the write strobe and clock to the memory clock, to compensate for skews. In general, leveling calibration tries a variety of delay values to determine the edges of the write window, and then selects an appropriate value to center the window. The details of the algorithm vary, depending on the memory protocol. — (DDRx, LPDDR3) Write leveling occurs before write deskew, therefore only one successful DQ bit is required to register a pass. Write leveling staggers the DQ bus to ensure that at least one DQ bit falls within the valid write window. — (RLDRAMx) Optimizes for the CK versus DK relationship. — (QDR-IV) Optimizes for the CK versus DK relationship. Is covered by address and command deskew using the loopback mode. — (QDR II/II+/Xtreme) The K clock is the only clock, therefore write leveling is not required. • Deskew calibration— Performs per-bit deskew of write data relative to the write strobe and clock. Write deskew calibration does not change dqs_out delays; the write clock is aligned to the CK clock during write leveling. • VREF-Out calibration— (DDR4) Calibrates the VREF level at the memory device. The VREF-Out calibration algorithm is similar to the VREF-In calibration algorithm. 2.9.4 Stratix 10 Calibration Flowchart The following flowchart illustrates the Stratix 10 calibration flow. Figure 49. Calibration Flowchart External Memory Interface Handbook Volume 3: Reference Material 101 2 Functional Description—Intel Stratix® 10 EMIF IP 2.10 Stratix 10 EMIF and SmartVID Stratix 10 EMIF IP can be used with the SmartVID voltage management system, to achieve reduced power consumption. The SmartVID controller allows the FPGA to operate at a reduced Vcc, while maintaining performance. Because the SmartVID controller can adjust Vcc up or down in response to power requirements and temperature, it can have an impact on external memory interface performance. When used with the SmartVID controller, the EMIF IP implements a handshake protocol to ensure that EMIF calibration does not begin until after voltage adjustment has completed. In extended speed grade devices, voltage adjustment occurs once when the FPGA is powered up, and no further voltage adjustments occur. The external memory calibration occurs after this initial voltage adjustment is completed. EMIF specifications are expected to be slightly lower in extended speed grade devices using SmartVID, than in devices not using SmartVID. In industrial speed grade devices, voltage adjustment occurs at power up, and may also occur during operation, in response to temperature changes. External memory interface calibration does not occur until after the initial voltage adjustment at power up. However, the external memory interface is not recalibrated in response to subsequent voltage adjustments that occur during operation. As a result, EMIF specifications for industrial speed grade devices using SmartVID are expected to be lower than for extended speed grade devices. Using Stratix 10 EMIF IP with SmartVID To employ Stratix 10 EMIF IP with SmartVID, follow these steps: 1. Ensure that the Quartus Prime project and Qsys system are configured to use VID components. This step exposes the vid_cal_done_persist interface on instantiated EMIF IP, which is required for communicating with the SmartVID controller. 2. Instantiate the SmartVID controller, using an I/O PLL IP core to drive the 125MHz vid_clk and the 25MHz jtag_core_clk inputs of the Smart VID controller. Note: Do not connect the emif_usr_clk signal to either the vid_clk or jtag_core_clk inputs. Doing so would hold both the EMIF IP and the SmartVID controller in a perpetual reset condition. 3. Instantiate the Stratix 10 EMIF IP. 4. Connect the vid_cal_done_persist signal from the EMIF IP with the cal_done_persistent signal on the SmartVID controller. This connection enables handshaking between the EMIF IP and the SmartVID controller, which allows the EMIF IP to delay memory calibration until after voltage levels are stabilized. Note: The EMIF vid_cal_done_persist interface becomes available only when a VID-enabled device is selected. 2.11 Stratix 10 Hard Memory Controller Rate Conversion Feature The hard memory controller's rate conversion feature allows the hard memory controller and PHY to run at half-rate, even though user logic is configured to run at quarter-rate. External Memory Interface Handbook Volume 3: Reference Material 102 2 Functional Description—Intel Stratix® 10 EMIF IP To facilitate timing closure, you may choose to clock your core user logic at quarterrate, resulting in easier timing closure at the expense of increased area and latency. To improve efficiency and help reduce overall latency, you can run the hard memory controller and PHY at half rate. The rate conversion feature converts traffic from the FPGA core to the hard memory controller from quarter-rate to half-rate, and traffic from the hard memory controller to the FPGA core from half-rate to quarter-rate. From the perspective of user logic inside the FPGA core, the effect is the same as if the hard memory controller were running at quarter-rate. The rate conversion feature is enabled automatically during IP generation whenever all of the following conditions are met: • The hard memory controller is in use. • User logic runs at quarter-rate. • The interface targets either an ES2 or production device. • Running the hard memory controller at half-rate dpoes not exceed the fMax specification of the hard memory controller and hard PHY. When the rate conversion feature is enabled, you should see the following info message displayed in the IP generation GUI: PHY and controller running at 2x the frequency of user logic for improved efficiency. 2.12 Differences Between User-Requested Reset in Stratix 10 versus Arria 10 The following table highlights differences between the user-requested reset mechanism in the Arria 10 EMIF IP and the Stratix 10 EMIF IP. Table 27. Arria 10 Stratix 10 Reset-related signals global_reset_n local_reset_req local_reset_done When can user logic request a reset? Any time after the FPGA enters user mode. local_reset_req has effect only local_reset_done is high. After device power-on, the local_reset_done signal transitions high upon completion of the first calibration, whether the calibration is successful or not. Is user-requested reset a requirement? A user-requested reset is typically required to ensure the memory interface begins from a known state. A user-requested reset is optional. The IOSSM (which is part of the device’s CNOC) automatically ensures that the memory interface begins from a known state as part of the device power-on sequence. A user-requested reset is necessarily only if the user logic must explicitly reset a memory interface after the device power-on sequence. continued... External Memory Interface Handbook Volume 3: Reference Material 103 2 Functional Description—Intel Stratix® 10 EMIF IP Arria 10 Stratix 10 When does a user-requested reset actually happen? As soon as global_reset_n is driven low by user logic. A reset request is handled by the IOSSM. If the IOSSM receives a reset request from multiple interfaces within the same I/O column, it must serialize the reset sequence of the individual interfaces. You should avoid making assumptions on when the reset sequence will begin after a request is issued. Timing requirement and triggering mechanism. global_reset_n is an asynchronous, Reset request is sent by transitioning the local_reset_req signal from low to high, then keeping the signal at the high state for a minimum of 2 EMIF core clock cycles, then transitioning the signal from high to low. local_reset_req is asynchronous in that there is no setup/hold timing to meet, but it must meet the minimum pulse width requirement of 2 EMIF core clock cycles. How long can an external memory interface be kept in reset? The interface is kept in reset for as long as global_reset_n is driven low. It is not possible to keep an external memory interface in reset indefinitely. Asserting local_reset_req high continuously has no effect as a reset request is completed by a full 0->1->0 pulse. Delaying initial calibration. Initial calibration can be delayed for as long as desired by driving global_reset_n immediately after FPGA power-up. Initial calibration cannot be skipped. The local_reset_done signal is driven high only after initial calibration has completed. Reset scope (within an external memory interface). All circuits involved in EMIF operations are reset. Only circuits that are required to restore EMIF to power-up state are reset. Excluded from the reset sequence are the IOSSM, the IOPLL(s), the DLL(s), and the CPA. Reset scope (within an I/O column). global_reset_n is a column-wide local_reset_req is a per-interface reset. It is not possible to reset a subset of the memory interfaces within an I/O column. reset. active-low reset signal. Reset assertion and de-assertion is level-triggered. 2.12.1 Method for Initiating a User-requested Reset Step 1 - Precondition Before asserting local_reset_req, user logic must ensure that the local_reset_done signal is high. As part of the device power-on sequence, the local_reset_done signal automatically transitions to high upon the completion of the interface calibration sequence, regardless of whether calibration is successful or not. Note: When targeting a group of interfaces that share the same core clocks, user logic must ensure that the local_reset_done signal of every interface is high. External Memory Interface Handbook Volume 3: Reference Material 104 2 Functional Description—Intel Stratix® 10 EMIF IP Step 2 - Reset Request After the pre-condition is satisfied, user logic can send a reset request by driving the local_cal_req signal from low to high and then low again (that is, by sending a pulse of 1). • The 0-to-1 and 1-to-0 transitions need not happen in relation to any clock edges (that is, they can occur asynchronously); however, the pulse must meet a minimum pulse width of at least 2 EMIF core clock cycles. For example, if the emif_usr_clk has a period of 4ns, then the local_reset_req pulse must last at least 8ns (that is, two emif_usr_clk periods). • The reset request is considered complete only after the 1-to-0 transition. The EMIF IP does not initiate the reset sequence when the local_reset_req is simply held high. • Additional pulses to local_reset_req are ignored until the reset sequence is completed. Optional - Detecting local_reset_done deassertion and assertion If you want, you can monitor the status of the local_reset_done signal to to explicitly detect the status of the reset sequence. • After the EMIF IP receives a reset request, it deasserts the local_reset_done signal. After initial power-up calibration, local_reset_done is de-asserted only in response to a user-requested reset. The reset sequence is imminent when local_reset_done has transitioned to low, although the exact timing depends on the current state of the I/O Subsystem Manager (IOSSM). As part of the EMIF reset sequence, the core reset signal (emif_usr_reset_n, afi_reset_n) is driven low. Do not use a register reset by the core reset signal to sample local_reset_done. • After the reset sequence has completed, local_reset_done is driven high again. local_reset_done being driven high indicates the completion of the reset sequence and the readiness to accept a new reset request; however, it does not imply that calibration was successful or that the hard memory controller is ready to accept requests. For these purposes, user logic must check signals such as afi_cal_success, afi_cal_fail, and amm_ready. 2.13 Compiling Stratix 10 EMIF IP with the Quartus Prime Software 2.13.1 Instantiating the Stratix 10 EMIF IP Depending on your work flow, you may instantiate your IP with Qsys or with the IP Catalog. Instantiating with Qsys If you instantiate your IP as part of a system in Qsys, follow the Qsys documentation for information on instantiating the IP in a Quartus Prime project. External Memory Interface Handbook Volume 3: Reference Material 105 2 Functional Description—Intel Stratix® 10 EMIF IP Instantiating with the IP Catalog If you generated your IP with the IP Catalog, you must add the Quartus Prime IP file (.qip) to your Quartus Prime project. The .qip file identifies the names and locations of the files that compose the IP. After you add the .qip file to your project, you can instantiate the memory interface in the RTL. 2.13.2 Setting I/O Assignments in Stratix 10 EMIF IP The .qip file contains the I/O standard and I/O termination assignments required by the memory interface pins for proper operation. The assignment values are based on input that you provide during generation. Unlike earlier device families, for Stratix 10 EMIF IP you do not need to run a <instance_name)_pin_assignments.tcl script to add the assignments into the Quartus Prime Settings File (.qsf). The system reads and applies the assignments from the .qip file during every compilation, regardless of how you name the memory interface pins in the top-level design component. No new assignments are created in the project's .qsf file during compilation. Note that I/O assignments in the .qsf file must specify the names of your top-level pins as target (-to), and you must not include the -entity or -library options. Consult the generated .qip file for the set of I/O assignments that are provided with the IP. Changing I/O Assignments You should not make changes to the generated .qip file, because any changes are overwritten and lost when you regenerate the IP. If you want to override an assignment made in the .qip file, add the desired assignment to the project's .qsf file. Assignments in the .qsf file always take precedence over assignments in the .qip file. 2.14 Debugging Stratix 10 EMIF IP You can debug hardware failures by connecting to the EMIF Debug Toolkit or by exporting an Avalon-MM slave port, from which you can access information gathered during calibration. You can also connect to this port to mask ranks and to request recalibration. You can access the exported Avalon-MM port in two ways: • Via the External Memory Interface Debug Toolkit • Via On-Chip Debug (core logic on the FPGA) 2.14.1 External Memory Interface Debug Toolkit The External Memory Interface Debug Toolkit provides access to data collected by the Nios II sequencer during memory calibration, and allows you to perform certain tasks. External Memory Interface Handbook Volume 3: Reference Material 106 2 Functional Description—Intel Stratix® 10 EMIF IP The External Memory Interface Debug Toolkit provides access to data including the following: • General interface information, such as protocol and interface width • Calibration results per group, including pass/fail status, failure stage, and delay settings You can also perform the following tasks: • Mask ranks from calibration (you might do this to skip specific ranks) • Request recalibration of the interface 2.14.2 On-Chip Debug for Stratix 10 The On-Chip Debug feature allows user logic to access the same debug capabilities as the External Memory Interface Toolkit. You can use On-Chip Debug to monitor the calibration results of an external memory interface, without a connected computer. To use On-Chip Debug, you need a C header file which is provided as part of the external memory interface IP. The C header file defines data structures that contain calibration data, and definitions of the commands that can be sent to the memory interface. The On-Chip Debug feature accesses the data structures through the Avalon-MM port that is exposed by the EMIF IP when you turn on debugging features. 2.14.3 Configuring Your EMIF IP for Use with the Debug Toolkit The Stratix 10 EMIF Debug Interface IP core contains the access point through which the EMIF Debug Toolkit reads calibration data collected by the Nios II sequencer. Connecting an EMIF IP Core to a Stratix 10 EMIF Debug Interface For the EMIF Debug Toolkit to access the calibration data for a Stratix 10 EMIF IP core, you must connect one of the EMIF cores in each I/O column to a Stratix 10 EMIF Debug Interface IP core. Subsequent EMIF IP cores in the same column must connect in a daisy chain to the first. There are two ways that you can add the Stratix 10 EMIF Debug Interface IP core to your design: • When you generate your EMIF IP core, on the Diagnostics tab, select Add EMIF Debug Interface for the EMIF Debug Toolkit/On-Chip Debug Port; you do not have to separately instantiate a Stratix 10 EMIF Debug Interface core. This method does not export an Avalon-MM slave port. You can use this method if you require only EMIF Debug Toolkit access to this I/O column; that is, if you do not require On-Chip Debug Port access, or PHYLite reconfiguration access. • When you generate your EMIF IP core, on the Diagnostics tab, select Export for the EMIF Debug Toolkit/On-Chip Debug Port. Then, separately instantiate a Stratix 10 EMIF Debug Interface core and connect its to_iossm interface to the cal_debug interface on the EMIF IP core. This method is appropriate if you want to also have On-Chip Debug Port access to this I/O column, or PHYLite reconfiguration access. External Memory Interface Handbook Volume 3: Reference Material 107 2 Functional Description—Intel Stratix® 10 EMIF IP For each of the above methods, you must assign a unique interface ID for each external memory interface in the I/O column, to identify that interface in the Debug Toolkit. You can assign an interface ID using the dropdown list that appears when you enable the Debug Toolkit/On-Chip Debug Port option. Daisy-Chaining Additional EMIF IP Cores for Debugging After you have connected a Stratix 10 EMIF Debug Interface to one of the EMIF IP cores in an I/O column, you must then connect subsequent EMIF IP cores in that column in a daisy-chain manner. If you don't require debug capabilities for a particular EMIF IP core, you do not have to connect that core to the daisy chain. To create a daisy chain of EMIF IP cores, follow these steps: 1. On the first EMIF IP core, select Enable Daisy-Chaining for EMIF Debug Toolkit/On-Chip Debug Port to create an Avalon-MM interface called cal_debug_out. 2. On the second EMIF IP core, select Export as the EMIF Debug Toolkit/On-Chip Debug Port mode, to export an Avalon-MM interface called cal_debug. 3. Connect the cal_debug_out interface of the first EMIF core to the cal_debug interface of the second EMIF core. 4. To connect more EMIF cores to the daisy chain, select the Enable DaisyChaining for EMIF Debug Toolkit/On-Chip Debug Port option on the second core, connect it to the next core using the Export option as described above. Repeat the process for subsequent EMIF cores. If you place any PHYLite cores with dynamic reconfiguration enabled into the same I/O column as an EMIF IP core, you should instantiate and connect the PHYLite cores in a similar way. See the Altera PHYLite for Memory Megafunction User Guide for more information. Related Links Altera PHYLite for Parallel Interfaces IP Core User Guide 2.14.4 Stratix 10 EMIF Debugging Examples The following sections provide examples of debugging a single external memory interface, and of adding additional EMIF instances to an I/O column. Debugging a Single External Memory Interface 1. Under EMIF Debug Toolkit/On-Chip Debug Port, select Add EMIF Debug Interface. (If you want to use the On-Chip Debug Port instead of the EMIF Debug Toolkit, select Export instead.) External Memory Interface Handbook Volume 3: Reference Material 108 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 50. EMIF With Debug Interface Added (No Additional Ports) emif_0 global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end ctrl_amm_avalon_slave_0 2. Figure 51. emif_usr_clk_clock_source emif_usr_reset_reset_source emif If you want to connect additional EMIF or PHYLite components in this I/O column, select Enable Daisy Chaining for EMIF Debug Toolkit/On-Chip Debug Port. EMIF With cal_debug Avalon Master Exported emif_0 global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end ctrl_amm_avalon_slave_0 emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master emif cal_debug Avalon Master Exported Adding Additional EMIF Instances to an I/O Column 1. Figure 52. Under EMIF Debug Toolkit/On-Chip Debug Port, select Export. EMIF With cal_debug Avalon Slave Exported emif_1 cal_debug Avalon Slave Exported global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end cal_debug_reset_reset_sink cal_debug_clk_clock_sink ctrl_amm_avalon_slave_0 cal_debug_avalon_slave emif_usr_clk_clock_source emif_usr_reset_reset_source emif 2. Specify a unique interface ID for this EMIF instance. 3. If you want to connect additional EMIF or PHYLite components in this I/O column, select Enable Daisy Chaining for EMIF Debug Toolkit/On-Chip Debug Port. External Memory Interface Handbook Volume 3: Reference Material 109 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 53. EMIF With Both cal_debug Master and Slave Exported emif_1 cal_debug Avalon Slave Exported 4. Figure 54. global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end cal_debug_reset_reset_sink cal_debug_clk_clock_sink ctrl_amm_avalon_slave_0 cal_debug_avalon_slave emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master cal_debug Avalon Master Exported emif Connect the cal_debug Avalon Master, clock, and reset interfaces of the previous component to the cal_debug Avalon Slave, clock, and reset interfaces of this component. EMIF Components Connected emif_1 emif_0 global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end ctrl_amm_avalon_slave_0 emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master emif global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end cal_debug_reset_reset_sink cal_debug_clk_clock_sink ctrl_amm_avalon_slave_0 cal_debug_avalon_slave emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master emif 2.15 Stratix 10 EMIF for Hard Processor Subsystem The Stratix 10 EMIF IP can enable the Stratix 10 Hard Processor Subsystem (HPS) to access external DRAM memory devices. To enable connectivity between the Stratix 10 HPS and the Stratix 10 EMIF IP, you must create and configure an instance of the Stratix 10 External Memory Interface for HPS IP core, and use Qsys to connect it to the Stratix 10 Hard Processor Subsystem instance in your system. Supported Modes The Stratix 10 Hard Processor Subsystem is compatible with the following external memory configurations: Protocol DDR3, DDR4, LPDDR3 Maximum memory clock frequency DDR3: 1.067 GHz DDR4: 1.333 GHz LPDDR3: 800 MHz Configuration Hard PHY with hard memory controller Clock rate of PHY and hard memory controller Half-rate Data width (without ECC) 16-bit, 32-bit, 64-bit Data width (with ECC) 24-bit, 40-bit, 72-bit DQ width per group x8 continued... External Memory Interface Handbook Volume 3: Reference Material 110 2 Functional Description—Intel Stratix® 10 EMIF IP Maximum number of I/O lanes for address/command 3 Memory format Discrete, UDIMM, SODIMM, RDIMM Ranks / CS# width Up to 2 2.15.1 Restrictions on I/O Bank Usage for Stratix 10 EMIF IP with HPS You can use only certain Stratix 10 I/O banks to implement Stratix 10 EMIF IP with the Stratix 10 Hard Processor Subsystem. External Memory Interface Handbook Volume 3: Reference Material 111 2 Functional Description—Intel Stratix® 10 EMIF IP The restrictions on I/O bank usage result from the Stratix 10 HPS having hard-wired connections to the EMIF circuits in the I/O banks closest to the HPS. For any given EMIF configuration, the pin-out of the EMIF-to-HPS interface is fixed. The following diagram illustrates the use of I/O banks and lanes for various EMIF-HPS data widths: 16 bit, no ECC 16 bit, with ECC 32 bit, no ECC 32 bit, with ECC 64 bit, no ECC Stratix 10 External Memory Interfaces I/O Bank and Lanes Usage 64 bit, with ECC Figure 55. ECC 8 bits Data 32 bits Data 32 bits Data 16 bits Data 16 bits Lane 0 Lane 3 ECC 8 bits ECC 8 bits Lane 1 Lane 2 Addr / cmd Addr / cmd Addr / cmd Addr / cmd Addr / cmd Addr / cmd Lane 1 Lane 0 Lane 3 Data 32 bits Data 32 bits Lane 2 Lane 1 Lane 0 External Memory Interface Handbook Volume 3: Reference Material 112 I/O Bank 2M (Addr/Cmd + ECC data) Data 32 bits I/O Bank 2L (Data bits 63:32) Lane 2 Data 32 bits I/O Bank 2N (Data bits 31:0) Lane 3 HPS 2 Functional Description—Intel Stratix® 10 EMIF IP The HPS EMIF uses the closest located external memory interfaces I/O banks to connect to SDRAM. These banks include: • Bank 2N—used for data I/Os (Data bits 31:0) • Bank 2M—used for address, command and ECC data I/Os • Bank 2L—used for data I/Os (Data bits 63:32) If no HPS EMIF is used in a system, the entire HPS EMIF bank can be used as FPGA GPIO. If there is a HPS EMIF in a system, the unused HPS EMIF pins can be used as FPGA general I/O with restrictions: • • Bank 2M: — Lane 3 is used for SDRAM ECC data. Unused pins in lane 3 can be used as FPGA inputs only. — Lanes 2, 1, and 0 are used for SDRAM address and command. Unused pins in these lanes can be used as FPGA inputs or outputs. Bank 2N and Bank 2L : — Lanes 3, 2, 1, and 0 are used for data bits. — With 64-bit data widths, unused pins in these banks can be used as FPGA inputs only. — With 32-bit data widths, unused pins in Bank 2N can be used as FPGA inputs only.Unused pins for Bank 2L can be used as FPGA inputs or outputs. — With 16-bit data widths, Quartus® Prime assigns lane 0 and lane 1 as data lanes in bank 2N. Unused pins in lane 0 and lane 1 can be used as FPGA inputs only. The other two lanes are available to use as FPGA inputs or outputs. By default, the Stratix 10 External Memory Interface for HPS IP core together with the Quartus Prime Fitter automatically implement the correct pin-out for HPS EMIF without you having to apply additional constraints. If you must modify the default pin-out for any reason, you must adhere to the following requirements, which are specific to HPS EMIF: 1. Within a single data lane (which implements a single x8 DQS group): • DQ pins must use pins at indices 1, 2, 3, 6, 7, 8, 9, 10. You may swap the locations between the DQ bits (that is, you may swap location of DQ[0] and DQ[3]) so long as the resulting pin-out uses pins at these indices only. • DM/DBI pin must use pin at index 11. There is no flexibility. • DQS/DQS# must use pins at index 4 and 5. There is no flexibility. 2. Assignment of data lanes must be as illustrated in the above figure. You are allowed to swap the locations of entire byte lanes (that is, you may swap locations of byte 0 and byte 3) so long as the resulting pin-out uses only the lanes permitted by your HPS EMIF configuration, as shown in the above figure. 3. You must not change placement of the address and command pins from the default. 4. You may place the alert# pin at any available pin location in either a data lane or an address and command lane. External Memory Interface Handbook Volume 3: Reference Material 113 2 Functional Description—Intel Stratix® 10 EMIF IP To override the default generated pin assignments, comment out the relevant HPS_LOCATION assignments in the .qip file, and add your own location assignments (using set_location_assignment) in the .qsf file. 2.16 Stratix 10 EMIF Ping Pong PHY Ping Pong PHY allows two memory interfaces to share the address and command bus through time multiplexing. Compared to having two independent interfaces that allocate address and command lanes separately, Ping Pong PHY achieves the same throughput with fewer resources, by sharing the address and command lanes. In Stratix 10 EMIF, Ping Pong PHY supports both half-rate and quarter-rate interfaces for DDR3, and quarter-rate for DDR4. 2.16.1 Stratix 10 Ping Pong PHY Feature Description Conventionally, the address and command buses of a DDR3 or DDR4 half-rate or quarter-rate interface use 2T time—meaning that commands are issued for two fullrate clock cycles, as illustrated below. Figure 56. 2T Command Timing CK CSn Addr, ba Extra Setup Time 2T Command Issued Active Period With the Ping Pong PHY, address and command signals from two independent controllers are multiplexed onto shared buses by delaying one of the controller outputs by one full-rate clock cycle. The result is 1T timing, with a new command being issued on each full-rate clock cycle. The following figure shows address and command timing for the Ping Pong PHY. The command signals CS, ODT, and CKE have two signals (one for ping and one for pong); the other address and command signals are shared. External Memory Interface Handbook Volume 3: Reference Material 114 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 57. 1T Command Timing Use by Ping Pong PHY CK CSn[0] CSn[1] Addr, ba Cmd Dev1 Cmd Dev0 2.16.2 Stratix 10 Ping Pong PHY Architecture In Stratix 10 EMIF, the Ping Pong PHY feature can be enabled only with the hard memory controller, where two hard memory controllers are instantiated—one for the primary interface and one for the secondary interface. The hard memory controller I/O bank of the primary interface is used for address and command and is always adjacent and above the hard memory controller bank of the secondary interface. All four lanes of the primary hard memory controller bank are used for address and command. The following example shows a 2x16 Ping Pong PHY bank-lane configuration. The upper bank (I/O bank N) is the address and command bank, which serves both the primary and secondary interfaces. The primary hard memory controller is linked to the secondary interface by the Ping Pong bus. The lower bank (I/O bank N-1) is the secondary interface bank, which carries the data buses for both primary and secondary interfaces. In the 2x16 case a total of four I/O banks are required for data, hence two banks in total are sufficient for the implementation. The data for the primary interface is routed down to the top two lanes of the secondary I/O bank, and the data for the secondary interface is routed to the bottom two lanes of the secondary I/O bank. External Memory Interface Handbook Volume 3: Reference Material 115 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 58. 2x16 Ping Pong PHY I/O Bank-Lane Configuration I/O Tile N DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 Primary HMC I/O Tile N - 1 Secondary HMC Address/ Command Primaary Interface Data Bus Secondary Interface Data Bus A 2x32 interface can be implemented similarly, with the additional data lanes placed above and below the primary and secondary I/O banks, such that primary data lanes are placed above the primary bank and secondary data lanes are placed below the secondary bank. External Memory Interface Handbook Volume 3: Reference Material 116 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 59. 2x32 Ping Pong PHY I/O Bank-Lane Configuration. I/O Tile N + 1 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 Control Path I/O Tile N Primary HMC I/O Tile N - 1 Secondary HMC I/O Tile N - 2 Primary Interface Data Bus Address/ Command Primaary Interface Data Bus Secondary Interface Data Bus Control Path Secondary Interface Data Bus 2.16.3 Stratix 10 Ping Pong PHY Limitations Ping Pong PHY supports up to two ranks per memory interface. In addition, the maximum data width is x72, which is half the maximum width of x144 for a single interface. External Memory Interface Handbook Volume 3: Reference Material 117 2 Functional Description—Intel Stratix® 10 EMIF IP Ping Pong PHY uses all lanes of the address and command I/O bank as address and command. For information on the pin allocations of the DDR3 and DDR4 address and command I/O bank, refer to DDR3 Scheme 1 and DDR4 Scheme 3, in External Memory Interface Pin Information for Stratix 10 Devices, on www.altera.com. An additional limitation is that I/O lanes may be left unused when you instantiate multiple pairs of Ping Pong PHY interfaces. The following diagram shows two pairs of x8 Pin Pong controllers (a total of 4 interfaces). Lanes highlighted in yellow are not driven by any memory interfaces (unused lanes and pins can still serve as general purpose I/Os). Even with some I/O lanes left unused, the Ping Pong PHY approach is still beneficial in terms of resource usage, compared to independent interfaces. Memory widths of 24 bits and 40 bits have a similar situation, while 16 bit, 32 bit, and 64 bit memory widths do not suffer this limitation. External Memory Interface Handbook Volume 3: Reference Material 118 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 60. Two Pairs of x8 Pin-Pong PHY Controllers I/O Tile N DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 Control Path I/O Tile N - 1 Primary HMC I/O Tile N - 2 Primary Interface Data Bus Address/ Command Secondary Interface Data Bus Secondary HMC I/O Tile N - 3 Primary HMC I/O Tile N - 4 Primary Interface Data Bus Address/ Command Secondary Interface Data Bus Secondary HMC 2.16.4 Stratix 10 Ping Pong PHY Calibration A Ping Pong PHY interface is calibrated as a regular interface of double width. Calibration of a Ping Pong PHY interface incorporates two sequencers, one on the primary hard memory controller I/O bank, and one on the secondary hard memory controller I/O bank. To ensure that the two sequencers issue instructions on the same memory clock cycle, the Nios II processor configures the sequencer on the primary hard memory controller to receive a token from the secondary interface, ignoring any External Memory Interface Handbook Volume 3: Reference Material 119 2 Functional Description—Intel Stratix® 10 EMIF IP commands from the Avalon bus. Additional delays are programmed on the secondary interface to allow for the passing of the token from the sequencer on the secondary hard memory controller tile to the sequencer on the primary hard memory controller tile. During calibration, the Nios II processor assumes that commands are always issued from the sequencer on the primary hard memory controller I/O bank. After calibration, the Nios II processor adjusts the delays for use with the primary and secondary hard memory controllers. 2.16.5 Using the Ping Pong PHY The following steps describe how to use the Ping Pong PHY for Stratix 10 EMIF. 1. Configure a single memory interface according to your requirements. 2. Select Instantiate two controllers sharing a Ping Pong PHY on the General tab in the parameter editor. The Quartus Prime software replicates the interface, resulting in two memory controllers and a shared PHY. The system configures the I/O bank-lane structure, without further input from you. 2.16.6 Ping Pong PHY Simulation Example Design The following figure illustrates a top-level block diagram of a generated Ping Pong PHY simulation example design, using two I/O banks. Functionally, the IP interfaces with user traffic separately, as it would with two independent memory interfaces. You can also generate synthesizable example designs, where the external memory interface IP interfaces with a traffic generator. Figure 61. Ping Pong PHY Simulation Example Design Simulation Example Design EMIF Tile N Traffic Generator 0 Sim Checker Primary HMC Tile N - 1 Lane 3 CS, ODT, CKE Lane 2 CAS, RAS, WE, ADDR, BA, BG, ... Lane 1 CS, ODT, CKE Lane 0 Lane 3 DQ, DQS, DM Lane 2 Traffic Generator 1 Secondary HMC Memory 0 Lane 1 DQ, DQS, DM Memory 1 Lane 0 2.17 AFI 4.0 Specification The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. External Memory Interface Handbook Volume 3: Reference Material 120 2 Functional Description—Intel Stratix® 10 EMIF IP The AFI is a single-data-rate interface, meaning that data is transferred on the rising edge of each clock cycle. Most memory interfaces, however, operate at double-datarate, transferring data on both the rising and falling edges of the clock signal. If the AFI interface is to directly control a double-data-rate signal, two single-data-rate bits must be transmitted on each clock cycle; the PHY then sends out one bit on the rising edge of the clock and one bit on the falling edge. The AFI convention is to send the low part of the data first and the high part second, as shown in the following figure. Figure 62. Single Versus Double Data Rate Transfer clock Single-data-rate Double-data-rate A High , A Low A Low A High B High , B Low B Low B High 2.17.1 Bus Width and AFI Ratio In cases where the AFI clock frequency is one-half or one-quarter of the memory clock frequency, the AFI data must be twice or four times as wide, respectively, as the corresponding memory data. The ratio between AFI clock and memory clock frequencies is referred to as the AFI ratio. (A half-rate AFI interface has an AFI ratio of 2, while a quarter-rate interface has an AFI ratio of 4.) In general, the width of the AFI signal depends on the following three factors: • The size of the equivalent signal on the memory interface. For example, if a[15:0] is a DDR3 address input and the AFI clock runs at the same speed as the memory interface, the equivalent afi_addr bus will be 16-bits wide. • The data rate of the equivalent signal on the memory interface. For example, if d[7:0] is a double-data-rate QDR II input data bus and the AFI clock runs at the same speed as the memory interface, the equivalent afi_write_data bus will be 16-bits wide. • The AFI ratio. For example, if cs_n is a single-bit DDR3 chip select input and the AFI clock runs at half the speed of the memory interface, the equivalent afi_cs_n bus will be 2-bits wide. The following formula summarizes the three factors described above: AFI_width = memory_width * signal_rate * AFI_RATE_RATIO Note: The above formula is a general rule, but not all signals obey it. For definite signal-size information, refer to the specific table. 2.17.2 AFI Parameters The following tables list Altera PHY interface (AFI) parameters for AFI 4.0. The parameters described in the following tables affect the width of AFI signal buses. Parameters prefixed by MEM_IF_ refer to the signal size at the interface between the PHY and memory device. External Memory Interface Handbook Volume 3: Reference Material 121 2 Functional Description—Intel Stratix® 10 EMIF IP Table 28. Ratio Parameters Parameter Name Description AFI_RATE_RATIO The ratio between the AFI clock frequency and the memory clock frequency. For full-rate interfaces this value is 1, for half-rate interfaces the value is 2, and for quarter-rate interfaces the value is 4. DATA_RATE_RATIO The number of data bits transmitted per clock cycle. For single-date rate protocols this value is 1, and for double-data rate protocols this value is 2. ADDR_RATE_RATIO The number of address bits transmitted per clock cycle. For single-date rate address protocols this value is 1, and for double-data rate address protocols this value is 2. Table 29. Memory Interface Parameters Parameter Name Description MEM_IF_ADDR_WIDTH The width of the address bus on the memory device(s). MEM_IF_BANKADDR_WIDTH The width of the bank address bus on the interface to the memory device(s). Typically, the log 2 of the number of banks. MEM_IF_CS_WIDTH The number of chip selects on the interface to the memory device(s). MEM_IF_WRITE_DQS_WIDTH The number of DQS (or write clock) signals on the write interface. For example, the number of DQS groups. MEM_IF_CLK_PAIR_COUNT The number of CK/CK# pairs. MEM_IF_DQ_WIDTH The number of DQ signals on the interface to the memory device(s). For single-ended interfaces such as QDR II, this value is the number of D or Q signals. MEM_IF_DM_WIDTH The number of data mask pins on the interface to the memory device(s). MEM_IF_READ_DQS_WIDTH The number of DQS signals on the read interface. For example, the number of DQS groups. Table 30. Derived AFI Parameters Parameter Name Derivation Equation AFI_ADDR_WIDTH MEM_IF_ADDR_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_BANKADDR_WIDTH MEM_IF_BANKADDR_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_CONTROL_WIDTH AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_CS_WIDTH MEM_IF_CS_WIDTH * AFI_RATE_RATIO AFI_DM_WIDTH MEM_IF_DM_WIDTH * AFI_RATE_RATIO * DATA_RATE_RATIO AFI_DQ_WIDTH MEM_IF_DQ_WIDTH * AFI_RATE_RATIO * DATA_RATE_RATIO AFI_WRITE_DQS_WIDTH MEM_IF_WRITE_DQS_WIDTH * AFI_RATE_RATIO AFI_LAT_WIDTH 6 AFI_RLAT_WIDTH AFI_LAT_WIDTH AFI_WLAT_WIDTH AFI_LAT_WIDTH * MEM_IF_WRITE_DQS_WIDTH AFI_CLK_PAIR_COUNT MEM_IF_CLK_PAIR_COUNT AFI_WRANK_WIDTH Number of ranks * MEM_IF_WRITE_DQS_WIDTH *AFI_RATE_RATIO AFI_RRANK_WIDTH Number of ranks * MEM_IF_READ_DQS_WIDTH *AFI_RATE_RATIO External Memory Interface Handbook Volume 3: Reference Material 122 2 Functional Description—Intel Stratix® 10 EMIF IP 2.17.3 AFI Signals The following tables list Altera PHY interface (AFI) signals grouped according to their functions. In each table, the Direction column denotes the direction of the signal relative to the PHY. For example, a signal defined as an output passes out of the PHY to the controller. The AFI specification does not include any bidirectional signals. Not all signals are used for all protocols. 2.17.3.1 AFI Clock and Reset Signals The AFI interface provides up to two clock signals and an asynchronous reset signal. Table 31. Clock and Reset Signals Signal Name Direction Width Description afi_clk Output 1 Clock with which all data exchanged on the AFI bus is synchronized. In general, this clock is referred to as full-rate, half-rate, or quarter-rate, depending on the ratio between the frequency of this clock and the frequency of the memory device clock. afi_half_clk Output 1 Clock signal that runs at half the speed of the afi_clk. The controller uses this signal when the half-rate bridge feature is in use. This signal is optional. afi_reset_n Output 1 Asynchronous reset output signal. You must synchronize this signal to the clock domain in which you use it. 2.17.3.2 AFI Address and Command Signals The address and command signals for AFI 4.0 encode read/write/configuration commands to send to the memory device. The address and command signals are single-data rate signals. Table 32. Address and Command Signals Signal Name Direction Width Description afi_addr Input AFI_ADDR_WIDTH Address or CA bus (LPDDR3 only). ADDR_RATE_RATIO is 2 for LPDDR3 CA bus. afi_bg Input AFI_BANKGROUP_WIDTH Bank group (DDR4 only). afi_ba Input AFI_BANKADDR_WIDTH Bank address. (Not applicable for LPDDR3.) afi_cke Input AFI_CLK_EN_WIDTH Clock enable. afi_cs_n Input AFI_CS_WIDTH Chip select signal. (The number of chip selects may not match the number of ranks; for example, RDIMMs and LRDIMMs require a minimum of 2 chip select signals for both single-rank and dual-rank configurations. Consult your memory device data sheet for continued... External Memory Interface Handbook Volume 3: Reference Material 123 2 Functional Description—Intel Stratix® 10 EMIF IP Signal Name Direction Width Description information about chip select signal width.) (Matches the number of ranks for LPDDR3.) afi_ras_n Input AFI_CONTROL_WIDTH RAS# (for DDR2 and DDR3 memory devices.) afi_we_n Input AFI_CONTROL_WIDTH WE# (for DDR2, DDR3, and RLDRAM II memory devices.) afi_rw_n Input AFI_CONTROL_WIDTH * 2 RWA/B# (QDR-IV). afi_cas_n Input AFI_CONTROL_WIDTH CAS# (for DDR2 and DDR3 memory devices.) afi_act_n Input AFI_CONTROL_WIDTH ACT# (DDR4). afi_ref_n Input AFI_CONTROL_WIDTH REF# (for RLDRAM II memory devices.) afi_rst_n Input AFI_CONTROL_WIDTH RESET# (for DDR3 and DDR4 memory devices.) afi_odt Input AFI_CLK_EN_WIDTH On-die termination signal for DDR2, DDR3, and LPDDR3 memory devices. (Do not confuse this memory device signal with the FPGA’s internal on-chip termination signal.) afi_par Input AFI_CS_WIDTH Address and command parity input. (DDR4) Address parity input. (QDR-IV) afi_ainv Input AFI_CONTROL_WIDTH Address inversion. (QDR-IV) afi_mem_clk_disable Input AFI_CLK_PAIR_COUNT When this signal is asserted, mem_clk and mem_clk_n are disabled. This signal is used in lowpower mode. afi_wps_n Output AFI_CS_WIDTH WPS (for QDR II/II+ memory devices.) afi_rps_n Output AFI_CS_WIDTH RPS (for QDR II/II+ memory devices.) 2.17.3.3 AFI Write Data Signals Write Data Signals for AFI 4.0 control the data, data mask, and strobe signals passed to the memory device during write operations. Table 33. Write Data Signals Signal Name afi_dqs_burst Direction Input Width AFI_RATE_RATIO Description Controls the enable on the strobe (DQS) pins for DDR2, DDR3, LPDDR2, and LPDDR3 memory devices. When this signal is asserted, mem_dqs and mem_dqsn are driven. continued... External Memory Interface Handbook Volume 3: Reference Material 124 2 Functional Description—Intel Stratix® 10 EMIF IP Signal Name Direction Width Description This signal must be asserted before afi_wdata_valid to implement the write preamble, and must be driven for the correct duration to generate a correctly timed mem_dqs signal. afi_wdata_valid Input AFI_RATE_RATIO Write data valid signal. This signal controls the output enable on the data and data mask pins. afi_wdata Input AFI_DQ_WIDTH Write data signal to send to the memory device at double-data rate. This signal controls the PHY’s mem_dq output. afi_dm Input AFI_DM_WIDTH Data mask. This signal controls the PHY’s mem_dm signal for DDR2, DDR3, LPDDR2, LPDDR3, and RLDRAM II memory devices.) Also directly controls the PHY's mem_dbi signal for DDR4. The mem_dm and mem_dbi features share the same port on the memory device. afi_bws_n Input AFI_DM_WIDTH Data mask. This signal controls the PHY’s mem_bws_n signal for QDR II/II+ memory devices. afi_dinv Input AFI_WRITE_DQS_WIDTH * 2 Data inversion. It directly controls the PHY's mem_dinva/b signal for QDR-IV devices. 2.17.3.4 AFI Read Data Signals Read Data Signals for AFI 4.0 control the data sent from the memory device during read operations. Table 34. Read Data Signals Signal Name Direction Width Description afi_rdata_en_full Input AFI_RATE_RATIO Read data enable full. Indicates that the memory controller is currently performing a read operation. This signal is held high for the entire read burst.If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). afi_rdata Output AFI_DQ_WIDTH Read data from the memory device. This data is considered valid only when afi_rdata_valid is asserted by the PHY. afi_rdata_valid Output AFI_RATE_RATIO Read data valid. When asserted, this signal indicates that the afi_rdata bus is valid.If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). External Memory Interface Handbook Volume 3: Reference Material 125 2 Functional Description—Intel Stratix® 10 EMIF IP 2.17.3.5 AFI Calibration Status Signals The PHY instantiates a sequencer which calibrates the memory interface with the memory device and some internal components such as read FIFOs and valid FIFOs. The sequencer reports the results of the calibration process to the controller through the Calibration Status Signals in the AFI interface. Table 35. Calibration Status Signals Signal Name Direction Width Description afi_cal_success Output 1 Asserted to indicate that calibration has completed successfully. afi_cal_fail Output 1 Asserted to indicate that calibration has failed. afi_cal_req Input 1 Effectively a synchronous reset for the sequencer. When this signal is asserted, the sequencer returns to the reset state; when this signal is released, a new calibration sequence begins. afi_wlat Output AFI_WLAT_WIDTH The required write latency in afi_clk cycles, between address/command and write data being issued at the PHY/ controller interface. The afi_wlat value can be different for different groups; each group’s write latency can range from 0 to 63. If write latency is the same for all groups, only the lowest 6 bits are required. afi_rlat Output AFI_RLAT_WIDTH The required read latency in afi_clk cycles between address/command and read data being returned to the PHY/controller interface. Values can range from 0 to 63. (1) Note to Table: 1. The afi_rlat signal is not supported for PHY-only designs. Instead, you can sample the afi_rdata_valid signal to determine when valid read data is available. 2.17.3.6 AFI Tracking Management Signals When tracking management is enabled, the sequencer can take control over the AFI 4.0 interface at given intervals, and issue commands to the memory device to track the internal DQS Enable signal alignment to the DQS signal returning from the memory device. The tracking management portion of the AFI 4.0 interface provides a means for the sequencer and the controller to exchange handshake signals. Table 36. Tracking Management Signals Signal Name Direction Width Description afi_ctl_refresh_done Input 4 Handshaking signal from controller to tracking manager, indicating that a refresh has occurred and waiting for a response. afi_seq_busy Output 4 Handshaking signal from sequencer to controller, indicating when DQS tracking is in progress. afi_ctl_long_idle Input 4 Handshaking signal from controller to tracking manager, indicating that it has exited low power state without a periodic refresh, and waiting for response. External Memory Interface Handbook Volume 3: Reference Material 126 2 Functional Description—Intel Stratix® 10 EMIF IP 2.17.3.7 AFI Shadow Register Management Signals Shadow registers are a feature that enables high-speed multi-rank support. Shadow registers allow the sequencer to calibrate each rank separately, and save the calibrated settings—such as deskew delay-chain configurations—of each rank in its own set of shadow registers. During a rank-to-rank switch, the correct set of calibrated settings is restored just in time to optimize the data valid window. The PHY relies on additional AFI signals to control which set of shadow registers to activate. Table 37. Shadow Register Management Signals Signal Name Direction Width Description afi_wrank Input AFI_WRANK_WIDTH Signal from controller specifying which rank the write data is going to. The signal timing is identical to that of afi_dqs_burst. That is, afi_wrank must be asserted at the same time and must last the same duration as the afi_dqs_burst signal. afi_rrank Output AFI_RRANK_WIDTH Signal from controller specifying which rank is being read. The signal must be asserted at the same time as the afi_rdata_en signal when issuing a read command, but unlike afi_rdata_en, afi_rrank is stateful. That is, once asserted, the signal value must remain unchanged until the controller issues a new read command to a different rank. Both the afi_wrank and afi_rrank signals encode the rank being accessed using the one-hot scheme (e.g. in a quad-rank interface, 0001, 0010, 0100, 1000 refer to the 1st, 2nd, 3rd, 4th rank respectively). The ordering within the bus is the same as other AFI signals. Specifically the bus is ordered by time slots, for example: Half-rate afi_w/rrank = {T1, T0} Quarter-rate afi_w/rrank = {T3, T2, T1, T0} Where Tx is a number of rank-bit words that one-hot encodes the rank being accessed at the yth full-rate cycle. External Memory Interface Handbook Volume 3: Reference Material 127 2 Functional Description—Intel Stratix® 10 EMIF IP Additional Requirements for Stratix 10 Shadow Register Support To ensure that the hardware has enough time to switch from one shadow register to another, the controller must satisfy the following minimum rank-to-rank-switch delays (tRTRS): • Two read commands going to different ranks must be separated by a minimum of 3 full-rate cycles (in addition to the burst length delay needed to avoid collision of data bursts). • Two write commands going to different rank must be separated by a minimum of 4 full-rate cycles (in addition to the burst length delay needed to avoid collision of data bursts). The Stratix 10 device family supports a maximum of 4 sets of shadow registers, each for an independent set of timings. More than 4 ranks are supported if those ranks have four or fewer sets of independent timing. For example, the rank multiplication mode of an LRDIMM allows more than one physical rank to share a set of timing data as a single logical rank. Therefore Stratix 10 devices can support up to 4 logical ranks, though that means more than 4 physical ranks. 2.17.4 AFI 4.0 Timing Diagrams 2.17.4.1 AFI Address and Command Timing Diagrams Depending on the ratio between the memory clock and the PHY clock, different numbers of bits must be provided per PHY clock on the AFI interface. The following figures illustrate the AFI address/command waveforms in full, half and quarter rate respectively. The waveforms show how the AFI command phase corresponds to the memory command output. AFI command 0 corresponds to the first memory command slot, AFI command 1 corresponds to the second memory command slot, and so on. External Memory Interface Handbook Volume 3: Reference Material 128 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 63. AFI Address and Command Full-Rate Memory Interface mem_clk mem_cs_n mem_cke mem_ras_n mem_cas_n mem_we_n AFI Interface afi_clk afi_cs_n afi_cke afi_ras_n afi_cas_n afi_we_n External Memory Interface Handbook Volume 3: Reference Material 129 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 64. AFI Address and Command Half-Rate Memory Interface mem_clk mem_cs_n mem_cke mem_ras_n mem_cas_n mem_we_n AFI Interface afi_clk afi_cs_n[1] afi_cs_n[0] 1 0 0 1 afi_cke[1] afi_cke[0] 1 1 1 1 afi_ras_n[1] afi_ras_n[0] 1 0 1 1 afi_cas_n[1] afi_cas_n[0] 1 1 0 1 afi_we_n[1] afi_we_n[0] 1 1 0 1 External Memory Interface Handbook Volume 3: Reference Material 130 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 65. AFI Address and Command Quarter-Rate Memory Interface mem_clk mem_cs_n mem_cke mem_ras_n mem_cas_n mem_we_n AFI Interface afi_clk afi_cs_n[3] 0 1 afi_cs_n[2] afi_cs_n[1] afi_cs_n[0] 1 0 0 1 1 0 afi_cke[3] 1 1 afi_cke[2] afi_cke[1] afi_cke[0] 1 1 1 1 1 1 1 1 afi_ras_n[3] afi_ras_n[2] afi_ras_n[1] 1 1 0 1 afi_ras_n[0] 1 0 afi_cas_n[3] afi_cas_n[2] afi_cas_n[1] afi_cas_n[0] 0 1 1 0 1 1 1 1 afi_we_n[3] afi_we_n[2] 0 1 1 0 afi_we_n[1] afi_we_n[0] 1 1 1 1 2.17.4.2 AFI Write Sequence Timing Diagrams The following timing diagrams illustrate the relationships between the write command and corresponding write data and write enable signals, in full, half, and quarter rate. For half rate and quarter rate, when the write command is sent on the first memory clock in a PHY clock (for example, afi_cs_n[0] = 0), that access is called aligned access; otherwise it is called unaligned access. You may use either aligned or unaligned access, or you may use both, but you must ensure that the distance External Memory Interface Handbook Volume 3: Reference Material 131 2 Functional Description—Intel Stratix® 10 EMIF IP between the write command and the corresponding write data are constant on the AFI interface. For example, if a command is sent on the second memory clock in a PHY clock, the write data must also start at the second memory clock in a PHY clock. Write sequences with wlat=0 Figure 66. AFI Write Data Full-Rate, wlat=0 afi_clk afi_command WR WR WR afi_wdata_valid afi_wdata A B C D E F afi_dm M N O P Q R The following diagrams illustrate both aligned and unaligned access. The first three write commands are aligned accesses where they were issued on LSB of afi_command. The fourth write command is unaligned access where it was issued on a different command slot. AFI signals must be shifted accordingly, based on the command slot. Figure 67. AFI Write Data Half-Rate, wlat=0 afi_clk afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR WR afi_wdata_valid[1] 1 1 1 1 0 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[1] B D F G afi_wdata[0] A C E afi_dm[1] N P R afi_dm[0] M O Q External Memory Interface Handbook Volume 3: Reference Material 132 H S T 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 68. AFI Write Data Quarter-Rate, wlat=0 afi_clk afi_command[3] NOP NOP NOP WR afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR NOP afi_wdata_valid[3] 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[3] D H L A afi_wdata[2] C G K D afi_wdata[1] B F J C afi_wdata[0] A E I B afi_dm[3] P T X afi_dm[2] O S W P afi_dm[1] N R V O afi_dm[0] M Q U N M Write sequences with wlat=non-zero The afi_wlat is a signal from the PHY. The controller must delay afi_dqs_burst, afi_wdata_valid, afi_wdata and afi_dm signals by a number of PHY clock cycles equal to afi_wlat, which is a static value determined by calibration before the PHY asserts cal_success to the controller. The following figures illustrate the cases when wlat=1. Note that wlat is in the number of PHY clocks and therefore wlat=1 equals 1, 2, and 4 memory clocks delay, respectively, on full, half and quarter rate. External Memory Interface Handbook Volume 3: Reference Material 133 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 69. AFI Write Data Full-Rate, wlat=1 afi_clk afi_command WR WR WR afi_wdata_valid Figure 70. afi_wdata A B C D E F afi_dm M N O P Q R AFI Write Data Half-Rate, wlat=1 afi_clk afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR WR afi_wdata_valid[1] 1 1 1 1 0 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[1] B D F G afi_wdata[0] A C E afi_dm[1] N P R afi_dm[0] M O Q External Memory Interface Handbook Volume 3: Reference Material 134 H S T 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 71. AFI Write Data Quarter-Rate, wlat=1 afi_clk afi_command[3] NOP NOP NOP WR afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR NOP afi_wdata_valid[3] 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[3] D H L A afi_wdata[2] C G K D afi_wdata[1] B F J C afi_wdata[0] A E I B afi_dm[3] P T X afi_dm[2] O S W P afi_dm[1] N R V O afi_dm[0] M Q U N M DQS burst The afi_dqs_burst signal must be asserted one or two complete memory clock cycles earlier to generate DQS preamble. DQS preamble is equal to one-half and onequarter AFI clock cycles in half and quarter rate, respectively. A DQS preamble of two is required in DDR4, when the write preamble is set to two clock cycles. The following diagrams illustrate how afi_dqs_burst must be asserted in full, half, and quarter-rate configurations. External Memory Interface Handbook Volume 3: Reference Material 135 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 72. AFI DQS Burst Full-Rate, wlat=1 afi_clk afi_command WR WR WR afi_dqs_burst afi_wdata_valid Figure 73. afi_wdata A B C D E F afi_dm M N O P Q R AFI DQS Burst Half-Rate, wlat=1 afi_clk afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR WR afi_dqs_burst[1] 1 1 1 1 1 1 0 afi_dqs_burst[0] 0 1 0 1 1 1 1 afi_wdata_valid[1] 1 1 1 1 0 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[1] B D F G afi_wdata[0] A C E afi_dm[1] N P R afi_dm[0] M O Q External Memory Interface Handbook Volume 3: Reference Material 136 H S T 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 74. AFI DQS Burst Quarter-Rate, wlat=1 afi_clk afi_command[3] NOP NOP NOP WR afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR NOP afi_dqs_burst[3] 1 1 1 1 1 1 0 afi_dqs_burst[2] 0 1 0 1 1 1 1 afi_dqs_burst[1] 0 1 0 1 1 0 1 afi_dqs_burst[0] 0 1 0 1 1 0 1 afi_wdata_valid[3] 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[3] D H L A afi_wdata[2] C G K D afi_wdata[1] B F J C afi_wdata[0] A E I B afi_dm[3] P T X afi_dm[2] O S W P afi_dm[1] N R V O afi_dm[0] M Q U N M Write data sequence with DBI (DDR4 and QDRIV only) The DDR4 write DBI feature is supported in the PHY, and when it is enabled, the PHY sends and receives the DBI signal without any controller involvement. The sequence is identical to non-DBI scenarios on the AFI interface. External Memory Interface Handbook Volume 3: Reference Material 137 2 Functional Description—Intel Stratix® 10 EMIF IP Write data sequence with CRC (DDR4 only) When the CRC feature of the PHY is enabled and used, the controller ensures at least one memory clock cycle between write commands, during which the PHY inserts the CRC data. Sending back to back write command would cause functional failure. The following figures show the legal sequences in CRC mode. Entries marked as 0 and RESERVE must be observed by the controller; no information is allowed on those entries. Figure 75. AFI Write Data with CRC Half-Rate, wlat=2 afi_clk afi_command[1] NOP NOP NOP afi_command[0] WR WR WR afi_dqs_burst[1] 1 1 1 afi_dqs_burst[0] 0 1 1 afi_wdata_valid[1] 1 1 afi_wdata_valid[0] 1 1 afi_wdata[1] B D afi_wdata[0] A C afi_dm[1] N P afi_dm[0] M O External Memory Interface Handbook Volume 3: Reference Material 138 0 0 Reserve Reserve 1 1 1 1 1 0 0 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 F H I H Reserve E G Reserve J L R T U W Reserve Q S Reserve V X 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 76. AFI Write Data with CRC Quarter-Rate, wlat=2 afi_clk afi_command[1] NOP NOP NOP afi_command[0] WR WR WR afi_dqs_burst[3] 1 1 1 1 1 1 1 0 afi_dqs_burst[2] 0 1 0 1 1 1 0 1 afi_dqs_burst[1] 0 1 0 1 1 0 1 1 afi_dqs_burst[0] 0 1 0 1 0 1 1 1 0 afi_wdata_valid[3] 1 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 1 afi_wdata_valid[0] 1 1 0 1 1 1 afi_wdata[3] D D G J M Reserve afi_wdata[2] C C F I Reserve P afi_wdata[1] B B E Reserve L O afi_wdata[0] A A Reserve H K N afi_dm[3] D D G J M Reserve afi_dm[2] C C F I Reserve P afi_dm[1] B B E Reserve L O afi_dm[0] A A Reserve H K N 0 Reserve Reserve 2.17.4.3 AFI Read Sequence Timing Diagrams The following waveforms illustrate the AFI write data waveform in full, half, and quarter-rate, respectively. The afi_rdata_en_full signal must be asserted for the entire read burst operation. The afi_rdata_en signal need only be asserted for the intended read data. Aligned and unaligned access for read commands is similar to write commands; however, the afi_rdata_en_full signal must be sent on the same memory clock in a PHY clock as the read command. That is, if a read command is sent on the second memory clock in a PHY clock, afi_rdata_en_full must also be asserted, starting from the second memory clock in a PHY clock. External Memory Interface Handbook Volume 3: Reference Material 139 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 77. AFI Read Data Full-Rate afi_clk afi_command RD RD RD afi_rdata_en_full afi_rdata A B C D E F afi_rdata_valid The following figure illustrates that the second and third reads require only the first and second half of data, respectively. The first three read commands are aligned accesses where they are issued on the LSB of afi_command. The fourth read command is unaligned access, where it is issued on a different command slot. AFI signals must be shifted accordingly, based on command slot. Figure 78. AFI Read Data Half-Rate afi_clk afi_command[1] NOP NOP NOP RD afi_command[0] RD RD RD NOP afi_rdata_en_full[1] 1 1 1 1 0 afi_rdata_en_full[0] 1 1 1 0 1 afi_rdata[1] B D F afi_rdata[0] A C E afi_rdata_valid[1] 1 1 1 afi_rdata_valid[0] 1 1 1 G H 1 1 In the following figure, the first three read commands are aligned accesses where they are issued on the LSB of afi_command. The fourth read command is unaligned access, where it is issued on a different command slot. AFI signals must be shifted accordingly, based on command slot. External Memory Interface Handbook Volume 3: Reference Material 140 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 79. AFI Read Data Quarter-Rate afi_clk afi_command[3] NOP NOP NOP RD afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] RD RD RD NOP afi_rdata_en_full[3] 1 1 1 1 0 afi_rdata_en_full[2] 1 1 1 0 1 afi_rdata_en_full[1] 1 1 1 0 1 afi_rdata_en_full[0] 1 1 1 0 1 afi_rdata[3] D H L afi_rdata[2] C G K P afi_rdata[1] B F J O afi_rdata[0] A E I N afi_rdata_valid[3] 1 1 1 1 0 afi_rdata_valid[2] 1 1 1 0 1 afi_rdata_valid[1] 1 1 1 0 1 afi_rdata_valid[0] 1 1 1 0 1 M 2.17.4.4 AFI Calibration Status Timing Diagram The controller interacts with the PHY during calibration at power-up and at recalibration. At power-up, the PHY holds afi_cal_success and afi_cal_fail 0 until calibration is done, when it asserts afi_cal_success, indicating to controller that the PHY is ready to use and afi_wlat and afi_rlat signals have valid values. At recalibration, the controller asserts afi_cal_req, which triggers the same sequence as at power-up, and forces recalibration of the PHY. External Memory Interface Handbook Volume 3: Reference Material 141 2 Functional Description—Intel Stratix® 10 EMIF IP Figure 80. Calibration PHY Status Calibrating Controller Working Re-Calibrating Controller Working AFI Interface afi_cal_success afi_cal_fail afi_cal_req afi_wlat 9 9 afi_rlat 9 9 2.18 Stratix 10 Resource Utilization The following tables provide resource utilization information for external memory interfaces on Stratix 10 devices. 2.18.1 QDR-IV Resource Utilization in Stratix 10 Devices The following table shows typical resource usage of QDR-IV controllers for Stratix 10 devices. Table 38. QDR-IV Resource Utilization in Stratix 10 Devices Memory Width (Bits) Combinational ALUTs Dedicated Logic Registers Block Memory Bits M20Ks Soft Controller 18 2123 4592 18432 8 1 36 2127 6023 36864 16 1 72 2114 8826 73728 32 1 2.19 Stratix 10 EMIF Latency The following latency data applies to all memory protocols supported by the Stratix 10 EMIF IP. Table 39. Rate Latency in Full-Rate Memory Clock Cycles 1 Controller Address & Command PHY Address & Command Memory Read Latency 2 PHY Read Data Return Controller Read Data Return Round Trip Round Trip Without Memory — Half:Write 12 2 3-23 — — — Half:Read 8 2 3-23 6 8 27-47 24 continued... External Memory Interface Handbook Volume 3: Reference Material 142 2 Functional Description—Intel Stratix® 10 EMIF IP Rate 1 Controller Address & Command PHY Address & Command Memory Read Latency 2 PHY Read Data Return Controller Read Data Return Round Trip Round Trip Without Memory Quarter:Writ e 14 2 3-23 — — — — Quarter:Rea d 10 2 3-23 6 14 35-55 32 Half:Write (ECC) 14 2 3-23 — — — — Half:Read (ECC) 12 2 3-23 6 8 31-51 28 Quarter:Writ e (ECC) 14 2 3-23 — — — — Quarter:Rea d (ECC) 12 2 3-23 6 14 37-57 34 1. User interface rate; the controller always operates in half rate. 2. Minimum and maximum read latency range for DDR3, DDR4, and LPDDR3. 2.20 Integrating a Custom Controller with the Hard PHY If you want to use your own custom memory controller, you must integrate the controller with the hard PHY to achieve a complete memory solution. Observe the following general guidelines: • When you configure your external memory interface IP, ensure that you select Configuration ➤ Hard PHY Only on the General tab in the parameter editor. • Consult the AFI 4.0 Specification, for detailed information on the AFI interface to the PHY. Related Links AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. 2.21 Document Revision History Date May 2017 Version 2017.05.08 Changes Initial release. External Memory Interface Handbook Volume 3: Reference Material 143 3 Functional Description—Intel Arria 10 EMIF IP 3 Functional Description—Intel Arria 10 EMIF IP Intel Arria 10 devices can interface with external memory devices clocking at frequencies of up to 1.3 GHz. The external memory interface IP component for Arria 10 devices provides a single parameter editor for creating external memory interfaces, regardless of memory protocol. Unlike earlier EMIF solutions which used protocolspecific parameter editors to create memory interfaces via a complex RTL generation method, the Arria 10 EMIF solution captures the protocol-specific hardened EMIF logic of the Arria 10 device together with more generic soft logic. The Arria 10 EMIF solution is designed with the following implementations in mind: Hard Memory Controller and Hard PHY This implementation provides a complete external memory interface, based on the hard memory controller and hard PHY that are part of the Arria 10 silicon. An AvalonMM interface is available for integration with user logic. Soft Memory Controller and Hard PHY This implementation provides a complete external memory interface, using an Intelprovided soft-logic-based memory controller and the hard PHY that is part of the Arria 10 silicon. An Avalon-MM interface is available for integration with user logic. Custom Memory Controller and Hard PHY (PHY only) This implementation provides access to the AFI interface, to allow use of a custom or third-party memory controller with the hard PHY that is part of the Arria 10 silicon. Because only the PHY component is provided by Intel, this configuration is also known as PHY only. Related Links • Arria 10 EMIF Architecture: Introduction on page 146 The Arria 10 EMIF architecture contains many new hardware features designed to meet the high-speed requirements of emerging memory protocols, while consuming the smallest amount of core logic area and power. • Hardware Resource Sharing Among Multiple EMIFs on page 158 Often, it is necessary or desirable to share resources between interfaces. • Arria 10 EMIF IP Component on page 162 The external memory interface IP component for Arria 10 provides a complete solution for implementing DDR3, DDR4, and QDR-IV external memory interfaces. • Compiling Arria 10 EMIF IP with the Quartus Prime Software on page 185 • Debugging Arria 10 EMIF IP on page 186 Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 3 Functional Description—Intel Arria 10 EMIF IP You can debug hardware failures by connecting to the EMIF Debug Toolkit or by exporting an Avalon-MM slave port, from which you can access information gathered during calibration. • Intel Arria 10 Core Fabric and General Purpose I/Os Handbook 3.1 Supported Memory Protocols The following table lists the external memory protocols supported by Arria 10 devices. Table 40. Supported Memory Protocols Protocol Hard Controller and Hard PHY Soft Controller and Hard PHY PHY Only DDR4 Yes — Yes DDR3 Yes — Yes LPDDR3 Yes — Yes RLDRAM 3 — Third party Yes QDR II/II+/II+ Xtreme — Yes — QDR-IV — Yes — Memory protocols not listed above are not supported by the Arria 10 EMIF IP; however, you can implement a custom memory interface for these protocols using the Altera PHYLite Megafunction. LPDDR3 is supported for simulation, compilation, and timing. Hardware support for LPDDR3 will be provided in a future release. Note: To achieve maximum top-line spec performance for a DDR4 interface, both read data bus inversion and periodic OCT calibration must be enabled. For maximum top-line spec performance for a QDR-IV interface, read data bus inversion must be enabled. Related Links Altera PHYLite for Memory Megafunction User Guide 3.2 Key Differences Compared to UniPHY IP and Previous Device Families The Arria 10 EMIF IP has a new design which bears several notable differences compared to UniPHY-based IP. If you are familiar with the UniPHY-based IP, you should review the following differences, as they affect the way you generate, instantiate, and use the Arria 10 EMIF IP. External Memory Interface Handbook Volume 3: Reference Material 145 3 Functional Description—Intel Arria 10 EMIF IP • Unlike the UniPHY-based IP, which presents a protocol-specific parameter editor for each supported memory protocol, the Arria 10 EMIF IP uses one parameter editor for all memory protcols. • With UniPHY-based IP, you must run the <variation_name>_pin_assignments.tcl script following synthesis, to apply I/O assignments to the project's .qsf file. In Arria 10 EMIF IP, the <variation_name>_pin_assignments.tcl script is no longer necessary. All the I/O assignments are included in the generated .qip file, which the Quartus Prime software processes during compilation. Assignments that you make in the .qsf file override those in the .qip file. • The Arria 10 EMIF IP includes a <variation_name>readme.txt file, located in the / altera_emif_arch_nf_<version> directory. This file contains important information about the implementation of the IP, including pin location guidelines, information on resource sharing, and signal descriptions. • To generate the synthesis example design or the simulation example design, you need to run additional scripts after generation. 3.3 Migrating from Previous Device Families There is no automatic migration mechanism for external memory interface IP generated for previous device families. To migrate an existing EMIF IP from an earlier device family to Arria 10, you must reparameterize and regenerate the IP targeting Arria 10, using either the IP Catalog or Qsys. If you attempt to recompile an existing IP generated for a previous device family, you will encounter errors in the Quartus Prime software. UniPHY-based IP continues to be supported for previous device families. 3.4 Arria 10 EMIF Architecture: Introduction The Arria 10 EMIF architecture contains many new hardware features designed to meet the high-speed requirements of emerging memory protocols, while consuming the smallest amount of core logic area and power. The following are key hardware features of the Arria 10 EMIF architecture: Hard Sequencer The sequencer employs a hard Nios II processor, and can perform memory calibration for a wide range of protocols. You can share the sequencer among multiple memory interfaces of the same or different protocols. Hard PHY The hard PHY in Arria 10 devices can interface with external memories running at speeds of up to 1.3 GHz. The PHY circuitry is hardened in the silicon, which simplifies the challenges of achieving timing closure and minimal power consumption. Hard Memory Controller The hard memory controller reduces latency and minimizes core logic consumption in the external memory interface. The hard memory controller supports the DDR3, DDR4, and LPDDR3 memory protocols. External Memory Interface Handbook Volume 3: Reference Material 146 3 Functional Description—Intel Arria 10 EMIF IP PHY-Only Mode Protocols that use a hard controller (DDR4, DDR3, and LPDDR3) as well as RLDRAM 3, provide a "PHY-only" option. When selected, this option generates only the PHY and sequencer, but not the controller. This PHY-Only mode provides a mechanism by which to integrate your own custom soft controller. High-Speed PHY Clock Tree Dedicated high speed PHY clock networks clock the I/O buffers in Arria 10 EMIF IP. The PHY clock trees exhibit low jitter and low duty cycle distortion, maximizing the data valid window. Automatic Clock Phase Alignment Automatic clock phase alignment circuitry dynamically adjust the clock phase of core clock networks to match the clock phase of the PHY clock networks. The clock phase alignment circuitry minimizes clock skew that can complicate timing closure in transfers between the FPGA core and the periphery. Resource Sharing The Arria 10 architecture simplifies resource sharing between memory interfaces. Resources such as the OCT calibration block, PLL reference clock pin, and core clock can be shared. The hard Nios processor in the I/O AUX must be shared across all interfaces in a column. Related Links Intel Arria 10 Core Fabric and General Purpose I/Os Handbook 3.4.1 Arria 10 EMIF Architecture: I/O Subsystem The I/O subsystem consists of two columns inside the core of Arria 10 devices. Each column can be thought of as loosely analogous to an I/O bank. Figure 81. Arria 10 I/O Subsystem Core Fabric I/O Column Transceivers (if applicable) External Memory Interface Handbook Volume 3: Reference Material 147 3 Functional Description—Intel Arria 10 EMIF IP The I/O subsystem provides the following features: • General-purpose I/O registers and I/O buffers • On-chip termination control (OCT) • I/O PLLs for external memory interfaces and user logic • Low-voltage differential signaling (LVDS) • External memory interface components, as follows: — Hard memory controller — Hard PHY — Hard Nios processor and calibration logic — DLL Related Links Arria 10 Core Fabric and General Purpose I/Os Handbook 3.4.2 Arria 10 EMIF Architecture: I/O Column Arria 10 devices have two I/O columns, which contain the hardware related to external memory interfaces. Each I/O column contains the following major parts: • A hardened Nios processor with dedicated memory. This Nios block is referred to as the I/O AUX. • Up to 13 I/O banks. Each I/O bank contains the hardware necessary for an external memory interface. External Memory Interface Handbook Volume 3: Reference Material 148 3 Functional Description—Intel Arria 10 EMIF IP Figure 82. I/O Column LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair Transceiver Block 2L 3H 2K 3G 2J 3F 2I 3E 2H 3D 2G 3C 2F 3B 2A 3A I/O Column LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair I/O Center Transceiver Block Transceiver Block Individual I/O Banks I/O PLL LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair LVDS I/O Buffer Pair I/O Column I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA I/O DLL I/O CLK OCT VR Hard Memory Controller and PHY Sequencer I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA I/O Lane SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA SERDES & DPA Bank Control 3.4.3 Arria 10 EMIF Architecture: I/O AUX Each column includes one I/O AUX, which contains a hardened Nios II processor with dedicated memory. The I/O AUX is responsible for calibration of all the EMIFs in the column. The I/O AUX includes dedicated memory which stores both the calibration algorithm and calibration run-time data. The hardened Nios II processor and the dedicated memory can be used only by an external memory interface, and cannot be employed for any other use. The I/O AUX can interface with soft logic, such as the debug toolkit, via an Avalon-MM bus. The I/O AUX is clocked by an on-die oscillator, and therefore does not consume a PLL. 3.4.4 Arria 10 EMIF Architecture: I/O Bank A single I/O bank contains all the hardware needed to build an external memory interface. Each I/O column contains up to 13 I/O banks; the exact number of banks depends on device size and pin package. You can make a wider interface by connecting multiple banks together. External Memory Interface Handbook Volume 3: Reference Material 149 3 Functional Description—Intel Arria 10 EMIF IP Each I/O bank resides in an I/O column, and contains the following components: Figure 83. • Hard memory controller • Sequencer components • PLL and PHY clock trees • DLL • Input DQS clock trees • 48 pins, organized into four I/O lanes of 12 pins each I/O Bank Architecture in Arria 10 Devices to / from bank above I/O Bank Output Path Input Path I/O Lane 3 S equencer Output Path Input Path I/O Lane 2 Output Path Input Path I/O Lane 1 PLL C lock Phase Alignment Output Path Input Path I/O Lane 0 Memory C ontroller to / from FPGA core to / from bank below I/O Bank Usage The pins in an I/O bank can serve as address and command pins, data pins, or clock and strobe pins for an external memory interface. You can implement a narrow interface, such as a DDR3 or DDR4 x8 interface, with only a single I/O bank. A wider interface, such as x72 or x144, can be implemented by configuring multiple adjacent banks in a multi-bank interface. Any pins in a bank which are not used by the external memory interface remain available for use as general purpose I/O pins (of the same voltage standard). External Memory Interface Handbook Volume 3: Reference Material 150 3 Functional Description—Intel Arria 10 EMIF IP Every I/O bank includes a hard memory controller which you can configure for DDR3 or DDR4. In a multi-bank interface, only the controller of one bank is active; controllers in the remaining banks are turned off to conserve power. To use a multi-bank Arria 10 EMIF interface, you must observe the following rules: • Designate one bank as the address and command bank. • The address and command bank must contain all the address and command pins. • The locations of individual address and command pins within the address and command bank must adhere to the pin map defined in the pin table— regardless of whether you use the hard memory controller or not. • If you do use the hard memory controller, the address and command bank contains the active hard controller. All the I/O banks in a column are capable of functioning as the address and command bank. However, for minimal latency, you should select the center-most bank of the interface as the address and command bank. 3.4.4.1 Implementing a x8 Interface with Hard Memory Controller The following diagram illustrates the use of a single I/O bank to implement a DDR3 or DDR4 x8 interface using the hard memory controller. Figure 84. Single Bank x8 Interface With Hard Controller Memory Controller Sequencer PLL Clock Phase Alignment Output Path I/O Lane 3 Input Path Address/Command Lane 3 Output Path I/O Lane 2 Input Path Address/Command Lane 2 Output Path I/O Lane 1 Input Path Address/Command Lane 1 Output Path I/O Lane 0 Input Path DQ Group 0 In the above diagram, shaded cells indicate resources that are in use. Note: For information on the I/O lanes and pins in use, consult the pin table for your device or the <variation_name>/altera_emif_arch_nf_140/<synth|sim>/ <variation_name>_altera_emif_arch_nf_140_<unique ID>_readme.txt file generated with your IP. Related Links Intel Arria 10 Core Fabric and General Purpose I/Os Handbook External Memory Interface Handbook Volume 3: Reference Material 151 3 Functional Description—Intel Arria 10 EMIF IP 3.4.4.2 Implementing a x72 Interface with Hard Memory Controller The following diagram illustrates one possible implementation of a DDR3 or DDR4 x72 interface using the hard memory controller. Note that only the hard memory controller in the address and command bank is used. Similarly, only the clock phase alignment block of the address and command bank is used to generate clock signals for the FPGA core. Figure 85. Multi-Bank x72 Interface With Hard Controller Memory Controller Sequencer PLL Clock Phase Alignment Memory Controller Sequencer PLL Clock Phase Alignment Memory Controller Sequencer PLL Clock Phase Alignment Output Path I/O Lane 3 Input Path DQ Group 8 Output Path I/O Lane 2 Input Path DQ Group 7 Output Path I/O Lane 1 Input Path DQ Group 6 Output Path I/O Lane 0 Input Path DQ Group 5 Output Path I/O Lane 3 Input Path Address/Command Lane 3 Output Path I/O Lane 2 Input Path Address/Command Lane 2 Output Path I/O Lane 1 Input Path Address/Command Lane 1 Output Path I/O Lane 0 Input Path DQ Group 4 Output Path I/O Lane 3 Input Path DQ Group 3 Output Path I/O Lane 2 Input Path DQ Group 2 Output Path I/O Lane 1 Input Path DQ Group 1 Output Path I/O Lane 0 Input Path DQ Group 0 In the above diagram, shaded cells indicate resources that are in use. External Memory Interface Handbook Volume 3: Reference Material 152 3 Functional Description—Intel Arria 10 EMIF IP Note: For information on the I/O lanes and pins in use, consult the pin table for your device or the <variation_name>/altera_emif_arch_nf_140/<synth|sim>/ <variation_name>_altera_emif_arch_nf_140_<unique ID>_readme.txt file generated with your IP. Related Links Arria 10 Core Fabric and General Purpose I/Os Handbook 3.4.5 Arria 10 EMIF Architecture: I/O Lane An I/O bank contains 48 I/O pins, organized into four I/O lanes of 12 pins each. Each I/O lane can implement one x8/x9 read capture group (DQS group), with two pins functioning as the read capture clock/strobe pair (DQS/DQS#), and up to 10 pins functioning as data pins (DQ and DM pins). To implement x18 and x36 groups, you can use multiple lanes within the same bank. It is also possible to implement a pair of x4 groups in a lane. In this case, four pins function as clock/strobe pair, and 8 pins function as data pins. DM is not available for x4 groups. There must be an even number of x4 groups for each interface. For x4 groups, DQS0 and DQS1 must be placed in the same I/O lane as a pair. Similarly, DQS2 and DQS3 must be paired. In general, DQS(x) and DQS(x+1) must be paired in the same I/O lane. Table 41. Lanes Used Per Group Group Size Number of Lanes Used Maximum Number of Data Pins per Group x8 / x9 1 10 x18 2 22 x36 4 46 pair of x4 1 4 per group, 8 per lane External Memory Interface Handbook Volume 3: Reference Material 153 3 Functional Description—Intel Arria 10 EMIF IP Figure 86. x4 Group Memory Controller Sequencer PLL Clock Phase Alignment Figure 87. Output Path I/O Lane 3 Input Path X4 Groups 6 and 7 Output Path I/O Lane 2 Input Path X4 Groups 4 and 5 Output Path I/O Lane 1 Input Path X4 Groups 2 and 3 Output Path I/O Lane 0 Input Path X4 Groups 0 and 1 Output Path I/O Lane 3 Input Path X8 Group 3 Output Path I/O Lane 2 Input Path X8 Group 2 Output Path I/O Lane 1 Input Path X8 Group 1 Output Path I/O Lane 0 Input Path X8 Group 0 x8 Group Memory Controller Sequencer PLL Clock Phase Alignment External Memory Interface Handbook Volume 3: Reference Material 154 3 Functional Description—Intel Arria 10 EMIF IP Figure 88. x18 Group M emo ry Controller Sequ encer PLL Clock Phase Alignment Figure 89. Output Path Input Path X18 Group 0 Output Path Input Path I/O Lane 3 I/O Lane 2 Output Path Input Path X18 Group 1 Output Path Input Path I/O Lane 1 Output Path Input Path I/O Lane 3 Output Path Input Path X36 Group 0 Output Path Input Path I/O Lane 2 Output Path Input Path I/O Lane 0 I/O Lane 0 x36 Group M emo ry Controller Sequ encer PLL Clock Phase Alignment I/O Lane 1 3.4.6 Arria 10 EMIF Architecture: Input DQS Clock Tree The input DQS clock tree is a balanced clock network that distributes the read capture clock and strobe from the external memory device to the read capture registers inside the I/Os. You can configure an input DQS clock tree in x4 mode, x8/x9 mode, x18 mode, or x36 mode. External Memory Interface Handbook Volume 3: Reference Material 155 3 Functional Description—Intel Arria 10 EMIF IP Within every bank, only certain physical pins at specific locations can drive the input DQS clock trees. The pin locations that can drive the input DQS clock trees vary, depending on the size of the group. Table 42. Pins Usable as Read Capture Clock / Strobe Pair Group Size Index of Lanes Spanned by Clock Tree In-Bank Index of Pins Usable as Read Capture Clock / Strobe Pair Positive Leg Negative Leg x4 0A 4 5 x4 0B 8 9 x4 1A 16 17 x4 1B 20 21 x4 2A 28 29 x4 2B 32 33 x4 3A 40 41 x4 3B 44 45 x8 / x9 0 4 5 x8 / x9 1 16 17 x8 / x9 2 28 29 x8 / x9 3 40 41 x18 0, 1 12 13 x18 2, 3 36 37 x36 0, 1, 2, 3 20 21 3.4.7 Arria 10 EMIF Architecture: PHY Clock Tree Dedicated high-speed clock networks drive I/Os in Arria 10 EMIF. Each PHY clock network spans only one bank. The relatively short span of the PHY clock trees results in low jitter and low duty-cycle distortion, maximizing the data valid window. The PHY clock tree in Arria 10 devices can run as fast as 1.3 GHz. All Arria 10 external memory interfaces use the PHY clock trees. 3.4.8 Arria 10 EMIF Architecture: PLL Reference Clock Networks Each I/O bank includes a PLL that can drive the PHY clock trees of that bank, through dedicated connections. In addition to supporting EMIF-specific functions, such PLLs can also serve as general-purpose PLLs for user logic. Arria 10 external memory interfaces that span multiple banks use the PLL in each bank. (Previous device families relied on a single PLL with clock signals broadcast to all I/Os via a clock network.) The Arria 10 architecture allows for relatively short PHY clock networks, reducing jitter and duty-cycle distortion. External Memory Interface Handbook Volume 3: Reference Material 156 3 Functional Description—Intel Arria 10 EMIF IP In a multi-bank interface, the clock outputs of individual PLLs must remain in phase; this is achieved by the following mechanisms: • A single PLL reference clock source feeds all PLLs. The reference clock signal reaches the PLLs by a balanced PLL reference clock tree. The Quartus Prime software automatically configures the PLL reference clock tree so that it spans the correct number of banks. • The IP sets the PLL M and N values appropriately to maintain synchronization among the clock dividers across the PLLs. This requirement restricts the legal PLL reference clock frequencies for a given memory interface frequency and clock rate. The Arria 10 EMIF IP parameter editor automatically calculates and displays the set of legal PLL reference clock frequencies. If you plan to use an on-board oscillator, you must ensure that its frequency matches the PLL reference clock frequency that you select from the displayed list. The correct M and N values of the PLLs are set automatically based on the PLL reference clock frequency that you select. Note: The PLL reference clock pin may be placed in the address and command I/O bank or in a data I/O bank, there is no implication on timing. Figure 90. PLL Balanced Reference Clock Tree I/O Bank PLL PHY clock tree I/O Bank I/O Column ref_clk Balanced Reference Clock Network PLL PHY clock tree I/O Bank PLL PHY clock tree I/O Bank PLL PHY clock tree 3.4.9 Arria 10 EMIF Architecture: Clock Phase Alignment In Arria 10 external memory interfaces, a global clock network clocks registers inside the FPGA core, and the PHY clock network clocks registers inside the FPGA periphery. Clock phase alignment circuitry employs negative feedback to dynamically adjust the phase of the core clock signal to match the phase of the PHY clock signal. The clock phase alignment feature effectively eliminates the clock skew effect in all transfers between the core and the periphery, facilitating timing closure. All Arria 10 external memory interfaces employ clock phase alignment circuitry. External Memory Interface Handbook Volume 3: Reference Material 157 3 Functional Description—Intel Arria 10 EMIF IP Figure 91. Clock Phase Alignment Illustration FPGA Periphery FPGA Core PHY Clock Network PLL Core Clock Network Clock Phase Alignment p Figure 92. +t Effect of Clock Phase Alignment Without Clock Phase Alignment With Clock Phase Alignment Core clock Core clock PHY clock PHY clock Skew between core and PHY clock network Core and PHY clocks aligned dynamically by clock phase alignment 3.5 Hardware Resource Sharing Among Multiple EMIFs Often, it is necessary or desirable to share resources between interfaces. The following topics explain which hardware resources can be shared, and provide guidance for doing so. 3.5.1 I/O Aux Sharing The I/O Aux contains a hard Nios-II processor and dedicated memory storing the calibration software code and data. External Memory Interface Handbook Volume 3: Reference Material 158 3 Functional Description—Intel Arria 10 EMIF IP When a column contains multiple memory interfaces, the hard Nios-II processor calibrates each interface serially. Interfaces placed within the same I/O column always share the same I/O Aux. The Quartus Prime Fitter handles I/O Aux sharing automatically. 3.5.2 I/O Bank Sharing Data lanes from multiple compatible interfaces can share a physical I/O bank to achieve a more compact pin placement. To share an I/O bank, interfaces must use the same memory protocol, rate, frequency, I/O standard, and PLL reference clock signal. Rules for Sharing I/O Banks • A bank cannot serve as the address and command bank for more than one interface. This means that lanes which implement address and command pins for different interfaces cannot be allocated to the same physical bank. Note: An exception to the above rule exists when two interfaces are configured in a Ping-Pong PHY fashion. In such a configuration, two interfaces share the same set of address and command pins, effectively meaning that they share the same address and command tile. • Pins within a lane cannot be shared by multiple memory interfaces. • Pins that are not used by EMIF IP can serve as general-purpose I/Os of compatible voltage and termination settings. • You can configure a bank as LVDS or as EMIF, but not both at the same time. • Interfaces that share banks must reside at consecutive bank locations. The following diagram illustrates two x16 interfaces sharing an I/O bank. The two interfaces share the same clock phase alignment block, so that one core clock signal can interact with both interfaces. Without sharing, the two interfaces would occupy a total of four physical banks instead of three. External Memory Interface Handbook Volume 3: Reference Material 159 3 Functional Description—Intel Arria 10 EMIF IP Figure 93. I/O Bank Sharing Memory Controller Sequencer PLL Clock Phase Alignment Memory Controller Sequencer PLL Clock Phase Alignment Memory Controller Sequencer PLL Clock Phase Alignment Output Path I/O Lane 3 Input Path Address/Command Lane 2 Output Path I/O Lane 2 Input Path Address/Command Lane 1 Output Path I/O Lane 1 Input Path Address/Command Lane 0 Output Path I/O Lane 0 Input Path DQ Group 0 Output Path I/O Lane 3 Input Path DQ Group 1 Output Path I/O Lane 2 Interface 1 Input Path Output Path I/O Lane 1 Input Path Output Path I/O Lane 0 Input Path DQ Group 1 Output Path I/O Lane 3 Input Path Address/Command Lane 2 Output Path I/O Lane 2 Input Path Address/Command Lane 1 Output Path I/O Lane 1 Input Path Address/Command Lane 0 Output Path I/O Lane 0 Input Path DQ Group 0 Interface 2 3.5.3 PLL Reference Clock Sharing In Arria 10, every I/O bank contains a PLL, meaning that it is not necessary to share PLLs in the interest of conserving resources. Nonetheless, it is often desirable to share PLLs for other reasons. External Memory Interface Handbook Volume 3: Reference Material 160 3 Functional Description—Intel Arria 10 EMIF IP You might want to share PLLs between interfaces for the following reasons: • To conserve pins. • When combined with the use of the balanced PLL reference clock tree, to allow the clock signals at different interfaces to be synchronous and aligned to each other. For this reason, interfaces that share core clock signals must also share the PLL reference clock signal. To implement PLL reference clock sharing, open your RTL and connect the PLL reference clock signal at your design's top-level to the PLL reference clock port of multiple interfaces. To share a PLL reference clock, the following requirements must be met: • Interfaces must expect a reference clock signal of the same frequency. • Interfaces must be placed in the same column. • Interfaces must be placed at adjacent bank locations. 3.5.4 Core Clock Network Sharing It is often desirable or necessary for multiple memory interfaces to be accessible using a single clock domain in the FPGA core. You might want to share core clock networks for the following reasons: • To minimize the area and latency penalty associated with clock domain crossing. • To minimize consumption of core clock networks. Multiple memory interfaces can share the same core clock signals under the following conditions: • The memory interfaces have the same protocol, rate, frequency, and PLL reference clock source. • The interfaces reside in the same I/O column. • The interfaces reside in adjacent bank locations. For multiple memory interfaces to share core clocks, you must specify one of the interfaces as master and the remaining interfaces as slaves. Use the Core clocks sharing setting in the parameter editor to specify the master and slaves. In your RTL, connect the clks_sharing_master_out signal from the master interface to the clks_sharing_slave_in signal of all the slave interfaces. Both the master and slave interfaces expose their own output clock ports in the RTL (e.g. emif_usr_clk, afi_clk), but the signals are equivalent, so it does not matter whether a clock port from a master or a slave is used. Core clock sharing necessitates PLL reference clock sharing; therefore, only the master interface exposes an input port for the PLL reference clock. All slave interfaces use the same PLL reference clock signal. External Memory Interface Handbook Volume 3: Reference Material 161 3 Functional Description—Intel Arria 10 EMIF IP 3.6 Arria 10 EMIF IP Component The external memory interface IP component for Arria 10 provides a complete solution for implementing DDR3, DDR4, and QDR-IV external memory interfaces. The EMIF IP also includes a protocol-specific calibration algorithm that automatically determines the optimal delay settings for a robust external memory interface. The external memory interface IP comprises the following parts: • A set of synthesizable files that you can integrate into a larger design • A stand-alone synthesizable example design that you can use for hardware validation • A set of simulation files that you can incorporate into a larger project • A stand-alone simulation example project that you can use to observe controller and PHY operation • A set of timing scripts that you can use to determine the maximum operating frequency of the memory interface based on external factors such as board skew, trace delays, and memory component timing parameters • A customized data sheet specific to your memory interface configuration 3.6.1 Instantiating Your Arria 10 EMIF IP in a Qsys Project The following steps describe how to instantiate your Arria 10 EMIF IP in a Qsys project. 1. Within the Qsys interface, select Memories and Memory Controllers in the component Library tree. 2. Under Memories and Memory Controllers, select External Memory Interfaces (Arria 10). 3. Under External Memory Interfaces (Arria 10), select the Arria 10 External Memory Interface component. External Memory Interface Handbook Volume 3: Reference Material 162 3 Functional Description—Intel Arria 10 EMIF IP Figure 94. Instantiating Arria 10 EMIF IP in Qsys 3.6.1.1 Logical Connections The following logical connections exist in an Arria 10 EMIF IP core. Table 43. Logical Connections Table Logical Connection afi_conduit_end (Conduit) afi_clk_conduit_end (Conduit) afi_half_clk_conduit_end (Conduit) afi_reset_n_conduit_end (Conduit) cal_debug_avalon_slave (Avalon Slave/Target) cal_debug_clk_clock_sink (Clock Input) cal_debug_reset_reset_sink (Reset Input) Description The Altera PHY Interface (AFI) connects a memory controller to the PHY. This interface is exposed only when you configure the memory interface in PHY-Only mode. The interface is synchronous to the afi_clk clock and afi_reset_n reset. Use this clock signal to clock the soft controller logic. The afi_clk is an output clock coming from the PHY when the memory interface is in PHY-Only mode. The phase of afi_clk is adjusted dynamically by hard circuitry for the best data transfer between FPGA core logic and periphery logic with maximum timing margin. Multiple memory interface instances can share a single afi_clk using the Core Clocks Sharing option during IP generation. This clock runs at half the frequency of afi_lk. It is exposed only when the memory interface is in PHY-Only mode. This single-bit reset provides a synchronized reset output. Use this signal to reset all registers that are clocked by either afi_clk or afi_half_clk. This interface is exposed when the EMIF Debug Toolkit/On-chip Debug Port option is set to Export. This interface can be connected to an Arria 10 External Memory Interface Debug Component to allow EMIF Debug Toolkit access, or it can be used directly by user logic to access calibration diagnostic data. This clock is used for the cal_debug_avalon_slave interface. It can be connected to the emif_usr_clk_clock_source interface. This reset is used for the cal_debug_avalon_slave interface. It can be connected to the emif_usr_reset_reset_source interface. continued... External Memory Interface Handbook Volume 3: Reference Material 163 3 Functional Description—Intel Arria 10 EMIF IP Logical Connection Description (Avalon Master/Source) This interface is exposed when the Enable Daisy-Chaining for EMIF Debug Toolkit/On-Chip Debug Port option is enabled. Connect this interface to the cal_debug_avalon_slave interface of the next EMIF instance in the same I/O column. cal_debug_out_clk_clock_sourc e This interface should be connected to the cal_debug_clk_clock_sink interface of the next EMIF instance in the same I/O column (similar to cal_debug_out_avalon_master). cal_debug_out_avalon_master (Clock Output) cal_debug_out_reset_reset_sou rce (Reset Output) This interface should be connected to the cal_debug_reset_reset_sink interface of the next EMIF instance in the same I/O column (similar to cal_debug_out_avalon_master). (Avalon Slave/Target) This interface allows access to the Efficiency Monitor CSR. For more information, see the documentation on the UniPHY Efficiency Monitor. global_reset_reset_sink This single-wire input port is the asynchronous reset input for the EMIF core. effmon_csr_avalon_slave (Reset Input) pll_ref_clk_clock_sink (Clock Input) oct_conduit_end (Conduit) mem_conduit_end (Conduit) status_conduit_end (Conduit) emif_usr_reset_reset_source (Reset Output) emif_usr_clk_clock_source (Clock Output) ctrl_amm_avalon_slave (Avalon Slave/Target) This single-wire input port connects the external PLL reference clock to the EMIF core. Multiple EMIF cores may share a PLL reference clock source, provided the restrictions outlined in the PLL and PLL Reference Clock Network section are observed. This logical port is connected to an OCT pin and provides calibrated reference data for EMIF cores with pins that use signaling standards that require on-chip termination. Depending on the I/O standard, reference voltage, and memory protocol, multiple EMIF cores may share a single OCT pin. This logical conduit can attach an Altera Memory Model to an EMIF core for simulation. Memory models for various protocols are available under the Memories and Memory Controllers — External Memory Interfaces — Memory Models section of the component library in Qsys. You must ensure that all configuration parameters for the memory model match the configuration parameters of the EMIF core. The status conduit exports two signals that can be sampled to determine if the calibration operation passed or failed for that core. This single-bit reset output provides a synchronized reset output that should be used to reset all components that are synchronously connected to the EMIF core. Assertion of the global reset input triggers an assertion of this output as well, therefore you should rely on this signal only as a reset source for all components connected to the EMIF core. Use this single-bit clock output to clock all logic connected to the EMIF core. The phase of this clock signal is adjusted dynamically by circuitry in the EMIF core such that data can be transferred between core logic and periphery registers with maximum timing margin. Drive all logic connected to the EMIF core with this clock signal. Other clock sources generated from the same reference clock or even the same PLL may have unknown phase relationships. Multiple EMIF cores can share a single core clock using the Core Clocks Sharing option described in the Example Design tab of the parameter editor. This Avalon target port initiates read or write commands to the controller. Refer to the Avalon Interface Specification for more information on how to design cores that comply to the Avalon Bus Specification. For DDR3, DDR4, and LPDDR3 protocols with the hard PHY and hard controller configuration and an AVL slave interface exposed, ctrl_amm_avalon_slave is renamed to crtl_amm_avalon_slave_0. For QDR II, QDR II+, and QDR II+ Xtreme interfaces with hard PHY and soft controller, separate read and write connections are used. ctrl_amm_avalon_slave_0 is the read port and ctrl_amm_avalon_slave_1 is the write port. For QDR-IV interfaces with hard PHY and soft controller operating at quarter rate, a total of eight separate Avalon interfaces (named ctrl_amm_avalon_slave_0 to ctrl_amm_avalon_slave_7) are used to maximize bus efficiency. External Memory Interface Handbook Volume 3: Reference Material 164 3 Functional Description—Intel Arria 10 EMIF IP 3.6.2 File Sets The Arria 10 EMIF IP core generates four output file sets for every EMIF IP core, arranged according to the following directory structure. Table 44. Generated File Sets Directory Description This directory contains only the files required to integrate a generated EMIF core into a larger design. This directory contains: • Synthesizable HDL source files • Customized TCL timing scripts specific to the core (protocol and topology) • HEX files used by the calibration algorithm to identify the interface parameters • A customized data sheet that describes the operation of the generated core <core_name>/* Note: The top-level HDL file is generated in the root folder as <core_name>.v (or <core_name.vhd> for VHDL designs). You can reopen this file in the parameter editor if you want to modify the EMIF core parameters and regenerate the design. <core_name>_sim/* This directory contains the simulation fileset for the generated EMIF core. These files can be integrated into a larger simulation project. For convenience, simulation scripts for compiling the core are provided in the /mentor, /cadence, /synopsys, and /riviera subdirectories. The top-level HDL file, <core_name>.v (or <core_name>.vhd) is located in this folder, and all remaining HDL files are placed in the / altera_emif_arch_nf subfolder, with the customized data sheet. The contents of this directory are not intended for synthesis. emif_<instance_num>_example_design/* This directory contains a set of TCL scripts, QSYS project files and README files for the complete synthesis and simulation example designs. You can invoke these scripts to generate a standalone fullysynthesizable project complete with an example driver, or a standalone simulation design complete with an example driver and a memory model. 3.6.3 Customized readme.txt File When you generate your Arria 10 EMIF IP, the system produces a customized readme file, containing data indicative of the settings in your IP core. The readme file is <variation_name>/ altera_emif_arch_nf_<version_number>/<synth|sim>/ <variation_name>_altera_emif_arch_nf_<version_number>_<unique ID>_readme.txt, and contains a summary of Arria 10 EMIF information, and details specific to your IP core, including: • Pin location guidelines • External port names, directions, and widths • Internal port names, directions, and widths • Avalon interface configuration details (if applicable) • Calibration mode • A brief summary of all configuration settings for the generated IP External Memory Interface Handbook Volume 3: Reference Material 165 3 Functional Description—Intel Arria 10 EMIF IP You should review the generated readme file for implementation guidelines specific to your IP core. 3.6.4 Clock Domains The Arria 10 EMIF IP core provides a single clock domain to drive all logic connected to the EMIF core. The frequency of the clock depends on the core-clock to memory-clock interface rate ratio. For example, a quarter-rate interface with an 800 MHz memory clock would provide a 200 MHz clock to the core (800 MHz / 4 = 200 MHz). The EMIF IP dynamically adjusts the phase of the core clock with respect to the periphery clock to maintain the optimum alignment for transferring data between the core and periphery. Independent EMIF IP cores driven from the same reference clock have independent core clock domains. You should employ one of the following strategies if you are implementing multiple EMIF cores: 1. Treat all crossing between independent EMIF-generated clock domains as asynchronous, even though they are generated from the same reference clock. 2. Use the Core clock sharing option to enforce that multiple EMIF cores share the same core clock. You must enable this option during IP generation. This option is applicable only for cores that reside in the same I/O column. 3.6.5 ECC in Arria 10 EMIF IP The ECC (error correction code) is a soft component of the Arria 10 EMIF IP that reduces the chance of errors when reading and writing to external memory. ECC allows correction of single-bit errors and reduces the chances of system failure. The ECC component includes an encoder, decoder, write FIFO buffer, and modification logic, to allow read-modify-write operations. The ECC code employs standard Hamming logic to correct single-bit errors and to detect double-bit errors. ECC is available in 16, 24, 40, and 72 bit widths. When writing data to memory, the encoder creates ECC bits and writes them together with the regular data. When reading from memory, the decoder checks the ECC bits and regular data, and passes the regular data unchanged if no errors are detected. If a single-bit error is detected, the ECC logic corrects the error and passes the regular data. If more than a single-bit error is detected, the ECC logic sets a flag to indicate the error. Read-modify-write operations can occur in the following circumstances: • A partial write in data mask mode with ECC enabled, where at least one memory burst of byte-enable is not all ones or all zeros. • Auto-correction with ECC enabled. This is usually a dummy write issued by the auto-correction logic to correct the memory content when a single-bit error is detected. The read-modify-write reads back the data, corrects the single-bit error, and writes the data back. The additional overhead associated with read-modify-write operations can severely reduce memory interface efficiency. For best efficiency, you should design traffic patterns to avoid read-modify-write operations wherever possible, such as by minimizing the number of partial writes in ECC mode. External Memory Interface Handbook Volume 3: Reference Material 166 3 Functional Description—Intel Arria 10 EMIF IP 3.6.5.1 ECC Components The ECC logic communicates with user logic via an Avalon-MM interface, and with the hard memory controller via an Avalon-ST interface ECC Encoder The ECC encoder consists of a x64/x72 encoder IP core capable of single-bit error correction and double-bit error detection. The encoder takes 64 bits input and converts it to 72 bits output, where the 8 additional bits are ECC code. The encoder supports any input data width less than 64-bits. Any unused input data bits are set to zero. ECC Decoder The ECC decoder consists of a x72/x64 decoder IP core capable of double-bit error detection. The decoder takes 72 bits input and converts it to 64 bits output. The decoder also produces single-bit error and double-bit error information. The decoder controls the user read data valid signal; when read data is intended for partial write, the user read data valid signal is deasserted, because the read data is meant for merging, not for the user. Partial Write Data FIFO Buffer The Partial Write Data FIFO Buffer is implemented in soft logic to store partial write data and byte enable. Data and byte enable are popped and merged with the returned read data. A partial write can occur in the following situations: • At least one memory burst of byte enable is not all ones or all zeroes. • Non data masked mode, where all memory bursts of byte enable are not all ones. • A dummy write with auto-correction logic, where all memory bursts of byte enable are all zeroes. (You might use a dummy write when correcting memory content with a single-bit error. Merging Logic Merge return partial read data with write data based on byte enabled popped from the FIFO buffer, and send it to the ECC encoder. Pointer FIFO Buffer The pointer FIFO buffer is implemented in soft logic to store write data pointers. The ECC logic refers to the pointers when sending write data to DBC. The pointers serve to overwrite existing write data in the data buffer during a read-modify-write process. Partial Logic Partial logic decodes byte enable information and distinguishes between normal and partial writes. Memory Mode Register Interface The Memory Mode Register interface is an Avalon-based interface through which core logic can access debug signals and sideband operation requests in the hard memory controller. External Memory Interface Handbook Volume 3: Reference Material 167 3 Functional Description—Intel Arria 10 EMIF IP The MMR logic routes ECC-related operations to an MMR register implemented in soft logic, and returns the ECC information via an Avalon-MM interface. The MMR logic tracks single-bit and double-bit error status, and provides the following information: • Interrupt status. • Single-bit error and double-bit error status. • Single-bit error and double-bit error counts (to a maximum of 15; if more than 15 errors occur, the count will overflow). • Address of the last error. 3.6.5.2 ECC User Interface Controls You can enable the ECC logic from the Configuration, Status, and Error Handling section of the Controller tab in the parameter editor. There are three user interface settings related to ECC: • Enabled Memory-Mapped Configuration and Status Register (MMR) Interface: Allows run-time configuration of the memory controller. You can enable this option together with ECC to retrieve error detection information from the ECC logic. • Enabled Error Detection and Correction Logic: Enables ECC logic for single-bit error correction and double-bit error detection. • Enable Auto Error Correction: Allows the controller to automatically correct single-bit errors detected by the ECC logic. 3.7 Examples of External Memory Interface Implementations for DDR4 The following figures are examples of external memory interface implementations for different DDR4 memory widths. The figures show the locations of the address/ command and data pins in relation to the locations of the memory controllers. Figure 95. DDR4 1x8 Implementation Example (One I/O Bank) 1 x 8 Pin (1 Bank) Data Address/Command Controller Address/Command Address/Command External Memory Interface Handbook Volume 3: Reference Material 168 3 Functional Description—Intel Arria 10 EMIF IP Figure 96. DDR4 1x32 Implementation Example (Two I/O Banks) 1 x 32 Pin (2 Banks) Data Data Data Data Address/Command Controller Address/Command Address/Command External Memory Interface Handbook Volume 3: Reference Material 169 3 Functional Description—Intel Arria 10 EMIF IP Figure 97. DDR4 1x72 Implementation Example (Three I/O Banks) 1 x 72 Pin (3 Banks) Data Data Data Data Data Address/Command Controller Address/Command Address/Command Data Data Data Data External Memory Interface Handbook Volume 3: Reference Material 170 3 Functional Description—Intel Arria 10 EMIF IP Figure 98. DDR4 2x16 Implementation Example with Controllers in Non-Adjacent Banks (Three I/O Banks) 2 x 16 Pin (3 Banks) Controller 1 Address/Command 1 Address/Command 1 Address/Command 1 Data 1 Data 1 Data 2 Data 2 Controller 2 Address/Command 2 Address/Command 2 Address/Command 2 External Memory Interface Handbook Volume 3: Reference Material 171 3 Functional Description—Intel Arria 10 EMIF IP Figure 99. DDR4 2x16 Implementation Example with Controllers in Adjacent Banks (Three I/O Banks) 2 x 16 Pin (3 Banks) Data 1 Controller 1 Address/Command 1 Address/Command 1 Address/Command 1 Data 1 Controller 2 Address/Command 2 Address/Command 2 Address/Command 2 Data 2 Data 2 External Memory Interface Handbook Volume 3: Reference Material 172 3 Functional Description—Intel Arria 10 EMIF IP Figure 100. DDR4 1x144 Implementation Example (Six I/O Banks) 1 x 144 Pin (6 Banks) Data Data Data Data Data Data Data Data Data Data Address/Command Controller Address/Command Address/Command Data Data Data Data Data Data Data Data 3.8 Arria 10 EMIF Sequencer The Arria 10 EMIF sequencer is fully hardened in silicon, with executable code to handle protocols and topologies. Hardened RAM contains the calibration algorithm. External Memory Interface Handbook Volume 3: Reference Material 173 3 Functional Description—Intel Arria 10 EMIF IP The Arria 10 EMIF sequencer is responsible for the following operations: • Initializes memory devices. • Calibrates the external memory interface. • Governs the hand-off of control to the memory controller. • Handles recalibration requests and debug requests. • Handles all supported protocols and configurations. Figure 101. Arria 10 EMIF Sequencer Operation Start Discover EMIFs in column Sequencer software Processed all interfaces? Data Yes No Initialize external memory Calibrate interface Hand-off House-keeping tasks 3.8.1 Arria 10 EMIF DQS Tracking DQS tracking is enabled for QDR II / II+ / QDR II+ Xtreme, RLDRAM 3, and LPDDR3 protocols. DQS tracking is not available for DDR3 and DDR4 protocols. 3.9 Arria 10 EMIF Calibration The calibration process compensates for skews and delays in the external memory interface. The calibration process enables the system to compensate for the effects of factors such as the following: • Timing and electrical constraints, such as setup/hold time and Vref variations. • Circuit board and package factors, such as skew, fly-by effects, and manufacturing variations. • Environmental uncertainties, such as variations in voltage and temperature. • The demanding effects of small margins associated with high-speed operation. 3.9.1 Calibration Stages At a high level, the calibration routine consists of address and command calibration, read calibration, and write calibration. External Memory Interface Handbook Volume 3: Reference Material 174 3 Functional Description—Intel Arria 10 EMIF IP The stages of calibration vary, depending on the protocol of the external memory interface. Table 45. Calibration Stages by Protocol Stage DDR4 DDR3 LPDDR3 RLDRAM II/3 QDR-IV QDR II/II+ Address and command Leveling Yes Yes — — — — Deskew Yes — Yes — Yes — DQSen Yes Yes Yes Yes Yes Yes Deskew Yes Yes Yes Yes Yes Yes VREF-In Yes — — — Yes — LFIFO Yes Yes Yes Yes Yes Yes Leveling Yes Yes Yes Yes Yes — Deskew Yes Yes Yes Yes Yes Yes VREF-Out Yes — — — — — Read Write 3.9.2 Calibration Stages Descriptions The various stages of calibration perform address and command calibration, read calibration, and write calibration. Address and Command Calibration The goal of address and command calibration is to delay address and command signals as necessary to optimize the address and command window. This stage is not available for all protocols, and cannot compensate for an inefficient board design. Address and command calibration consists of the following parts: • Leveling calibration— Centers the CS# signal and the entire address and command bus, relative to the CK clock. This operation is available only for DDR3 and DDR4 interfaces. • Deskew calibration— Provides per-bit deskew for the address and command bus (except CS#), relative to the CK clock. This operation is available for DDR4 and QDR-IV interfaces only. External Memory Interface Handbook Volume 3: Reference Material 175 3 Functional Description—Intel Arria 10 EMIF IP Read Calibration Read calibration consists of the following parts: • DQSen calibration— Calibrates the timing of the read capture clock gating and ungating, so that the PHY can gate and ungate the read clock at precisely the correct time—if too early or too late, data corruption can occur. The algorithm for this stage varies, depending on the memory protocol. • Deskew calibration— Performs per-bit deskew of read data relative to the read strobe or clock. • VREF-IN calibration— Calibrates the Vref level at the FPGA. • LFIFO calibration: Normalizes differences in read delays between groups due to fly-by, skews, and other variables and uncertainties. Write Calibration Write calibration consists of the following parts: • Leveling calibration— Aligns the write strobe and clock to the memory clock, to compensate for skews, especially those associated with fly-by topology. The algorithm for this stage varies, depending on the memory protocol. • Deskew calibration— Performs per-bit deskew of write data relative to the write strobe and clock. • VREF-Out calibration— Calibrates the VREF level at the memory device. 3.9.3 Calibration Algorithms The calibration algorithms sometimes vary, depending on the targeted memory protocol. Address and Command Calibration Address and command calibration consists of the following parts: External Memory Interface Handbook Volume 3: Reference Material 176 3 Functional Description—Intel Arria 10 EMIF IP • Leveling calibration— (DDR3 and DDR4 only) Toggles the CS# and CAS# signals to send read commands while keeping other address and command signals constant. The algorithm monitors for incoming DQS signals, and if the DQS signal toggles, it indicates that the read commands have been accepted. The algorithm then repeats using different delay values, to find the optimal window. • Deskew calibration— (DDR4, QDR-IV, and LPDDR3 only) — (DDR4) Uses the DDR4 address and command parity feature. The FPGA sends the address and command parity bit, and the DDR4 memory device responds with an alert signal if the parity bit is detected. The alert signal from the memory device tells the FPGA that the parity bit was received. Deskew calibration requires use of the PAR/ALERT# pins, so you should not omit these pins from your design. One limitation of deskew calibration is that it cannot deskew ODT and CKE pins. — (QDR-IV) Uses the QDR-IV loopback mode. The FPGA sends address and command signals, and the memory device sends back the address and command signals which it captures, via the read data pins. The returned signals indicate to the FPGA what the memory device has captured. Deskew calibration can deskew all synchronous address and command signals. — (LPDDR3) Uses the LPDDR3 CA training mode. The FPGA sends signals onto the LPDDR3 CA bus, and the memory device sends back those signals that it captures, via the DQ pins. The returned signals indicate to the FPGA what the memory device has captured. Deskew calibration can deskew all signals on the CA bus. The remaining command signals (CS, CKE, and ODT) are calibrated based on the average of the deskewed CA bus. External Memory Interface Handbook Volume 3: Reference Material 177 3 Functional Description—Intel Arria 10 EMIF IP Read Calibration • DQSen calibration— (DDR3, DDR4, LPDDR3, RLDRAMx and QDRx) DQSen calibration occurs before Read deskew, therefore only a single DQ bit is required to pass in order to achieve a successful read pass. — (DDR3, DDR4,and LPDDR3) The DQSen calibration algorithm searches the DQS preamble using a hardware state machine. The algorithm sends many back-to-back reads with a one clock cycle gap between. The hardware state machine searches for the DQS gap while sweeping DQSen delay values. the algorithm then increments the VFIFO value, and repeats the process until a pattern is found. The process is then repeated for all other read DQS groups. — (RLDRAMx and QDRx) The DQSen calibration algorithm does not use a hardware state machine; rather, it calibrates cycle-level delays using software and subcycle delays using DQS tracking hardware. The algorithm requires good data in memory, and therefore relies on guaranteed writes. (Writing a burst of 0s to one location, and a burst of 1s to another; back-to-back reads from these two locations are used for read calibration.) The algorithm enables DQS tracking to calibrate the phase component of DQS enable. It then issues a guaranteed write, followed by back-to-back reads. The algorithm sweeps DQSen values cycle by cycle until the read operation succeeds. The process is then repeated for all other read groups. • Deskew calibration— Read deskew calibration is performed before write leveling, and must be performed at least twice: once before write calibration, using simple data patterns from guaranteed writes, and again after write calibration, using complex data patterns. The deskew calibration algorithm performs a guaranteed write, and then sweeps dqs_in delay values from low to high, to find the right edge of the read window. The algorithm then sweeps dq-in delay values low to high, to find the left edge of the read window. Updated dqs_in and dq_in delay values are then applied to center the read window. The algorithm then repeats the process for all data pins. • Vref-In calibration— Read Vref-In calibration begins by programming Vref-In with an arbitrary value. The algorithm then sweeps the Vref-In value from the starting value to both ends, and measures the read window for each value. The algorithm selects the Vref-In value which provides the maximum read window. • LFIFO calibration— Read LFIFO calibration normalizes read delays between groups. The PHY must present all data to the controller as a single data bus. The LFIFO latency should be large enough for the slowest read data group, and large enough to allow proper synchronization across FIFOs. External Memory Interface Handbook Volume 3: Reference Material 178 3 Functional Description—Intel Arria 10 EMIF IP Write Calibration • Leveling calibration— Write leveling calibration aligns the write strobe and clock to the memory clock, to compensate for skews. In general, leveling calibration tries a variety of delay values to determine the edges of the write window, and then selects an appropriate value to center the window. The details of the algorithm vary, depending on the memory protocol. — (DDRx, LPDDR3) Write leveling occurs before write deskew, therefore only one successful DQ bit is required to register a pass. Write leveling staggers the DQ bus to ensure that at least one DQ bit falls within the valid write window. — (RLDRAMx) Optimizes for the CK versus DK relationship. — (QDR-IV) Optimizes for the CK versus DK relationship. Is covered by address and command deskew using the loopback mode. — (QDR II/II+/Xtreme) The K clock is the only clock, therefore write leveling is not required. • Deskew calibration— Performs per-bit deskew of write data relative to the write strobe and clock. Write deskew calibration does not change dqs_out delays; the write clock is aligned to the CK clock during write leveling. • VREF-Out calibration— (DDR4) Calibrates the VREF level at the memory device. The VREF-Out calibration algorithm is similar to the VREF-In calibration algorithm. 3.9.4 Calibration Flowchart The following flowchart illustrates the calibration flow. Figure 102. Calibration Flowchart External Memory Interface Handbook Volume 3: Reference Material 179 3 Functional Description—Intel Arria 10 EMIF IP 3.9.5 Periodic OCT Recalibration Periodic OCT recalibration improves the accuracy of the on-chip termination values used by DDR4 Pseudo-open Drain (POD) I/Os. This feature periodically invokes the user-mode OCT calibration engine and updates the I/O buffer termination settings to compensate for variations in calibrated OCT settings caused by large changes in device operating temperature. This feature is automatically enabled for DDR4 memory interfaces unless the IP does not meet the technical requirements, or if you explicitly disable the feature in the parameter editor. 3.9.5.1 Operation The Periodic OCT recalibration engine refreshes the calibrated OCT settings for DDR4 I/O buffers every 500ms. To ensure data integrity, there is a momentary pause in user traffic as the OCT settings are refreshed; however, the process of OCT calibration is decoupled from the actual update to the I/O buffers, to minimize disruption of user traffic. The calibration process uses the external RZQ reference resistor to determine the optimal settings for the I/O buffer, to meet the specified calibrated I/O standards on the FPGA. OCT Calibration only affects the I/O pin that is connected to the RZQ resistor; therefore, memory traffic is not interrupted during the calibration phase. Upon completion of the calibration process, the updated calibration settings are applied to the I/O buffers. The memory traffic is halted momentarily by placing the memory into self-refresh mode; this ensures that the data bus is idle and no glitches are created by the I/O buffers during the buffer update. The buffer is updated as soon as the memory enters self-refresh mode. The memory interface exits self-refresh mode when the buffer update is complete and new read or write requests are detected on the Avalon bus. The controller remains in self-refresh mode until a new command is detected. OCT calibration continues to occur even if the memory is still in self refresh mode. Upon detection of a new command, the controller issues a self-refresh exit command to the memory, followed by a memory-side ZQ calibration short duration (ZQCS) command. Memory traffic resumes when the memory DLL has relocked. If you disable the periodic OCT recalibration engine, the calibration process occurs only once during device configuration. In this operating mode, the calibrated OCT settings can vary across temperature as specified by the calibration accuracy ranges listed in the Arria 10 Device Handbook. The DDR external timing report automatically factors in the effect of enabling or disabling the periodic OCT recalibration engine when calculating the total amount of external I/O transfer margin. 3.9.5.2 Technical Restrictions Certain criteria must be met in order to use periodic OCT recalibration. External Memory Interface Handbook Volume 3: Reference Material 180 3 Functional Description—Intel Arria 10 EMIF IP The periodic OCT recalibration engine is enabled only when all of the following criteria are met: • The memory interface is configured to use the Altera Hard Memory Controller for DDR4. • The memory interface is configured for either DDR4 UDIMM or component topologies. RDIMM and LRDIMM topologies are not supported. • The memory interface is not used with the hardened processor subsystem. • The memory interface does not use Ping-Pong PHY. • The memory interface does not use calibrated I/O standards for address, command, or clock signals. • The memory interface uses calibrated I/O standards for the data bus. • The memory does not use the memory mapped register (MMR) interface of the HMC, including ECC modes. • You have not explicitly disabled periodic OCT recalibration in the parameter editor. • The specified device is a production level device (that is, not an ES/ES2/ES3 class device). Periodic OCT recalibration requires that each EMIF instance in the design employ a dedicated RZQ resistor. Because this restriction cannot be detected at IP generation time, you must explicitly disable the periodic OCT recalibration engine for a given interface if it shares an RZQ resistor with another interface. Ensure that you observe this restriction when automatically upgrading EMIF IP from older versions of the Quartus Prime software. 3.9.5.3 Efficiency Impact The Periodic OCT recalibration engine must interrupt user traffic for a short period of time in order to update I/O buffer termination settings. External Memory Interface Handbook Volume 3: Reference Material 181 3 Functional Description—Intel Arria 10 EMIF IP The exact flow of operations executed by the recalibration engine that affects memory traffic is described below: 1. Enter Self-Refresh Mode. The EMIF calibration CPU triggers self-refresh entry on the hard memory controller. The controller flushes all pending operations, precharges all banks and issues the self-refresh command. This operation introduces a delay of approximately 25 Memory clock cycles (precharge all and self-refresh entry commands). 2. Confirm Self-Refresh Mode. The EMIF calibration CPU polls the hard memory controller to confirm that the clocks have stopped. This operation introduces no delay. 3. Issue codeword update. The EMIF calibration CPU triggers user-mode OCT logic to update code words. This operation introduces a delay of 50-100ns, depending on the device speed grade. 4. Allow Exit Self-Refresh Mode. The EMIF calibration CPU enables automatic selfrefresh exit logic. This operation introduces a delay of 50-100ns, depending on the device speed grade. 5. Wait for Memory Traffic. The hard memory controller waits for an incoming read or write command on the Avalon bus. The delay introduced by this operation varies, depending on the user application. 6. Exit Self Refresh Mode. The hard memory controller issues the Self-Refresh Exit command and a simultaneous memory-side RZQ calibration (ZQCS) command. The delay introduced by this operation varies according to the device speed bin (up to ~1000 memory clock cycles for fastest memory devices). The efficiency impact on throughput-sensitive work loads is less than one percent, even under worst-case scenarios with all banks active. However, be aware that the first command issued after the hard memory controller exits self-refresh mode will incur the latency overhead of waiting for the memory DLL to re-lock when the SelfRefresh Exit command is issued by the hard memory controller. Contact Intel FPGA Technical Services for information on how to manually trigger or inhibit periodic OCT updates for applications that are sensitive to latency. 3.10 Back-to-Back User-Controlled Refresh Usage in Arria 10 The following diagram illustrates the back-to-back refresh model for optimized hard memory controller (HMC) performance in Arria 10 devices. For optimal performance, ensure that you deassert the Refresh request after receiving the acknowledgement pulse. You can implement a timer to track tRFC before asserting the next Refresh request. Failure to deassert the Refresh request can delay memory access to the rank not in refresh. External Memory Interface Handbook Volume 3: Reference Material 182 3 Functional Description—Intel Arria 10 EMIF IP 3.11 Arria 10 EMIF and SmartVID Arria 10 EMIF IP can be used with the SmartVID voltage management system, to achieve reduced power consumption. Note: Arria 10 HPS EMIF IP does not currently support SmartVID. The SmartVID controller allows the FPGA to operate at a reduced Vcc, while maintaining performance. Because the SmartVID controller can adjust Vcc up or down in response to power requirements and temperature, it can have an impact on external memory interface performance. When used with the SmartVID controller, the EMIF IP implements a handshake protocol to ensure that EMIF calibration does not begin until after voltage adjustment has completed. In extended speed grade devices, voltage adjustment occurs once when the FPGA is powered up, and no further voltage adjustments occur. The external memory calibration occurs after this initial voltage adjustment is completed. EMIF specifications are expected to be slightly lower in extended speed grade devices using SmartVID, than in devices not using SmartVID. In industrial speed grade devices, voltage adjustment occurs at power up, and may also occur during operation, in response to temperature changes. External memory interface calibration does not occur until after the initial voltage adjustment at power up. However, the external memory interface is not recalibrated in response to subsequent voltage adjustments that occur during operation. As a result, EMIF specifications for industrial speed grade devices using SmartVID are expected to be lower than for extended speed grade devices. Using Arria 10 EMIF IP with SmartVID To employ Arria 10 EMIF IP with SmartVID, follow these steps: 1. Ensure that the Quartus Prime project and Qsys system are configured to use VID components. This step exposes the vid_cal_done_persist interface on instantiated EMIF IP, which is required for communicating with the SmartVID controller. 2. Instantiate the SmartVID controller, using an I/O PLL IP core to drive the 125MHz vid_clk and the 25MHz jtag_core_clk inputs of the Smart VID controller. Note: Do not connect the emif_usr_clk signal to either the vid_clk or jtag_core_clk inputs. Doing so would hold both the EMIF IP and the SmartVID controller in a perpetual reset condition. 3. Instantiate the Arria 10 EMIF IP. 4. Connect the vid_cal_done_persist signal from the EMIF IP with the cal_done_persistent signal on the SmartVID controller. This connection enables handshaking between the EMIF IP and the SmartVID controller, which allows the EMIF IP to delay memory calibration until after voltage levels are stabilized. Note: The EMIF vid_cal_done_persist interface becomes available only when a VID-enabled device is selected. Related Links SmartVID Controller IP Core User Guide External Memory Interface Handbook Volume 3: Reference Material 183 3 Functional Description—Intel Arria 10 EMIF IP 3.12 Hard Memory Controller Rate Conversion Feature The hard memory controller's rate conversion feature allows the hard memory controller and PHY to run at half-rate, even though user logic is configured to run at quarter-rate. To facilitate timing closure, you may choose to clock your core user logic at quarterrate, resulting in easier timing closure at the expense of increased area and latency. To improve efficiency and help reduce overall latency, you can run the hard memory controller and PHY at half rate. The rate conversion feature converts traffic from the FPGA core to the hard memory controller from quarter-rate to half-rate, and traffic from the hard memory controller to the FPGA core from half-rate to quarter-rate. From the perspective of user logic inside the FPGA core, the effect is the same as if the hard memory controller were running at quarter-rate. The rate conversion feature is enabled automatically during IP generation whenever all of the following conditions are met: • The hard memory controller is in use. • User logic runs at quarter-rate. • The interface targets either an ES2 or production device. • Running the hard memory controller at half-rate dpoes not exceed the fMax specification of the hard memory controller and hard PHY. When the rate conversion feature is enabled, you should see the following info message displayed in the IP generation GUI: PHY and controller running at 2x the frequency of user logic for improved efficiency. Related Links Arria 10 Core Fabric and General Purpose I/Os Handbook 3.13 Back-to-Back User-Controlled Refresh for Hard Memory Controller The following waveform illustrates the recommended Arria 10 model for back-to-back user-controlled refreshes, for optimized hard memory controller performance. Figure 103. External Memory Interface Handbook Volume 3: Reference Material 184 3 Functional Description—Intel Arria 10 EMIF IP You should deassert the refresh request after the refresh acknowledgement pulse is received. You can implement a timer to keep track of the tRFC status before asserting a refresh request. Failure to deassert the Refresh request can delay access to the rank not in refresh. 3.14 Compiling Arria 10 EMIF IP with the Quartus Prime Software 3.14.1 Instantiating the Arria 10 EMIF IP Depending on your work flow, you may instantiate your IP with Qsys or with the IP Catalog. Instantiating with Qsys If you instantiate your IP as part of a system in Qsys, follow the Qsys documentation for information on instantiating the IP in a Quartus Prime project. Instantiating with the IP Catalog If you generated your IP with the IP Catalog, you must add the Quartus Prime IP file (.qip) to your Quartus Prime project. The .qip file identifies the names and locations of the files that compose the IP. After you add the .qip file to your project, you can instantiate the memory interface in the RTL. 3.14.2 Setting I/O Assignments in Arria 10 EMIF IP The .qip file contains the I/O standard and I/O termination assignments required by the memory interface pins for proper operation. The assignment values are based on input that you provide during generation. Unlike earlier device families, for Arria 10 EMIF IP you do not need to run a <instance_name)_pin_assignments.tcl script to add the assignments into the Quartus Prime Settings File (.qsf). The system reads and applies the assignments from the .qip file during every compilation, regardless of how you name the memory interface pins in the top-level design component. No new assignments are created in the project's .qsf file during compilation. Note that I/O assignments in the .qsf file must specify the names of your top-level pins as target (-to), and you must not include the -entity or -library options. Consult the generated .qip file for the set of I/O assignments that are provided with the IP. Changing I/O Assignments You should not make changes to the generated .qip file, because any changes are overwritten and lost when you regenerate the IP. If you want to override an assignment made in the .qip file, add the desired assignment to the project's .qsf file. Assignments in the .qsf file always take precedence over assignments in the .qip file. External Memory Interface Handbook Volume 3: Reference Material 185 3 Functional Description—Intel Arria 10 EMIF IP 3.15 Debugging Arria 10 EMIF IP You can debug hardware failures by connecting to the EMIF Debug Toolkit or by exporting an Avalon-MM slave port, from which you can access information gathered during calibration. You can also connect to this port to mask ranks and to request recalibration. You can access the exported Avalon-MM port in two ways: • Via the External Memory Interface Debug Toolkit • Via On-Chip Debug (core logic on the FPGA) 3.15.1 External Memory Interface Debug Toolkit The External Memory Interface Debug Toolkit provides access to data collected by the Nios II sequencer during memory calibration, and allows you to perform certain tasks. The External Memory Interface Debug Toolkit provides access to data including the following: • General interface information, such as protocol and interface width • Calibration results per group, including pass/fail status, failure stage, and delay settings You can also perform the following tasks: • Mask ranks from calibration (you might do this to skip specific ranks) • Request recalibration of the interface 3.15.2 On-Chip Debug for Arria 10 The On-Chip Debug feature allows user logic to access the same debug capabilities as the External Memory Interface Toolkit. You can use On-Chip Debug to monitor the calibration results of an external memory interface, without a connected computer. To use On-Chip Debug, you need a C header file which is provided as part of the external memory interface IP. The C header file defines data structures that contain calibration data, and definitions of the commands that can be sent to the memory interface. The On-Chip Debug feature accesses the data structures through the Avalon-MM port that is exposed by the EMIF IP when you turn on debugging features. 3.15.3 Configuring Your EMIF IP for Use with the Debug Toolkit The Arria 10 EMIF Debug Interface IP core contains the access point through which the EMIF Debug Toolkit reads calibration data collected by the Nios II sequencer. Connecting an EMIF IP Core to an Arria 10 EMIF Debug Interface For the EMIF Debug Toolkit to access the calibration data for an Arria 10 EMIF IP core, you must connect one of the EMIF cores in each I/O column to an Arria 10 EMIF Debug Interface IP core. Subsequent EMIF IP cores in the same column must connect in a daisy chain to the first. External Memory Interface Handbook Volume 3: Reference Material 186 3 Functional Description—Intel Arria 10 EMIF IP There are two ways that you can add the Arria 10 EMIF Debug Interface IP core to your design: • When you generate your EMIF IP core, on the Diagnostics tab, select Add EMIF Debug Interface for the EMIF Debug Toolkit/On-Chip Debug Port; you do not have to separately instantiate an Arria 10 EMIF Debug Interface core. This method does not export an Avalon-MM slave port. You can use this method if you require only EMIF Debug Toolkit access to this I/O column; that is, if you do not require On-Chip Debug Port access, or PHYLite reconfiguration access. • When you generate your EMIF IP core, on the Diagnostics tab, select Export for the EMIF Debug Toolkit/On-Chip Debug Port. Then, separately instantiate an Arria 10 EMIF Debug Interface core and connect its to_ioaux interface to the cal_debug interface on the EMIF IP core. This method is appropriate if you want to also have On-Chip Debug Port access to this I/O column, or PHYLite reconfiguration access. For each of the above methods, you must assign a unique interface ID for each external memory interface in the I/O column, to identify that interface in the Debug Toolkit. You can assign an interface ID using the dropdown list that appears when you enable the Debug Toolkit/On-Chip Debug Port option. Daisy-Chaining Additional EMIF IP Cores for Debugging After you have connected an Arria 10 EMIF Debug Interface to one of the EMIF IP cores in an I/O column, you must then connect subsequent EMIF IP cores in that column in a daisy-chain manner. If you don't require debug capabilities for a particular EMIF IP core, you do not have to connect that core to the daisy chain. To create a daisy chain of EMIF IP cores, follow these steps: 1. On the first EMIF IP core, select Enable Daisy-Chaining for EMIF Debug Toolkit/On-Chip Debug Port to create an Avalon-MM interface called cal_debug_out. 2. On the second EMIF IP core, select Export as the EMIF Debug Toolkit/On-Chip Debug Port mode, to export an Avalon-MM interface called cal_debug. 3. Connect the cal_debug_out interface of the first EMIF core to the cal_debug interface of the second EMIF core. 4. To connect more EMIF cores to the daisy chain, select the Enable DaisyChaining for EMIF Debug Toolkit/On-Chip Debug Port option on the second core, connect it to the next core using the Export option as described above. Repeat the process for subsequent EMIF cores. If you place any PHYLite cores with dynamic reconfiguration enabled into the same I/O column as an EMIF IP core, you should instantiate and connect the PHYLite cores in a similar way. See the Altera PHYLite for Memory Megafunction User Guide for more information. 3.15.4 Arria 10 EMIF Debugging Examples This topic provides examples of debugging a single external memory interface, and of adding additional EMIF instances to an I/O column. External Memory Interface Handbook Volume 3: Reference Material 187 3 Functional Description—Intel Arria 10 EMIF IP Debugging a Single External Memory Interface 1. Under EMIF Debug Toolkit/On-Chip Debug Port, select Add EMIF Debug Interface. (If you want to use the On-Chip Debug Port instead of the EMIF Debug Toolkit, select Export instead.) Figure 104. EMIF With Debug Interface Added (No Additional Ports) emif_0 global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end ctrl_amm_avalon_slave_0 2. emif_usr_clk_clock_source emif_usr_reset_reset_source emif If you want to connect additional EMIF or PHYLite components in this I/O column, select Enable Daisy Chaining for EMIF Debug Toolkit/On-Chip Debug Port. Figure 105. EMIF With cal_debug Avalon Master Exported emif_0 global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end ctrl_amm_avalon_slave_0 emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master emif cal_debug Avalon Master Exported Adding Additional EMIF Instances to an I/O Column 1. Under EMIF Debug Toolkit/On-Chip Debug Port, select Export. Figure 106. EMIF With cal_debug Avalon Slave Exported emif_1 cal_debug Avalon Slave Exported global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end cal_debug_reset_reset_sink cal_debug_clk_clock_sink ctrl_amm_avalon_slave_0 cal_debug_avalon_slave emif_usr_clk_clock_source emif_usr_reset_reset_source emif 2. Specify a unique interface ID for this EMIF instance. 3. If you want to connect additional EMIF or PHYLite components in this I/O column, select Enable Daisy Chaining for EMIF Debug Toolkit/On-Chip Debug Port. External Memory Interface Handbook Volume 3: Reference Material 188 3 Functional Description—Intel Arria 10 EMIF IP Figure 107. EMIF With Both cal_debug Master and Slave Exported emif_1 cal_debug Avalon Slave Exported 4. global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end cal_debug_reset_reset_sink cal_debug_clk_clock_sink ctrl_amm_avalon_slave_0 cal_debug_avalon_slave emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master cal_debug Avalon Master Exported emif Connect the cal_debug Avalon Master, clock, and reset interfaces of the previous component to the cal_debug Avalon Slave, clock, and reset interfaces of this component. Figure 108. EMIF Components Connected emif_1 emif_0 global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end ctrl_amm_avalon_slave_0 emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master emif global_reset_reset_sink pll_ref_clk_clock_sink oct_conduit_end mem_conduit_end status_conduit_end cal_debug_reset_reset_sink cal_debug_clk_clock_sink ctrl_amm_avalon_slave_0 cal_debug_avalon_slave emif_usr_clk_clock_source emif_usr_reset_reset_source cal_debug_out_reset_reset_source cal_debug_out_clk_clock_source cal_debug_out_avalon_master emif 3.16 Arria 10 EMIF for Hard Processor Subsystem The Arria 10 EMIF IP can enable the Arria 10 Hard Processor Subsystem (HPS) to access external DRAM memory devices. To enable connectivity between the Arria 10 HPS and the Arria 10 EMIF IP, you must create and configure an instance of the Arria 10 External Memory Interface for HPS IP core, and use Qsys to connect it to the Arria 10 Hard Processor Subsystem instance in your system. Supported Modes The Arria 10 Hard Processor Subsystem is compatible with the following external memory configurations: Protocol DDR3, DDR4 Maximum memory clock frequency DDR3: 1.067 GHz DDR4: 1.333 GHz Configuration Hard PHY with hard memory controller Clock rate of PHY and hard memory controller Half-rate Data width (without ECC) 16-bit, 32-bit, 64-bit 2 Data width (with ECC) 24-bit, 40-bit, 72-bit 2 DQ width per group x8 Maximum number of I/O lanes for address/command 3 continued... External Memory Interface Handbook Volume 3: Reference Material 189 3 Functional Description—Intel Arria 10 EMIF IP Memory format Discrete, UDIMM, SODIMM, RDIMM Ranks / CS# width Up to 2 Notes to table: 1. Only Arria 10 devices with a special ordering code support 64-bit and 72-bit data widths; all other devices support only to 32-bit data widths. Note: Arria 10 HPS EMIF IP does not currently support SmartVID. 3.16.1 Restrictions on I/O Bank Usage for Arria 10 EMIF IP with HPS Only certain Arria 10 I/O banks can be used to implement Arria 10 EMIF IP with the Arria 10 Hard Processor Subsystem. If both Arria 10 HPS EMIF IP and non-HPS Arria 10 EMIF IP are implemented, you must place the non-HPS EMIF IP in a different I/O column than the HPS EMIF IP. External Memory Interface Handbook Volume 3: Reference Material 190 3 Functional Description—Intel Arria 10 EMIF IP The restrictions on I/O bank usage result from the Arria 10 HPS having hard-wired connections to the EMIF circuits in the I/O banks closest to the HPS. For any given EMIF configuration, the pin-out of the EMIF-to-HPS interface is fixed. The following diagram illustrates the use of I/O banks and lanes for various EMIF-HPS data widths: I/O Bank 2L (not used by HPS EMIF) 16 bit, with ECC 32 bit, no ECC 16 bit, no ECC Lane 3 ECC 8 bits ECC 8 bits Lane 2 Addr / cmd Addr / cmd Addr / cmd Addr / cmd Addr / cmd HPS Addr / cmd Lane 1 I/O Bank 2K ECC 8 bits 32 bit, with ECC 64 bit, no ECC 64 bit, with ECC Figure 109. I/O Banks and Lanes Usage Lane 0 Lane 2 Data 32 bits Data 16 bits Data 64 bits Data 64 bits Data 16 bits Lane 1 Lane 0 Lane 3 Lane 2 Lane 1 I/O Bank 2I Data 32 bits I/O Band 2J Lane 3 Lane 0 External Memory Interface Handbook Volume 3: Reference Material 191 3 Functional Description—Intel Arria 10 EMIF IP You should refer to the pinout file for your device and package for detailed information. Banks and pins used for HPS access to a DDR interface are labeled HPS_DDR in the HPS Function column of the pinout file. By default, the Arria 10 External Memory Interface for HPS IP core together with the Quartus Prime Fitter automatically implement the correct pin-out for HPS EMIF without you having to implement additional constraints. If, for any reason, you must modify the default pin-out, you must adhere to the following requirements, which are specific to HPS EMIF: 1. Within a single data lane (which implements a single x8 DQS group): a. DQ pins must use pins at indices 1, 2, 3, 6, 7, 8, 9, 10. You may swap the locations between the DQ bits (that is, you may swap location of DQ[0] and DQ[3]) so long as the resulting pin-out uses pins at these indices only. b. DM/DBI pin must use pin at index 11. There is no flexibility. c. DQS/DQS# must use pins at index 4 and 5. There is no flexibility. 2. Assignment of data lanes must be as illustrated in the above figure. You are allowed to swap the locations of entire byte lanes (that is, you may swap locations of byte 0 and byte 3) so long as the resulting pin-out uses only the lanes permitted by your HPS EMIF configuration, as shown in the above figure. 3. You must not change placement of the address and command pins from the default. 4. You may place the alert# pin at any available pin location in either a data lane or an address and command lane. To override the default generated pin assignments, comment out the relevant HPS_LOCATION assignments in the .qip file, and add your own location assignments (using set_location_assignment) in the .qsf file. 3.16.2 Using the EMIF Debug Toolkit with Arria 10 HPS Interfaces The External Memory Interface Debug Toolkit is not directly compatible with Arria 10 HPS interfaces. To debug your Arria 10 HPS interface using the EMIF Debug Toolkit, you should create an identically parameterized, non-HPS version of your interface, and apply the EMIF Debug Toolkit to that interface. When you finish debugging this non-HPS interface, you can then apply any needed changes to your HPS interface, and continue your design development. 3.17 Arria 10 EMIF Ping Pong PHY Ping Pong PHY allows two memory interfaces to share the address and command bus through time multiplexing. Compared to having two independent interfaces that allocate address and command lanes separately, Ping Pong PHY achieves the same throughput with fewer resources, by sharing the address and command lanes. In Arria 10 EMIF, Ping Pong PHY supports both half-rate and quarter-rate interfaces for DDR3, and quarter-rate for DDR4. External Memory Interface Handbook Volume 3: Reference Material 192 3 Functional Description—Intel Arria 10 EMIF IP 3.17.1 Ping Pong PHY Feature Description Conventionally, the address and command buses of a DDR3 or DDR4 half-rate or quarter-rate interface use 2T time—meaning that commands are issued for two fullrate clock cycles, as illustrated below. Figure 110. 2T Command Timing CK CSn Addr, ba Extra Setup Time 2T Command Issued Active Period With the Ping Pong PHY, address and command signals from two independent controllers are multiplexed onto shared buses by delaying one of the controller outputs by one full-rate clock cycle. The result is 1T timing, with a new command being issued on each full-rate clock cycle. The following figure shows address and command timing for the Ping Pong PHY. External Memory Interface Handbook Volume 3: Reference Material 193 3 Functional Description—Intel Arria 10 EMIF IP Figure 111. 1T Command Timing Use by Ping Pong PHY CK CSn[0] CSn[1] Addr, ba Cmd Dev1 Cmd Dev0 3.17.2 Ping Pong PHY Architecture In Arria 10 EMIF, the Ping Pong PHY feature can be enabled only with the hard memory controller, where two hard memory controllers are instantiated—one for the primary interface and one for the secondary interface. The hard memory controller I/O bank of the primary interface is used for address and command and is always adjacent and above the hard memory controller bank of the secondary interface. All four lanes of the primary hard memory controller bank are used for address and command. The I/O bank containing the secondary hard memory controller must have at least one lane from the secondary interface. The following example shows a 2x16 Ping Pong PHY bank-lane configuration. The upper bank (I/O bank N) is the address and command bank, which serves both the primary and secondary interfaces. The primary hard memory controller is linked to the secondary interface by the Ping Pong bus. The lower bank (I/O bank N-1) is the secondary interface bank, which carries the data buses for both primary and secondary interfaces. In the 2x16 case a total of four I/O banks are required for data, hence two banks in total are sufficient for the implementation. The data for the primary interface is routed down to the top two lanes of the secondary I/O bank, and the data for the secondary interface is routed to the bottom two lanes of the secondary I/O bank. External Memory Interface Handbook Volume 3: Reference Material 194 3 Functional Description—Intel Arria 10 EMIF IP Figure 112. 2x16 Ping Pong PHY I/O Bank-Lane Configuration I/O Tile N DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 Primary HMC I/O Tile N - 1 Secondary HMC Address/ Command Primaary Interface Data Bus Secondary Interface Data Bus A 2x32 interface can be implemented using three tiles, so long as the tile containing the secondary hard memory controller has at least one secondary data lane. The order of the lanes does not matter. External Memory Interface Handbook Volume 3: Reference Material 195 3 Functional Description—Intel Arria 10 EMIF IP Figure 113. 2x32 Ping Pong PHY I/O Bank-Lane Configuration. I/O Tile N + 1 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 Control Path I/O Tile N Primary HMC I/O Tile N - 1 Secondary HMC Primary Interface Data Bus Address/ Command Secondary Interface Data Bus 3.17.3 Ping Pong PHY Limitations Ping Pong PHY supports up to two ranks per memory interface. In addition, the maximum data width is x72, which is half the maximum width of x144 for a single interface. Ping Pong PHY uses all lanes of the address and command I/O bank as address and command. For information on the pin allocations of the DDR3 and DDR4 address and command I/O bank, refer to DDR3 Scheme 1 and DDR4 Scheme 3, in External Memory Interface Pin Information for Arria 10 Devices, on www.altera.com. An additional limitation is that I/O lanes may be left unused when you instantiate multiple pairs of Ping Pong PHY interfaces. The following diagram shows two pairs of x8 Pin Pong controllers (a total of 4 interfaces). Lanes highlighted in yellow are not driven by any memory interfaces (unused lanes and pins can still serve as general purpose I/Os). Even with some I/O lanes left unused, the Ping Pong PHY approach is still beneficial in terms of resource usage, compared to independent interfaces. Memory widths of 24 bits and 40 bits have a similar situation, while 16 bit, 32 bit, and 64 bit memory widths do not suffer this limitation. External Memory Interface Handbook Volume 3: Reference Material 196 3 Functional Description—Intel Arria 10 EMIF IP Figure 114. Two Pairs of x8 Pin-Pong PHY Controllers I/O Tile N DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 DBCO Data Buffer x12 DBC1 Data Buffer x12 DBC2 Data Buffer x12 DBC3 Data Buffer x12 Control Path I/O Tile N - 1 Primary HMC I/O Tile N - 2 Primary Interface Data Bus Address/ Command Secondary Interface Data Bus Secondary HMC I/O Tile N - 3 Primary HMC I/O Tile N - 4 Primary Interface Data Bus Address/ Command Secondary Interface Data Bus Secondary HMC Related Links External Memory Interface Pin Information for Arria 10 Devices 3.17.4 Ping Pong PHY Calibration A Ping Pong PHY interface is calibrated as a regular interface of double width. External Memory Interface Handbook Volume 3: Reference Material 197 3 Functional Description—Intel Arria 10 EMIF IP Calibration of a Ping Pong PHY interface incorporates two sequencers, one on the primary hard memory controller I/O bank, and one on the secondary hard memory controller I/O bank. To ensure that the two sequencers issue instructions on the same memory clock cycle, the Nios II processor configures the sequencer on the primary hard memory controller to receive a token from the secondary interface, ignoring any commands from the Avalon bus. Additional delays are programmed on the secondary interface to allow for the passing of the token from the sequencer on the secondary hard memory controller tile to the sequencer on the primary hard memory controller tile. During calibration, the Nios II processor assumes that commands are always issued from the sequencer on the primary hard memory controller I/O bank. After calibration, the Nios II processor adjusts the delays for use with the primary and secondary hard memory controllers. 3.17.5 Using the Ping Pong PHY The following steps describe how to use the Ping Pong PHY for Arria 10 EMIF. 1. Configure a single memory interface according to your requirements. 2. Select Instantiate two controllers sharing a Ping Pong PHY on the General tab in the parameter editor. The Quartus Prime software replicates the interface, resulting in two memory controllers and a shared PHY. The system configures the I/O bank-lane structure, without further input from you. 3.17.6 Ping Pong PHY Simulation Example Design The following figure illustrates a top-level block diagram of a generated Ping Pong PHY simulation example design, using two I/O banks. Functionally, the IP interfaces with user traffic separately, as it would with two independent memory interfaces. You can also generate synthesizable example designs, where the external memory interface IP interfaces with a traffic generator. Figure 115. Ping Pong PHY Simulation Example Design Simulation Example Design EMIF Tile N Traffic Generator 0 Sim Checker Primary HMC Tile N - 1 Lane 3 CS, ODT, CKE Lane 2 CAS, RAS, WE, ADDR, BA, BG, ... Lane 1 CS, ODT, CKE Lane 0 Lane 3 DQ, DQS, DM Lane 2 Traffic Generator 1 Secondary HMC Memory 0 Lane 1 Lane 0 External Memory Interface Handbook Volume 3: Reference Material 198 DQ, DQS, DM Memory 1 3 Functional Description—Intel Arria 10 EMIF IP 3.18 AFI 4.0 Specification The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. The AFI is a single-data-rate interface, meaning that data is transferred on the rising edge of each clock cycle. Most memory interfaces, however, operate at double-datarate, transferring data on both the rising and falling edges of the clock signal. If the AFI interface is to directly control a double-data-rate signal, two single-data-rate bits must be transmitted on each clock cycle; the PHY then sends out one bit on the rising edge of the clock and one bit on the falling edge. The AFI convention is to send the low part of the data first and the high part second, as shown in the following figure. Figure 116. Single Versus Double Data Rate Transfer clock Single-data-rate Double-data-rate A High , A Low A Low A High B High , B Low B Low B High 3.18.1 Bus Width and AFI Ratio In cases where the AFI clock frequency is one-half or one-quarter of the memory clock frequency, the AFI data must be twice or four times as wide, respectively, as the corresponding memory data. The ratio between AFI clock and memory clock frequencies is referred to as the AFI ratio. (A half-rate AFI interface has an AFI ratio of 2, while a quarter-rate interface has an AFI ratio of 4.) In general, the width of the AFI signal depends on the following three factors: • The size of the equivalent signal on the memory interface. For example, if a[15:0] is a DDR3 address input and the AFI clock runs at the same speed as the memory interface, the equivalent afi_addr bus will be 16-bits wide. • The data rate of the equivalent signal on the memory interface. For example, if d[7:0] is a double-data-rate QDR II input data bus and the AFI clock runs at the same speed as the memory interface, the equivalent afi_write_data bus will be 16-bits wide. • The AFI ratio. For example, if cs_n is a single-bit DDR3 chip select input and the AFI clock runs at half the speed of the memory interface, the equivalent afi_cs_n bus will be 2-bits wide. The following formula summarizes the three factors described above: AFI_width = memory_width * signal_rate * AFI_RATE_RATIO Note: The above formula is a general rule, but not all signals obey it. For definite signal-size information, refer to the specific table. External Memory Interface Handbook Volume 3: Reference Material 199 3 Functional Description—Intel Arria 10 EMIF IP 3.18.2 AFI Parameters The following tables list Altera PHY interface (AFI) parameters for AFI 4.0. Not all parameters are used for all protocols. The parameters described in the following tables affect the width of AFI signal buses. Parameters prefixed by MEM_IF_ refer to the signal size at the interface between the PHY and memory device. Table 46. Ratio Parameters Parameter Name Description AFI_RATE_RATIO The ratio between the AFI clock frequency and the memory clock frequency. For full-rate interfaces this value is 1, for half-rate interfaces the value is 2, and for quarter-rate interfaces the value is 4. DATA_RATE_RATIO The number of data bits transmitted per clock cycle. For single-date rate protocols this value is 1, and for double-data rate protocols this value is 2. ADDR_RATE_RATIO The number of address bits transmitted per clock cycle. For single-date rate address protocols this value is 1, and for double-data rate address protocols this value is 2. Table 47. Memory Interface Parameters Parameter Name Description MEM_IF_ADDR_WIDTH The width of the address bus on the memory device(s). For LPDDR3, the width of the CA bus, which encodes commands and addresses together. MEM_IF_BANKGROUP_WIDTH The width of the bank group bus on the interface to the memory device(s). Applicable to DDR4 only. MEM_IF_BANKADDR_WIDTH The width of the bank address bus on the interface to the memory device(s). Typically, the log 2 of the number of banks. Not applicable to DDR4. MEM_IF_CS_WIDTH The number of chip selects on the interface to the memory device(s). MEM_IF_CKE_WIDTH Number of CKE signals on the interface to the memory device(s). This usually equals to MEM_IF_CS_WIDTH except for certain DIMM configurations. MEM_IF_ODT_WIDTH Number of ODT signals on the interface to the memory device(s). This usually equals to MEM_IF_CS_WIDTH except for certain DIMM configurations. MEM_IF_WRITE_DQS_WIDTH The number of DQS (or write clock) signals on the write interface. For example, the number of DQS groups. MEM_IF_CLK_PAIR_COUNT The number of CK/CK# pairs. MEM_IF_DQ_WIDTH The number of DQ signals on the interface to the memory device(s). For single-ended interfaces such as QDR II, this value is the number of D or Q signals. MEM_IF_DM_WIDTH The number of data mask pins on the interface to the memory device(s). Table 48. Derived AFI Parameters Parameter Name Derivation Equation AFI_ADDR_WIDTH MEM_IF_ADDR_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_BANKGROUP_WIDTH MEM_IF_BANKGROUP_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO. Applicable to DDR4 only. AFI_BANKADDR_WIDTH MEM_IF_BANKADDR_WIDTH * AFI_RATE_RATIO * ADDR_RATE_RATIO continued... External Memory Interface Handbook Volume 3: Reference Material 200 3 Functional Description—Intel Arria 10 EMIF IP Parameter Name Derivation Equation AFI_CONTROL_WIDTH AFI_RATE_RATIO * ADDR_RATE_RATIO AFI_CS_WIDTH MEM_IF_CS_WIDTH * AFI_RATE_RATIO AFI_CKE_WIDTH MEM_IF_CKE_WIDTH * AFI_RATE_RATIO AFI_ODT_WIDTH MEM_IF_ODT_WIDTH * AFI_RATE_RATIO AFI_DM_WIDTH MEM_IF_DM_WIDTH * AFI_RATE_RATIO * DATA_RATE_RATIO AFI_DQ_WIDTH MEM_IF_DQ_WIDTH * AFI_RATE_RATIO * DATA_RATE_RATIO AFI_WRITE_DQS_WIDTH MEM_IF_WRITE_DQS_WIDTH * AFI_RATE_RATIO AFI_LAT_WIDTH 6 AFI_RLAT_WIDTH AFI_LAT_WIDTH AFI_WLAT_WIDTH AFI_LAT_WIDTH * MEM_IF_WRITE_DQS_WIDTH AFI_CLK_PAIR_COUNT MEM_IF_CLK_PAIR_COUNT AFI_WRANK_WIDTH Number of ranks * MEM_IF_WRITE_DQS_WIDTH *AFI_RATE_RATIO AFI_RRANK_WIDTH Number of ranks * MEM_IF_READ_DQS_WIDTH *AFI_RATE_RATIO 3.18.3 AFI Signals The following tables list Altera PHY interface (AFI) signals grouped according to their functions. In each table, the Direction column denotes the direction of the signal relative to the PHY. For example, a signal defined as an output passes out of the PHY to the controller. The AFI specification does not include any bidirectional signals. Not all signals are used for all protocols. 3.18.3.1 AFI Clock and Reset Signals The AFI interface provides up to two clock signals and an asynchronous reset signal. Table 49. Clock and Reset Signals Signal Name Direction Width Description afi_clk Output 1 Clock with which all data exchanged on the AFI bus is synchronized. In general, this clock is referred to as full-rate, half-rate, or quarter-rate, depending on the ratio between the frequency of this clock and the frequency of the memory device clock. afi_half_clk Output 1 Clock signal that runs at half the speed of the afi_clk. The controller uses this signal when the half-rate bridge feature is in use. This signal is optional. afi_reset_n Output 1 Asynchronous reset output signal. You must synchronize this signal to the clock domain in which you use it. External Memory Interface Handbook Volume 3: Reference Material 201 3 Functional Description—Intel Arria 10 EMIF IP 3.18.3.2 AFI Address and Command Signals The address and command signals for AFI 4.0 encode read/write/configuration commands to send to the memory device. The address and command signals are single-data rate signals. Table 50. Address and Command Signals Signal Name Direction Width Description afi_addr Input AFI_ADDR_WIDTH Address or CA bus (LPDDR3 only). ADDR_RATE_RATIO is 2 for LPDDR3 CA bus. afi_bg Input AFI_BANKGROUP_WIDTH Bank group (DDR4 only). afi_ba Input AFI_BANKADDR_WIDTH Bank address. (Not applicable for LPDDR3.) afi_cke Input AFI_CLK_EN_WIDTH Clock enable. afi_cs_n Input AFI_CS_WIDTH Chip select signal. (The number of chip selects may not match the number of ranks; for example, RDIMMs and LRDIMMs require a minimum of 2 chip select signals for both single-rank and dual-rank configurations. Consult your memory device data sheet for information about chip select signal width.) (Matches the number of ranks for LPDDR3.) afi_ras_n Input AFI_CONTROL_WIDTH RAS# (for DDR2 and DDR3 memory devices.) afi_we_n Input AFI_CONTROL_WIDTH WE# (for DDR2, DDR3, and RLDRAM II memory devices.) afi_rw_n Input AFI_CONTROL_WIDTH * 2 RWA/B# (QDR-IV). afi_cas_n Input AFI_CONTROL_WIDTH CAS# (for DDR2 and DDR3 memory devices.) afi_act_n Input AFI_CONTROL_WIDTH ACT# (DDR4). afi_ref_n Input AFI_CONTROL_WIDTH REF# (for RLDRAM II memory devices.) afi_rst_n Input AFI_CONTROL_WIDTH RESET# (for DDR3 and DDR4 memory devices.) afi_odt Input AFI_CLK_EN_WIDTH On-die termination signal for DDR2, DDR3, and LPDDR3 memory devices. (Do not confuse this memory device signal with the FPGA’s internal on-chip termination signal.) afi_par Input AFI_CS_WIDTH Address and command parity input. (DDR4) Address parity input. (QDR-IV) afi_ainv Input AFI_CONTROL_WIDTH Address inversion. (QDR-IV) continued... External Memory Interface Handbook Volume 3: Reference Material 202 3 Functional Description—Intel Arria 10 EMIF IP Signal Name Direction Width Description afi_mem_clk_disable Input AFI_CLK_PAIR_COUNT When this signal is asserted, mem_clk and mem_clk_n are disabled. This signal is used in lowpower mode. afi_wps_n Output AFI_CS_WIDTH WPS (for QDR II/II+ memory devices.) afi_rps_n Output AFI_CS_WIDTH RPS (for QDR II/II+ memory devices.) 3.18.3.3 AFI Write Data Signals Write Data Signals for AFI 4.0 control the data, data mask, and strobe signals passed to the memory device during write operations. Table 51. Write Data Signals Signal Name Direction Width Description afi_dqs_burst Input AFI_RATE_RATIO Controls the enable on the strobe (DQS) pins for DDR2, DDR3, LPDDR2, and LPDDR3 memory devices. When this signal is asserted, mem_dqs and mem_dqsn are driven. This signal must be asserted before afi_wdata_valid to implement the write preamble, and must be driven for the correct duration to generate a correctly timed mem_dqs signal. afi_wdata_valid Input AFI_RATE_RATIO Write data valid signal. This signal controls the output enable on the data and data mask pins. afi_wdata Input AFI_DQ_WIDTH Write data signal to send to the memory device at double-data rate. This signal controls the PHY’s mem_dq output. afi_dm Input AFI_DM_WIDTH Data mask. This signal controls the PHY’s mem_dm signal for DDR2, DDR3, LPDDR2, LPDDR3, and RLDRAM II memory devices.) Also directly controls the PHY's mem_dbi signal for DDR4. The mem_dm and mem_dbi features share the same port on the memory device. afi_bws_n Input AFI_DM_WIDTH Data mask. This signal controls the PHY’s mem_bws_n signal for QDR II/II+ memory devices. afi_dinv Input AFI_WRITE_DQS_WIDTH * 2 Data inversion. It directly controls the PHY's mem_dinva/b signal for QDR-IV devices. 3.18.3.4 AFI Read Data Signals Read Data Signals for AFI 4.0 control the data sent from the memory device during read operations. External Memory Interface Handbook Volume 3: Reference Material 203 3 Functional Description—Intel Arria 10 EMIF IP Table 52. Read Data Signals Signal Name Direction Width Description afi_rdata_en_full Input AFI_RATE_RATIO Read data enable full. Indicates that the memory controller is currently performing a read operation. This signal is held high for the entire read burst.If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). afi_rdata Output AFI_DQ_WIDTH Read data from the memory device. This data is considered valid only when afi_rdata_valid is asserted by the PHY. afi_rdata_valid Output AFI_RATE_RATIO Read data valid. When asserted, this signal indicates that the afi_rdata bus is valid.If this signal is aligned to even clock cycles, it is possible to use 1-bit even in half-rate mode (i.e., AFI_RATE=2). 3.18.3.5 AFI Calibration Status Signals The PHY instantiates a sequencer which calibrates the memory interface with the memory device and some internal components such as read FIFOs and valid FIFOs. The sequencer reports the results of the calibration process to the controller through the Calibration Status Signals in the AFI interface. Table 53. Calibration Status Signals Signal Name Direction Width Description afi_cal_success Output 1 Asserted to indicate that calibration has completed successfully. afi_cal_fail Output 1 Asserted to indicate that calibration has failed. afi_cal_req Input 1 Effectively a synchronous reset for the sequencer. When this signal is asserted, the sequencer returns to the reset state; when this signal is released, a new calibration sequence begins. afi_wlat Output AFI_WLAT_WIDTH The required write latency in afi_clk cycles, between address/command and write data being issued at the PHY/ controller interface. The afi_wlat value can be different for different groups; each group’s write latency can range from 0 to 63. If write latency is the same for all groups, only the lowest 6 bits are required. afi_rlat Output AFI_RLAT_WIDTH The required read latency in afi_clk cycles between address/command and read data being returned to the PHY/controller interface. Values can range from 0 to 63. (1) Note to Table: 1. The afi_rlat signal is not supported for PHY-only designs. Instead, you can sample the afi_rdata_valid signal to determine when valid read data is available. External Memory Interface Handbook Volume 3: Reference Material 204 3 Functional Description—Intel Arria 10 EMIF IP 3.18.3.6 AFI Tracking Management Signals When tracking management is enabled, the sequencer can take control over the AFI 4.0 interface at given intervals, and issue commands to the memory device to track the internal DQS Enable signal alignment to the DQS signal returning from the memory device. The tracking management portion of the AFI 4.0 interface provides a means for the sequencer and the controller to exchange handshake signals. Table 54. Tracking Management Signals Signal Name Direction Width Description afi_ctl_refresh_done Input 4 Handshaking signal from controller to tracking manager, indicating that a refresh has occurred and waiting for a response. afi_seq_busy Output 4 Handshaking signal from sequencer to controller, indicating when DQS tracking is in progress. afi_ctl_long_idle Input 4 Handshaking signal from controller to tracking manager, indicating that it has exited low power state without a periodic refresh, and waiting for response. 3.18.3.7 AFI Shadow Register Management Signals Shadow registers are a feature that enables high-speed multi-rank support. Shadow registers allow the sequencer to calibrate each rank separately, and save the calibrated settings—such as deskew delay-chain configurations—of each rank in its own set of shadow registers. During a rank-to-rank switch, the correct set of calibrated settings is restored just in time to optimize the data valid window. The PHY relies on additional AFI signals to control which set of shadow registers to activate. Table 55. Shadow Register Management Signals Signal Name Direction Width Description afi_wrank Input AFI_WRANK_WIDTH Signal from controller specifying which rank the write data is going to. The signal timing is identical to that of afi_dqs_burst. That is, afi_wrank must be asserted at the same time and must last the same duration as the afi_dqs_burst signal. afi_rrank Output AFI_RRANK_WIDTH Signal from controller specifying which rank is being read. The signal must be asserted at the same time as the afi_rdata_en signal when issuing a read command, but unlike afi_rdata_en, afi_rrank is stateful. That is, once asserted, the signal value must remain unchanged until the controller issues a new read command to a different rank. External Memory Interface Handbook Volume 3: Reference Material 205 3 Functional Description—Intel Arria 10 EMIF IP Both the afi_wrank and afi_rrank signals encode the rank being accessed using the one-hot scheme (e.g. in a quad-rank interface, 0001, 0010, 0100, 1000 refer to the 1st, 2nd, 3rd, 4th rank respectively). The ordering within the bus is the same as other AFI signals. Specifically the bus is ordered by time slots, for example: Half-rate afi_w/rrank = {T1, T0} Quarter-rate afi_w/rrank = {T3, T2, T1, T0} Where Tx is a number of rank-bit words that one-hot encodes the rank being accessed at the yth full-rate cycle. Additional Requirements for Arria 10 Shadow Register Support To ensure that the hardware has enough time to switch from one shadow register to another, the controller must satisfy the following minimum rank-to-rank-switch delays (tRTRS): • Two read commands going to different ranks must be separated by a minimum of 3 full-rate cycles (in addition to the burst length delay needed to avoid collision of data bursts). • Two write commands going to different rank must be separated by a minimum of 4 full-rate cycles (in addition to the burst length delay needed to avoid collision of data bursts). The Arria 10 device family supports a maximum of 4 sets of shadow registers, each for an independent set of timings. More than 4 ranks are supported if those ranks have four or fewer sets of independent timing. For example, the rank multiplication mode of an LRDIMM allows more than one physical rank to share a set of timing data as a single logical rank. Therefore Arria 10 devices can support up to 4 logical ranks, though that means more than 4 physical ranks. 3.18.4 AFI 4.0 Timing Diagrams 3.18.4.1 AFI Address and Command Timing Diagrams Depending on the ratio between the memory clock and the PHY clock, different numbers of bits must be provided per PHY clock on the AFI interface. The following figures illustrate the AFI address/command waveforms in full, half and quarter rate respectively. The waveforms show how the AFI command phase corresponds to the memory command output. AFI command 0 corresponds to the first memory command slot, AFI command 1 corresponds to the second memory command slot, and so on. External Memory Interface Handbook Volume 3: Reference Material 206 3 Functional Description—Intel Arria 10 EMIF IP Figure 117. AFI Address and Command Full-Rate Memory Interface mem_clk mem_cs_n mem_cke mem_ras_n mem_cas_n mem_we_n AFI Interface afi_clk afi_cs_n afi_cke afi_ras_n afi_cas_n afi_we_n External Memory Interface Handbook Volume 3: Reference Material 207 3 Functional Description—Intel Arria 10 EMIF IP Figure 118. AFI Address and Command Half-Rate Memory Interface mem_clk mem_cs_n mem_cke mem_ras_n mem_cas_n mem_we_n AFI Interface afi_clk afi_cs_n[1] afi_cs_n[0] 1 0 0 1 afi_cke[1] afi_cke[0] 1 1 1 1 afi_ras_n[1] afi_ras_n[0] 1 0 1 1 afi_cas_n[1] afi_cas_n[0] 1 1 0 1 afi_we_n[1] afi_we_n[0] 1 1 0 1 External Memory Interface Handbook Volume 3: Reference Material 208 3 Functional Description—Intel Arria 10 EMIF IP Figure 119. AFI Address and Command Quarter-Rate Memory Interface mem_clk mem_cs_n mem_cke mem_ras_n mem_cas_n mem_we_n AFI Interface afi_clk afi_cs_n[3] 0 1 afi_cs_n[2] afi_cs_n[1] afi_cs_n[0] 1 0 0 1 1 0 afi_cke[3] 1 1 afi_cke[2] afi_cke[1] afi_cke[0] 1 1 1 1 1 1 1 1 afi_ras_n[3] afi_ras_n[2] afi_ras_n[1] 1 1 0 1 afi_ras_n[0] 1 0 afi_cas_n[3] afi_cas_n[2] afi_cas_n[1] afi_cas_n[0] 0 1 1 0 1 1 1 1 afi_we_n[3] afi_we_n[2] 0 1 1 0 afi_we_n[1] afi_we_n[0] 1 1 1 1 3.18.4.2 AFI Write Sequence Timing Diagrams The following timing diagrams illustrate the relationships between the write command and corresponding write data and write enable signals, in full, half, and quarter rate. For half rate and quarter rate, when the write command is sent on the first memory clock in a PHY clock (for example, afi_cs_n[0] = 0), that access is called aligned access; otherwise it is called unaligned access. You may use either aligned or unaligned access, or you may use both, but you must ensure that the distance External Memory Interface Handbook Volume 3: Reference Material 209 3 Functional Description—Intel Arria 10 EMIF IP between the write command and the corresponding write data are constant on the AFI interface. For example, if a command is sent on the second memory clock in a PHY clock, the write data must also start at the second memory clock in a PHY clock. Write sequences with wlat=0 Figure 120. AFI Write Data Full-Rate, wlat=0 afi_clk afi_command WR WR WR afi_wdata_valid afi_wdata A B C D E F afi_dm M N O P Q R The following diagrams illustrate both aligned and unaligned access. The first three write commands are aligned accesses where they were issued on LSB of afi_command. The fourth write command is unaligned access where it was issued on a different command slot. AFI signals must be shifted accordingly, based on the command slot. Figure 121. AFI Write Data Half-Rate, wlat=0 afi_clk afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR WR afi_wdata_valid[1] 1 1 1 1 0 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[1] B D F G afi_wdata[0] A C E afi_dm[1] N P R afi_dm[0] M O Q External Memory Interface Handbook Volume 3: Reference Material 210 H S T 3 Functional Description—Intel Arria 10 EMIF IP Figure 122. AFI Write Data Quarter-Rate, wlat=0 afi_clk afi_command[3] NOP NOP NOP WR afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR NOP afi_wdata_valid[3] 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[3] D H L A afi_wdata[2] C G K D afi_wdata[1] B F J C afi_wdata[0] A E I B afi_dm[3] P T X afi_dm[2] O S W P afi_dm[1] N R V O afi_dm[0] M Q U N M Write sequences with wlat=non-zero The afi_wlat is a signal from the PHY. The controller must delay afi_dqs_burst, afi_wdata_valid, afi_wdata and afi_dm signals by a number of PHY clock cycles equal to afi_wlat, which is a static value determined by calibration before the PHY asserts cal_success to the controller. The following figures illustrate the cases when wlat=1. Note that wlat is in the number of PHY clocks and therefore wlat=1 equals 1, 2, and 4 memory clocks delay, respectively, on full, half and quarter rate. External Memory Interface Handbook Volume 3: Reference Material 211 3 Functional Description—Intel Arria 10 EMIF IP Figure 123. AFI Write Data Full-Rate, wlat=1 afi_clk afi_command WR WR WR afi_wdata_valid afi_wdata A B C D E F afi_dm M N O P Q R Figure 124. AFI Write Data Half-Rate, wlat=1 afi_clk afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR WR afi_wdata_valid[1] 1 1 1 1 0 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[1] B D F G afi_wdata[0] A C E afi_dm[1] N P R afi_dm[0] M O Q External Memory Interface Handbook Volume 3: Reference Material 212 H S T 3 Functional Description—Intel Arria 10 EMIF IP Figure 125. AFI Write Data Quarter-Rate, wlat=1 afi_clk afi_command[3] NOP NOP NOP WR afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR NOP afi_wdata_valid[3] 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[3] D H L A afi_wdata[2] C G K D afi_wdata[1] B F J C afi_wdata[0] A E I B afi_dm[3] P T X afi_dm[2] O S W P afi_dm[1] N R V O afi_dm[0] M Q U N M DQS burst The afi_dqs_burst signal must be asserted one or two complete memory clock cycles earlier to generate DQS preamble. DQS preamble is equal to one-half and onequarter AFI clock cycles in half and quarter rate, respectively. A DQS preamble of two is required in DDR4, when the write preamble is set to two clock cycles. The following diagrams illustrate how afi_dqs_burst must be asserted in full, half, and quarter-rate configurations. External Memory Interface Handbook Volume 3: Reference Material 213 3 Functional Description—Intel Arria 10 EMIF IP Figure 126. AFI DQS Burst Full-Rate, wlat=1 afi_clk afi_command WR WR WR afi_dqs_burst afi_wdata_valid afi_wdata A B C D E F afi_dm M N O P Q R Figure 127. AFI DQS Burst Half-Rate, wlat=1 afi_clk afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR WR afi_dqs_burst[1] 1 1 1 1 1 1 0 afi_dqs_burst[0] 0 1 0 1 1 1 1 afi_wdata_valid[1] 1 1 1 1 0 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[1] B D F G afi_wdata[0] A C E afi_dm[1] N P R afi_dm[0] M O Q External Memory Interface Handbook Volume 3: Reference Material 214 H S T 3 Functional Description—Intel Arria 10 EMIF IP Figure 128. AFI DQS Burst Quarter-Rate, wlat=1 afi_clk afi_command[3] NOP NOP NOP WR afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] WR WR WR NOP afi_dqs_burst[3] 1 1 1 1 1 1 0 afi_dqs_burst[2] 0 1 0 1 1 1 1 afi_dqs_burst[1] 0 1 0 1 1 0 1 afi_dqs_burst[0] 0 1 0 1 1 0 1 afi_wdata_valid[3] 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 afi_wdata_valid[0] 1 1 1 0 1 afi_wdata[3] D H L A afi_wdata[2] C G K D afi_wdata[1] B F J C afi_wdata[0] A E I B afi_dm[3] P T X afi_dm[2] O S W P afi_dm[1] N R V O afi_dm[0] M Q U N M Write data sequence with DBI (DDR4 and QDRIV only) The DDR4 write DBI feature is supported in the PHY, and when it is enabled, the PHY sends and receives the DBI signal without any controller involvement. The sequence is identical to non-DBI scenarios on the AFI interface. External Memory Interface Handbook Volume 3: Reference Material 215 3 Functional Description—Intel Arria 10 EMIF IP Write data sequence with CRC (DDR4 only) When the CRC feature of the PHY is enabled and used, the controller ensures at least one memory clock cycle between write commands, during which the PHY inserts the CRC data. Sending back to back write command would cause functional failure. The following figures show the legal sequences in CRC mode. Entries marked as 0 and RESERVE must be observed by the controller; no information is allowed on those entries. Figure 129. AFI Write Data with CRC Half-Rate, wlat=2 afi_clk afi_command[1] NOP NOP NOP afi_command[0] WR WR WR afi_dqs_burst[1] 1 1 1 afi_dqs_burst[0] 0 1 1 afi_wdata_valid[1] 1 1 afi_wdata_valid[0] 1 1 afi_wdata[1] B D afi_wdata[0] A C afi_dm[1] N P afi_dm[0] M O External Memory Interface Handbook Volume 3: Reference Material 216 0 0 Reserve Reserve 1 1 1 1 1 0 0 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 F H I H Reserve E G Reserve J L R T U W Reserve Q S Reserve V X 3 Functional Description—Intel Arria 10 EMIF IP Figure 130. AFI Write Data with CRC Quarter-Rate, wlat=2 afi_clk afi_command[1] NOP NOP NOP afi_command[0] WR WR WR afi_dqs_burst[3] 1 1 1 1 1 1 1 0 afi_dqs_burst[2] 0 1 0 1 1 1 0 1 afi_dqs_burst[1] 0 1 0 1 1 0 1 1 afi_dqs_burst[0] 0 1 0 1 0 1 1 1 0 afi_wdata_valid[3] 1 1 1 1 1 0 afi_wdata_valid[2] 1 1 1 1 0 1 afi_wdata_valid[1] 1 1 1 0 1 1 afi_wdata_valid[0] 1 1 0 1 1 1 afi_wdata[3] D D G J M Reserve afi_wdata[2] C C F I Reserve P afi_wdata[1] B B E Reserve L O afi_wdata[0] A A Reserve H K N afi_dm[3] D D G J M Reserve afi_dm[2] C C F I Reserve P afi_dm[1] B B E Reserve L O afi_dm[0] A A Reserve H K N 0 Reserve Reserve 3.18.4.3 AFI Read Sequence Timing Diagrams The following waveforms illustrate the AFI write data waveform in full, half, and quarter-rate, respectively. The afi_rdata_en_full signal must be asserted for the entire read burst operation. The afi_rdata_en signal need only be asserted for the intended read data. Aligned and unaligned access for read commands is similar to write commands; however, the afi_rdata_en_full signal must be sent on the same memory clock in a PHY clock as the read command. That is, if a read command is sent on the second memory clock in a PHY clock, afi_rdata_en_full must also be asserted, starting from the second memory clock in a PHY clock. External Memory Interface Handbook Volume 3: Reference Material 217 3 Functional Description—Intel Arria 10 EMIF IP Figure 131. AFI Read Data Full-Rate afi_clk afi_command RD RD RD afi_rdata_en_full afi_rdata A B C D E F afi_rdata_valid The following figure illustrates that the second and third reads require only the first and second half of data, respectively. The first three read commands are aligned accesses where they are issued on the LSB of afi_command. The fourth read command is unaligned access, where it is issued on a different command slot. AFI signals must be shifted accordingly, based on command slot. Figure 132. AFI Read Data Half-Rate afi_clk afi_command[1] NOP NOP NOP RD afi_command[0] RD RD RD NOP afi_rdata_en_full[1] 1 1 1 1 0 afi_rdata_en_full[0] 1 1 1 0 1 afi_rdata[1] B D F afi_rdata[0] A C E afi_rdata_valid[1] 1 1 1 afi_rdata_valid[0] 1 1 1 G H 1 1 In the following figure, the first three read commands are aligned accesses where they are issued on the LSB of afi_command. The fourth read command is unaligned access, where it is issued on a different command slot. AFI signals must be shifted accordingly, based on command slot. External Memory Interface Handbook Volume 3: Reference Material 218 3 Functional Description—Intel Arria 10 EMIF IP Figure 133. AFI Read Data Quarter-Rate afi_clk afi_command[3] NOP NOP NOP RD afi_command[2] NOP NOP NOP NOP afi_command[1] NOP NOP NOP NOP afi_command[0] RD RD RD NOP afi_rdata_en_full[3] 1 1 1 1 0 afi_rdata_en_full[2] 1 1 1 0 1 afi_rdata_en_full[1] 1 1 1 0 1 afi_rdata_en_full[0] 1 1 1 0 1 afi_rdata[3] D H L afi_rdata[2] C G K P afi_rdata[1] B F J O afi_rdata[0] A E I N afi_rdata_valid[3] 1 1 1 1 0 afi_rdata_valid[2] 1 1 1 0 1 afi_rdata_valid[1] 1 1 1 0 1 afi_rdata_valid[0] 1 1 1 0 1 M 3.18.4.4 AFI Calibration Status Timing Diagram The controller interacts with the PHY during calibration at power-up and at recalibration. At power-up, the PHY holds afi_cal_success and afi_cal_fail 0 until calibration is done, when it asserts afi_cal_success, indicating to controller that the PHY is ready to use and afi_wlat and afi_rlat signals have valid values. At recalibration, the controller asserts afi_cal_req, which triggers the same sequence as at power-up, and forces recalibration of the PHY. External Memory Interface Handbook Volume 3: Reference Material 219 3 Functional Description—Intel Arria 10 EMIF IP Figure 134. Calibration PHY Status Calibrating Controller Working Re-Calibrating Controller Working AFI Interface afi_cal_success afi_cal_fail afi_cal_req afi_wlat 9 9 afi_rlat 9 9 3.19 Resource Utilization The following tables provide resource utilization information for external memory interfaces on Arria 10 devices. 3.19.1 QDR-IV Resource Utilization in Arria 10 Devices The following table shows typical resource usage of QDR-IV interfaces with soft controller for Arria 10 devices. Table 56. QDR-IV Resource Utilization in Arria 10 Devices Memory Width (Bits) Combinational ALUTs Dedicated Logic Registers Block Memory Bits M20Ks Soft Controller 18 2123 4592 18432 8 1 36 2127 6023 36864 16 1 72 2114 8826 73728 32 1 3.20 Arria 10 EMIF Latency The following latency data applies to all memory protocols supported by the Arria 10 EMIF IP. Table 57. Rate Latency in Full-Rate Memory Clock Cycles 1 Controller Address & Command PHY Address & Command Memory Read Latency 2 PHY Read Data Return Controller Read Data Return Round Trip Round Trip Without Memory — Half:Write 12 2 3-23 — — — Half:Read 8 2 3-23 6 8 27-47 24 continued... External Memory Interface Handbook Volume 3: Reference Material 220 3 Functional Description—Intel Arria 10 EMIF IP Rate 1 Controller Address & Command PHY Address & Command Memory Read Latency 2 PHY Read Data Return Controller Read Data Return Round Trip Round Trip Without Memory Quarter:Writ e 14 2 3-23 — — — — Quarter:Rea d 10 2 3-23 6 14 35-55 32 Half:Write (ECC) 14 2 3-23 — — — — Half:Read (ECC) 12 2 3-23 6 8 31-51 28 Quarter:Writ e (ECC) 14 2 3-23 — — — — Quarter:Rea d (ECC) 12 2 3-23 6 14 37-57 34 1. User interface rate; the controller always operates in half rate. 2. Minimum and maximum read latency range for DDR3, DDR4, and LPDDR3. 3.21 Arria 10 EMIF Calibration Times The time needed for calibration varies, depending on many factors including the interface width, the number of ranks, frequency, board layout, and difficulty of calibration. The following table lists approximate typical calibration times for various protocols and configurations. Table 58. Arria 10 EMIF IP Approximate Calibration Times Protocol DDR3, x64 UDIMM, DQS x8, DM on DDR4, x64 UDIMM, DQS x8, DBI on RLDRAM 3, x36 QDR II, x36, BWS on Rank and Frequency Typical Calibration Time 1 rank, 933 MHz 102 ms 1 rank, 800 MHz 106 ms 2 rank, 933 MHz 198 ms 2 rank, 800 MHz 206 ms 1 rank, 1067 MHz 314 ms 1 rank, 800 MHz 353 ms 2 rank 1067 MHz 625 ms 2 rank 800 MHz 727 ms 1200 MHz 2808 ms 1067 MHz 2825 ms 1200 MHz, with DM 2818 ms 1067 MHz, with DM 2833 ms 333 MHz 616 ms continued... External Memory Interface Handbook Volume 3: Reference Material 221 3 Functional Description—Intel Arria 10 EMIF IP Protocol QDR-IV, x36, BWS on Rank and Frequency Typical Calibration Time 633 MHz 833 ms 1067 MHz 1563 ms 1067 MHz, with DBI 1556 ms 3.22 Integrating a Custom Controller with the Hard PHY If you want to use your own custom memory controller, you must integrate the controller with the hard PHY to achieve a complete memory solution. Observe the following general guidelines: • When you configure your external memory interface IP, ensure that you select Configuration ➤ Hard PHY Only on the General tab in the parameter editor. • Consult the AFI 4.0 Specification, for detailed information on the AFI interface to the PHY. 3.23 Memory Mapped Register (MMR) Tables The address buses to read and write from the MMR registers are 10 bits wide, while the read and write data buses are configured to be 32 bits. The Bits Register Link column in the table below provides the mapping on the width of the data read within the 32-bit bus. The reads and writes are always performed using the 32-bit-wide bus. Register Summary Address 32-bit Bus Bits Register link sbcfg8 7 16 sbcfg9 8 16 reserve2 9 16 ctrlcfg0 10 32 ctrlcfg1 11 32 ctrlcfg2 12 32 ctrlcfg3 13 32 ctrlcfg4 14 32 ctrlcfg5 15 16 ctrlcfg6 16 16 ctrlcfg7 17 16 ctrlcfg8 18 8 ctrlcfg9 19 8 dramtiming0 20 24 dramodt0 21 32 dramodt1 22 24 Register continued... External Memory Interface Handbook Volume 3: Reference Material 222 3 Functional Description—Intel Arria 10 EMIF IP Register Address 32-bit Bus Bits Register link sbcfg0 23 32 sbcfg1 24 32 sbcfg2 25 8 sbcfg3 26 24 sbcfg4 27 24 sbcfg5 28 8 sbcfg6 29 32 sbcfg7 30 8 caltiming0 31 32 caltiming1 32 32 caltiming2 33 32 caltiming3 34 32 caltiming4 35 32 caltiming5 36 24 caltiming6 37 32 caltiming7 38 32 caltiming8 39 32 caltiming9 40 8 caltiming10 41 8 dramaddrw 42 24 sideband0 43 8 sideband1 44 8 sideband2 45 8 sideband3 46 8 sideband4 47 8 sideband5 48 8 sideband6 49 8 sideband7 50 8 sideband8 51 8 sideband9 52 8 sideband10 53 8 sideband11 54 8 sideband12 55 8 sideband13 56 32 sideband14 57 16 sideband15 58 8 continued... External Memory Interface Handbook Volume 3: Reference Material 223 3 Functional Description—Intel Arria 10 EMIF IP Register Address 32-bit Bus Bits Register link dramsts 59 8 dbgdone 60 8 dbgsignals 61 32 dbgreset 62 8 dbgmatch 63 32 counter0mask 64 32 counter1mask 65 32 counter0match 66 32 counter1match 67 32 niosreserve0 68 16 niosreserve1 69 16 niosreserve2 70 16 ecc1 128 10 ecc2 129 22 ecc3 130 9 ecc4 144 16 Note: Addresses are in decimal format. 3.23.1 ctrlcfg0: Controller Configuration address=10(32 bit) Bit High Bit Low cfg_mem_type 3 0 Selects memory type. Program this field with one of the following binary values, "0000" for DDR3 SDRAM, "0001" for DDR4 SDRAM, "0010" for LPDDR3 SDRAM and "0011" for RLDRAM3. Read/Write cfg_dimm_type 6 4 Selects dimm type. Program this field with one of the following binary values: 3' b000: for DIMM_TYPE_COMPONENT, 3' b001: for DIMM_TYPE_UDIMM, 3' b010: for DIMM_TYPE_RDIMM, 3' b011: for DIMM_TYPE_LRDIMM, 3' b100: for DIMM_TYPE_SODIMM, and 3' b101: for DIMM_TYPE_3DS. Read/Write cfg_ac_pos 8 7 Specify C/A (command/address) pin position. Read/Write 13 9 Configures burst length for control path. Legal values are valid for JEDEC allowed DRAM values for the DRAM selected in cfg_type. For DDR3, DDR4 Read/Write Field cfg_ctrl_burst_length Description Access continued... External Memory Interface Handbook Volume 3: Reference Material 224 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low Description Access and LPDDR3, this should be programmed with 8 (binary "01000"), for RLDRAM III it can be programmed with 2 or 4 or 8. cfg_dbc0_burst_lengt h 18 14 Configures burst length for DBC0. Legal values are valid for JEDEC allowed DRAM values for the DRAM selected in cfg_type. For DDR3, DDR4 and LPDDR3, this should be programmed with 8 (binary "01000"), for RLDRAM III it can be programmed with 2 or 4 or 8. Read/Write cfg_dbc1_burst_lengt h 23 19 Configures burst length for DBC1. Legal values are valid for JEDEC allowed DRAM values for the DRAM selected in cfg_type. For DDR3, DDR4 and LPDDR3, this should be programmed with 8 (binary "01000"), for RLDRAM III it can be programmed with 2 or 4 or 8. Read/Write cfg_dbc2_burst_lengt h 28 24 Configures burst length for DBC2. Legal values are valid for JEDEC allowed DRAM values for the DRAM selected in cfg_type. For DDR3, DDR4 and LPDDR3, this should be programmed with 8 (binary "01000"), for RLDRAM III it can be programmed with 2 or 4 or 8. Read/Write 3.23.2 ctrlcfg1: Controller Configuration address=11(32 bit) Bit High Bit Low Description Access cfg_dbc3_burst_le ngth 4 0 Configures burst length for DBC3. Legal values are valid for JEDEC allowed DRAM values for the DRAM selected in cfg_type. For DDR3, DDR4 and LPDDR3, this should be programmed with 8 (binary "01000"), for RLDRAM III it can be programmed with 2 or 4 or 8. Read/Write cfg_addr_order 6 5 Selects the order for address interleaving. Programming this field with different values gives different mappings between the AXI or AvalonMM address and the SDRAM address. Program this field with the following binary values to select the ordering. "00" - chip, row, bank(BG, BA), column; "01" - chip, bank(BG, BA), row, column; "10"-row, chip, bank(BG, BA), column. Read/Write cfg_ctrl_enable_ec c 7 7 Enable the generation and checking of ECC. Read/Write cfg_dbc0_enable_e cc 8 8 Enable the generation and checking of ECC. Read/Write Field continued... External Memory Interface Handbook Volume 3: Reference Material 225 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low Description Access cfg_dbc1_enable_e cc 9 9 Enable the generation and checking of ECC. Read/Write cfg_dbc2_enable_e cc 10 10 Enable the generation and checking of ECC. Read/Write cfg_dbc3_enable_e cc 11 11 Enable the generation and checking of ECC. Read/Write cfg_reorder_data 12 12 This bit controls whether the controller can re-order operations to optimize SDRAM bandwidth. It should generally be set to a one. Read/Write cfg_ctrl_reorder_rd ata 13 13 This bit controls whether the controller need to re-order the read return data. Read/Write cfg_dbc0_reorder_ rdata 14 14 This bit controls whether the controller need to re-order the read return data. Read/Write cfg_dbc1_reorder_ rdata 15 15 This bit controls whether the controller need to re-order the read return data. Read/Write cfg_dbc2_reorder_ rdata 16 16 This bit controls whether the controller need to re-order the read return data. Read/Write cfg_dbc3_reorder_ rdata 17 17 This bit controls whether the controller need to re-order the read return data. Read/Write cfg_reorder_read 18 18 This bit controls whether the controller can re-order read command to 1. Read/Write cfg_starve_limit 24 19 Specifies the number of DRAM burst transactions an individual transaction will allow to reorder ahead of it before its priority is raised in the memory controller. Read/Write cfg_dqstrk_en 25 25 Enables DQS tracking in the PHY. Read/Write cfg_ctrl_enable_d m 26 26 Set to a one to enable DRAM operation if DM pins are connected. Read/Write cfg_dbc0_enable_d m 27 27 Set to a one to enable DRAM operation if DM pins are connected. Read/Write cfg_dbc1_enable_d m 28 28 Set to a one to enable DRAM operation if DM pins are connected. Read/Write cfg_dbc2_enable_d m 29 29 Set to a one to enable DRAM operation if DM pins are connected. Read/Write cfg_dbc3_enable_d m 30 30 Set to a one to enable DRAM operation if DM pins are connected. Read/Write 3.23.3 ctrlcfg2: Controller Configuration External Memory Interface Handbook Volume 3: Reference Material 226 3 Functional Description—Intel Arria 10 EMIF IP address=12(32 bit) Field Bit High Bit Low Description Access cfg_ctrl_output_regd 0 0 Set to one to register the HMC command output. Set to 0 to disable it. Read/Write cfg_dbc0_output_reg d 1 1 Set to one to register the HMC command output. Set to 0 to disable it. Read/Write cfg_dbc1_output_reg d 2 2 Set to one to register the HMC command output. Set to 0 to disable it. Read/Write cfg_dbc2_output_reg d 3 3 Set to one to register the HMC command output. Set to 0 to disable it. Read/Write cfg_dbc3_output_reg d 4 4 Set to one to register the HMC command output. Set to 0 to disable it. Read/Write cfg_ctrl2dbc_switch0 6 5 Select of the MUX ctrl2dbc_switch0. 2 Read/Write cfg_ctrl2dbc_switch1 8 7 Select of the MUX ctrl2dbc_switch1. 2 Read/Write cfg_dbc0_ctrl_sel 9 9 DBC0 - control path select. 1 Read/Write cfg_dbc1_ctrl_sel 10 10 DBC1 - control path select. 1 Read/Write cfg_dbc2_ctrl_sel 11 11 DBC2 - control path select. 1 Read/Write cfg_dbc3_ctrl_sel 12 12 DBC3 - control path select. 1 Read/Write cfg_dbc2ctrl_sel 14 13 Specifies which DBC is driven by the local control path. 2 Read/Write cfg_dbc0_pipe_lat 17 15 Specifies in number of controller clock cycles the latency of pipelining the signals from control path to DBC0 Read/Write cfg_dbc1_pipe_lat 20 18 Specifies in number of controller clock cycles the latency of pipelining the signals from control path to DBC1 Read/Write cfg_dbc2_pipe_lat 23 21 Specifies in number of controller clock cycles the latency of pipelining the signals from control path to DBC2 Read/Write cfg_dbc3_pipe_lat 26 24 Specifies in number of controller clock cycles the latency of pipelining the signals from control path to DBC3 Read/Write Description Access 3.23.4 ctrlcfg3: Controller Configuration address=13(32 bit) Bit High Bit Low cfg_ctrl_cmd_rate 2 0 3 Read/Write cfg_dbc0_cmd_rate 5 3 3 Read/Write cfg_dbc1_cmd_rate 8 6 3 Read/Write cfg_dbc2_cmd_rate 11 9 3 Read/Write Field continued... External Memory Interface Handbook Volume 3: Reference Material 227 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low cfg_dbc3_cmd_rate 14 12 3 Read/Write cfg_ctrl_in_protocol 15 15 1 Read/Write cfg_dbc0_in_protocol 16 16 1 Read/Write cfg_dbc1_in_protocol 17 17 1 Read/Write cfg_dbc2_in_protocol 18 18 1 Read/Write cfg_dbc3_in_protocol 19 19 1 Read/Write cfg_ctrl_dualport_en 20 20 Enable the second command port for RLDRAM3 only (BL=2 or 4) Read/Write cfg_dbc0_dualport_e n 21 21 Enable the second data port for RLDRAM3 only (BL=2 or 4) Read/Write cfg_dbc1_dualport_e n 22 22 Enable the second data port for RLDRAM3 only (BL=2 or 4) Read/Write cfg_dbc2_dualport_e n 23 23 Enable the second data port for RLDRAM3 only (BL=2 or 4) Read/Write cfg_dbc3_dualport_e n 24 24 Enable the second data port for RLDRAM3 only (BL=2 or 4) Read/Write cfg_arbiter_type 25 25 Indicates controller arbiter operating mode. Set this to: - 1 Read/Write cfg_open_page_en 26 26 Set to 1 to enable the open page policy when command reordering is disabled (cfg_cmd_reorder = 0). This bit does not matter when cfg_cmd_reorder is 1. Read/Write cfg_rld3_multibank_ mode 30 28 Multibank setting, specific for RLDRAM3. Set this to: - 3 Read/Write Note: Description Access DDR4 gear down mode is not supported. 3.23.5 ctrlcfg4: Controller Configuration address=14(32 bit) Bit High Bit Low cfg_tile_id 4 0 Tile ID. Read/Write cfg_pingpong_mode 6 5 Ping Pong mode: 2 Read/Write cfg_ctrl_slot_rotate_ en 9 7 Cmd slot rotate enable: bit[0] controls write, 1 Read/Write cfg_dbc0_slot_rotate _en 12 10 DBC0 slot rotate enable: bit[0] controls write, 1 Read/Write cfg_dbc1_slot_rotate _en 15 13 DBC1 slot rotate enable: bit[0] controls write, 1 Read/Write cfg_dbc2_slot_rotate _en 18 16 DBC2 slot rotate enable: bit[0] controls write, 1 Read/Write cfg_dbc3_slot_rotate _en 21 19 DBC3 slot rotate enable: bit[0] controls write, 1 Read/Write Field Description Access continued... External Memory Interface Handbook Volume 3: Reference Material 228 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low Description Access cfg_ctrl_slot_offset 23 22 Enables afi information to be offset by numbers of FR cycles. Affected afi signal is afi_rdata_en, afi_rdata_en_full, afi_wdata_valid, afi_dqs_burst, afi_mrnk_write and afi_mrnk_read. Set this to: - 2 Read/Write cfg_dbc0_slot_offset 25 24 Enables afi information to be offset by numbers of FR cycles. Affected afi signal is afi_rdata_en, afi_rdata_en_full, afi_wdata_valid, afi_dqs_burst, afi_mrnk_write and afi_mrnk_read. Set this to: - 2 Read/Write cfg_dbc1_slot_offset 27 26 Enables afi information to be offset by numbers of FR cycles. Affected afi signal is afi_rdata_en, afi_rdata_en_full, afi_wdata_valid, afi_dqs_burst, afi_mrnk_write and afi_mrnk_read. Set this to: - 2 Read/Write cfg_dbc2_slot_offset 29 28 Enables afi information to be offset by numbers of FR cycles. Affected afi signal is afi_rdata_en, afi_rdata_en_full, afi_wdata_valid, afi_dqs_burst, afi_mrnk_write and afi_mrnk_read. Set this to: - 2 Read/Write cfg_dbc3_slot_offset 31 30 Enables afi information to be offset by numbers of FR cycles. Affected afi signal is afi_rdata_en, afi_rdata_en_full, afi_wdata_valid, afi_dqs_burst, afi_mrnk_write and afi_mrnk_read. Set this to: - 2 Read/Write Description Access 3.23.6 ctrlcfg5: Controller Configuration address=15(32 bit) Bit High Bit Low cfg_col_cmd_slot 3 0 Specify the col cmd slot. One hot encoding. Read/Write cfg_row_cmd_slot 7 4 Specify the row cmd slot. One hot encoding. Read/Write cfg_ctrl_rc_en 8 8 Set to 1 to enable the rate conversion. It converts QR input from core to HR inside HMC. Read/Write cfg_dbc0_rc_en 9 9 Set to 1 to enable the rate conversion. It converts QR input from core to HR inside HMC. Read/Write cfg_dbc1_rc_en 10 10 Set to 1 to enable the rate conversion. It converts QR input from core to HR inside HMC. Read/Write cfg_dbc2_rc_en 11 11 Set to 1 to enable the rate conversion. It converts QR input from core to HR inside HMC. Read/Write cfg_dbc3_rc_en 12 12 Set to 1 to enable the rate conversion. It converts QR input from core to HR inside HMC. Read/Write Field External Memory Interface Handbook Volume 3: Reference Material 229 3 Functional Description—Intel Arria 10 EMIF IP 3.23.7 ctrlcfg6: Controller Configuration address=16(32 bit) Field cfg_cs_chip Bit High Bit Low Description Access 15 0 Chip select mapping scheme. Mapping seperated into 4 sections: [CS3][CS2] [CS1][CS0] Each section consists of 4 bits to indicate which CS_n signal should be active when command goes to current CS. Eg: if we set to 16. Read/Write Description Access 3.23.8 ctrlcfg7: Controller Configuration address=17(32 bit) Field Bit High Bit Low cfg_clkgating_en 0 0 Set to 1 to enable the clock gating. The clock is shut off for the whole HMC. Read/Write cfg_rb_reserved_entr y 7 1 Specify how many enties are reserved in read buffer before almost full is asserted. Read/Write cfg_wb_reserved_ent ry 14 8 Specify how many enties are reserved in write buffer before almost full is asserted. Read/Write Description Access 3.23.9 ctrlcfg8: Controller Configuration address=18(32 bit) Bit High Bit Low cfg_3ds_en 0 0 Setting to 1 to enable #DS support for DDR4. Read/Write cfg_ck_inv 1 1 Use to program CK polarity. 1 Read/Write cfg_addr_mplx_en 2 2 Setting to 1 enables RLD3 address mulplex mode. Read/Write Field 3.23.10 ctrlcfg9: Controller Configuration address=19(32 bit) Field cfg_dfx_bypass_en Bit High Bit Low 0 0 External Memory Interface Handbook Volume 3: Reference Material 230 Description Used for dft and timing characterization only. 1 Access Read/Write 3 Functional Description—Intel Arria 10 EMIF IP 3.23.11 dramtiming0: Timing Parameters address=20(32 bit) Field Bit High Bit Low 6 0 Memory read latency. Read/Write cfg_power_saving_ex it_cycles 12 7 The minimum number of cycles to stay in a low power state. This applies to both power down and self-refresh and should be set to the greater of tPD and tCKESR. Read/Write cfg_mem_clk_disable _entry_cycles 18 13 Set to a the number of clocks after the execution of an self-refresh to stop the clock. This register is generally set based on PHY design latency and should generally not be changed. Read/Write cfg_tcl Description Access 3.23.12 dramodt0: On-Die Termination Parameters address=21(32 bit) Bit High Bit Low Description Access cfg_write_odt_chip Field 15 0 ODT scheme setting for write command. Setting seperated into 4 sections: [CS3][CS2][CS1][CS0] Each section consists of 4 bits to indicate which chip should ODT be asserted when write occurs on current CS. Eg: if we set to 16. Read/Write cfg_read_odt_chip 31 16 ODT scheme setting for read command. Setting seperated into 4 sections: [CS3][CS2][CS1][CS0] Each section consists of 4 bits to indicate which chip should ODT be asserted when write occurs on current CS. Eg: if we set to 16. Read/Write 3.23.13 dramodt1: On-Die Termination Parameters External Memory Interface Handbook Volume 3: Reference Material 231 3 Functional Description—Intel Arria 10 EMIF IP address=22(32 bit) Field Bit High Bit Low Description Access cfg_wr_odt_on 5 0 Indicates number of memory clock cycle gap between write command and ODT signal rising edge. Read/Write cfg_rd_odt_on 11 6 Indicates number of memory clock cycle gap between read command and ODT signal rising edge. Read/Write cfg_wr_odt_period 17 12 Indicates number of memory clock cycle write ODT signal should stay asserted after rising edge. Read/Write cfg_rd_odt_period 23 18 Indicates number of memory clock cycle read ODT signal should stay asserted after rising edge. Read/Write 3.23.14 sbcfg0: Sideband Configuration address=23(32 bit) Field Bit High Bit Low Description Access cfg_rld3_refresh_seq 0 15 0 Banks to Refresh for RLD3 in sequence 0. Must not be more than 4 banks. Read/Write cfg_rld3_refresh_seq 1 31 16 Banks to Refresh for RLD3 in sequence 1. Must not be more than 4 banks. Read/Write 3.23.15 sbcfg1: Sideband Configuration address=24(32 bit) Field Bit High Bit Low Description Access cfg_rld3_refresh_seq 2 15 0 Banks to Refresh for RLD3 in sequence 2. Must not be more than 4 banks. Read/Write cfg_rld3_refresh_seq 3 31 16 Banks to Refresh for RLD3 in sequence 3. Must not be more than 4 banks. Read/Write 3.23.16 sbcfg2: Sideband Configuration External Memory Interface Handbook Volume 3: Reference Material 232 3 Functional Description—Intel Arria 10 EMIF IP address=25(32 bit) Field Bit High Bit Low Description Access cfg_srf_zqcal_disable 0 0 Set to 1 to disable ZQ Calibration after self refresh. Read/Write cfg_mps_zqcal_disab le 1 1 Set to 1 to disable ZQ Calibration after Maximum Power Saving exit. Read/Write cfg_mps_dqstrk_disa ble 2 2 Set to 1 to disable DQS Tracking after Maximum Power Saving exit. Read/Write cfg_sb_cg_disable 3 3 Set to 1 to disable mem_ck gating during self refresh and deep power down. Clock gating is not supported when the Ping Pong PHY feature is enabled. Do not enable clock gating for Ping Pong PHY interfaces. Read/Write cfg_user_rfsh_en 4 4 Setting to 1 to enable user refresh. Read/Write cfg_srf_autoexit_en 5 5 Setting to 1 to enable controller to exit Self Refresh when new command is detected. Read/Write cfg_srf_entry_exit_bl ock 7 6 Blocking arbiter from issuing cmds for the 4 cases, 2 Read/Write Description Access 3.23.17 sbcfg3: Sideband Configuration address=26(32 bit) Field cfg_sb_ddr4_mr3 Bit High Bit Low 19 0 This register stores the DDR4 MR3 Content. Read/Write 3.23.18 sbcfg4: Sideband Configuration address=27(32 bit) Field cfg_sb_ddr4_mr4 Bit High Bit Low 19 0 Description This register stores the DDR4 MR4 Content. Access Read/Write 3.23.19 sbcfg5: Sideband Configuration External Memory Interface Handbook Volume 3: Reference Material 233 3 Functional Description—Intel Arria 10 EMIF IP address=28(32 bit) Field Bit High Bit Low Description Access cfg_short_dqstrk_ctrl _en 0 0 Set to 1 to enable controller controlled DQS short tracking, Set to 0 to enable sequencer controlled DQS short tracking. Read/Write cfg_period_dqstrk_ct rl_en 1 1 Set to 1 to enable controller to issue periodic DQS tracking. Read/Write 3.23.20 sbcfg6: Sideband Configuration address=29(32 bit) Field Bit High Bit Low cfg_period_dqstrk_in terval 15 0 cfg_t_param_dqstrk_ to_valid_last 23 cfg_t_param_dqstrk_ to_valid 31 Description Access Inverval between two controller controlled periodic DQS tracking. Read/Write 16 DQS Tracking Rd to Valid timing for the last Rank. Read/Write 24 DQS Tracking Rd to Valid timing for Ranks other than the Last. Read/Write 3.23.21 sbcfg7: Sideband Configuration address=30(32 bit) Field Bit High Bit Low Description Access cfg_rfsh_warn_thres hold 6 o Threshold to warn a refresh is needed within the number of controller clock cycles specified by the threshold. Read/Write Description Access 3.23.22 sbcfg8: Sideband Configuration address=7(32 bit) Field cfg_reserve cfg_ddr4_mps_addr mirror Bit High Bit Low 15 1 General purpose reserved register. Read/Write 0 0 When asserted, indicates DDR4 Address Mirroring is enabled for MPS. Read/Write Description Access 3.23.23 sbcfg9: Sideband Configuration address=8(32 bit) Field cfg_sb_ddr4_mr5 Bit High Bit Low 15 0 External Memory Interface Handbook Volume 3: Reference Material 234 DDR4 Mode Register 5. Read/Write 3 Functional Description—Intel Arria 10 EMIF IP 3.23.24 caltiming0: Command/Address/Latency Parameters address=31(32 bit) Field Bit High Bit Low Description Access cfg_t_param_act_to_ rdwr 5 0 Activate to Read/write command timing. Read/Write cfg_t_param_act_to_ pch 11 6 Active to precharge. Read/Write cfg_t_param_act_to_ act 17 12 Active to activate timing on same bank. Read/Write cfg_t_param_act_to_ act_diff_bank 23 18 Active to activate timing on different banks, for DDR4 same bank group. Read/Write cfg_t_param_act_to_ act_diff_bg 29 24 Active to activate timing on different bank groups, DDR4 only. Read/Write 3.23.25 caltiming1: Command/Address/Latency Parameters address=32(32 bit) Field Bit High Bit Low Description Access cfg_t_param_rd_to_r d 5 0 Read to read command timing on same bank. Read/Write cfg_t_param_rd_to_r d_diff_chip 11 6 Read to read command timing on different chips. Read/Write cfg_t_param_rd_to_r d_diff_bg 17 12 Read to read command timing on different chips. Read/Write cfg_t_param_rd_to_ wr 23 18 Write to read command timing on same bank. Read/Write cfg_t_param_rd_to_ wr_diff_chip 29 24 Read to write command timing on different chips Read/Write 3.23.26 caltiming2: Command/Address/Latency Parameters address=33(32 bit) Field Bit High Bit Low Description Access cfg_t_param_rd_to_ wr_diff_bg 5 0 Read to write command timing on different bank groups. Read/Write cfg_t_param_rd_to_ pch 11 6 Read to precharge command timing. Read/Write continued... External Memory Interface Handbook Volume 3: Reference Material 235 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low Description Access cfg_t_param_rd_ap_ to_valid 17 12 Read command with autoprecharge to data valid timing. Read/Write cfg_t_param_wr_to_ wr 23 18 Write to write command timing on same bank. Read/Write cfg_t_param_wr_to_ wr_diff_chip 29 24 Write to write command timing on different chips. Read/Write 3.23.27 caltiming3: Command/Address/Latency Parameters address=34(32 bit) Field Bit High Bit Low cfg_t_param_wr_to_ wr_diff_bg 5 0 Write to write command timing on different bank groups. Description Read/Write Access cfg_t_param_wr_to_ rd 11 6 Write to read command timing. Read/Write cfg_t_param_wr_to_ rd_diff_chip 17 12 Write to read command timing on different chips. Read/Write cfg_t_param_wr_to_ rd_diff_bg 23 18 Write to read command timing on different bank groups. Read/Write cfg_t_param_wr_to_ pch 29 24 Write to precharge command timing. Read/Write 3.23.28 caltiming4: Command/Address/Latency Parameters address=35(32 bit) Field Bit High Bit Low cfg_t_param_wr_ap_ to_valid 5 0 Write with autoprecharge to valid command timing. Description Read/Write cfg_t_param_pch_to _valid 11 6 Precharge to valid command timing. Read/Write cfg_t_param_pch_all _to_valid 17 12 Precharge all to banks being ready for bank activation command. Read/Write cfg_t_param_arf_to_ valid 25 18 Auto Refresh to valid DRAM command window. Read/Write cfg_t_param_pdn_to _valid 31 26 Power down to valid bank command window. Read/Write 3.23.29 caltiming5: Command/Address/Latency Parameters External Memory Interface Handbook Volume 3: Reference Material 236 Access 3 Functional Description—Intel Arria 10 EMIF IP address=36(32 bit) Field Bit High Bit Low cfg_t_param_srf_to_ valid 9 0 cfg_t_param_srf_to_ zq_cal 19 10 Description Access Self-refresh to valid bank command window. Read/Write Self refresh to ZQ calibration window. Read/Write 3.23.30 caltiming6: Command/Address/Latency Parameters address=37(32 bit) Field Bit High Bit Low cfg_t_param_arf_per iod 12 0 cfg_t_param_pdn_pe riod 28 13 Description Access Auto-refresh period. Read/Write Clock power down recovery period. Read/Write 3.23.31 caltiming7: Command/Address/Latency Parameters address=38(32 bit) Field Bit High Bit Low Description Access cfg_t_param_zqcl_to _valid 8 0 Long ZQ calibration to valid. Read/Write cfg_t_param_zqcs_to _valid 15 9 Short ZQ calibration to valid. Read/Write cfg_t_param_mrs_to _valid 19 16 Mode Register Setting to valid. Read/Write cfg_t_param_mps_to _valid 29 20 Timing parameter for Maximum Power Saving to any valid command. tXMP Read/Write 3.23.32 caltiming8: Command/Address/Latency Parameters address=39(32 bit) Field Bit High Bit Low Description Access cfg_t_param_mrr_to _valid 3 0 Timing parameter for Mode Register Read to any valid command. Read/Write cfg_t_param_mpr_to _valid 8 4 Timing parameter for Multi Purpose Register Read to any valid command. Read/Write cfg_t_param_mps_e xit_cs_to_cke 12 9 Timing parameter for exit Maximum Power Saving. Timing requirement for CS assertion vs CKE de-assertion. tMPX_S Read/Write continued... External Memory Interface Handbook Volume 3: Reference Material 237 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low Description Access cfg_t_param_mps_e xit_cke_to_cs 16 13 Timing parameter for exit Maximum Power Saving. Timing requirement for CKE de-assertion vs CS de-assertion. tMPX_LH Read/Write cfg_t_param_rld3_m ultibank_ref_delay 19 17 RLD3 Refresh to Refresh Delay for all sequences. Read/Write cfg_t_param_mmr_c md_to_valid 27 20 MMR cmd to valid delay. Read/Write 3.23.33 caltiming9: Command/Address/Latency Parameters address=40(32 bit) Field Bit High Bit Low cfg_t_param_4_act_t o_act 7 0 Description The four-activate window timing parameter. Access Read/Write 3.23.34 caltiming10: Command/Address/Latency Parameters address=41(32 bit) Field Bit High Bit Low cfg_t_param_16_act _to_act 7 0 Description The 16-activate window timing parameter (RLD3). Access Read/Write 3.23.35 dramaddrw: Row/Column/Bank Address Width Configuration address=42(32 bit) Bit High Bit Low cfg_col_addr_width 4 0 The number of column address bits for the memory devices in your memory interface. Read/Write cfg_row_addr_width 9 5 The number of row address bits for the memory devices in your memory interface. Read/Write cfg_bank_addr_width 13 10 The number of bank address bits for the memory devices in your memory interface. Read/Write cfg_bank_group_add r_width 15 14 The number of bank group address bits for the memory devices in your memory interface. Read/Write cfg_cs_addr_width 18 16 The number of chip select address bits for the memory devices in your memory interface. Read/Write Field External Memory Interface Handbook Volume 3: Reference Material 238 Description Access 3 Functional Description—Intel Arria 10 EMIF IP 3.23.36 sideband0: Sideband address=43(32 bit) Field mr_cmd_trigger Bit High Bit Low 0 0 Description Write to 1 to trigger the execution of the mode register command. Access Read/Write 3.23.37 sideband1: Sideband address=44(32 bit) Field mmr_refresh_req Bit High Bit Low 3 0 Description When asserted, indicates Refresh request to the specific rank. Each bit corresponds to each rank. Access Read/Write 3.23.38 sideband2: Sideband address=45(32 bit) Field Bit High Bit Low mmr_zqcal_long_req 0 0 Description When asserted, indicates long ZQ cal request. This bit is write clear. Access Read/Write 3.23.39 sideband3: Sideband address=46(32 bit) Field Bit High Bit Low Description Access mmr_zqcal_short_re q 0 0 When asserted, indicates short ZQ cal request. This bit is write clear. Read/Write Bit High Bit Low Description Access 3 0 3.23.40 sideband4: Sideband address=47(32 bit) Field mmr_self_rfsh_req When asserted, indicates self refresh request to the specific rank. Each bit corresponds to each rank. These bits are write clear. Read/Write External Memory Interface Handbook Volume 3: Reference Material 239 3 Functional Description—Intel Arria 10 EMIF IP 3.23.41 sideband5: Sideband address=48(32 bit) Field mmr_dpd_mps_req Bit High Bit Low Description Access 0 0 When asserted, indicates deep power down or max power saving request. This bit is write clear. Read/Write Bit High Bit Low Description Access 0 0 3.23.42 sideband6: Sideband address=49(32 bit) Field mr_cmd_ack Acknowledge to mode register command. Read 3.23.43 sideband7: Sideband address=50(32 bit) Field mmr_refresh_ack Bit High Bit Low 0 0 Description Acknowledge to indicate refresh is in progress. Access Read 3.23.44 sideband8: Sideband address=51(32 bit) Field mmr_zqcal_ack Bit High Bit Low 0 0 Description Acknowledge to indicate ZQCAL is progressing. Access Read 3.23.45 sideband9: Sideband address=52(32 bit) Field mmr_self_rfsh_ack Bit High Bit Low Description Access 0 0 Acknowledge to indicate self refresh is progressing. Read 3.23.46 sideband10: Sideband External Memory Interface Handbook Volume 3: Reference Material 240 3 Functional Description—Intel Arria 10 EMIF IP address=53(32 bit) Field mmr_dpd_mps_ack Bit High Bit Low 0 0 Description Acknowledge to indicate deep power down/max power saving is in progress. Access Read 3.23.47 sideband11: Sideband address=54(32 bit) Field mmr_auto_pd_ack Bit High Bit Low 0 0 Description Acknowledge to indicate auto power down is in progress. Access Read 3.23.48 sideband12: Sideband address=55(32 bit) Field Bit High Bit Low Description Access mr_cmd_type 2 0 Indicates the type of Mode Register Command. Read/Write mr_cmd_rank 6 3 Indicates which rank the mode register command is intended to. Read/Write 3.23.49 sideband13: Sideband address=56(32 bit) Field mr_cmd_opcode Bit High Bit Low Description Access 31 0 [31:27] reserved. Register Command Opcode Information to be used for Register Command LPDDR3 [26:20] Reserved [19:10] falling edge CA. [9:0] rising edge CA DDR4 [26:24] C2:C0 [23] ACT [22:21] BG1:BG0 [20] Reserved [19:18] BA1:BA0 [17] A17 [16] RAS [15] CAS [14] WE [13:0] A13:A0 DDR3 [26:21] Reserved [20:18] BA2:BA0 [17] A17 [16] RAS [15] CAS [14] WE [13] Reserved [12:0] A12:A0 RLDRAM3 [26] Reserved [25:22] BA3:BA0 [21] REF [20] WE [19:0] A19:A0 Read/Write 3.23.50 sideband14: Sideband External Memory Interface Handbook Volume 3: Reference Material 241 3 Functional Description—Intel Arria 10 EMIF IP address=57(32 bit) Field mmr_refresh_bank Bit High Bit Low Description Access 15 0 User refresh bank information, binary representation of bank address. Enables refresh to that bank address when requested. Read/Write Bit High Bit Low Description Access 3 0 Setting to 1 to stall the corresponding rank. Read/Write 3.23.51 sideband15: Sideband address=58(32 bit) Field mmr_stall_rank 3.23.52 dramsts: Calibration Status address=59(32 bit) Field Bit High Bit Low Description Access phy_cal_success 0 0 This bit will be set to 1 if the PHY was able to successfully calibrate. Read phy_cal_fail 1 1 This bit will be set to 1 if the PHY was unable to calibrate. Read 3.23.53 ecc1: ECC General Configuration address=128(32 bit) Bit High Bit Low cfg_enable_ecc 0 0 A value of 1 enables the ECC encoder/decoder. Read/Write cfg_enable_dm 1 1 A value of 1 indicate that this is a design with DM. Read/Write cfg_enable_rmw 2 2 A value of 1 enables the RMW feature, including partial/dummy write support. Read/Write cfg_data_rate 6 3 Set this value to 2, 4, or 8 for full, half or quarter rate designs. cfg_ecc_in_protocol 7 7 Set this value to 1 for Avalon-MM or 0 for Avalon-ST input interface. Readonly register. Read/Write cfg_enable_auto_cor r 8 8 A value of 1 enables the auto correction feature, injecting a dummy write command after a single-bit error Read/Write Field Description Access Read continued... External Memory Interface Handbook Volume 3: Reference Material 242 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low Description Access is (SBE) detected. This feature must be enabled together with RMW and ECC. cfg_enable_ecc_code _overwrite Reserved 9 9 31 10 A value of 1 enables the ECC code overwrite feature. Reuse the original read-back ECC code during RMW if a double-bit error (DBE) is detected. Read/Write 3.23.54 ecc2: Width Configuration address=129(32 bit) Field Bit High Bit Low Description Access cfg_dram_data_widt h 7 0 Set this value to the DRAM data width, without taking rate conversion into consideration. For example, for a 64+8=72 bit ECC design, set this register to ‘d72. Read cfg_local_data_width 15 8 Set this value to the LOCAL data width, without taking rate conversion into consideration. For example, for a 64+8=72 DQ bit ECC design, set this register to ‘d64. Read cfg_addr_width 21 16 Set this value to the LOCAL address width, which corresponds to the address field width of the cmd field. For example, for a LOCAL address width of 24 bits, set this value to ‘d24. Read Reserved 31 22 3.23.55 ecc3: ECC Error and Interrupt Configuration address=130(32 bit) Bit High Bit Low Description Access cfg_gen_sbe 0 0 A value of 1 enables the generate SBE feature. Generates a single bit error during the write process. Read/Write cfg_gen_dbe 1 1 A value of 1 enables the generate DBE feature. Generates a double bit error during the write process. Read/Write cfg_enable_intr 2 2 A value of 1 enables the interrupt feature. The interrupt signal notifies if an error condition occurs. The condition is configurable. Read/Write cfg_mask_sbe_intr 3 3 A value of 1 masks the interrupt signal when SBE occurs. Read/Write Field continued... External Memory Interface Handbook Volume 3: Reference Material 243 3 Functional Description—Intel Arria 10 EMIF IP Field Bit High Bit Low cfg_mask_dbe_intr 4 4 A value of 1 masks the interrupt signal when DBE occurs. Read/Write cfg_mask_corr_drop ped_intr 5 5 A value of 1 masks the interrupt signal when auto correction command can’t be scheduled, due to backpressure (FIFO full). Read/Write cfg_mask_hmi_intr 6 6 A value of 1 masks the interrupt signal when the hard memory interface asserts an interrupt signal via hmi_interrupt port. Read/Write cfg_clr_intr 7 7 Writing a vale of 1 to this self-clearing bit clears the interrupt signal, error status, and address. Read/Write 31 9 Reserved Description Access 3.23.56 ecc4: Status and Error Information address=144(32 bit) Field Bit High Bit Low Description Access sts_ecc_intr 0 9 Indicates the interrupt status; a value of 1 indicates interrupt occurred. Read sts_sbe_error 1 1 Indicates the SBE status; a value of 1 indicates SBE occurred. Read sts_dbe_error 2 2 Indicates the DBE status; a value of 1 indicates DBE occurred. Read sts_corr_dropped 3 3 Indicates the status of correction command dropped; a value of 1 indicates correction command dropped. Read sts_sbe_count 7 4 Indicates the number of times SBE error has occurred. The counter will overflow. Read sts_dbe_count 11 8 Indicates the number of times DBE error has occurred. The counter will overflow. Read sts_corr_dropped_co unt 15 12 Indicates the number of times correction command has dropped. Counter will overflow. Read Reserved 31 16 3.23.57 ecc5: Address of Most Recent SBE/DBE External Memory Interface Handbook Volume 3: Reference Material 244 3 Functional Description—Intel Arria 10 EMIF IP address=145(32 bit) Field sts_err_addr* Bit High Bit Low 31 0 Description Address of the most recent single-bit error or double-bit error. Access Read 3.23.58 ecc6: Address of Most Recent Correct Command Dropped address=146(32 bit) Bit High Field sts_corr_dropped_add r Bit Low 31 0 Description Address of the most recent correction command dropped. Access Read 3.24 Document Revision History Date Version Changes May 2017 2017.05.08 • • Added Using the EMIF Debug Toolkit with Arria 10 HPS Interfaces topic. Rebranded as Intel. October 2016 2016.10.31 • Added Back-to-Back User-Controlled Refresh for Hard Memory Controller topic. Modified first bullet point in the ECC in Arria 10 EMIF IP topic. Added additional content about read-modify-write operations to the ECC in Arria 10 EMIF IP topic. • • • • May 2016 2016.05.02 • • • • • • • November 2015 2015.11.02 • • • • • • • Removed afi_alert_n from the AFI Address and Command Signals table in the AFI Address and Command Signals topic. Added ecc5: Address of Most Recent SBE/DBE and ecc6: Address of Most recent Correct Command Dropped to Memory Mapped Register (MMR) Tables section. Modified text of Hard Memory Controller and Hard PHY and Soft Memory Controller and Hard PHY. Modified content of the Ping Pong PHY Architecture topic. Added section on read-modify-write operations, to ECC in Arria 10 EMIF IP topic. Added AFI 4.0 Timing Diagrams section. Added Arria 10 EMIF Latency section. Added Arria 10 EMIF Calibration Times section. Added Integrating a Custom Controller with the Hard PHY section. Added LPDDR3 support to AFI Parameters, AFI Address and Command Signals, and AFI Write Data Signals. Revised rendering of address information in the Memory Mapped Register (MMR) Tables. Added ecc1: ECC General Configuration, ecc2: Width Configuration, ecc3: ECC Error and Interrupt Configuration, and ecc4: Status and Error Information in the Memory Mapped Register (MMR) Tables. Removed RZQ Pin Sharing section. Added bank numbers to figure in Restrictions on I/O Bank Usage for Arria 10 EMIF IP with HPS topic. Added Arria 10 EMIF and SmartVID topic. Changed instances of Quartus II to Quartus Prime. continued... External Memory Interface Handbook Volume 3: Reference Material 245 3 Functional Description—Intel Arria 10 EMIF IP Date Version Changes May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 • • • • • • August 2014 2014.08.15 Added debug-related connections to the Logical Connections table in the Logical Connections topic. Added Arria 10 EMIF Ping Pong PHY section. Added Arria 10 EMIF Debugging Examples section. Added x4 mode support. Added AFI 4.0 Specification section. Added MMR Register Tables section. • Added PHY-only support for DDR3, DDR4, and RLDRAM 3, and soft controller and hard PHY support for RLDRAM 3 and QDR II/II+/II+ Xtreme to Supported Memory Protocols table. • Added afi_conduit_end, afi_clk_conduit_end, afi_half_clk_conduit_end, and afi_reset_n_conduit_end to • Expanded description of ctrl_amm_avalon_slave in Logical Connections table. Added ECC in Arria 10 EMIF IP. Added Configuring Your EMIF IP for Use with the Debug Toolkit. Added Arria 10 EMIF for Hard Processor Subsystem. Logical Connections table. • • • December 2013 2013.12.16 Initial release. External Memory Interface Handbook Volume 3: Reference Material 246 4 Functional Description—Intel MAX® 10 EMIF IP 4 Functional Description—Intel MAX® 10 EMIF IP Intel MAX® 10 FPGAs provide advanced processing capabilities in a low-cost, instanton, small-form-factor device featuring capabilities such as digital signal processing, analog functionality, Nios II embedded processor support and memory controllers. Figure 135. MAX 10 EMIF Block Diagram External Memory Interface IP PLL External Memory Device I/O Structure PHY Reference Clock Calibration Sequencer DQ I/O I/O Block Write Path Memory Controller Write Path Address/Command Path 4.1 MAX 10 EMIF Overview MAX 10 FPGAs provide advanced processing capabilities in a low-cost, instant-on, small-form-factor device featuring capabilities such as digital signal processing, analog functionality, Nios II embedded processor support and memory controllers. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 4 Functional Description—Intel MAX® 10 EMIF IP Figure 136. MAX 10 EMIF Block Diagram External Memory Interface IP PLL I/O Structure PHY Reference Clock Calibration Sequencer External Memory Device DQ I/O I/O Block Write Path Memory Controller Read Path Address/Command Path 4.2 External Memory Protocol Support MAX 10 FPGAs offer external memory interface support for DDR2, DDR3, DDR3L, and LPDDR2 protocols. Table 59. Supported External Memory Configurations External Memory Protocol Maximum Frequency Configuration DDR3, DDR3L 303 MHz x16 + ECC + Configuration and Status Register (CSR) DDR2 200 MHz x16 + ECC + Configuration and Status Register (CSR) LPDDR21 200 MHz2 x16 4.3 MAX 10 Memory Controller MAX 10 FPGAs use the HPC II external memory controller. 4.4 MAX 10 Low Power Feature The MAX 10 low power feature is automatically activated when the self refresh or low power down modes are activated. The low power feature sends the afi_mem_clk_disable signal to stop the clock used by the controller. 1 MAX 10 devices support only single-die LPPDR2. 2 To achieve the specified performance, constrain the memory device I/O and core power supply variation to within ±3%. By default, the frequency is 167 MHz. External Memory Interface Handbook Volume 3: Reference Material 248 4 Functional Description—Intel MAX® 10 EMIF IP To conserve power, the MAX 10 UniPHY IP core performs the following functions: Note: • Tri-states the address and command signals except CKE and RESET_N signals • Disables the input buffer of DDR input The MAX 10 low power feature is available from version 15.0 of the Quartus Prime software. To enable this feature, regenerate your MAX 10 UniPHY IP core using the Quartus Prime software version 15.0 or later. 4.5 MAX 10 Memory PHY MAX 10 devices employ UniPHY, but without using DLLs for calibration. The physical layer implementation for MAX 10 external memory interfaces provides a calibrated read path and a static write path. 4.5.1 Supported Topologies The memory PHY supports DDR2 and DDR3 protocols with up to two discrete memory devices, and the LPDDR2 protocol with one discrete memory device. 4.5.2 Read Datapath One PLL output is used to capture data from the memory during read operations. The clock phase is calibrated by the sequencer before the interface is ready for use. Figure 137. Read Datapath Full Rate Half Rate Read Burst d0/d1/d2/d3 Double Data Rate FPGA FPGA Read Latency LFIFO Group 0 d2/d3 d0/d1 LFIFO Group 1 d2/d3 d0/d1 LFIFO Group 2 d2/d3 d0/d1 HR Register DQ Capture DDR to SDR HR Register DQ Capture DDR to SDR HR Register DQ Capture DDR to SDR VFIFO Group 0 d0/d1/d2/d3 VFIFO Group 1 d0/d1/d2/d3 VFIFO Group 2 Address/Command/Clock I/O d3 d2 d1 d0 d3 d2 d1 d0 MEM Group 0 I/O d3 d2 d1 d0 d3 d2 d1 d0 MEM Group 1 I/O d3 d2 d1 d0 d3 d2 d1 d0 PLL Device 0 Device 1 MEM Group 2 Capture Clock 0 For DDR3 interfaces, two PLL outputs capture data from the memory devices during a read. In a 24-bit interface—whether the top 8 bits are used by ECC or not—the supported topology is two discrete DDR3 devices of 16-bit and 8-bit DQ each. Each discrete device has a dedicated capture clock output from the PLL. For LPDDR2 interfaces, the supported configuration is a single memory device with memory width of 16-bit DQ. The other PLL output is used for DQS tracking purposes, because the tDQSCK drift might cause data capture to fail otherwise. The tracking clock is a shifted capture clock used to sample the DQS signal. By capturing the DQS signal, the system can compensate for DQS signal drift. External Memory Interface Handbook Volume 3: Reference Material 249 4 Functional Description—Intel MAX® 10 EMIF IP Figure 138. Read Datapath Timing Diagram Soft afi Clock afi_rdata_en write_enable for LFIFO (afi_rdata_valid) Data Transferred Marked as Valid dcba hgfe VFIFO Pipe read_enable for LFIFO Data Transferred Marked as Valid dcba afi_clk Captured Data (after rdata_fifo) dcba hgfe hgfe Capture Clock/2 HR Register Output (Clocked by Div/2 Clock) dcba Second Flopped Data First DDIO Data Captured on Soft ba hgfe ba dc fe dc fe hg hg Hard Capture Clock Read mem_dq a b DDIO Output c d ba e dc f g fe h hg 4.5.3 Write Datapath The write datapath is a static path and is timing analyzed to meet timing requirements. Figure 139. Write Datapath Full Rate Half Rate Double Data Rate FPGA d0/d1/d2/d3 d0/d1/d2/d3 d0/d1/d2/d3 Address/Command/Clock HR to FR HR to FR HR to FR d2/d3 d0/d1 d2/d3 d0/d1 d2/d3 d0/d1 External Memory Interface Handbook Volume 3: Reference Material 250 I/O HR to FR I/O HR to FR I/O HR to FR Device 0 d3 d2 d1 d0 MEM Group 0 d3 d2 d1 d0 MEM Group 1 d3 d2 d1 d0 Device 1 MEM Group 2 4 Functional Description—Intel MAX® 10 EMIF IP In the PHY write data path, for even write latency the write data valid, write data, and dqs enable pass through one stage of fr_cycle_shifter in a flow through path. For odd memory write latency, the output is shifted by a full-rate clock cycle. The full-rate cycle-shifted output feeds into a simple DDIO. Figure 140. Write Datapath Timing Diagram soft afi Clock Write Data afi_wdata abcd efgh phy_ddio_dq (after fr_cycle_shifter) cdxx ghab xxef Multiplexer Select WR DATA Hi WR DATA Lo x c a g e x x d b h f x Write Data Valid afi_wdata_valid[0] afi_wdata_valid[1] phy_ddio_wrdata_en[0] (after fr_cycle_shifter) phy_ddio_wrdata_en[1] (after fr_cycle_shifter) afi_dqs_burst[0] DQS Enable afi_dqs_burst[1] phy_ddio_dqs_en[0] (after fr_cycle_shifter) phy_ddio_dqs_en[1] (after fr_cycle_shifter) Multiplexer Select DQS_OE DQ_OE Memory Clock hard Transferred DQS_OE Transferred DQ_OE adc Clock mem_dq d c b a h g f e mem_dqs mem_dqs_n External Memory Interface Handbook Volume 3: Reference Material 251 4 Functional Description—Intel MAX® 10 EMIF IP 4.5.4 Address and Command Datapath Implementation of the address and command datapath differs between DDR2/DDR3 and LPDDR2, because of the double data-rate command in LPDDR2. LPDDR2 For LPDDR2, CA is four bits wide on the AFI interface. The least significant bit of CA is captured by a negative-edge triggered flop, and the most-significant bit of CA is captured by a positive-edge triggered flop, before multiplexing to provide maximum setup and hold margins for the AFI clock-to-MEM clock data transfer. DDR2/DDR3 For DDR2/DDR3, the full-rate register and MUX architecture is similar to LPDDR2, but both phases are driven with the same signal. The DDR3 address and command signal is not clocked by the write/ADC clock as with LPDDR2, but by the inverted MEM clock, for better address and command margin. Chip Select Because the memory controller can drive both phases of cs_n in half-rate, the signal is fully exposed to the AFI side. 4.5.5 Sequencer The sequencer employs an RTL-based state machine which assumes control of the interface at reset (whether at initial startup or when the IP is reset) and maintains control throughout the calibration process. The sequencer relinquishes control to the memory controller only after successful calibration. The sequencer consists of a calibration state machine, together with a read-write (RW) manager, PHY Manager, and PLL Manager. Figure 141. Sequencer Architecture Memory Controller AFI Multiplexer Calibration State Machine Read/Write Manager Sequencer External Memory Interface Handbook Volume 3: Reference Material 252 PHY Manager Datapath PLL Manager PLL PD 4 Functional Description—Intel MAX® 10 EMIF IP Figure 142. Calibration State Machine Stages Start Protocol? DDR2 Initialization Sequence DDR3 Initialization Sequence LPDDR2 Initialization Sequence Precharge and Activate Guaranteed Write Read Data and Check Data Data Match? no yes VFIFO Max? no Increment VFIFO yes VFIFO Calibrated Increment PLL Increment PLL Read Data and Check Data Data Match? yes no Decrement PLL Half Amount of Phase Bert Margin Established Precharge and FIFO Reset Protocol? DDR2 User Mode Register DDR3 User Mode Register LPDDR2 User Mode Register Invert AFI Multiplexer and Assert cal Success VT Tracking External Memory Interface Handbook Volume 3: Reference Material 253 4 Functional Description—Intel MAX® 10 EMIF IP RW Manager The read-write (RW) manager encapsulates the protocol to read and write to the memory device through the Altera PHY Interface (AFI). It provides a buffer that stores the data to be sent to and read from memory, and provides the following commands: • Write configuration—configures the memory for use. Sets up burst lengths, read and write latencies, and other device specific parameters. • Refresh—initiates a refresh operation at the DRAM. The sequencer also provides a register that determines whether the RW manager automatically generates refresh signals. • Enable or disable multi-purpose register (MPR)—for memory devices with a special register that contains calibration specific patterns that you can read, this command enables or disables access to the register. • Activate row—for memory devices that have both rows and columns, this command activates a specific row. Subsequent reads and writes operate on this specific row. • Precharge—closes a row before you can access a new row. • Write or read burst—writes or reads a burst length of data. • Write guaranteed—writes with a special mode where the memory holds address and data lines constant. Intel guarantees this type of write to work in the presence of skew, but constrains to write the same data across the entire burst length. • Write and read back-to-back—performs back-to-back writes or reads to adjacent banks. Most memory devices have strict timing constraints on subsequent accesses to the same bank, thus back-to-back writes and reads have to reference different banks. • Protocol-specific initialization—a protocol-specific command required by the initialization sequence. PHY Manager The PHY Manager provides access to the PHY for calibration, and passes relevant calibration results to the PHY. For example, the PHYManager sets the VFIFO and LFIFO buffer parameters resulting from calibration, signals the PHY when the memory initialization sequence finishes, and reports the pass/fail status of calibration. PLL Manager The PLL Manager controls the phase of capture clocks during calibration. The output phases of individual PLL outputs can be dynamically adjusted relative to each other and to the reference clock without having to load the scan chain of the PLL. The phase is shifted by 1/8th of the period of the voltage-controlled oscillator (VCO) at a time. The output clocks are active during this dynamic phase-shift process. A PLL counter increments with every phase increase and decrements with every phase reduction. The PLL Manager records the amount by which the PLL counter has shifted since the last reset, enabling the sequencer and tracking manager to determine whether the phase boundary has been reached. External Memory Interface Handbook Volume 3: Reference Material 254 4 Functional Description—Intel MAX® 10 EMIF IP 4.6 Calibration The purpose of calibration is to exercise the external memory interface to find an optimal window for capture clock. There is no calibration for writing data or for address and command output; these paths are analyzed by TimeQuest to meet timing requirements. The calibration algorithm assumes that address and command signals can reliably be sent to the external memory device, and that write data is registered with DQS. 4.6.1 Read Calibration Read calibration consists of two primary parts: capture clock and VFIFO buffer calibration, and read latency tuning. A VFIFO buffer is a FIFO buffer that is calibrated to reflect the read latency of the interface. The calibration process selects the correct read pointer of the FIFO. The VFIFO function delays the controller's afi_rdata_valid signal to align to data captured internally to the PHY. LFIFO buffers are FIFO buffers which ensure that data from different DQS groups arrives at the user side at the same time. Calibration ensures that data arrives at the user side with minimal latency. Capture Clock and VFIFO Calibration Capture clock and VFIFO calibration performs the following steps: • Executes a guaranteed write routine in the read-write (RW) manager. • Sweeps VFIFO buffer values, beginning at 0, in the PHY Manager. — • For each adjustment of the VFIFO buffer, sweeps the capture clock phase in the PLL Manager to find the first phase that works. This is accomplished by issuing a read to the RW Manager and performing a bit check. Data is compared for all data bits. Increments the capture clock phase in the PLL Manager until capture clock phase values are exhausted or until the system stops working. If there are no more capture clock phase values to try, calibration increments the VFIFO buffer value in the PHY Manager, and phase sweep again until the system stops working. Completion of this step establishes a working range of values. • As a final step, calibration centers the capture clock phase within the working range. Read Latency Tuning Read latency tuning performs the following steps to achieve the optimal latency value: • Assigns one LFIFO buffer for each DQ group. • Aligns read data to the AFI clock. • Gradually reduces LFIFO latency until reads fail, then increases latency to find the minimum value that yields reliable operation. 4.6.2 Write Calibration There is no calibration routine for writing data. External Memory Interface Handbook Volume 3: Reference Material 255 4 Functional Description—Intel MAX® 10 EMIF IP 4.7 Sequencer Debug Information Following calibration, the sequencer loads a set of debug information onto an output port. You can use the SignalTap II Logic Analyzer to access the debug information in the presynthesized design. You could also bring this port to a register, to latch the value and make it accessible to a host processor. The output port where the debug information is available is not normally connected to anything, so it could be removed during synthesis. Signal name: phy_cal_debug_info Module: <corename>_s0.v The signal is 32 bits wide, and is defined in the following table. Table 60. Sequencer Debug Data best_comp_result [23:16] Best data comparison result on a per-pin basis from the Read-Write Manager. Pin 16 - 23. Any mismatch on respective pin data comparison produces a high bit. If calibration fails, the result is a non-zero value. When calibration passes, the result is zero. best_comp_result [15:8] Best data comparison result on a per-pin basis from the Read-Write Manager. Pin 8 - 15. Any mismatch on respective pin data comparison produces a high bit. If calibration fails, the result is a non-zero value. When calibration passes, the result is zero. best_comp_result [7:0] Best data comparison result on a per-pin basis from the Read-Write Manager. Pin 0 - 7. Any mismatch on respective pin data comparison produces a high bit. If calibration fails, the result is a non-zero value. When calibration passes, the result is zero. margin [7:0] Margin found by the sequencer if calibration passes. Number represents the amount of subsequent PLL phase where valid data is found. If calibration fails, this number is zero. When calibration passes, this value is non-zero. Debug Example and Interpretation The following table illustrates possible debug signal values and their interpretations. best_comp_result [23:16] best_comp_result [15:8] best_comp_result [7:0] margin [7:0] Passing result 0000 0000 0000 0000 0000 0000 0001 0000 Interpretation No failing pins. No failing pins. No failing pins. 16 phases of margin for valid window. Ideal case for 300Mhz interface. continued... External Memory Interface Handbook Volume 3: Reference Material 256 4 Functional Description—Intel MAX® 10 EMIF IP best_comp_result [23:16] best_comp_result [15:8] best_comp_result [7:0] margin [7:0] Failing result 0010 0000 1111 1111 0000 0000 0000 0000 Interpretation Pin 21 failing. All pins failing, pin 8-15. No failing pins. No valid window. 4.8 Register Maps This topic provides register information for MAX 10 EMIF. Table 61. Controller Register Map Address Description 0x100 - 0x126 Reserved 0x130 ECC control register 0x131 ECC status register 0x132 ECC error address register UniPHY does not have a configuration and status register. The HPC II controller does have a configuration and status register, when configured for ECC. Related Links Soft Controller Register Map on page 371 The soft controller register map allows you to control the soft memory controller settings. 4.9 Document Revision History Table 62. Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Changed instances of Quartus II to Quartus Prime. May 2015 2015.05.04 • • December 2014 2014.12.15 Updated the external memory protocol support. Added topic about the low power feature. Initial release. External Memory Interface Handbook Volume 3: Reference Material 257 6 Functional Description—Hard Memory Interface 6 Functional Description—Hard Memory Interface The hard (on-chip) memory interface components are available in the Arria V and Cyclone V device families. The Arria V device family includes hard memory interface components supporting DDR2 and DDR3 SDRAM memory protocols at speeds of up to 533 MHz. For the Quartus II software version 12.0 and later, the Cyclone V device family supports both hard and soft interface support. The Arria V device family supports both hard and soft interfaces for DDR3 and DDR2, and soft interfaces for LPDDR2 SDRAM, QDR II SRAM, and RLDRAM II memory protocols. The Cyclone V device family supports both hard and soft interfaces for DDR3, DDR2, and LPDDR2 SDRAM memory protocols. The hard memory interface consists of three main parts, as follows: • The multi-port front end (MPFE), which allows multiple independent accesses to the hard memory controller. • The hard memory controller, which initializes, refreshes, manages, and communicates with the external memory device. • The hard PHY, which provides the physical layer interface to the external memory device. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 6 Functional Description—Hard Memory Interface Figure 143. Hard Memory Interface Architecture Hard Memory Interface Architecture FPGA Fabric C ommand C ommand Port 1 Port 2 Avalon-MM / ST Adaptor C ommand Port 6 D ata Port 1 D ata Port 4 Write Data Write Data Transaction Transaction F IF O F IF O Transaction F IF O Data F IF O Data F IF O D ata Port 4 D ata Port 1 Read Data Read Data Data F IF O Data F IF O Multi-Port Front End Register Port Bus Transaction Register Interface Bus Transaction DRAM Burst State Machine Write Data Handling DRAM Burst Command Single-Port Controller Slot 1 Slot 2 Slot 3 Slot 8 Command Arbitration DRAM Command Read Data Handling Data Data ECC Calc ECC Check Write D a ta Register Control Avalon-MM Memory Mapped Register R ea d D a ta Hard PHY 6.1 Multi-Port Front End (MPFE) The multi-port front end and its associated fabric interface provide up to six command ports, four read-data ports and four write-data ports, through which user logic can access the hard memory controller. Each port can be configured as read only or write only, or read and write ports may be combined to form bidirectional data ports. Ports can be 32, 64, 128, or 256 data bits wide, depending on the number of ports used and the type (unidirectional or bidirectional) of the port. Fabric Interface The fabric interface provides communication between the Avalon-ST-like internal protocol of the hard memory interface and the external Avalon-MM protocol. The fabric interface supports frequencies in the range of 10 MHz to one-half of the memory interface frequency. For example, for an interface running at 533 MHz, the maximum user logic frequency is 267 MHz. The MPFE handles the clock crossing between user logic and the hard memory interface. The multi-port front end read and write FIFO depths are 8, and the command FIFO depth is 4. The FIFO depths are not configurable. Operation Ordering Requests arriving at a given port are executed in the order in which they are received. Requests arriving at different ports have no guaranteed order of service, except when a first transaction has completed before the second arrives. External Memory Interface Handbook Volume 3: Reference Material 259 6 Functional Description—Hard Memory Interface 6.2 Multi-port Scheduling Multi-port scheduling is governed by two considerations: the absolute priority of a request and the weighting of a port. User-configurable priority and weight settings determine the absolute and relative scheduling policy for each port. 6.2.1 Port Scheduling The evaluation of absolute priority ensures that ports carrying higher-priority traffic are served ahead of ports carrying lower-priority traffic. The scheduler recognizes eight priority levels, with higher values representing higher priorities. Priority is absolute; for example, any transaction with priority seven will always be scheduled before transactions of priority six or lower. When ports carry traffic of the same absolute priority, relative priority is determined based on port weighting. Port weighting is a five-bit value, and is determined by a weighted round robin (WRR) algorithm. The scheduler can alter priority if the latency target for a transaction is exceeded. The scheduler tracks latency on a per-port basis, and counts the cycles that a transaction is pending. Each port has a priority escalation register and a pending counter engagement register. If the number of cycles in the pending counter engagement register elapse without a pending transaction being served, that transaction’s priority is esclated. To ensure that high-priority traffic is served quickly and that long and short bursts are effectively interleaved on ports, bus transactions longer than a single DRAM burst are scheduled as a series of DRAM bursts, with each burst arbitrated separately. The scheduler uses a form of deficit round robin (DRR) scheduling algorithm which corrects for past over-servicing or under-servicing of a port. Each port has an associated weight which is updated every cycle, with a user-configured weight added to it and the amount of traffic served subtracted from it. The port with the highest weighting is considered the most eligible. To ensure that lower priority ports do not build up large running weights while higher priority ports monopolize bandwidth, the hard memory controller’s DRR weights are updated only when a port matches the scheduled priority. Hence, if three ports have traffic, two being priority 7 and one being priority 4, the weights for both ports at priority 7 are updated but the port with priority 4 remains unchanged. 6.2.2 DRAM Burst Scheduling DRAM burst scheduling recognizes addresses that access the same column/row combination—also known as open page accesses. Such operations are always served in the order in which they are received in the single-port controller. External Memory Interface Handbook Volume 3: Reference Material 260 6 Functional Description—Hard Memory Interface Selection of DRAM operations is a two-stage process; first, each pending transaction must wait for its timers to be eligible for execution, then the transaction arbitrates against other transactions that are also eligible for execution. The following rules govern transaction arbitration: • High priority operations take precedence over lower priority operations • If multiple operations are in arbitration, read operations have precedence over write operations • If multiple operations still exist, the oldest is served first A high-priority transaction in the DRAM burst scheduler wins arbitration for that bank immediately if the bank is idle and the high-priority transaction’s chip select/row/ column address does not match an address already in the single-port controller. If the bank is not idle, other operations to that bank yield until the high-priority operation is finished. If the address matches another chip select/row/column, the high-priority transaction yeilds until the earlier transaction is completed. You can force the DRAM burst scheduler to serve transactions in the order that they are received, by setting a bit in the register set. 6.2.3 DRAM Power Saving Modes The hard memory controller supports two DRAM power-saving modes: self-refresh, and fast/slow all-bank precharge powerdown exit. Engagement of a DRAM power saving mode can occur due to inactivity, or in response to a user command. The user command to enter power-down mode forces the DRAM burst-scheduling bank-management logic to close all banks and issue the power-down command. You can program the controller to power down when the DRAM burst-scheduling queue is empty for a specified number of cycles; the DRAM is reactivated when an active DRAM command is received. 6.3 MPFE Signal Descriptions The following table describes the signals for the multi-port front end. Table 63. MPFE Signals Signal avl_<signal_name>_# mp_cmd_clk_#_clk (1) (1) (2) mp_rfifo_reset_n_#_reset_n mp_wfifo_clk_#_clk — (1) mp_cmd_reset_n_#_reset_n mp_rfifo_clk_#_clk Direction (2) (2) Description Local interface signals. Input Clock for the command FIFO buffer. (3) Follow Avalon-MM master frequency. Maximum frequency is one-half of the interface frequency, and subject to timing closure. Input Asynchronous reset signal for command FIFO buffer. Input Clock for the read data FIFO buffer. Follow Avalon-MM master frequency. Maximum frequency is one-half of the interface frequency, and subject to timing closure. Input Asynchronous reset signal for read data FIFO buffer. Input Clock for the write data FIFO buffer. Follow Avalon-MM master frequency. Maximum frequency is one-half of the interface frequency, and subject to timing closure. continued... External Memory Interface Handbook Volume 3: Reference Material 261 6 Functional Description—Hard Memory Interface Signal mp_wfifo_reset_n_#_reset_n bonding_in_1/2/3 bonding_out_1/2/3 Direction (2) Description Input Asynchronous reset signal for write data FIFO buffer. Input Bonding interface input port. Connect second controller bonding output port to this port according to the port sequence. Output Bonding interface output port. Connect this port to the second controller bonding intput port according to the port sequence. Notes to Table: 1. # represents the number of the slave port. Values are 0—5. 2. # represents the number of the slave port. Values are 0—3. 3. The command FIFO buffers have two stages. The first stage is 4 bus transaction deep per port. After port scheduling, the commands are placed in the second stage FIFO buffer, which is 8 DRAM transactions deep. The second stage FIFO buffer is used to optimize memory operations where bank look ahead and data reordering occur. The write data buffer is 32 deep and read buffer is 64 deep. Every input interface (command, read data, and write data) has its own clock domain. Each command port can be connected to a different clock, but the read data and write data ports associated with a command port must connect to the same clock as that command port. Each input interface uses the same reset signal as its clock. By default, the IP generates all clock signals regardless of the MPFE settings, but all unused ports and FIFO buffers are connected to ground. The command ports can be used only in unidirectional configurations, with either 4 write and 2 read, 3 write and 3 read, or 2 write and 4 read scenarios. For bidirectional ports, the number of clocks is reduced from 6 to a maximum of 4. For the scenario depicted in the following figure: • command port 0 is associated with read and write data FIFO 0 and 1 • command port 1 is associated with read data FIFO 2 • command port 2 is associated with write data FIFO 2 External Memory Interface Handbook Volume 3: Reference Material 262 6 Functional Description—Hard Memory Interface Figure 144. Sample MPFE Configuration Therefore, if port 0 (avl_0) is clocked by a 100 MHz clock signal, mp_cmd_clk_0, mp_rfifo_clk_0, mp_rfifo_clk_1, mp_wfifo_clk_0, and mp_wfifo_clk_1 must all be connected to the same 100 MHz clock, as illustrated below. External Memory Interface Handbook Volume 3: Reference Material 263 6 Functional Description—Hard Memory Interface Figure 145. Sample Connection Mapping 6.4 Hard Memory Controller The hard memory controller initializes, refreshes, manages, and communicates with the external memory device. Note: The hard memory controller is functionally similar to the High Performance Controller II (HPC II). For information on signals, refer to the Functional Description—HPC II Controller chapter. External Memory Interface Handbook Volume 3: Reference Material 264 6 Functional Description—Hard Memory Interface Related Links Functional Description—HPC II Controller on page 343 The High Performance Controller II works with the UniPHY-based DDR2, DDR3, and LPDDR2 interfaces. 6.4.1 Clocking The ports on the MPFE can be clocked at different frequencies, and synchronization is maintained by cross-domain clocking logic in the MPFE. Command ports can connect to different clocks, but the data ports associated with a given command port must be attached to the same clock as that command port. For example, a bidirectional command port that performs a 64-bit read/write function has its read port and write port connected to the same clock as the command port. Note that these clocks are separate from the EMIF core generated clocks. 6.4.2 Reset The ports of the MPFE must connect to the same reset signal. When the reset signal is asserted, it resets the command and data FIFO buffer in the MPFE without resetting the hard memory controller. Note: The global_reset_n and soft_reset_n signals are asynchronous. For easiest management of reset signals, Intel recommends the following sequence at power-up: 1. Initially global_reset_n, soft_reset_n, and the MPFE reset signals are all asserted. 2. global_reset_n is deasserted. 3. Wait for pll_locked to transition high. 4. soft_reset_n is deasserted. 5. (Optional) If you encounter difficulties, wait for the controller signal local_cal_success to go high, indicating that the external memory interface has successfully completed calibration, before deasserting the MPFE FIFO reset signals. This will ensure that read/write activity cannot occur until the interface is successfully calibrated. 6.4.3 DRAM Interface The DRAM interface is 40 bits wide, and can accommodate 8-bit, 16-bit, 16-bit plus ECC, 32-bit, or 32-bit plus ECC configurations. Any unused I/Os in the DRAM interface can be reused as user I/Os. The DRAM interface supports DDR2 and DDR3 memory protocols, and LPDDR2 for Cyclone V only. Fast and medium speed grade devices are supported to 533 MHz for Arria V and 400 MHz for Cyclone V. External Memory Interface Handbook Volume 3: Reference Material 265 6 Functional Description—Hard Memory Interface 6.4.4 ECC The hard controller supports both error-correcting code (ECC) calculated by the controller and by the user. Controller ECC code employs standard Hamming logic which can detect and correct single-bit errors and detect double-bit errors. The controller ECC is available for 16-bit and 32-bit widths, each requiring an additional 8 bits of memory, resulting in an actual memory width of 24-bits and 40-bits, respectively. In user ECC mode, all bits are treated as data bits, and are written to and read from memory. User ECC can implement nonstandard memory widths such as 24-bit or 40bit, where ECC is not required. Controller ECC Controller ECC provides the following features: Byte Writes—The memory controller performs a read/modify/write operation to keep ECC valid when a subset of the bits of a word is being written. If an entire word is being written (but less than a full burst) and the DM pins are connected, no read is necessary and only that word is updated. If controller ECC is disabled, byte-writes have no performance impact. ECC Write Backs—When a read operation detects a correctable error, the memory location is scheduled for a read/modify/write operation to correct the single-bit error. User ECC—User ECC is 24-bits or 40-bits wide; with user ECC, the controller performs no ECC checking. The controller employs memory word addressing with byte enables, and can handle arbitrary memory widths. User ECC does not disable byte writes; hence, you must ensure that any byte writes do not result in corrupted ECC. 6.4.5 Bonding of Memory Controllers Bonding is a feature that allows data to be split between two memory controllers, providing the ability to service bandwidth streams similar to a single 64-bit controller. Bonding works by dividing data buses in proportion to the memory widths, and always sending a transaction to both controllers. When signals are returned, bonding ensures that both sets of signals are returned identically. Bonding can be applied to asymetric controllers, and allows controllers to have different memory clocks. Bonding does not attempt to synchronize the controllers. Bonding supports only one port. The Avalon port width can be varied from 64-bit to 256-bit; 32-bit port width is not supported. The following signals require bonding circuitry: Read data return—This bonding allows read data from the two controllers to return with effectively one ready signal to the bus master that initiated the bus transaction. Write ready—For Avalon-MM, this is effectively bonding on the waitrequest signal. Write acknowledge—Synchronization on returning the write completed signal. For each of the above implementations, data is returned in order, hence the circuitry must match up for each valid cycle. External Memory Interface Handbook Volume 3: Reference Material 266 6 Functional Description—Hard Memory Interface Bonded FIFO buffers must have identical FIFO numbers; that is, read FIFO 1 on controller 1 must be paired with Read FIFO 1 on controller 2. Data Return Bonding Long loop times can lead to communications problems when using bonded controllers. The following effects are possible when using bonded controllers: • If one memory controller completes its transaction and receives new data before the other controller, then the second controller can send data as soon as it arrives, and before the first controller acknowledges that the second controller has data. • If the first controller has a single word in its FIFO buffer and the second controller receives single-word transactions, the second controller must determine whether the second word is a valid signal or not. To accommodate the above effects, the hard controller maintains two counters for each bonded pair of FIFO buffers and implements logic that monitors those counters to ensure that the bonded controllers receive the same data on the same cycle, and that they send the data out on the same cycle. FIFO Ready FIFO ready bonding is used for write command and write data buses. The implementation is similar to the data return bonding. Bonding Latency Impact Bonding has no latency impact on ports that are not bonded. Bonding Controller Usage Arria V and Cyclone V devices employ three shared bonding controllers to manage the read data return bonding, write acknowledge bonding, and command/write data ready bonding. The three bonding controllers require three pairs of bonding I/Os, each based on a six port count; this means that a bonded hard memory controller requires 21 input signals and 21 output signals for its connection to the fabric, and another 21 input signals and 21 output signals to the paired hard memory controller. Note: The hard processor system (HPS) hard memory controller cannot be bonded with another hard memory controller on the FPGA portion of the device. Bonding Configurations and Parameter Requirements Intel has verified hard memory controller bonding between two interfaces with the following configuration: • Same clock source • Same memory clock frequency • Same memory parameters and timings (except interface width) • Same controller settings. • Same port width in MPFE settings Bonding supports only one port. The Avalon port width can be varied from 64-bits to 256-bits; a 32-bit port width is not supported. External Memory Interface Handbook Volume 3: Reference Material 267 6 Functional Description—Hard Memory Interface 6.5 Hard PHY A physical layer interface (PHY) is embedded in the periphery of the Arria V device, and can run at the same high speed as the hard controller and hard sequencer. This hard PHY is located next to the hard controller. Differing device configurations have different numbers and sizes of hard controller and hard PHY pairs. The hard PHY implements logic that connects the hard controller to the I/O ports. Because the hard controller and AFI interface support high frequencies, a portion of the sequencer is implemented as hard logic. The Nios II processor, the instruction/ data RAM, and the Avalon fabric of the sequencer are implemented as core soft logic. The read/write manger and PHY manager components of the sequencer, which must operate at full rate, are implemented as hard logic in the hard PHY. 6.5.1 Interconnections The hard PHY resides on the device between the hard controller and the I/O register blocks. The hard PHY is instantiated or bypassed entirely, depending on the parameterization that you specify. The hard PHY connects to the hard memory controller and the core, enabling the use of either the hard memory controller or a software-based controller. (You can have the hard controller and hard PHY, or the soft controller and soft PHY; however, the combination of soft controller with hard PHY is not supported.) The hard PHY also connects to the I/O register blocks and the DQS logic. The path between the hard PHY and the I/O register blocks can be bypassed, but not reconfigured—in other words, if you use the hard PHY datapath, the pins to which it connects are predefined and specified by the device pin table. 6.5.2 Clock Domains The hard PHY contains circuitry that uses the following clock domains: AFI clock domain (pll_afi_clk) —The main full-rate clock signal that synchronizes most of the circuit logic. Avalon clock domain (pll_avl_clk) —Synchronizes data on the internal Avalon bus, namely the Read/Write Manager, PHY Manager, and Data Manager data. The data is then transferred to the AFI clock domain. To ensure reliable data transfer between clock domains, the Avalon clock period must be an integer multiple of the AFI clock period, and the phases of the two clocks must be aligned. Address and Command clock domain (pll_addr_cmd_clk) —Synchronizes the global asychronous reset signal, used by the I/Os in this clock domain. 6.5.3 Hard Sequencer The sequencer initializes the memory device and calibrates the I/Os, with the objective of maximizing timing margins and achieving the highest possible performance. External Memory Interface Handbook Volume 3: Reference Material 268 6 Functional Description—Hard Memory Interface When the hard memory controller is in use, a portion of the sequncer must run at full rate; for this reason, the Read/Write Manager, PHY Manager, and Data Manager are implemented as hard components within the hard PHY. The hard sequencer communicates with the soft-logic sequencer components (including the Nios II processor) via an Avalon bus. 6.5.4 MPFE Setup Guidelines The following instructions provide information on configuring the multi-port front end of the hard memory interface. 1. To enable the hard memory interface, turn on Enable Hard External Memory Interface in the Interface Type tab in the parameter editor. 2. To export bonding interface ports to the top level, turn on Export bonding interface in the Multiple Port Front End pulldown on the Controller Settings tab in the parameter editor. Note: The system exports three bonding-in ports and three bonding-out ports. You must generate two controllers and connect the bonding ports manually. 3. To expand the interface data width from a maximum of 32 bits to a maximum of 40 bits, turn on Enable Avalon-MM data for ECC in the Multiple Port Front End pulldown on the Controller Settings tab in the parameter editor. Note: The controller does not perform ECC checking when this option is turned on. 4. Select the required Number of ports for the multi-port front end in the Multiple Port Front End pulldown on the Controller Settings tab in the parameter editor. Note: The maximum number of ports is 6, depending on the port type and width. The maximum port width is 256 bits, which is the maximum data width of the read data FIFO and write data FIFO buffers. 5. The table in the Multiple Port Front End pulldown on the Controller Settings tab in the parameter editor lists the ports that are created. The columns in the table describe each port, as follows: • Port: Indicates the port number. • Type: Indicates whether the port is read only, write only, or bidirectional. • Width: To achieve optimum MPFE throughput, Intel recommends setting the MPFE data port width according to the following calculation: 2 x (frequency ratio of HMC to user logic) x (interface data width) For example, if the frequency of your user logic is one-half the frequency of the hard memory controller, you should set the port width to be 4x the interface data width. If the frequency ratio of the hard memory controller to user logic is a fractional value, you should use a larger value; for example, if the ratio is 1.5, you can use 2. External Memory Interface Handbook Volume 3: Reference Material 269 6 Functional Description—Hard Memory Interface • Priority: The priority setting specifies the priority of the slave port, with higher values representing higher priority. The slave port with highest priority is served first. • Weight: The weight setting has a range of values of 0–31, and specifies the relative priority of a slave port, with higher weight representing higher priority. The weight value can determine relative bandwidth allocations for slave ports with the same priority values.For example, if two ports have the same priority value, and weight values of 4 and 6, respectively, the port with a weight of 4 will receive 40% of the bus bandwidth, while the port with a weight of 6 will receive 60% of the bus bandwidth—assuming 100% total available bus bandwidth. 6.5.5 Soft Memory Interface to Hard Memory Interface Migration Guidelines The following instructions provide information on mapping your soft memory interface to a hard memory interface. Pin Connections 1. The hard and soft memory controllers have compatible pinouts. Assign interface pins to the hard memory interface according to the pin table. 2. Ensure that your soft memory interface pins can fit into the hard memory interface. The hard memory interface can support a maximum of a 40-bit interface with user ECC, or a maximum of 80-bits with same-side bonding. The soft memory interface does not support bonding. 3. Follow the recommended board layout guidelines for the hard memory interface. Software Interface Preparation Observe the following points in preparing your soft memory interface for migration to a hard memory interface: • You cannot use the hard PHY without also using the hard memory controller. • The hard memory interface supports only full-rate controller mode. • Ensure that the MPFE data port width is set according to the soft memory interface half-rate mode Avalon data width. • The hard memory interface uses a different Avalon port signal naming convention than the software memory interface. Ensure that you change the avl_<signal_name> signals in the soft memory interface to .avl_<signal_name>_0 signals in the hard memory interface. • The hard memory controller MPFE includes an additional three clocks and three reset ports (CMD port, RFIFO port, and WFIFO port) that do not exist with the soft memory controller. You should connect the user logic clock signal to the MPFE clock port, and the user logic reset signal to the MPFE reset port. • In the soft memory interface, the half-rate afi_clk is a user logic clock. In the hard memory interface, afi_clk is a full-rate clock, because the core fabric might not be able to achieve full-rate speed. When you migrate your soft memory interface to a hard memory interface, you need to supply an additional slower rate clock. The maximum clock rate supported by core logic is one-half of the maximum interface frequency. Latency External Memory Interface Handbook Volume 3: Reference Material 270 6 Functional Description—Hard Memory Interface Overall, you should expect to see slightly more latency when using the hard memory controller and multi-port front end, than when using the soft memory controller. The hard memory controller typically exhibits lower latency than the soft memory controller; however, the multi-port front end does introduce additional latency cycles due to FIFO buffer stages used for synchronization. The MPFE cannot be bypassed, even if only one port is needed. 6.5.6 Bonding Interface Guidelines Bonding allows a single data stream to be split between two memory controllers, providing the ability to expand the interface data width similar to a single 64-bit controller. This section provides some guidelines for setting up the bonding interface. 1. Bonding interface ports are exported to the top level in your design. You should connect each bonding_in* port in one hard memory controller to the corresponding bonding_out_*port in the other hard memory controller, and vice versa. 2. You should modify the Avalon signal connections to drive the bonding interface with a single user logic/master, as follows: a. AND both avl_ready signals from both hard memory controllers before the signals enter the user logic. b. AND both avl_rdata_valid signals from both hard memory controllers before the signals enter the user logic. (The avl_rdata_valid signals should be identical for both hard memory controllers.) c. Branch the following signals from the user logic to both hard memory controllers: d. • avl_burstbegin • avl_addr • avl_read_req • avl_write_req • avl_size Split the following signals according to each multi-port front end data port width: • avl_rdata • avl_wdata • avl_be 6.6 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. continued... External Memory Interface Handbook Volume 3: Reference Material 271 6 Functional Description—Hard Memory Interface Date May 2015 December 2014 August 2014 Version Changes 2015.05.04 Maintenance release. 21014.12.15 Maintenance release. 2014.08.15 • Updated descriptions of mp_cmd_reset_n_#_reset_n, mp_rfifo_reset_n_#_reset_n, and mp_wfifo_reset_n_#_reset_n • in the MPFE Signals table. Added Reset section to Hard Memory Controller section. December 2013 2013.12.16 • • • • Added footnote about command FIFOs to MPFE Signals table. Added information about FIFO depth for the MPFE. Added information about hard memory controller bonding. Reworded protocol-support information for Arria V and Cyclone V devices. November 2012 2.1 • • Added Hard Memory Interface Implementation Guidelines. Moved content of EMI-Related HPS Features in SoC Devices section to chapter 4, Functional Description—HPS Memory Controller. June 2012 2.0 • • • Added EMI-Related HPS Features in SoC Devices. Added LPDDR2 support. Added Feedback icon. November 2011 1.0 Initial release. External Memory Interface Handbook Volume 3: Reference Material 272 7 Functional Description—HPS Memory Controller 7 Functional Description—HPS Memory Controller The hard processor system (HPS) SDRAM controller subsystem provides efficient access to external SDRAM for the ARM* Cortex*-A9 microprocessor unit (MPU) subsystem, the level 3 (L3) interconnect, and the FPGA fabric. Note: This chapter applies to the HPS architecture of Arria V and Cyclone V memory controllers only. The SDRAM controller provides an interface between the FPGA fabric and HPS. The interface accepts Advanced Microcontroller Bus Architecture (AMBA®) Advanced eXtensible Interface (AXI™) and Avalon® Memory-Mapped (Avalon-MM) transactions, converts those commands to the correct commands for the SDRAM, and manages the details of the SDRAM access. 7.1 Features of the SDRAM Controller Subsystem The SDRAM controller subsystem offers programming flexibility, port and bus configurability, error correction, and power management for external memories up to 4 GB. • Support for double data rate 2 (DDR2), DDR3, and low-power DDR2 (LPDDR2) SDRAM • Flexible row and column addressing with the ability to support up to 4 GB of memory in various interface configurations • Optional 8-bit integrated error correction code (ECC) for 16- and 32-bit data widths3 • User-configurable memory width of 8, 16, 16+ECC, 32, 32+ECC • User-configurable timing parameters • Two chip selects (DDR2 and DDR3) • Command reordering (look-ahead bank management) • Data reordering (out of order transactions) • User-controllable bank policy on a per port basis for either closed page or conditional open page accesses • User-configurable priority support with both absolute and weighted round-robin scheduling • Flexible FPGA fabric interface with up to 6 ports that can be combined for a data width up to 256 bits using Avalon-MM and AXI interfaces • Power management supporting self refresh, partial array self refresh (PASR), power down, and LPDDR2 deep power down 3 The level of ECC support is package dependent. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 7 Functional Description—HPS Memory Controller 7.2 SDRAM Controller Subsystem Block Diagram The SDRAM controller subsystem connects to the MPU subsystem, the L3 interconnect, and the FPGA fabric. The memory interface consists of the SDRAM controller, the physical layer (PHY), control and status registers (CSRs), and their associated interfaces. Figure 146. SDRAM Controller Subsystem High-Level Block Diagram SDRAM Controller Subsystem SDRAM Controller MPU Subsystem L3 Interconnect FPGA Fabric 64-Bit AXI 32-Bit AXI Multi-Port Front End Single-Port Controller Altera PHY Interface DDR PHY HPS I/O Pins External Memory FPGA-to-HPS SDRAM Interface 32- to 256-Bit AXI or Avalon-MM Control & Status Registers Register Slave Interface L4 Peripheral Bus (l4_sp_clk) SDRAM Controller The SDRAM controller provides high performance data access and run-time programmability. The controller reorders data to reduce row conflicts and bus turnaround time by grouping read and write transactions together, allowing for efficient traffic patterns and reduced latency. The SDRAM controller consists of a multiport front end (MPFE) and a single-port controller. The MPFE provides multiple independent interfaces to the single-port controller. The single-port controller communicates with and manages each external memory device. The MPFE FPGA-to-HPS SDRAM interface port has an asynchronous FIFO buffer followed by a synchronous FIFO buffer. Both the asynchronous and synchronous FIFO buffers have a read and write data FIFO depth of 8, and a command FIFO depth of 4. The MPU subsystem 64-bit AXI port and L3 interconnect 32-bit AXI port have asynchronous FIFO buffers with read and write data FIFO depth of 8, and command FIFO depth of 4. External Memory Interface Handbook Volume 3: Reference Material 274 7 Functional Description—HPS Memory Controller DDR PHY The DDR PHY provides a physical layer interface for read and write memory operations between the memory controller and memory devices. The DDR PHY has dataflow components, control components, and calibration logic that handle the calibration for the SDRAM interface timing. Related Links Memory Controller Architecture on page 277 The SDRAM controller consists of an MPFE, a single-port controller, and an interface to the CSRs. 7.3 SDRAM Controller Memory Options Bank selects, and row and column address lines can be configured to work with SDRAMs of various technology and density combinations. Table 64. SDRAM Controller Interface Memory Options Memory Type Mbits Column Address Bit Width Bank Select Bit Width Row Address Bit Width DDR2 256 10 2 13 1024 32 512 10 2 14 1024 64 1024 (1 Gb) 10 3 14 1024 128 2048 (2 Gb) 10 3 15 1024 256 4096 (4 Gb) 10 3 16 1024 512 512 10 3 13 1024 64 1024 (1 Gb) 10 3 14 1024 128 2048 (2 Gb) 10 3 15 1024 256 4096 (4 Gb) 10 3 16 1024 512 64 9 2 12 512 128 10 2 12 1024 16 256 10 2 13 1024 32 11 2 13 2048 64 11 2 14 2048 128 1024 (1 Gb) -S4 11 3 13 2048 128 2048 11 2 15 2048 4 DDR3 LPDDR2 512 1024 (1 Gb) -S25 6 Page Size MBytes 8 256 continued... 4 For all memory types shown in this table, the DQ width is 8. 5 S2 signifies a 2n prefetch size 6 S4 signifies a 4n prefetch size External Memory Interface Handbook Volume 3: Reference Material 275 7 Functional Description—HPS Memory Controller Memory Type 4 Mbits (2 Gb) - S2 2048 (2 Gb) -S4 4096 (4 Gb) Column Address Bit Width Bank Select Bit Width Row Address Bit Width Page Size MBytes 11 3 14 2048 256 12 3 14 4096 512 5 6 7.4 SDRAM Controller Subsystem Interfaces 7.4.1 MPU Subsystem Interface The SDRAM controller connects to the MPU subsystem with a dedicated 64-bit AXI interface, operating on the mpu_l2_ram_clk clock domain. 7.4.2 L3 Interconnect Interface The SDRAM controller interfaces to the L3 interconnect with a dedicated 32-bit AXI interface, operating on the l3_main_clk clock domain. 7.4.3 CSR Interface The CSR interface connects to the level 4 (L4) bus and operates on the l4_sp_clk clock domain. The MPU subsystem uses the CSR interface to configure the controller and PHY, for example setting the memory timing parameter values or placing the memory in a low power state. The CSR interface also provides access to the status registers in the controller and PHY. 7.4.4 FPGA-to-HPS SDRAM Interface The FPGA-to-HPS SDRAM interface provides masters implemented in the FPGA fabric access to the SDRAM controller subsystem in the HPS. The interface has three port types that are used to construct the following AXI or Avalon-MM interfaces: • Command ports—issue read and/or write commands, and for receive write acknowledge responses • 64-bit read data ports—receive data returned from a memory read • 64-bit write data ports—transmit write data The FPGA-to-HPS SDRAM interface supports six command ports, allowing up to six Avalon-MM interfaces or three AXI interfaces. Each command port can be used to implement either a read or write command port for AXI, or be used as part of an Avalon-MM interface. The AXI and Avalon-MM interfaces can be configured to support 32-, 64-, 128-, and 256-bit data. 4 For all memory types shown in this table, the DQ width is 8. External Memory Interface Handbook Volume 3: Reference Material 276 7 Functional Description—HPS Memory Controller Table 65. FPGA-to-HPS SDRAM Controller Port Types Port Type Available Number of Ports Command 6 64-bit read data 4 64-bit write data 4 The FPGA-to-HPS SDRAM controller interface can be configured with the following characteristics: • Avalon-MM interfaces and AXI interfaces can be mixed and matched as required by the fabric logic, within the bounds of the number of ports provided to the fabric. • Because the AXI protocol allows simultaneous read and write commands to be issued, two SDRAM control ports are required to form an AXI interface. • Because the data ports are natively 64-bit, they must be combined if wider data paths are required for the interface. • Each Avalon-MM or AXI interface of the FPGA-to-HPS SDRAM interface operates on an independent clock domain. • The FPGA-to-HPS SDRAM interfaces are configured during FPGA configuration. The following table shows the number of ports needed to configure different bus protocols, based on type and data width. Table 66. FPGA-to-HPS SDRAM Port Utilization Bus Protocol Command Ports Read Data Ports Write Data Ports 32- or 64-bit AXI 2 1 1 128-bit AXI 2 2 2 256-bit AXI 2 4 4 32- or 64-bit Avalon-MM 1 1 1 128-bit Avalon-MM 1 2 2 256-bit Avalon-MM 1 4 4 32- or 64-bit Avalon-MM write-only 1 0 1 128-bit Avalon-MM write-only 1 0 2 256-bit Avalon-MM write-only 1 0 4 32- or 64-bit Avalon-MM read-only 1 1 0 128-bit Avalon-MM read-only 1 2 0 256-bit Avalon-MM read-only 1 4 0 7.5 Memory Controller Architecture The SDRAM controller consists of an MPFE, a single-port controller, and an interface to the CSRs. External Memory Interface Handbook Volume 3: Reference Material 277 7 Functional Description—HPS Memory Controller Figure 147. SDRAM Controller Block Diagram SDRAM Controller Multi-Port Front End Single-Port Controller Read Data 6 Data FIFO Buffers Reorder Buffer ECC Generation & Checking Write Data 6 Data FIFO Buffers FPGA Fabric FIFO POP Logic Write Data Buffer Altera PHY Interface Command 6 Write WR Acknowledge Acknowledge Queues 10 Command FIFO Buffers Rank Timer Scheduler Command Generator Timer Bank Pool Arbiter Control & Status Register Interface 7.5.1 Multi-Port Front End The Multi-Port Front End (MPFE) is responsible for scheduling pending transactions from the configured interfaces and sending the scheduled memory transactions to the single-port controller. The MPFE handles all functions related to individual ports. The MPFE consists of three primary sub-blocks. Command Block The command block accepts read and write transactions from the FPGA fabric and the HPS. When the command FIFO buffer is full, the command block applies backpressure by deasserting the ready signal. For each pending transaction, the command block calculates the next SDRAM burst needed to progress on that transaction. The command block schedules pending SDRAM burst commands based on the user-supplied configuration, available write data, and unallocated read data space. Write Data Block The write data block transmits data to the single-port controller. The write data block maintains write data FIFO buffers and clock boundary crossing for the write data. The write data block informs the command block of the amount of pending write data for each transaction so that the command block can calculate eligibility for the next SDRAM write burst. External Memory Interface Handbook Volume 3: Reference Material 278 7 Functional Description—HPS Memory Controller Read Data Block The read data block receives data from the single-port controller. Depending on the port state, the read data block either buffers the data in its internal buffer or passes the data straight to the clock boundary crossing FIFO buffer. The read data block reorders out-of-order data for Avalon-MM ports. In order to prevent the read FIFO buffer from overflowing, the read data block informs the command block of the available buffer area so the command block can pace read transaction dispatch. 7.5.2 Single-Port Controller The single-port logic is responsible for following actions: • Queuing the pending SDRAM bursts • Choosing the most efficient burst to send next • Keeping the SDRAM pipeline full • Ensuring all SDRAM timing parameters are met Transactions passed to the single-port logic for a single page in SDRAM are guaranteed to be executed in order, but transactions can be reordered between pages. Each SDRAM burst read or write is converted to the appropriate Altera PHY interface (AFI) command to open a bank on the correct row for the transaction (if required), execute the read or write command, and precharge the bank (if required). The single-port logic implements command reordering (looking ahead at the command sequence to see which banks can be put into the correct state to allow a read or write command to be executed) and data reordering (allowing data transactions to be dispatched even if the data transactions are executed in an order different than they were received from the multi-port logic). The single-port controller consists of eight sub-modules. 7.5.2.1 Command Generator The command generator accepts commands from the MPFE and from the internal ECC logic, and provides those commands to the timer bank pool. Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. 7.5.2.2 Timer Bank Pool The timer bank pool is a parallel queue that operates with the arbiter to enable data reordering. The timer bank pool tracks incoming requests, ensures that all timing requirements are met, and, on receiving write-data-ready notifications from the write data buffer, passes the requests to the arbiter Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. External Memory Interface Handbook Volume 3: Reference Material 279 7 Functional Description—HPS Memory Controller 7.5.2.3 Arbiter The arbiter determines the order in which requests are passed to the memory device. When the arbiter receives a single request, that request is passed immediately. When multiple requests are received, the arbiter uses arbitration rules to determine the order to pass requests to the memory device. Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. 7.5.2.4 Rank Timer The rank timer performs the following functions: • Maintains rank-specific timing information • Ensures that only four activates occur within a specified timing window • Manages the read-to-write and write-to-read bus turnaround time • Manages the time-to-activate delay between different banks Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. 7.5.2.5 Write Data Buffer The write data buffer receives write data from the MPFE and passes the data to the PHY, on approval of the write request. Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. 7.5.2.6 ECC Block The ECC block consists of an encoder and a decoder-corrector, which can detect and correct single-bit errors, and detect double-bit errors. The ECC block can correct single- bit errors and detect double-bit errors resulting from noise or other impairments during data transmission. Note: The level of ECC support is package dependent. Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. 7.5.2.7 AFI Interface The AFI interface provides communication between the controller and the PHY. Related Links Memory Controller Architecture on page 277 External Memory Interface Handbook Volume 3: Reference Material 280 7 Functional Description—HPS Memory Controller For more information, refer to the SDRAM Controller Block diagram. 7.5.2.8 CSR Interface The CSR interface is accessible from the L4 bus. The interface allows code in the HPS MPU or soft IP cores in the FPGA fabric to configure and monitor the SDRAM controller. Related Links Memory Controller Architecture on page 277 For more information, refer to the SDRAM Controller Block diagram. 7.6 Functional Description of the SDRAM Controller Subsystem 7.6.1 MPFE Operation Ordering Operation ordering is defined and enforced within a port, but not between ports. All transactions received on a single port for overlapping addresses execute in order. Requests arriving at different ports have no guaranteed order of service, except when a first transaction has completed before the second arrives. Avalon-MM does not support write acknowledgement. When a port is configured to support Avalon-MM, you should read from the location that was previously written to ensure that the write operation has completed. When a port is configured to support AXI, the master accessing the port can safely issue a read operation to the same address as a write operation as soon as the write has been acknowledged. To keep write latency low, writes are acknowledged as soon as the transaction order is guaranteed—meaning that any operations received on any port to the same address as the write operation are executed after the write operation. To reduce read latency, the single-port logic can return read data out of order to the multi-port logic. The returned data is rearranged to its initial order on a per port basis by the multi-port logic and no traffic reordering occurs between individual ports. Read Data Handling The MPFE contains a read buffer shared by all ports. If a port is capable of receiving returned data then the read buffer is bypassed. If the size of a read transaction is smaller than twice the memory interface width, the buffer RAM cannot be bypassed. The lowest memory latency occurs when the port is ready to receive data and the full width of the interface is utilized. 7.6.2 MPFE Multi-Port Arbitration The HPS SDRAM controller multi-port front end (MPFE) contains a programmable arbiter. The arbiter decides which MPFE port gains access to the single-port memory controller. The SDRAM transaction size that is arbitrated is a burst of two beats. This burst size ensures that the arbiter does not favor one port over another when the incoming transaction size is a large burst. The arbiter makes decisions based on two criteria: priority and weight. The priority is an absolute arbitration priority where the higher priority ports always win arbitration over the lower priority ports. Because multiple ports can be set to the same priority, the weight value refines the port choice by implementing a round-robin arbitration External Memory Interface Handbook Volume 3: Reference Material 281 7 Functional Description—HPS Memory Controller among ports set to the same priority. This programmable weight allows you to assign a higher arbitration value to a port in comparison to others such that the highest weighted port receives more transaction bandwidth than the lower weighted ports of the same priority. Before arbitration is performed, the MPFE buffers are checked for any incoming transactions. The priority of each port that has buffered transactions is compared and the highest priority wins. If multiple ports are of the same highest priority value, the port weight is applied to determine which port wins. Because the arbiter only allows SDRAM-sized bursts into the single-port memory controller, large transactions may need to be serviced multiple times before the read or write command is fully accepted to the single-port memory controller. The MPFE supports dynamic tuning of the priority and weight settings for each port, with changes committed into the SDRAM controller at fixed intervals of time. Arbitration settings are applied to each port of the MPFE. The memory controller supports a mix of Avalon-MM and AXI protocols. As defined in the "Port Mappings" section, the Avalon-MM ports consume a single command port while the AXI ports consume a pair of command ports to support simultaneous read and write transactions. In total, there are ten command ports for the MPFE to arbitrate. The following table illustrates the command port mapping within the HPS as well as the ports exposed to the FPGA fabric. Table 67. HPS SDRAM MPFE Command Port Mapping Command Port Allowed Functions 0, 2, 4 FPGA fabric AXI read command ports FPGA fabric Avalon-MM read or write command ports 1, 3, 5 FPGA fabric AXI write command ports FPGA fabric Avalon-MM read or write command ports Data Size 32-bit to 256-bit data 6 L3 AXI read command port 32-bit data 7 MPU AXI read command port 64-bit data 8 L3 AXI write command port 32-bit data 9 MPU AXI write command port 64-bit data When the FPGA ports are configured for AXI, the command ports are always assigned in groups of two starting with even number ports 0, 2, or 4 assigned to the read command channel. For example, if you configure the first FPGA-to-SDRAM port as AXI and the second port as Avalon-MM, you can expect the following mapping: • Command port 0 = AXI read • Command port 1 = AXI write • Command port 2 = Avalon-MM read and write Setting the MPFE Priority The priority of each of the ten command ports is configured through the userpriority field of the mppriority register. This 30-bit register uses 3 bits per port to configure the priority. The lowest priority is 0x0 and the highest priority is 0x7. The bits are mapped in ascending order with bits [2:0] assigned to command port 0 and bits [29:27] assigned to command port 9. External Memory Interface Handbook Volume 3: Reference Material 282 7 Functional Description—HPS Memory Controller Setting the MPFE Static Weights The static weight settings used in the round-robin command port priority scheme are programmed in a 128-bit field distributed among four 32-bit registers: • mpweight_0_4 • mpweight_1_4 • mpweight_2_4 • mpweight_3_4 Each port is assigned a 5-bit value within the 128-bit field, such that port 0 is assigned to bits [4:0] of the mpweight_0_4 register, port 1 is assigned to bits [9:5] of the mpweight_0_4 register up to port 9, which is assigned to bits[49:45] contained in the mpweight_1_4 register. The valid weight range for each port is 0x0 to 0x1F, with larger static weights representing a larger arbitration share. Bits[113:50] in the mpweight_1_4, mpweight_2_4 and mpweight_3_4 registers, hold the sum of weights for each priority. This 64-bit field is divided into eight fields of 8-bits, each representing the sum of static weights. Bits[113:50] are mapped in ascending order with bits [57:50] holding the sum of static weights for all ports with priority setting 0x0, and bits [113:106] holding the sum of static weights for all ports with priority setting 0x7. Example Using MPFE Priority and Weights In this example, the following settings apply: • FPGA MPFE ports 0 is assigned to AXI read commands and port 1 is assigned to AXI write commands. • FPGA MPFE port 2 is assigned to Avalon-MM read and write commands. • L3 master ports (ports 6 and 8) and the MPU ports (port 7 and 9) are given the lowest priority but the MPU ports are configured with more arbitration static weight than the L3 master ports. • The FPGA MPFE command ports are given the highest priority; however, AXI ports 0 and 1 are given a larger static weight because they carry the highest priority traffic in the entire system. Assigning a high priority and larger static weight ensures ports 0 and 1 will receive the highest quality-of-service (QoS). The table below details the port weights and sum of weights. Table 68. SDRAM MPFE Port Priority, Weights and Sum of Weights Priority Weights Sum of Wei ghts - Port 0 Port 1 Port 2 Port 3 Port 4 Port 5 Port 6 Port 7 Port 8 Port 9 - 1 10 10 5 0 0 0 0 0 0 0 25 0 0 0 0 0 0 0 1 4 1 4 10 If the FPGA-to-SDRAM ports are configured according to the table and if both ports are accessed concurrently, you can expect the AXI port to receive 80% of the total service. This value is determined by taking the sum of port 0 and 1 weights divided by External Memory Interface Handbook Volume 3: Reference Material 283 7 Functional Description—HPS Memory Controller the total weight for all ports of priority 1. The remaining 20% of bandwidth is allocated to the Avalon-MM port. With these port settings, any FPGA transaction buffered by the MPFE for either slave port blocks the MPU and L3 masters from having their buffered transactions serviced. To avoid transaction starvation, you should assign ports the same priority level and adjust the bandwidth allocated to each port by adjusting the static weights. MPFE Weight Calculation The MPFE applies a deficit round-robin arbitration scheme to determine which port is serviced. The larger the port weight, the more often it is serviced. Ports are serviced only when they have buffered transactions and are set to the highest priority of all the ports that also have buffered transactions. The arbiter determines which port to service by examining the running weight of all the same ports at the same priority level and the largest running weight is serviced. Each time a port is drained of all transactions, its running weight is set to 0x80. Each time a port is serviced, the static weight is added and the sum of weights is subtracted from the running weight for that port. Each time a port is not serviced (same priority as another port but has a lower running weight), the static weight for the port is added to the running weight of the port for that particular priority level. The running weight additions and subtractions are only applied to one priority level, so any time ports of a different priority level are being serviced, the running weight of a lower priority port is not modified. MPFE Multi-Port Arbitration Considerations for Use When using MPFE multi-port arbitration, the following considerations apply: • To ensure that the dynamic weight value does not roll over when a port is serviced, the following equation should always be true: (sum of weights - static weight) < 128 If the running weight remains less than 128, arbitration for that port remains functional. • The memory controller commits the priority and weight registers into the MPFE arbiter once every 10 SDRAM clock cycles. As a result, when the mppriority register and mpweight_*_4 registers are configured at run time, the update interval can occur while the registers are still being written and ports can have different priority or weights than intended for a brief period. Because the mppriority and mpweight_*_4 registers can be updated in a single 32-bit transaction, Intel recommends updating first to ensure that transactions that need to be serviced have the appropriate priority after the next update. Because the static weights are divided among four 32-bit registers • In addition to the mppriority register and mpweight_*_* registers, the remappriority register adds another level of priority to the port scheduling. By programming bit N in the priorityremap field of the remappriority register, any port with an absolute priority N is sent to the front of the single port command queue and is serviced before any other transaction. Please refer to the remappriority register for more details. External Memory Interface Handbook Volume 3: Reference Material 284 7 Functional Description—HPS Memory Controller The scheduler is work-conserving. Write operations can only be scheduled when enough data for the SDRAM burst has been received. Read operations can only be scheduled when sufficient internal memory is free and the port is not occupying too much of the read buffer. 7.6.3 MPFE SDRAM Burst Scheduling SDRAM burst scheduling recognizes addresses that access the same row/bank combination, known as open page accesses. Operations to a page are served in the order in which they are received by the single-port controller. Selection of SDRAM operations is a two-stage process. First, each pending transaction must wait for its timers to be eligible for execution. Next, the transaction arbitrates against other transactions that are also eligible for execution. The following rules govern transaction arbitration: • High-priority operations take precedence over lower-priority operations • If multiple operations are in arbitration, read operations have precedence over write operations • If multiple operations still exist, the oldest is served first A high-priority transaction in the SDRAM burst scheduler wins arbitration for that bank immediately if the bank is idle and the high-priority transaction's chip select, row, or column fields of the address do not match an address already in the single-port controller. If the bank is not idle, other operations to that bank yield until the highpriority operation is finished. If the chip select, row, and column fields match an earlier transaction, the high-priority transaction yields until the earlier transaction is completed. Clocking The FPGA fabric ports of the MPFE can be clocked at different frequencies. Synchronization is maintained by clock-domain crossing logic in the MPFE. Command ports can operate on different clock domains, but the data ports associated with a given command port must be attached to the same clock as that command port. Note: A command port paired with a read and write port to form an Avalon-MM interface must operate at the same clock frequency as the data ports associated with it. 7.6.4 Single-Port Controller Operation The single-port controller increases the performance of memory transactions through command and data re-ordering, enforcing bank policies, combining write operations and allowing burst transfers. Correction of single-bit errors and detection of double-bit errors is handled in the ECC module of the single-port Controller. SDRAM Interface The SDRAM interface is up to 40 bits wide and can accommodate 8-bit, 16-bit, 16-bit plus ECC, 32-bit, or 32-bit plus ECC configurations, depending on the device package. The SDRAM interface supports LPDDR2, DDR2, and DDR3 memory protocols. External Memory Interface Handbook Volume 3: Reference Material 285 7 Functional Description—HPS Memory Controller 7.6.4.1 Command and Data Reordering The heart of the SDRAM controller is a command and data reordering engine. Command reordering allows banks for future transactions to be opened before the current transaction finishes. Data reordering allows transactions to be serviced in a different order than they were received when that new order allows for improved utilization of the SDRAM bandwidth. Operations to the same bank and row are performed in order to ensure that operations which impact the same address preserve the data integrity. The following figure shows the relative timing for a write/read/write/read command sequence performed in order and then the same command sequence performed with data reordering. Data reordering allows the write and read operations to occur in bursts, without bus turnaround timing delay or bank reassignment. Figure 148. Data Reordering Effect Data Reordering Off Command Address WR B0R0 RD B1R0 WR B0R0 RD B1R0 Data Reordering On Command Address WR B0R0 WR B0R0 RD B1R0 RD B1R0 The SDRAM controller schedules among all pending row and column commands every clock cycle. 7.6.4.2 Bank Policy The bank policy of the SDRAM controller allows users to request that a transaction's bank remain open after an operation has finished so that future accesses do not delay in activating the same bank and row combination. The controller supports only eight simultaneously-opened banks, so an open bank might get closed if the bank resource is needed for other operations. Open bank resources are allocated dynamically as SDRAM burst transactions are scheduled. Bank allocation is requested automatically by the controller when an incoming transaction spans multiple SDRAM bursts or by the extended command interface. When a bank must be reallocated, the least-recently-used open bank is used as the replacement. If the controller determines that the next pending command will cause the bank request to not be honored, the bank might be held open or closed depending on the pending operation. A request to close a bank with a pending operation in the timer bank pool to the same row address causes the bank to remain open. A request to leave a bank open with a pending command to the same bank but a different row address causes a precharge operation to occur. 7.6.4.3 Write Combining The SDRAM controller combines write operations from successive bursts on a port where the starting address of the second burst is one greater than the ending address of the first burst and the resulting burst length does not overflow the 11-bit burst-length counters. External Memory Interface Handbook Volume 3: Reference Material 286 7 Functional Description—HPS Memory Controller Write combining does not occur if the previous bus command has finished execution before the new command has been received. 7.6.4.4 Burst Length Support The controller supports burst lengths of 2, 4, 8, and 16. Data widths of 8, 16, and 32 bits are supported for non-ECC operation and data widths of 24 and 40 bits are supported for operations with ECC enabled. The following table shows the type of SDRAM for each burst length. Table 69. SDRAM Burst Lengths Burst Length SDRAM 4 LPDDR2, DDR2 8 DDR2, DDR3, LPDDR2 16 LPDDR2 Width Matching The SDRAM controller automatically performs data width conversion. 7.6.4.5 ECC The single-port controller supports memory ECC calculated by the controller. The controller ECC employs standard Hamming logic to detect and correct single-bit errors and detect double-bit errors. The controller ECC is available for 16-bit and 32bit widths, each requiring an additional 8 bits of memory, resulting in an actual memory width of 24-bits and 40-bits, respectively. Note: The level of ECC support is package dependent. Functions the controller ECC provides are: • Byte Writes • ECC Write Backs • Notification of ECC Errors 7.6.4.5.1 Byte Writes The memory controller performs a read-modify-write operation to ensure that the ECC data remains valid when a subset of the bits of a word is being written. Byte writes with ECC enabled are executed as a read-modify-write. Typical operations only use a single entry in the timer bank pool. Controller ECC enabled sub-word writes use two entries. The first operation is a read and the second operation is a write. These two operations are transferred to the timer bank pool with an address dependency so that the write cannot be performed until the read data has returned. This approach ensures that any subsequent operations to the same address (from the same port) are executed after the write operation, because they are ordered on the row list after the write operation. If an entire word is being written (but less than a full burst) and the DM pins are connected, no read is necessary and only that word is updated. If controller ECC is disabled, byte-writes have no performance impact. External Memory Interface Handbook Volume 3: Reference Material 287 7 Functional Description—HPS Memory Controller 7.6.4.5.2 ECC Write Backs If the controller ECC is enabled and a read operation results in a correctable ECC error, the controller corrects the location in memory, if write backs are enabled. The correction results in scheduling a new read-modify-write. A new read is performed at the location to ensure that a write operation modifying the location is not overwritten. The actual ECC correction operation is performed as a read-modify-write operation. ECC write backs are enabled and disabled through the cfg_enable_ecc_code_overwrites field in the ctrlcfg register. Note: Double-bit errors do not generate read-modify-write commands. Instead, double-bit error address and count are reported through the erraddr and dbecount registers, respectively. In addition, a double-bit error interrupt can be enabled through the dramintr register. 7.6.4.5.3 User Notification of ECC Errors When an ECC error occurs, an interrupt signal notifies the MPU subsystem, and the ECC error information is stored in the status registers. The memory controller provides interrupts for single-bit and double-bit errors. The status of interrupts and errors are recorded in status registers, as follows: • The dramsts register records interrupt status. • The dramintr register records interrupt masks. • The sbecount register records the single-bit error count. • The dbecount register records the double-bit error count. • The erraddr register records the address of the most recent error. For a 32-bit interface, ECC is calculated across a span of 8 bytes, meaning the error address is a multiple of 8 bytes (4-bytes*2 burst length). To find the byte address of the word that contains the error, you must multiply the value in the erraddr register by 8. 7.6.4.6 Interleaving Options The controller supports the following address-interleaving options: • Non-interleaved • Bank interleave without chip select interleave • Bank interleave with chip select interleave The following interleaving examples use 512 megabits (Mb) x 16 DDR3 chips and are documented as byte addresses. For RAMs with smaller address fields, the order of the fields stays the same but the widths may change. Non-interleaved RAM mapping is non-interleaved. External Memory Interface Handbook Volume 3: Reference Material 288 7 Functional Description—HPS Memory Controller Figure 149. Non-interleaved Address Decoding Address Decoding (512 Mb x 16 DDR3 DRAM) DDR 3 512 x 16 DDR 3 512 x 16 DDR 3 512 x 16 DDR 3 512 x 16 Controller 28 S 24 20 B (2 :0 ) 16 12 8 R ( 15 :0 ) 0 C ( 9 :0 ) Address Nomenclature R =Row B =Bank C =Column 4 S =Chip Select Bank Interleave Without Chip Select Interleave Bank interleave without chip select interleave swaps row and bank from the noninterleaved address mapping. This interleaving allows smaller data structures to spread across all banks in a chip. Figure 150. Bank Interleave Without Chip Select Interleave Address Decoding 28 24 S 20 16 12 R ( 15 :0 ) 8 B ( 2 :0) 4 0 C ( 9:0 ) Bank Interleave with Chip Select Interleave Bank interleave with chip select interleave moves the row address to the top, followed by chip select, then bank, and finally column address. This interleaving allows smaller data structures to spread across multiple banks and chips (giving access to 16 total banks for multithreaded access to blocks of memory). Memory timing is degraded when switching between chips. Figure 151. Bank Interleave With Chip Select Interleave Address Decoding 28 24 20 R ( 15 :0 ) 16 12 S B ( 2 :0) 8 4 0 C ( 9:0 ) External Memory Interface Handbook Volume 3: Reference Material 289 7 Functional Description—HPS Memory Controller 7.6.4.7 AXI-Exclusive Support The single-port controller supports AXI-exclusive operations. The controller implements a table shared across all masters, which can store up to 16 pending writes. Table entries are allocated on an exclusive read and table entries are deallocated on a successful write to the same address by any master. Any exclusive write operation that is not present in the table returns an exclusive fail as acknowledgement to the operation. If the table is full when the exclusive read is performed, the table replaces a random entry. Note: When using AXI-exclusive operations, accessing the same location from Avalon-MM interfaces can result in unpredictable results. 7.6.4.8 Memory Protection The single-port controller has address protection to allow the software to configure basic protection of memory from all masters in the system. If the system has been designed exclusively with AMBA masters, TrustZone® is supported. Ports that use Avalon-MM can be configured for port level protection. Memory protection is based on physical addresses in memory. The single-port controller can configure up to 20 rules to allow or prevent masters from accessing a range of memory based on their AxIDs, level of security and the memory region being accessed. If no rules are matched in an access, then default settings take effect. The rules are stored in an internal protection table and can be accessed through indirect addressing offsets in the protruledwr register in the CSR. To read a specific rule, set the readrule bit and write the appropriate offset in the ruleoffset field of the protruledwr register. To write a new rule, three registers in the CSR must be configured: 1. Table 70. The protportdefault register is programmed to control the default behavior of memory accesses when no rules match. When a bit is clear, all default accesses from that port pass. When a bit is set, all default accesses from that port fails. The bits are assigned as follows: protportdefault register Bits 31:10 Description reserved 9 When this bit is set to 1, deny CPU writes during a default transaction. When this bit is clear, allow CPU writes during a default transaction. 8 When this bit is set to 1, deny L3 writes during a default transaction. When this bit is clear, allow L3 writes during a default transaction. 7 When this bit is set to 1, deny CPU reads during a default transaction. When this bit is clear, allow CPU reads during a default transaction. 6 When this bit is set to 1, deny L3 reads during a default transaction. When this bit is clear, allow L3 reads during a default transaction. 5:0 When this bit is set to 1, deny accesses from FPGA-to-SDRAM ports 0 through 5 during a default transaction. continued... External Memory Interface Handbook Volume 3: Reference Material 290 7 Functional Description—HPS Memory Controller Bits Description When this bit is clear, allow accesses from FPGA-to-SDRAM ports 0 through 5 during a default transaction. 2. The protruleid register gives the bounds of the AxID value that allows an access 3. The protruledata register configures the specific security characteristics for a rule. Once the registers are configured, they can be committed to the internal protection table by programming the ruleoffset field and setting the writerule bit in the protruledwr register. Secure and non-secure regions are specified by rules containing a starting address and ending address with 1 MB boundaries for both addresses. You can override the port defaults and allow or disallow all transactions. The following table lists the fields that you can specify for each rule. Table 71. Fields for Rules in Memory Protection Table Field Width Description 1 Set to 1 to activate the rule. Set to 0 to deactivate the rule. 10 Specifies the set of ports to which the rule applies, with one bit representing each port, as follows: bits 0 to 5 correspond to FPGA fabric ports 0 to 5, bit 6 corresponds to AXI L3 interconnect read, bit 7 is the CPU read, bit 8 is L3 interconnect write, and bit 9 is the CPU write. 12 Low transfer AxID of the rules to which this rule applies. Incoming transactions match if they are greater than or equal to this value. Ports with smaller AxIDs have the AxID shifted to the lower bits and zero padded at the top. 12 High transfer AxID of the rules to which this rule applies. Incoming transactions match if they are less than or equal to this value. Address_low 12 Points to a 1MB block and is the lower address. Incoming addresses match if they are greater than or equal to this value. Address_high 12 Upper limit of address. Incoming addresses match if they are less than or equal to this value. Protection 2 A value of 0x0 indicates that the rule applies to secure transactions; a value of 0x1 indicates the rule applies to nonsecure transactions. Values 0x2 and 0x3 set the region to shared, meaning both secure and non-secure accesses are valid. Fail/allow 1 Set this value to 1 to force the operation to fail or succeed. Valid Port Mask 7 AxID_low7 AxID_high 7 Each port has a default access status of either allow or fail. Rules with the opposite allow/fail value can override the default. The system evaluates each transaction against every rule in the memory protection table. If a transaction arrives at a port that defaults to access allowed, it fails only if a rule with the fail bit matches the 7 Although AxID and Port Mask could be redundant, including both in the table allows possible compression of rules. If masters connected to a port do not have contiguous AxIDs, a portbased rule might be more efficient than an AxID-based rule, in terms of the number of rules needed. External Memory Interface Handbook Volume 3: Reference Material 291 7 Functional Description—HPS Memory Controller transaction. Conversely, if a transaction arrives at a port that has the default rule set to access denied, it allows access only if there is a matching rule that forces accessed allowed. Transactions that fail the protection rules return a slave error (SLVERR). The recommended sequence for writing a rule is: 1. Write the protruledwr register fields as follows: • ruleoffset = offset selected by user that points to indirect offset in an internal protection table.. • writerule = 0 • readrule = 0 2. Write the protruleaddr, protruleid, and protruledata registers so you configure the rule you would like to enforce. 3. Write the protruledwr register fields as follows: • ruleoffset = offset of the rule that needs to be written • writerule = 1 • readrule = 0 Similarly, the recommended sequence for reading a rule is: 1. 2. Write the protruledwr register fields as follows: • ruleoffset = offset of the rule that needs to be written • writerule = 0 • readrule = 0 Write the protruledwr register fields as follows: • ruleoffset = offset of the rule that needs to be read • writerule = 0 • readrule = 1 3. Read the values of the protruleaddr, protruleid, and protruledata registers to determine the rule parameters. The following figure represents an overview of how the protection rules are applied. There is no priority among the 20 rules. All rules are always evaluated in parallel. External Memory Interface Handbook Volume 3: Reference Material 292 7 Functional Description—HPS Memory Controller Figure 152. SDRAM Protection Access Flow Diagram Start Evaluate all rules for match Ignore all rules that do not match Port default access Allowed Not Allowed Ignore all “allow” rules that match yes Reject Access Any “fail” rule that matches? Ignore all “fail” rules that match no yes Allow Access Allow Access Any “allow” rule that matches? no Reject Access End Exclusive transactions are security checked on the read operation only. A write operation can occur only if a valid read is marked in the internal exclusive table. Consequently, a master performing an exclusive read followed by a write, can write to memory only if the exclusive read was successful. Related Links ARM TrustZone® For more information about TrustZone® refer to the ARM web page. External Memory Interface Handbook Volume 3: Reference Material 293 7 Functional Description—HPS Memory Controller 7.6.4.9 Example of Configuration for TrustZone For a TrustZone® configuration, memory is TrustZone divided into a range of memory accessible by secure masters and a range of memory accessible by non-secure masters. The two memory address ranges may have a range of memory that overlaps. This example implements the following memory configuration: • 2 GB total RAM size • 0—512 MB dedicated secure area • 513—576 MB shared area • 577—2048 MB dedicated non-secure area Figure 153. Example Memory Configuration 2048 Non-Secure 2 GB 1024 576 Shared 512 Secure 0 In this example, each port is configured by default to disallow all accesses. The following table shows the two rules programmed into the memory protection table. Table 72. Rule # Rules in Memory Protection Table for Example Configuration Port Mask AxID Low AxID High Address Low Address High protruledata.security Fail/ Allow 1 0x3FF (1023) 0x000 0xFFF (4095) 0 576 0x1 Allow 2 0x3FF (1023) 0x000 0xFFF (4095) 512 2047 0x0 Allow The port mask value, AxID Low, and AxID High, apply to all ports and all transfers within those ports. Each access request is evaluated against the memory protection table, and will fail unless there is a rule match allowing a transaction to complete successfully. External Memory Interface Handbook Volume 3: Reference Material 294 7 Functional Description—HPS Memory Controller Table 73. Result for a Sample Set of Transactions Operation Note: Source Address Accesses Security Access Type Result Comments Read CPU 4096 secure Allow Matches rule 1. Write CPU 536, 870, 912 secure Allow Matches rule 1. Write L3 attached masters 605, 028, 350 secure Fail Does not match rule 1 (out of range of the address field), does not match rule 2 (protection bit incorrect). Read L3 attached masters 4096 non-secure Fail Does not match rule 1 (AxPROT signal value wrong), does not match rule 2 (not in address range). Write CPU 536, 870, 912 non-secure Allow Matches rule 2. Write L3 attached masters 605, 028, 350 non-secure Allow Matches rule 2. If a master is using the Accelerator Coherency Port (ACP) to maintain cache coherency with the Cortex-A9 MPCore processor, then the address ranges in the rules of the memory protection table should be made mutually exclusive, such that the secure and non-secure regions do not overlap and any area that is shared is part of the nonsecure region. This configuration prevents coherency issues from occurring. 7.7 SDRAM Power Management The SDRAM controller subsystem supports the following power saving features in the SDRAM: • Partial array self-refresh (PASR) • Power down • Deep power down for LPDDR2 To enable self-refresh for the memories of one or both chip selects, program the selfshreq bit and the sefrfshmask bit in the lowpwreq register. Power-saving mode initiates either due to a user command or from inactivity. The number of idle clock cycles after which a memory can be put into power-down mode is programmed through the autopdycycles field of the lowpwrtiming register. Power-down mode forces the SDRAM burst-scheduling bank-management logic to close all banks and issue the power down command. The SDRAM automatically reactivates when an active SDRAM command is received. To enable deep power down request for the LPDDR2 memories of one or both chip selects, program the deeppwrdnreq bit and the deepwrdnmask field of the lowpwreq register. Other power-down modes are performed only under user control. External Memory Interface Handbook Volume 3: Reference Material 295 7 Functional Description—HPS Memory Controller 7.8 DDR PHY The DDR PHY connects the memory controller and external memory devices in the speed critical command path. The DDR PHY implements the following functions: • Calibration—the DDR PHY supports the JEDEC-specified steps to synchronize the memory timing between the controller and the SDRAM chips. The calibration algorithm is implemented in software. • Memory device initialization—the DDR PHY performs the mode register write operations to initialize the devices. The DDR PHY handles re-initialization after a deep power down. • Single-data-rate to double-data-rate conversion. 7.9 Clocks All clocks are assumed to be asynchronous with respect to the ddr_dqs_clk memory clock. All transactions are synchronized to memory clock domain. Table 74. SDRAM Controller Subsystem Clock Domains Clock Name Description Clock for PHY ddr_dq_clk Clock for MPFE, single-port controller, CSR access, and PHY ddr_dqs_clk Clock for PHY that provides up to 2 times ddr_dq_clk frequency ddr_2x_dqs_clk Clock for CSR interface l4_sp_clk Clock for MPU interface mpu_l2_ram_clk Clock for L3 interface l3_main_clk Six separate clocks used for the FPGA-to-HPS SDRAM ports to the FPGA fabric f2h_sdram_clk[5:0] In terms of clock relationships, the FPGA fabric connects the appropriate clocks to write data, read data, and command ports for the constructed ports. 7.10 Resets The SDRAM controller subsystem supports a full reset (cold reset) and a warm reset. The SDRAM controller can be configured to preserve memory contents during a warm reset. External Memory Interface Handbook Volume 3: Reference Material 296 7 Functional Description—HPS Memory Controller To preserve memory contents, the reset manager can request that the single-port controller place the SDRAM in self-refresh mode prior to issuing the warm reset. If self-refresh mode is enabled before the warm reset to preserve memory contents, the PHY and the memory timing logic is not reset, but the rest of the controller is reset. 7.10.1 Taking the SDRAM Controller Subsystem Out of Reset When a cold or warm reset is issued in the HPS, the Reset Manager resets this module and holds it in reset until software releases it. After the MPU boots up, it can deassert the reset signal by clearing the appropriate bits in the Reset Manager's corresponding reset trigger. 7.11 Port Mappings The memory interface controller has a set of command, read data, and write data ports that support AXI3, AXI4 and Avalon-MM. Tables are provided to identify port assignments and functions. Table 75. Command Port Assignments Command Port Table 76. Allowed Functions 0, 2, 4 FPGA fabric AXI read command ports FPGA fabric Avalon-MM read or write command ports 1, 3, 5 FPGA fabric AXI write command ports FPGA fabric Avalon-MM read or write command ports 6 L3 AXI read command port 7 MPU AXI read command port 8 L3 AXI write command port 9 MPU AXI write command port Read Port Assignments Read Port 0, 1, 2, 3 Table 77. Allowed Functions 64-bit read data from the FPGA fabric. When 128-bit data read ports are created, then read data ports 0 and1 get paired as well as 2 and 3. 4 32-bit L3 read data port 5 64-bit MPU read data port Write Port Assignments Write Port Allowed Functions 0, 1, 2, 3 64-bit write data from the FPGA fabric. When 128-bit data write ports are created, then write data ports 0 and 1 get paired as well as 2 and 3. 4 32-bit L3 write data port 5 64-bit MPU write data port External Memory Interface Handbook Volume 3: Reference Material 297 7 Functional Description—HPS Memory Controller 7.12 Initialization The SDRAM controller subsystem has control and status registers (CSRs) which control the operation of the controller including DRAM type, DRAM timing parameters and relative port priorities. It also has a small set of bits which depend on the FPGA fabric to configure ports between the memory controller and the FPGA fabric; these bits are set for you when you configure your implementation using the HPS GUI in Qsys. The CSRs are configured using a dedicated slave interface, which provides access to the registers. This region controls all SDRAM operation, MPFE scheduler configuration, and PHY calibration. The FPGA fabric interface configuration is programmed into the FPGA fabric and the values of these register bits can be read by software. The ports can be configured without software developers needing to know how the FPGA-to-HPS SDRAM interface has been configured. 7.12.1 FPGA-to-SDRAM Protocol Details The following topics summarize signals for the Avalon-MM Bidirectional port, AvalonMM Write Port, Avalon-MM Read Port, and AXI port. Note: If your device has multiple FPGA hardware images, then the same FPGA-to-SDRAM port configuration should be used across all designs. 7.12.1.1 Avalon-MM Bidirectional Port The Avalon-MM bidirectional ports are standard Avalon-MM ports used to dispatch read and write operations. Each configured Avalon-MM bidirectional port consists of the signals listed in the following table. Table 78. Avalon-MM Bidirectional Port Signals Name Bit Width Input/Output Direction Function 1 In Clock for the Avalon-MM interface 1 In Indicates read transaction 1 In Indicates write transaction 32 In Address of the transaction 32, 64, 128, or 256 Out clk 8 read 8 write address Read data return readdata continued... 8 The Avalon-MM protocol does not allow read and write transactions to be posted concurrently. External Memory Interface Handbook Volume 3: Reference Material 298 7 Functional Description—HPS Memory Controller Name Bit Width Input/Output Direction 1 Out 32, 64, 128, or 256 In Write data for a transaction 4, 8, 16, 32 In Byte enables for each write byte lane 1 Out Indicates need for additional cycles to complete a transaction 11 In readdatavalid Function Indicates the readdata signal contains valid data in response to a previous read request. writedata byteenable waitrequest burstcount Transaction burst length. The value of the maximum burstcount parameter must be a power of 2. The read and write interfaces are configured to the same size. The byte-enable size scales with the data bus size. Related Links Avalon Interface Specifications Information about the Avalon-MM protocol 7.12.1.2 Avalon-MM Write-Only Port The Avalon-MM write-only ports are standard Avalon-MM ports used to dispatch write operations. Each configured Avalon-MM write port consists of the signals listed in the following table. Table 79. Avalon-MM Write-Only Port Signals Name Bits Direction Function 1 In Reset 1 In Clock 1 In Indicates write transaction 32 In Address of the transaction 32, 64, 128, or 256 In Write data for a transaction reset clk write address writedata continued... External Memory Interface Handbook Volume 3: Reference Material 299 7 Functional Description—HPS Memory Controller Name Bits Direction 4, 8, 16, 32 In 1 Out 11 In Function Byte enables for each write byte byteenable waitrequest Indicates need for additional cycles to complete a transaction Transaction burst length burstcount Related Links Avalon Interface Specifications Information about the Avalon-MM protocol 7.12.1.3 Avalon-MM Read Port The Avalon-MM read ports are standard Avalon-MM ports used only to dispatch read operations. Each configured Avalon-MM read port consists of the signals listed in the following table. Table 80. Avalon-MM Read Port Signals Name Bits Direction Function 1 In Reset 1 In Clock 1 In Indicates read transaction 32 In Address of the transaction 32, 64, 128, or 256 Out Read data return 1 Out Flags valid cycles for read data return 1 Out Indicates the need for additional cycles to complete a transaction. Needed for read operations when delay is needed to accept the read command. 11 In reset clk read address readdata readdatavalid waitrequest burstcount Related Links Avalon Interface Specifications Information about the Avalon-MM protocol External Memory Interface Handbook Volume 3: Reference Material 300 Transaction burst length 7 Functional Description—HPS Memory Controller 7.12.1.4 AXI Port The AXI port uses an AXI-3 interface. Each configured AXI port consists of the signals listed in the following table. Because the AXI protocol allows simultaneous read and write commands to be issued, two SDRAM control ports are required to form an AXI interface. Table 81. Name AXI Port Signals Bits Direction ARESETn 1 In n/a Reset ACLK 1 In n/a Clock AWID 4 In Write address Write identification tag 32 In Write address Write address AWLEN 4 In Write address Write burst length AWSIZE 3 In Write address Width of the transfer size AWBURST 2 In Write address Burst type AWLOCK 2 In Write address Lock type signal which indicates if the access is exclusive; valid values are 0x0 (normal access) and 0x1 (exclusive access) AWCACHE 4 In Write address Cache policy type AWPROT 3 In Write address Protection-type signal used to indicate whether a transaction is secure or non-secure AWREADY 1 Out Write address Indicates ready for a write command AWVALID 1 In Write address Indicates valid write command. WID 4 In Write data Write data transfer ID WDATA 32, 64, 128 or 256 In Write data Write data WSTRB 4, 8, 16, 32 In Write data Byte-based write data strobe. Each bit width corresponds to 8 bit wide transfer for 32-bit wide to 256-bit wide transfer. WLAST 1 In Write data Last transfer in a burst WVALID 1 In Write data Indicates write data and strobes are valid WREADY 1 Out Write data Indicates ready for write data and strobes BID 4 Out Write response Write response transfer ID BRESP 2 Out Write response Write response status BVALID 1 Out Write response Write response valid signal BREADY 1 In Write response Write response ready signal ARID 4 In Read address Read identification tag 32 In Read address Read address ARLEN 4 In Read address Read burst length ARSIZE 3 In Read address Width of the transfer size AWADDR ARADDR Channel Function continued... External Memory Interface Handbook Volume 3: Reference Material 301 7 Functional Description—HPS Memory Controller Name Bits Direction Channel Function ARBURST 2 In Read address Burst type ARLOCK 2 In Read address Lock type signal which indicates if the access is exclusive; valid values are 0x0 (normal access) and 0x1 (exclusive access) ARCACHE 4 In Read address Lock type signal which indicates if the access is exclusive; valid values are 0x0 (normal access) and 0x1 (exclusive access) ARPROT 3 In Read address Protection-type signal used to indicate whether a transaction is secure or non-secure ARREADY 1 Out Read address Indicates ready for a read command ARVALID 1 In Read address Indicates valid read command RID 4 Out Read data Read data transfer ID RDATA 32, 64, 128 or 256 Out Read data Read data RRESP 2 Out Read data Read response status RLAST 1 Out Read data Last transfer in a burst RVALID 1 Out Read data Indicates read data is valid RREADY 1 In Read data Read data channel ready signal Related Links ARM AMBA Open Specification AMBA Open Specifications, including information about the AXI-3 interface 7.13 SDRAM Controller Subsystem Programming Model SDRAM controller configuration occurs through software programming of the configuration registers using the CSR interface. 7.13.1 HPS Memory Interface Architecture The configuration and initialization of the memory interface by the ARM processor is a significant difference compared to the FPGA memory interfaces, and results in several key differences in the way the HPS memory interface is defined and configured. Boot-up configuration of the HPS memory interface is handled by the initial software boot code, not by the FPGA programmer, as is the case for the FPGA memory interfaces. The Quartus Prime software is involved in defining the configuration of I/O ports which is used by the boot-up code, as well as timing analysis of the memory interface. Therefore, the memory interface must be configured with the correct PHYlevel timing information. Although configuration of the memory interface in Qsys is still necessary, it is limited to PHY- and board-level settings. 7.13.2 HPS Memory Interface Configuration To configure the external memory interface components of the HPS, open the HPS interface by selecting the Arria V/Cyclone V Hard Processor System component in Qsys. Within the HPS interface, select the EMIF tab to open the EMIF parameter editor. External Memory Interface Handbook Volume 3: Reference Material 302 7 Functional Description—HPS Memory Controller The EMIF parameter editor contains four additional tabs: PHY Settings, Memory Parameters, Memory Timing, and Board Settings. The parameters available on these tabs are similar to those available in the parameter editors for non-SoC device families. There are significant differences between the EMIF parameter editor for the Hard Processor System and the parameter editors for non-SoC devices, as follows: Note: • Because the HPS memory controller is not configurable through the Quartus Prime software, the Controller and Diagnostic tabs, which exist for non-SoC devices, are not present in the EMIF parameter editor for the hard processor system. • Unlike the protocol-specific parameter editors for non-SoC devices, the EMIF parameter editor for the Hard Processor System supports multiple protocols, therefore there is an SDRAM Protocol parameter, where you can specify your external memory interface protocol. By default, the EMIF parameter editor assumes the DDR3 protocol, and other parameters are automatically populated with DDR3-appropriate values. If you select a protocol other than DDR3, change other associated parameter values appropriately. • Unlike the memory interface clocks in the FPGA, the memory interface clocks for the HPS are initialized by the boot-up code using values provided by the configuration process. You may accept the values provided by UniPHY, or you may use your own PLL settings. If you choose to specify your own PLL settings, you must indicate that the clock frequency that UniPHY should use is the requested clock frequency, and not the achieved clock frequency calculated by UniPHY. The HPS does not support EMIF synthesis generation, compilation, or timing analysis. The HPS hard memory controller cannot be bonded with another hard memory controller on the FPGA portion of the device. 7.13.3 HPS Memory Interface Simulation Qsys provides a complete simulation model of the HPS memory interface controller and PHY, providing cycle-level accuracy, comparable to the simulation models for the FPGA memory interface. The simulation model supports only the skip-cal simulation mode; quick-cal and fullcal are not supported. An example design is not provided. However, you can create a test design by adding the traffic generator component to your design using Qsys. Also, the HPS simulation model does not use external memory pins to connect to the DDR memory model; instead, the memory model is incorporated directly into the HPS SDRAM interface simulation modules. The memory instance incorporated into the HPS model is in the simulation model hierarchy at: hps_0/fpga_interfaces/f2sdram/ hps_sdram_inst/mem/ Simulation of the FPGA-to-SDRAM interfaces requires that you first bring the interfaces out of reset, otherwise transactions cannot occur. Connect the H2F reset to the F2S port resets and add a stage to your testbench to assert and deassert the H2F reset in the HPS. Appropriate Verilog code is shown below: initial begin // Assert reset <base name>.hps.fpga_interfaces.h2f_reset_inst.reset_assert(); // Delay #1 External Memory Interface Handbook Volume 3: Reference Material 303 7 Functional Description—HPS Memory Controller // Deassert reset <base name>.hps.fpga_interfaces.h2f_reset_inst.reset_deassert(); end 7.13.4 Generating a Preloader Image for HPS with EMIF To generate a Preloader image for an HPS-based external memory interface, you must complete the following tasks: • Create a Qsys project. • Create a top-level file and add constraints. • Create a Preloader BSP file. • Create a Preloader image. 7.13.4.1 Creating a Qsys Project Before you can generate a preloader image, you must create a Qsys project, as follows: 1. On the Tools menu in the Quartus Prime software, click Qsys. 2. Under Component library, expand Embedded Processor System, select Hard Processor System and click Add. 3. Specify parameters for the FPGA Interfaces, Peripheral Pin Multiplexing, and HPS Clocks, based on your design requirements. 4. On the SDRAM tab, select the SDRAM protocol for your interface. 5. Populate the necessary parameter fields on the PHY Settings, Memory Parameters, Memory Timing, and Board Settings tabs. 6. Add other Qsys components in your Qsys design and make the appropriate bus connections. 7. Save the Qsys project. 8. Click Generate on the Generation tab, to generate the Qsys design. 7.13.4.2 Creating a Top-Level File and Adding Constraints This topic describes adding your Qsys system to your top-level design and adding constraints to your design. 1. Add your Qsys system to your top-level design. 2. Add the Quartus Prime IP files (.qip) generated in step 2, to your Quartus Prime project. 3. Perform analysis and synthesis on your design. 4. Constrain your EMIF design by running the <variation_name>_p0_pin_assignments.tcl pin constraints script. 5. Add other necessary constraints—such as timing constraints, location assignments, and pin I/O standard assignments—for your design. 6. Compile your design to generate an SRAM object file (.sof) and the hardware handoff files necessary for creating a preloader image. External Memory Interface Handbook Volume 3: Reference Material 304 7 Functional Description—HPS Memory Controller Note: You must regenerate the hardware handoff files whenever the HPS configuration changes; for example, due to changes in Peripheral Pin Multiplexing or I/O standard for HPS pins. Related Links Intel SoC FPGA Embedded Design Suite User's Guide For more information on how to create a preloader BSP file and image. 7.14 Debugging HPS SDRAM in the Preloader To assist in debugging your design, tools are available at the preloader stage. • UART or semihosting printout • Simple memory test • Debug report • Predefined data patterns The following topics provide procedures for implementing each of the above tools. 7.14.1 Enabling UART or Semihosting Printout UART printout is enabled by default. If UART is not available on your system, you can use semihosting together with the debugger tool. To enable semihosting in the Preloader, follow these steps: 1. When you create the .bsp file in the BSP Editor, select SEMIHOSTING in the spl.debug window. 2. Enable semihosting in the debugger, by typing set semihosting enabled true at the command line in the debugger. External Memory Interface Handbook Volume 3: Reference Material 305 7 Functional Description—HPS Memory Controller 7.14.2 Enabling Simple Memory Test After the SDRAM is successfully calibrated, a simple memory test may be performed using the debugger. 1. When you create the .bsp file in the BSP Editor, select HARDWARE_DIAGNOSTIC in the spl.debug window.. 2. The simple memory test assumes SDRAM with a memory size of 1 GB. If your board contains a different SDRAM memory size, open the file <design folder> \spl_bsp\uboot-socfpga\include\configs\socfpga_cyclone5.h in a text editor, and change the PHYS_SDRAM_1_SIZE parameter at line 292 to specify your actual memory size in bytes. External Memory Interface Handbook Volume 3: Reference Material 306 7 Functional Description—HPS Memory Controller 3. The simple memory test assumes SDRAM with a memory size of 1 GB. If your board contains a different SDRAM memory size, open the file <design folder> \spl_bsp\uboot-socfpga\include\configs\socfpga_arria5.h in a text editor, and change the PHYS_SDRAM_1_SIZE parameter at line 292 to specify your actual memory size in bytes. 7.14.3 Enabling the Debug Report You can enable the SDRAM calibration sequencer to produce a debug report on the UART printout or semihosting output. To enable the debug report, follow these steps: External Memory Interface Handbook Volume 3: Reference Material 307 7 Functional Description—HPS Memory Controller 1. After you have enabled the UART or semihosting, open the file <project directory>\hps_isw_handoff\sequencer_defines.hin a text editor. 2. Locate the line #define RUNTIME_CAL_REPORT 0 and change it to #define RUNTIME_CAL_REPORT 1. Figure 154. Semihosting Printout With Debug Support Enabled 7.14.3.1 Analysis of Debug Report The following analysis will help you interpret the debug report. External Memory Interface Handbook Volume 3: Reference Material 308 7 Functional Description—HPS Memory Controller • The Read Deskew and Write Deskew results shown in the debug report are before calibration. (Before calibration results are actually from the window seen during calibration, and are most useful for debugging.) • For each DQ group, the Write Deskew, Read Deskew, DM Deskew, and Read after Write results map to the before-calibration margins reported in the EMIF Debug Toolkit. Note: The Write Deskew, Read Deskew, DM Deskew, and Read after Write results are reported in delay steps (nominally 25ps, in Arria V and Cyclone V devices), not in picoseconds. • DQS Enable calibration is reported as a VFIFO setting (in one clock period steps), a phase tap (in one-eighth clock period steps), and a delay chain step (in 25ps steps). SEQ.C: DQS Enable ; Group 0 ; Rank 0 ; Start VFIFO SEQ.C: DQS Enable ; Group 0 ; Rank 0 ; End VFIFO SEQ.C: DQS Enable ; Group 0 ; Rank 0 ; Center VFIFO 5 ; Phase 6 ; Delay 6 ; Phase 5 ; Delay 6 ; Phase 2 ; Delay 4 9 1 Analysis of DQS Enable results: A VFIFO tap is 1 clock period, a phase is 1/8 clock period (45 degrees) and delay is nominally 25ps per tap. The DQSen window is the difference between the start and end—for the above example, assuming a frequency of 400 MHz (2500ps), that calculates as follows: start is 5*2500 + 6*2500/8 +4*25 = 14475ps. By the same calculation, the end is 16788ps. Consequently, the DQSen window is 2313ps. • The size of a read window or write window is equal to (left edge + right edge) * delay chain step size. Both the left edge and the right edge can be negative or positive.: SEQ.C: delay SEQ.C: delay Read Deskew ; 0 ; DQS delay Write Deskew ; 6 ; DQS delay DQ 8 DQ 4 0 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ 0 ; Rank 0 ; Left edge 30 ; Right edge 17 ; DQ Analysis of DQ and DQS delay results: The DQ and DQS output delay (write) is the D5 delay chain. The DQ input delay (read) is the D1 delay chain, the DQS input delay (read) is the D4 delay chain. External Memory Interface Handbook Volume 3: Reference Material 309 7 Functional Description—HPS Memory Controller • Consider the following example of latency results: SEQ.C: LFIFO Calibration ; Latency 10 Analysis of latency results: This is the calibrated PHY read latency. The EMIF Debug Toolkit does not report this figure. This latency is reported in clock cycles. • Consider the following example of FOM results: SEQ.C: FOM IN = 83 SEQ.C: FOM OUT = 91 Analysis of FOM results: The FOM IN value is a measure of the health of the read interface; it is calculated as the sum over all groups of the minimum margin on DQ plus the margin on DQS, divided by 2. The FOM OUT is a measure of the health of the write interface; it is calculated as the sum over all groups of the minimum margin on DQ plus the margin on DQS, divided by 2. You may refer to these values as indicators of improvement when you are experimenting with various termination schemes, assuming there are no individual misbehaving DQ pins. • The debug report does not provide delay chain step size values. The delay chain step size varies with device speed grade. Refer to your device data sheet for exact incremental delay values for delay chains. Related Links Functional Description–UniPHY For more information about calibration, refer to the Calibration Stages section in the Functional Description-UniPHY chapter of the External Memory Interface Handbook . 7.14.4 Writing a Predefined Data Pattern to SDRAM in the Preloader You can include your own code to write a predefined data pattern to the SDRAM in the preloader for debugging purposes. 1. Include your code in the file: <project_folder>\software\spl_bsp\ubootsocfpga\arch\arm\cpu\armv7\socfpga\spl.c . Adding the following code to the spl.c file causes the controller to write walking 1s and walking 0s, repeated five times, to the SDRAM. /*added for demo, place after the last #define statement in spl.c */ #define ROTATE_RIGHT(X) ( (X>>1) | (X&1?0X80000000:0) ) /*added for demo, place after the calibration code */ test_data_walk0((long *)0x100000,PHYS_SDRAM_1_SIZE); int test_data_walk0(long *base, long maxsize) { volatile long *addr; long cnt; ulong data_temp[3]; ulong expected_data[3]; ulong read_data; int i = 0; //counter to loop different data pattern int num_address; num_address=50; data_temp[0]=0XFFFFFFFE; //initial data for walking 0 pattern data_temp[1]=0X00000001; //initial data for walking 1 pattern data_temp[2]=0XAAAAAAAA; //initial data for A->5 switching External Memory Interface Handbook Volume 3: Reference Material 310 7 Functional Description—HPS Memory Controller expected_data[0]=0XFFFFFFFE; //initial data for walking 0 pattern expected_data[1]=0X00000001; //initial data for walking 1 pattern expected_data[2]=0XAAAAAAAA; //initial data for A->5 switching for (i=0;i<3;i++) { printf("\nSTARTED %08X DATA PATTERN !!!!\n",data_temp[i]); /*write*/ for (cnt = (0+i*num_address); cnt < ((i+1)*num_address) ; cnt++ ) { addr = base + cnt; /* pointer arith! */ sync (); *addr = data_temp[i]; data_temp[i]=ROTATE_RIGHT(data_temp[i]); } /*read*/ for (cnt = (0+i*num_address); cnt < ((i+1)*num_address) ; cnt = cnt++ ) { addr = base + cnt; /* pointer arith! */ sync (); read_data=*addr; printf("Address:%X Expected: %08X Read:%08X \n",addr, expected_data[i],read_data); if (expected_data[i] !=read_data) { puts("!!!!!!FAILED!!!!!!\n\n"); hang(); } expected_data[i]=ROTATE_RIGHT(expected_data[i]); } } } ====//End Of Code//===== Figure 155. Memory Contents After Executing Example Code 7.15 SDRAM Controller Address Map and Register Definitions This section lists the SDRAM register address map and describes the registers. 7.15.1 SDRAM Controller Address Map Address map for the SDRAM Interface registers Base Address: 0xFFC20000 External Memory Interface Handbook Volume 3: Reference Material 311 7 Functional Description—HPS Memory Controller SDRAM Controller Module Register Offset Widt h Acce ss Reset Value ctrlcfg on page 315 0x5000 32 RW 0x0 Controller Configuration Register dramtiming1 on page 317 0x5004 32 RW 0x0 DRAM Timings 1 Register dramtiming2 on page 317 0x5008 32 RW 0x0 DRAM Timings 2 Register dramtiming3 on page 318 0x500C 32 RW 0x0 DRAM Timings 3 Register dramtiming4 on page 319 0x5010 32 RW 0x0 DRAM Timings 4 Register lowpwrtiming on page 320 0x5014 32 RW 0x0 Lower Power Timing Register dramodt on page 320 0x5018 32 RW 0x0 ODT Control Register dramaddrw on page 321 0x502C 32 RW 0x0 DRAM Address Widths Register dramifwidth on page 322 0x5030 32 RW 0x0 DRAM Interface Data Width Register dramsts on page 323 0x5038 32 RW 0x0 DRAM Status Register dramintr on page 323 0x503C 32 RW 0x0 ECC Interrupt Register sbecount on page 324 0x5040 32 RW 0x0 ECC Single Bit Error Count Register dbecount on page 325 0x5044 32 RW 0x0 ECC Double Bit Error Count Register erraddr on page 325 0x5048 32 RW 0x0 ECC Error Address Register dropcount on page 326 0x504C 32 RW 0x0 ECC Auto-correction Dropped Count Register dropaddr on page 326 0x5050 32 RW 0x0 ECC Auto-correction Dropped Address Register lowpwreq on page 327 0x5054 32 RW 0x0 Low Power Control Register lowpwrack on page 328 0x5058 32 RW 0x0 Low Power Acknowledge Register staticcfg on page 329 0x505C 32 RW 0x0 Static Configuration Register ctrlwidth on page 329 0x5060 32 RW 0x0 Memory Controller Width Register portcfg on page 330 0x507C 32 RW 0x0 Port Configuration Register fpgaportrst on page 332 0x5080 32 RW 0x0 FPGA Ports Reset Control Register protportdefault on page 333 0x508C 32 RW 0x0 Memory Protection Port Default Register protruleaddr on page 334 0x5090 32 RW 0x0 Memory Protection Address Register protruleid on page 334 0x5094 32 RW 0x0 Memory Protection ID Register protruledata on page 335 0x5098 32 RW 0x0 Memory Protection Rule Data Register protrulerdwr on page 336 0x509C 32 RW 0x0 Memory Protection Rule Read-Write Register mppriority on page 337 0x50AC 32 RW 0x0 Scheduler priority Register remappriority on page 338 0x50E0 32 RW 0x0 Controller Command Pool Priority Remap Register External Memory Interface Handbook Volume 3: Reference Material 312 Description 7 Functional Description—HPS Memory Controller Port Sum of Weight Register Register Offset Widt h Acce ss Reset Value Description mpweight_0_4 on page 339 0x50B0 32 RW 0x0 Port Sum of Weight Register[1/4] mpweight_1_4 on page 339 0x50B4 32 RW 0x0 Port Sum of Weight Register[2/4] mpweight_2_4 on page 340 0x50B8 32 RW 0x0 Port Sum of Weight Register[3/4] mpweight_3_4 on page 340 0x50BC 32 RW 0x0 Port Sum of Weight Register[4/4] 7.15.1.1 SDRAM Controller Module Register Descriptions Address map for the SDRAM controller and multi-port front-end. All registers in this group reset to zero. Offset: 0x5000 ctrlcfg on page 315 The Controller Configuration Register determines the behavior of the controller. dramtiming1 on page 317 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. dramtiming2 on page 317 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. dramtiming3 on page 318 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. dramtiming4 on page 319 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. lowpwrtiming on page 320 This register controls the behavior of the low power logic in the controller. dramodt on page 320 This register controls which ODT pin asserts with chip select 0 (CS0) assertion and which ODT pin asserts with chip select 1 (CS1) assertion. dramaddrw on page 321 This register configures the width of the various address fields of the DRAM. The values specified in this register must match the memory devices being used. dramifwidth on page 322 This register controls the interface width of the SDRAM controller. dramsts on page 323 This register provides the status of the calibration and ECC logic. dramintr on page 323 This register can enable, disable and clear the SDRAM error interrupts. sbecount on page 324 This register tracks the single-bit error count. dbecount on page 325 This register tracks the double-bit error count. External Memory Interface Handbook Volume 3: Reference Material 313 7 Functional Description—HPS Memory Controller erraddr on page 325 This register holds the address of the most recent ECC error. dropcount on page 326 This register holds the address of the most recent ECC error. dropaddr on page 326 This register holds the last dropped address. lowpwreq on page 327 This register instructs the controller to put the DRAM into a power down state. Note that some commands are only valid for certain memory types. lowpwrack on page 328 This register gives the status of the power down commands requested by the Low Power Control register. staticcfg on page 329 This register controls configuration values which cannot be updated during active transfers. First configure the membl and eccn fields and then re-write these fields while setting the applycfg bit. The applycfg bit is write only. ctrlwidth on page 329 This register controls the width of the physical DRAM interface. portcfg on page 330 Each bit of the autopchen field maps to one of the control ports. If a port executes mostly sequential memory accesses, the corresponding autopchen bit should be 0. If the port has highly random accesses, then its autopchen bit should be set to 1. fpgaportrst on page 332 This register implements functionality to allow the CPU to control when the MPFE will enable the ports to the FPGA fabric. protportdefault on page 333 This register controls the default protection assignment for a port. Ports which have explicit rules which define regions which are illegal to access should set the bits to pass by default. Ports which have explicit rules which define legal areas should set the bit to force all transactions to fail. Leaving this register to all zeros should be used for systems which do not desire any protection from the memory controller. protruleaddr on page 334 This register is used to control the memory protection for port 0 transactions. Address ranges can either be used to allow access to memory regions or disallow access to memory regions. If TrustZone is being used, access can be enabled for protected transactions or disabled for unprotected transactions. The default state of this register is to allow all access. Address values used for protection are only physical addresses. protruleid on page 334 This register configures the AxID for a given protection rule. protruledata on page 335 This register configures the protection memory characteristics of each protection rule. protrulerdwr on page 336 This register is used to perform read and write operations to the internal protection table. mppriority on page 337 External Memory Interface Handbook Volume 3: Reference Material 314 7 Functional Description—HPS Memory Controller This register is used to configure the DRAM burst operation scheduling. remappriority on page 338 This register applies another level of port priority after a transaction is placed in the single port queue. Port Sum of Weight Register Register Descriptions on page 339 This register is used to configure the DRAM burst operation scheduling. 7.15.1.1.1 ctrlcfg The Controller Configuration Register determines the behavior of the controller. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25000 Offset: 0x5000 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 Reserved 25 24 23 22 burst terme n burst intre n nodmp ins dqstr ken RW 0x0 RW 0x0 RW 0x0 RW 0x0 9 8 7 6 21 20 19 18 17 16 1 0 starvelimit RW 0x0 15 14 13 12 11 10 reord eren gendb e gensb e eccco rren eccen addrorder membl memtype RW 0x0 RW 0x0 RW 0x0 RW 0x0 RW 0x0 RW 0x0 RW 0x0 cfg_e nable _ecc_ code_ overw rites RW 0x0 5 4 3 2 RW 0x0 ctrlcfg Fields Name Bit Description Access Rese t 25 bursttermen Set to a one to enable the controller to issue burst terminate commands. This must only be set when the DRAM memory type is LPDDR2. RW 0x0 24 burstintren Set to a one to enable the controller to issue burst interrupt commands. This must only be set when the DRAM memory type is LPDDR2. RW 0x0 23 nodmpins Set to a one to enable DRAM operation if no DM pins are connected. RW 0x0 22 dqstrken Enables DQS tracking in the PHY. RW 0x0 continued... External Memory Interface Handbook Volume 3: Reference Material 315 7 Functional Description—HPS Memory Controller Bit Description Access Rese t starvelimit Specifies the number of DRAM burst transactions an individual transaction will allow to reorder ahead of it before its priority is raised in the memory controller. RW 0x0 15 reorderen This bit controls whether the controller can re-order operations to optimize SDRAM bandwidth. It should generally be set to a one. RW 0x0 14 gendbe Enable the deliberate insertion of double bit errors in data written to memory. This should only be used for testing purposes. RW 0x0 13 gensbe Enable the deliberate insertion of single bit errors in data written to memory. This should only be used for testing purposes. RW 0x0 12 cfg_enable_ecc_code_ov erwrites Set to a one to enable ECC overwrites. ECC overwrites occur when a correctable ECC error is seen and cause a new read/ modify/write to be scheduled for that location to clear the ECC error. RW 0x0 11 ecccorren Enable auto correction of the read data returned when single bit error is detected. RW 0x0 10 eccen Enable the generation and checking of ECC. This bit must only be set if the memory connected to the SDRAM interface is 24 or 40 bits wide. If you set this, you must clear the useeccasdata field in the staticcfg register. RW 0x0 9:8 addrorder This bit field selects the order for address interleaving. Programming this field with different values gives different mappings between the AXI or Avalon-MM address and the SDRAM address. Program this field with the following binary values to select the ordering. RW 0x0 21:1 6 Name Value Description Address Interleaving 0x0 chip, row, bank, column Bank interleaved with no rank (chip select) interleaving 0x1 chip, bank, row, column No interleaving 0x2 row, chip, bank, column Bank interleaved with rank (chip select) interleaving 0x3 reserved N/A Intel recommends programming addrorder to 0x0 or 0x2. 7:3 membl Configures burst length as a static decimal value. Legal values are valid for JEDEC allowed DRAM values for the DRAM selected in cfg_type. For DDR3, this should be programmed with 8 (binary "01000"), for DDR2 it can be either 4 or 8 depending on the exact DRAM chip. LPDDR2 can be programmed with 4, 8, or 16 and LPDDR can be programmed with 2, 4, or 8. You must also program the membl field in the staticcfg register. RW 0x0 2:0 memtype This bit field selects the memory type. This field can be programmed with the following binary values: RW 0x0 Value 0x0 Description Reserved External Memory Interface Handbook Volume 3: Reference Material 316 7 Functional Description—HPS Memory Controller Bit Name Description Value Access Rese t Description 0x1 Memory type is DDR2 SDRAM 0x2 Memory type is DDR3 SDRAM 0x3 reserved 0x4 Memory type is LPDDR2 SDRAM 0x5-0x7 Reserved 7.15.1.1.2 dramtiming1 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25004 Offset: 0x5004 Access: RW Bit Fields 31 30 15 29 14 13 28 27 26 25 24 23 22 21 20 19 18 17 16 trfc tfaw trrd RW 0x0 RW 0x0 RW 0x0 12 11 10 9 8 7 6 5 4 3 2 1 trrd tcl tal tcwl RW 0x0 RW 0x0 RW 0x0 RW 0x0 0 dramtiming1 Fields Name Bit Description Access Rese t 31:2 4 trfc The refresh cycle timing parameter. RW 0x0 23:1 8 tfaw The four-activate window timing parameter. RW 0x0 17:1 4 trrd The activate to activate, different banks timing parameter. RW 0x0 13:9 tcl Memory read latency. RW 0x0 8:4 tal Memory additive latency. RW 0x0 3:0 tcwl Memory write latency. RW 0x0 7.15.1.1.3 dramtiming2 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. External Memory Interface Handbook Volume 3: Reference Material 317 7 Functional Description—HPS Memory Controller Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25008 Offset: 0x5008 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 Reserved 15 14 13 12 26 25 24 23 22 21 20 19 18 17 16 twtr twr trp trcd RW 0x0 RW 0x0 RW 0x0 RW 0x0 11 10 9 8 7 6 trcd trefi RW 0x0 RW 0x0 5 4 3 2 1 0 dramtiming2 Fields Bit Name Description Access Rese t 28:2 5 twtr The write to read timing parameter. RW 0x0 24:2 1 twr The write recovery timing. RW 0x0 20:1 7 trp The precharge to activate timing parameter. RW 0x0 16:1 3 trcd The activate to read/write timing parameter. RW 0x0 12:0 trefi The refresh interval timing parameter. RW 0x0 7.15.1.1.4 dramtiming3 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC2500C Offset: 0x500C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. External Memory Interface Handbook Volume 3: Reference Material 318 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 23 22 21 Reserved 15 14 13 12 11 10 9 8 7 6 20 19 18 17 tccd tmrd RW 0x0 RW 0x0 5 4 3 2 1 tmrd trc tras trtp RW 0x0 RW 0x0 RW 0x0 RW 0x0 16 0 dramtiming3 Fields Name Bit Description Access Rese t 22:1 9 tccd The CAS to CAS delay time. RW 0x0 18:1 5 tmrd Mode register timing parameter. RW 0x0 14:9 trc The activate to activate timing parameter. RW 0x0 8:4 tras The activate to precharge timing parameter. RW 0x0 3:0 trtp The read to precharge timing parameter. RW 0x0 7.15.1.1.5 dramtiming4 This register implements JEDEC standardized timing parameters. It should be programmed in clock cycles, for the value specified by the memory vendor. Base Address Module Instance sdr Register Address 0xFFC20000 0xFFC25010 Offset: 0x5010 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 Reserved 15 14 13 12 11 10 9 8 7 22 21 20 19 18 17 minpwrsavecycles pwrdownexit RW 0x0 RW 0x0 6 5 4 pwrdownexit selfrfshexit RW 0x0 RW 0x0 3 2 1 16 0 External Memory Interface Handbook Volume 3: Reference Material 319 7 Functional Description—HPS Memory Controller dramtiming4 Fields Bit Name Description Access Rese t 23:2 0 minpwrsavecycles The minimum number of cycles to stay in a low power state. This applies to both power down and self-refresh and should be set to the greater of tPD and tCKESR. RW 0x0 19:1 0 pwrdownexit The power down exit cycles, tXPDLL. RW 0x0 selfrfshexit The self refresh exit cycles, tXS. RW 0x0 9:0 7.15.1.1.6 lowpwrtiming This register controls the behavior of the low power logic in the controller. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25014 Offset: 0x5014 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 Reserved 18 17 16 clkdisablecycles RW 0x0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 autopdcycles RW 0x0 lowpwrtiming Fields Bit Name Description Access Rese t 19:1 6 clkdisablecycles Set to a the number of clocks after the execution of an selfrefresh to stop the clock. This register is generally set based on PHY design latency and should generally not be changed. RW 0x0 15:0 autopdcycles The number of idle clock cycles after which the controller should place the memory into power-down mode. RW 0x0 7.15.1.1.7 dramodt This register controls which ODT pin asserts with chip select 0 (CS0) assertion and which ODT pin asserts with chip select 1 (CS1) assertion. Module Instance sdr Base Address 0xFFC20000 External Memory Interface Handbook Volume 3: Reference Material 320 Register Address 0xFFC25018 7 Functional Description—HPS Memory Controller Offset: 0x5018 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 6 5 4 3 18 17 16 2 1 0 Reserved 15 14 13 12 11 10 9 8 Reserved 7 cfg_read_odt_chip cfg_write_odt_chip RW 0x0 RW 0x0 dramodt Fields Bit Name Description Access Rese t 7:4 cfg_read_odt_chip This register controls which ODT pin is asserted during reads. Bits[5:4] select the ODT pin that asserts with CS0 and bits[7:6] select the ODT pin that asserts with CS1. For example, a value of 0x9 asserts ODT[0] for accesses CS0 and ODT[1] for accesses with CS1. This field can be set to 0x1 is there is only one chip select available. RW 0x0 3:0 cfg_write_odt_chip This register controls which ODT pin is asserted during writes. Bits[1:0] select the ODT pin that asserts with CS0 and bits[3:2] select the ODT pin that asserts with CS1. For example, a value of 0x9 asserts ODT[0] for accesses CS0 and ODT[1] for accesses with CS1. This field can be set to 0x1 is there is only one chip select available. RW 0x0 7.15.1.1.8 dramaddrw This register configures the width of the various address fields of the DRAM. The values specified in this register must match the memory devices being used. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC2502C Offset: 0x502C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. External Memory Interface Handbook Volume 3: Reference Material 321 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 7 csbits bankbits rowbits colbits RW 0x0 RW 0x0 RW 0x0 RW 0x0 dramaddrw Fields Name Bit Description Access Rese t 15:1 3 csbits This field defines the number of chip select address bits. Set this field to 0x0 for single chip select and to 0x1 for two chip selects. When this field is set to 0x1, you may use rank interleaved mode by programming the ctrlcfg.addrorder field to 0x2. If you are using a single rank memory interface (csbits=0x0), you may not enable the rank interleaved mode (ctrlcfg.addrorder must be set less than 0x2). When this field is set to 0x1 to enable dual ranks, the chip select (cs) bit of the incoming address is used to determine which chip select is active. When the chip select bit of the incoming address is 0, chip select 0 becomes active. When the chip select bit of the incoming address is 1, chip select 1 becomes active. RW 0x0 12:1 0 bankbits The number of bank address bits for the memory devices in your memory interface. RW 0x0 9:5 rowbits The number of row address bits for the memory devices in your memory interface. RW 0x0 4:0 colbits The number of column address bits for the memory devices in your memory interface. RW 0x0 7.15.1.1.9 dramifwidth This register controls the interface width of the SDRAM controller. Base Address Module Instance sdr Register Address 0xFFC20000 0xFFC25030 Offset: 0x5030 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 Reserved 7 ifwidth RW 0x0 External Memory Interface Handbook Volume 3: Reference Material 322 7 Functional Description—HPS Memory Controller dramifwidth Fields Bit 7:0 Name Description Access Rese t RW 0x0 This register controls the width of the SDRAM interface, including any bits used for ECC. For example, for a 32-bit interface with ECC, program this register to 0x28. The ctrlwidth register must also be programmed. ifwidth 7.15.1.1.10 dramsts This register provides the status of the calibration and ECC logic. Base Address Module Instance sdr Register Address 0xFFC20000 0xFFC25038 Offset: 0x5038 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 corrd rop dbeer r sbeer r calfa il calsu ccess RW 0x0 RW 0x0 RW 0x0 RW 0x0 RW 0x0 Reserved 7 Reserved dramsts Fields Name Bit Description Access Rese t 4 corrdrop This bit is set to 1 when any auto-corrections have been dropped. RW 0x0 3 dbeerr This bit is set to 1 when any ECC double bit errors are detected. RW 0x0 2 sbeerr This bit is set to 1 when any ECC single bit errors are detected. RW 0x0 1 calfail This bit is set to 1 when the PHY is unable to calibrate. RW 0x0 0 calsuccess This bit will be set to 1 if the PHY was able to successfully calibrate. RW 0x0 7.15.1.1.11 dramintr This register can enable, disable and clear the SDRAM error interrupts. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC2503C External Memory Interface Handbook Volume 3: Reference Material 323 7 Functional Description—HPS Memory Controller Offset: 0x503C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 6 5 20 19 18 17 16 Reserved 15 14 13 12 11 10 9 8 7 Reserved 4 3 2 1 0 intrc lr corrd ropma sk dbema sk sbema sk intre n RW 0x0 RW 0x0 RW 0x0 RW 0x0 RW 0x0 dramintr Fields Bit Name Description Access Rese t 4 intrclr Writing to this self-clearing bit clears the interrupt signal. Writing to this bit also clears the error count and error address registers: sbecount, dbecount, dropcount, erraddr, and dropaddr. RW 0x0 3 corrdropmask Set this bit to a one to mask interrupts for an ECC correction write back needing to be dropped. This indicates a burst of memory errors in a short period of time. RW 0x0 2 dbemask Mask the double bit error interrupt. RW 0x0 1 sbemask Mask the single bit error interrupt. RW 0x0 0 intren Enable the interrupt output. RW 0x0 7.15.1.1.12 sbecount This register tracks the single-bit error count. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC25040 Offset: 0x5040 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. External Memory Interface Handbook Volume 3: Reference Material 324 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 7 Reserved count RW 0x0 sbecount Fields Name Bit 7:0 Description Reports the number of single bit errors that have occurred since the status register counters were last cleared. count Access Rese t RW 0x0 7.15.1.1.13 dbecount This register tracks the double-bit error count. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25044 Offset: 0x5044 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 6 5 4 19 18 17 16 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 7 Reserved count RW 0x0 dbecount Fields Name Bit 7:0 count Description Reports the number of double bit errors that have occurred since the status register counters were last cleared. Access Rese t RW 0x0 7.15.1.1.14 erraddr This register holds the address of the most recent ECC error. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC25048 Offset: 0x5048 External Memory Interface Handbook Volume 3: Reference Material 325 7 Functional Description—HPS Memory Controller Access: RW Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 addr RW 0x0 15 14 13 12 11 10 9 8 7 addr RW 0x0 erraddr Fields Name Bit 31:0 Description The address of the most recent ECC error. addr Access Rese t RW 0x0 Note: For a 32-bit interface, ECC is calculated across a span of 8 bytes, meaning the error address is a multiple of 8 bytes (4 bytes*2 burst length). To find the byte address of the word that contains the error, you must multiply the value in the erraddr register by 8. 7.15.1.1.15 dropcount This register holds the address of the most recent ECC error. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC2504C Offset: 0x504C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 7 Reserved corrdropcount RW 0x0 dropcount Fields Bit 7:0 Name corrdropcount Description This gives the count of the number of ECC write back transactions dropped due to the internal FIFO overflowing. 7.15.1.1.16 dropaddr This register holds the last dropped address. External Memory Interface Handbook Volume 3: Reference Material 326 Access Rese t RW 0x0 7 Functional Description—HPS Memory Controller Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25050 Offset: 0x5050 Access: RW Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 corrdropaddr RW 0x0 15 14 13 12 11 10 9 8 7 corrdropaddr RW 0x0 dropaddr Fields Bit 31:0 Name Description This register gives the last address which was dropped. corrdropaddr Access Rese t RW 0x0 7.15.1.1.17 lowpwreq This register instructs the controller to put the DRAM into a power down state. Note that some commands are only valid for certain memory types. Base Address Module Instance sdr Register Address 0xFFC20000 0xFFC25054 Offset: 0x5054 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 5 4 19 18 17 2 1 16 Reserved 15 14 13 12 11 10 Reserved 9 8 7 6 selfrfshmask RW 0x0 3 selfr shreq deeppwrdnmas k RW 0x0 RW 0x0 0 deepp wrdnr eq RW 0x0 External Memory Interface Handbook Volume 3: Reference Material 327 7 Functional Description—HPS Memory Controller lowpwreq Fields Bit 5:4 3 2:1 0 Name Description Access Rese t selfrfshmask Write a one to each bit of this field to have a self refresh request apply to both chips. RW 0x0 selfrshreq Write a one to this bit to request the RAM be put into a self refresh state. This bit is treated as a static value so the RAM will remain in self-refresh as long as this register bit is set to a one. This power down mode can be selected for all DRAMs supported by the controller. RW 0x0 deeppwrdnmask Write ones to this register to select which DRAM chip selects will be powered down. Typical usage is to set both of these bits when deeppwrdnreq is set but the controller does support putting a single chip into deep power down and keeping the other chip running. RW 0x0 deeppwrdnreq Write a one to this bit to request a deep power down. This bit should only be written with LPDDR2 DRAMs, DDR3 DRAMs do not support deep power down. RW 0x0 7.15.1.1.18 lowpwrack This register gives the status of the power down commands requested by the Low Power Control register. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25058 Offset: 0x5058 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 6 5 4 3 2 17 16 Reserved 15 14 13 12 11 10 9 8 7 Reserved 1 0 selfr fshac k deepp wrdna ck RW 0x0 RW 0x0 lowpwrack Fields Bit Name Description Access Rese t 1 selfrfshack This bit is a one to indicate that the controller is in a selfrefresh state. RW 0x0 0 deeppwrdnack This bit is set to a one after a deep power down has been executed RW 0x0 External Memory Interface Handbook Volume 3: Reference Material 328 7 Functional Description—HPS Memory Controller 7.15.1.1.19 staticcfg This register controls configuration values which cannot be updated during active transfers. First configure the membl and eccn fields and then re-write these fields while setting the applycfg bit. The applycfg bit is write only. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC2505C Offset: 0x505C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 apply cfg useec casda ta Reserved 15 14 13 12 11 10 9 8 7 Reserved RW 0x0 membl RW 0x0 RW 0x0 staticcfg Fields Name Bit Description Access Rese t 3 applycfg Write with this bit set to apply all the settings loaded in SDR registers to the memory interface. This bit is write-only and always returns 0 if read. RW 0x0 2 useeccasdata This field allows the FPGA ports to directly access the extra data bits that are normally used to hold the ECC code. The interface width must be set to 24 or 40 in the dramifwidth register. If you set this, you must clear the eccen field in the ctrlcfg register. RW 0x0 membl This field specifies the DRAM burst length. The following encodings set the burst length: • 0x0= Burst length of 2 clocks • 0x1= Burst length of 4 clocks • 0x2= Burst length of 8 clocks • 0x3= Burst length of 16 clocks If you program the this field, you must also set the membl field in the ctrlcfg register. RW 0x0 1:0 7.15.1.1.20 ctrlwidth This register controls the width of the physical DRAM interface. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC25060 External Memory Interface Handbook Volume 3: Reference Material 329 7 Functional Description—HPS Memory Controller Offset: 0x5060 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 6 5 4 3 2 17 16 1 0 Reserved 15 14 13 12 11 10 9 8 7 Reserved ctrlwidth RW 0x0 ctrlwidth Fields Bit 1:0 Name ctrlwidth Description This field specifies the SDRAM controller interface width: Value Access Rese t RW 0x0 Description 0x0 8-bit interface width 0x1 16-bit (no ECC) or 24-bit (ECC enabled) interface width 0x2 32-bit (no ECC) or 40-bit (ECC enabled) interface width Additionally, you must program the dramifwidth register. 7.15.1.1.21 portcfg Each bit of the autopchen field maps to one of the control ports. If a port executes mostly sequential memory accesses, the corresponding autopchen bit should be 0. If the port has highly random accesses, then its autopchen bit should be set to 1. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC2507C Offset: 0x507C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. External Memory Interface Handbook Volume 3: Reference Material 330 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 Reserved 18 17 16 autopchen RW 0x0 15 14 13 12 11 10 9 autopchen 8 7 6 5 Reserved 4 3 2 1 0 portprotocol RW 0x0 RO 0x0 portcfg Fields Bit 19:10 Name autopchen Description Access Reset Auto-Precharge Enable: One bit is assigned to each control port. For each bit, the encodings are as follows: RW 0x0 RO 0x0 Value Description 0x0 The controller requests an automatic precharge following a bus command completion (close the row automatically) 0x1 The controller attempts to keep a row open. All active ports with random dominated operations should set the autopchen bit to 1. The bits in this field correspond to the control ports as follows: 5:0 portprotocol Bit 9 CPU write Bit 8 L3 write Bit 7 CPU read Bit 6 L3 read Bit 5 FPGA-to-SDRAM port 5 Bit 4 FPGA-to-SDRAM port 4 Bit 3 FPGA-to-SDRAM port 3 Bit 2 FPGA-to-SDRAM port 2 Bit 1 FPGA-to-SDRAM port 1 Bit 0 FPGA-to-SDRAM port 0 Port Protocol: You can read this field to determine the protocol configuration of each of the FPGA-to-SDRAM ports. The bits in this field correspond to the control ports as follows: Bit 5 FPGA-to-SDRAM port 5 Bit 4 FPGA-to-SDRAM port 4 Bit 3 FPGA-to-SDRAM port 3 Bit 2 FPGA-to-SDRAM port 2 Bit 1 FPGA-to-SDRAM port 1 continued... External Memory Interface Handbook Volume 3: Reference Material 331 7 Functional Description—HPS Memory Controller Bit Name Description Bit 0 Access Reset FPGA-to-SDRAM port 0 When you read the corresponding port bit after the FPGA has been configured, it will have one of the following values: Value Protocol Configuration Type 0x1 AXI (reset value for all ports) 0x0 Avalon-MM Note: The value in this field is only valid after the fabric has been configured. Note: The AXI protocol requires both a read and a write port. Therefore, you must ensure that AXI ports are allocated in the pairs (port 0, port 1), (port 2, port 3), and (port 4, port 5). Aside from the requirement noted above, AXI and AvalonMM ports can be mixed freely. 7.15.1.1.22 fpgaportrst This register implements functionality to allow the CPU to control when the MPFE will enable the ports to the FPGA fabric. Base Address Module Instance sdr Register Address 0xFFC20000 0xFFC25080 Offset: 0x5080 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 Reserved 7 portrstn RW 0x0 fpgaportrst Fields Name Bit 13:0 portrstn Description Access Rese t This register should be written to with a 1 to enable the selected FPGA port to exit reset. Writing a bit to a zero will stretch the port reset until the register is written. Read data ports are connected to bits 3:0, with read data port 0 at bit 0 to read data port 3 at bit 3. Write data ports 0 to 3 are mapped to 4 to 7, with write data port 0 connected to bit 4 to RW 0x0 External Memory Interface Handbook Volume 3: Reference Material 332 7 Functional Description—HPS Memory Controller Bit Name Description Access Rese t write data port 3 at bit 7. Command ports are connected to bits 8 to 13, with command port 0 at bit 8 to command port 5 at bit 13. Expected usage would be to set all the bits at the same time but setting some bits to a zero and others to a one is supported. 7.15.1.1.23 protportdefault This register controls the default protection assignment for a port. Ports which have explicit rules which define regions which are illegal to access should set the bits to pass by default. Ports which have explicit rules which define legal areas should set the bit to force all transactions to fail. Leaving this register to all zeros should be used for systems which do not desire any protection from the memory controller. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC2508C Offset: 0x508C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 7 Reserved portdefault RW 0x0 protportdefault Fields Bit 9:0 Name portdefault Description Access Rese t Determines the default action for specified transactions. When a bit is zero, the specified access is allowed by default. When a bit is one, the specified access is denied by default. RW 0x0 Bit 9 CPU write Bit 8 L3 write Bit 7 CPU read Bit 6 L3 read Bit 5 Access to FPGA-to-SDRAM port 5 Bit 4 Access to FPGA-to-SDRAM port 4 Bit 3 Access to FPGA-to-SDRAM port 3 Bit 2 Access to FPGA-to-SDRAM port 2 External Memory Interface Handbook Volume 3: Reference Material 333 7 Functional Description—HPS Memory Controller Bit Name Description Access Bit 1 Access to FPGA-to-SDRAM port 1 Bit 0 Access to FPGA-to-SDRAM port 0 Rese t 7.15.1.1.24 protruleaddr This register is used to control the memory protection for port 0 transactions. Address ranges can either be used to allow access to memory regions or disallow access to memory regions. If TrustZone is being used, access can be enabled for protected transactions or disabled for unprotected transactions. The default state of this register is to allow all access. Address values used for protection are only physical addresses. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25090 Offset: 0x5090 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 Reserved 20 19 18 17 16 2 1 0 highaddr RW 0x0 15 14 13 12 11 10 9 8 7 6 5 highaddr lowaddr RW 0x0 RW 0x0 4 3 protruleaddr Fields Name Bit Description Access Rese t 23:1 2 highaddr Upper 12 bits of the address for a check. Address is compared to be greater than or equal to the address of a transaction. Note that since AXI transactions cannot cross a 4K byte boundary, the transaction start and transaction end address must also fall within the same 1MByte block pointed to by this address pointer. RW 0x0 11:0 lowaddr Lower 12 bits of the address for a check. Address is compared to be less than or equal to the address of a transaction. Note that since AXI transactions cannot cross a 4K byte boundary, the transaction start and transaction end address must also fall within the same 1MByte block pointed to by this address pointer. RW 0x0 7.15.1.1.25 protruleid This register configures the AxID for a given protection rule. External Memory Interface Handbook Volume 3: Reference Material 334 7 Functional Description—HPS Memory Controller Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC25094 Offset: 0x5094 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 Reserved 19 18 17 16 2 1 0 highid RW 0x0 15 14 13 12 11 10 9 8 7 6 5 highid lowid RW 0x0 RW 0x0 4 3 protruleid Fields Bit Name Description Access Rese t 23:1 2 highid AxID for the protection rule. Incoming AxID needs to be less than or equal to this value. For all AxIDs from a port, AxID high should be programmed to all ones. RW 0x0 11:0 lowid AxID for the protection rule. Incoming AxID needs to be greater than or equal to this value. For all AxIDs from a port, AxID high should be programmed to all ones. RW 0x0 7.15.1.1.26 protruledata This register configures the protection memory characteristics of each protection rule. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC25098 Offset: 0x5098 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. External Memory Interface Handbook Volume 3: Reference Material 335 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 Reserved 15 14 Reserved 13 12 11 10 9 8 ruler esult 7 portmask valid rule RW 0x0 RW 0x0 security RW 0x0 RW 0x0 protruledata Fields Name Description Access Rese t ruleresult Set this bit to a one to force a protection failure, zero to allow the access the succeed RW 0x0 portmask The bits in this field determine which ports the rule applies to. If a port's bit is set, the rule applies to that port; if the bit is clear, the rule does not apply. The bits in this field correspond to the control ports as follows: RW 0x0 Bit 13 12:3 Bit 9 CPU write Bit 8 L3 write Bit 7 CPU read Bit 6 L3 read Bit 5 FPGA-to-SDRAM port 5 Bit 4 FPGA-to-SDRAM port 4 Bit 3 FPGA-to-SDRAM port 3 Bit 2 FPGA-to-SDRAM port 2 Bit 1 FPGA-to-SDRAM port 1 Bit 0 FPGA-to-SDRAM port 0 & 2 1:0 validrule Set to bit to a one to make a rule valid, set to a zero to invalidate a rule. RW 0x0 security Valid security field encodings are: RW 0x0 Value Description 0x0 Rule applies to secure transactions 0x1 Rule applies to non-secure transactions 0x2 or 0x3 Rule applies to secure and non-secure transactions 7.15.1.1.27 protrulerdwr This register is used to perform read and write operations to the internal protection table. External Memory Interface Handbook Volume 3: Reference Material 336 7 Functional Description—HPS Memory Controller Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC2509C Offset: 0x509C Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 readr ule write rule RW 0x0 RW 0x0 Reserved 15 14 13 12 11 10 9 8 7 Reserved ruleoffset RW 0x0 protrulerdwr Fields Bit Name Description Access Rese t 6 readrule Write to this bit to have the memory_prot_data register loaded with the value from the internal protection table at offset. Table value will be loaded before a rdy is returned so read data from the register will be correct for any follow-on reads to the memory_prot_data register. RW 0x0 5 writerule Write to this bit to have the memory_prot_data register to the table at the offset specified by port_offset. Bit automatically clears after a single cycle and the write operation is complete. RW 0x0 ruleoffset This field defines which of the 20 rules in the protection table you want to read or write. RW 0x0 4:0 7.15.1.1.28 mppriority This register is used to configure the DRAM burst operation scheduling. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC250AC Offset: 0x50AC Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. External Memory Interface Handbook Volume 3: Reference Material 337 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 Reserved 23 22 21 20 19 18 17 16 5 4 3 2 1 0 userpriority RW 0x0 15 14 13 12 11 10 9 8 7 6 userpriority RW 0x0 mppriority Fields Bit 29:0 Name userpriority Description Access Rese t User Priority: This field sets the absolute user priority of each port, which is represented as a 3-bit value. 0x0 is the lowest priority and 0x7 is the highest priority. Port 0 is configured by programming userpriority[2:0], port 1 is configured by programming userpriority[5:3], port 2 is configured by programming userpriority[8:6], and so on. RW 0x0 7.15.1.1.29 remappriority This register applies another level of port priority after a transaction is placed in the single port queue. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC250E0 Offset: 0x50E0 Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 6 5 20 19 18 17 16 4 3 2 1 0 Reserved 15 14 13 12 11 10 9 8 7 Reserved priorityremap RW 0x0 remappriority Fields Bit 7:0 Name priorityremap Description Each bit of this field represents a priority level. If bit N in the priorityremap field is set, then any port transaction with absolute user priority of N jumps to the front of the single port queue and is serviced ahead of any tranactions in the queue. For example, if bit 5 is set in the priorityremap field of the remappriority register, then any port External Memory Interface Handbook Volume 3: Reference Material 338 Access Rese t RW 0x0 7 Functional Description—HPS Memory Controller Bit Name Description Access Rese t transaction with a userpriority value of 0x5 in the mppriority register is serviced ahead of any other transaction already in the single port queue. 7.15.1.1.30 Port Sum of Weight Register Register Descriptions This register is used to configure the DRAM burst operation scheduling. Offset: 0xb0 mpweight_0_4 This register is used to configure the DRAM burst operation scheduling. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC250B0 Offset: 0x50B0 Access: RW Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 staticweight_31_0 RW 0x0 15 14 13 12 11 10 9 8 7 staticweight_31_0 RW 0x0 mpweight_0_4 Fields Name Bit 31:0 staticweight_31_0 Description Set static weight of the port. Each port is programmed with a 5 bit value. Port 0 is bits 4:0, port 1 is bits 9:5, up to port 9 being bits 49:45 Access Rese t RW 0x0 mpweight_1_4 This register is used to configure the DRAM burst operation scheduling. Module Instance sdr Base Address 0xFFC20000 Register Address 0xFFC250B4 Offset: 0x50B4 Access: RW External Memory Interface Handbook Volume 3: Reference Material 339 7 Functional Description—HPS Memory Controller Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 sumofweights_13_0 17 16 staticweight _49_32 RW 0x0 RW 0x0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 staticweight_49_32 RW 0x0 mpweight_1_4 Fields Bit Name Description Access Rese t 31:1 8 sumofweights_13_0 Set the sum of static weights for particular user priority. This register is used as part of the deficit round robin implementation. It should be set to the sum of the weights for the ports RW 0x0 17:0 staticweight_49_32 Set static weight of the port. Each port is programmed with a 5 bit value. Port 0 is bits 4:0, port 1 is bits 9:5, up to port 9 being bits 49:45 RW 0x0 mpweight_2_4 This register is used to configure the DRAM burst operation scheduling. Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC250B8 Offset: 0x50B8 Access: RW Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 6 5 4 3 2 1 0 sumofweights_45_14 RW 0x0 15 14 13 12 11 10 9 8 7 sumofweights_45_14 RW 0x0 mpweight_2_4 Fields Name Bit 31:0 sumofweights_45_14 Description Access Rese t Set the sum of static weights for particular user priority. This register is used as part of the deficit round robin implementation. It should be set to the sum of the weights for the ports RW 0x0 mpweight_3_4 This register is used to configure the DRAM burst operation scheduling. External Memory Interface Handbook Volume 3: Reference Material 340 7 Functional Description—HPS Memory Controller Module Instance Base Address sdr Register Address 0xFFC20000 0xFFC250BC Offset: 0x50BC Access: RW Important: To prevent indeterminate system behavior, reserved areas of memory must not be accessed by software or hardware. Any area of the memory map that is not explicitly defined as a register space or accessible memory is considered reserved. Bit Fields 31 30 29 28 27 26 25 24 23 22 21 20 19 18 Reserved 17 16 sumofweights _63_46 RW 0x0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 sumofweights_63_46 RW 0x0 mpweight_3_4 Fields Bit 17:0 Name sumofweights_63_46 Description Access Rese t Set the sum of static weights for particular user priority. This register is used as part of the deficit round robin implementation. It should be set to the sum of the weights for the ports RW 0x0 7.16 Document Revision History Date Version Changes May 2017 2017.05.08 • • October 2016 2016.10.28 Maintenance release May 2016 2016.05.03 Maintenance release November 2015 2015.11.02 • • • • • • Added note to first topic. Rebranded as Intel. Added information regarding calculation of ECC error byte address location from erraddr register in "User Notification of ECC Errors" sectoin Added information regarding bus response to memory protection transaction failure in "Memory Protection" section Clarified "Protection" row in "Fields for Rules in Memory Protection" table in the "Memory Protection" section Clarified protruledata.security column in "Rules in Memory Protection Table for Example Configuration" table in the "Example of Configuration for TrustZone" section Added note about double-bit error functionality in "ECC Write Backs" subsection of "ECC" section Added the "DDR Calibration" subsection under "DDR PHY" section continued... External Memory Interface Handbook Volume 3: Reference Material 341 7 Functional Description—HPS Memory Controller Date Version Changes May 2015 2015.05.04 • Added the recommended sequence for writing or reading a rule in the "Memory Protection" section. December 2014 2014.12.15 • • Added SDRAM Protection Access Flow Diagram to "Memory Protection" subsection in the "Single-Port Controller Operation" section. Changed the "SDRAM Multi-Port Scheduling" section to "SDRAM MultiPort Arbitration" and added detailed information on how to use and program the priority and weighted arbitration scheme. 2014.6.30 • • • • Added Port Mappings section. Added SDRAM Controller Memory Options section. Enhanced Example of Configuration for TrustZone section. Added SDRAM Controller address map and registers. December 2013 2013.12.30 • • • Added Generating a Preloader Image for HPS with EMIF section. Added Debugging HPS SDRAM in the Preloader section. Enhanced Simulation section. November 2012 1.1 Added address map and register definitions section. January 2012 1.0 Initial release. June 2014 External Memory Interface Handbook Volume 3: Reference Material 342 8 Functional Description—HPC II Controller 8 Functional Description—HPC II Controller The High Performance Controller II works with the UniPHY-based DDR2, DDR3, and LPDDR2 interfaces. The controller provides high memory bandwidth, high clock rate performance, and run-time programmability. The controller can reorder data to reduce row conflicts and bus turn-around time by grouping reads and writes together, allowing for efficient traffic patterns and reduced latency. Note: The controller described here is the High Performance Controller II (HPC II) with advanced features for designs generated in the Quartus II software version 11.0 and later, and the Quartus Prime software. Designs created in earlier versions and regenerated in version 11.0 and later do not inherit the new advanced features; for information on HPC II without the version 11.0 and later advanced features, refer to the External Memory Interface Handbook for Quartus II version 10.1, available in the External Memory Interfaces literature section on www.altera.com. Related Links External Memory Interface Handbook, v10.1 8.1 HPC II Memory Interface Architecture The memory interface consists of the memory controller logic block, the physical logic layer (PHY), and their associated interfaces. The following figure shows a high-level block diagram of the overall external memory interface architecture. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 8 Functional Description—HPC II Controller Figure 156. High-Level Diagram of Memory Interface Architecture Memory Interface IP External Memory AFI Interface Avalon-ST Interface Avalon-MM or AXI Converter Data Master AFI Interface PHY Memory Controller CSR Interface CSR Master 8.2 HPC II Memory Controller Architecture The memory controller logic block uses an Avalon Streaming (Avalon-ST) interface as its native interface, and communicates with the PHY layer by the Altera PHY Interface (AFI). The following figure shows a block diagram of the memory controller architecture. External Memory Interface Handbook Volume 3: Reference Material 344 8 Functional Description—HPC II Controller Figure 157. Memory Controller Architecture Block Diagram Memory Controller Command Generator Timing Bank Pool Arbiter Write Data Buffer AFI Interface to PHY Avalon-ST Input Interface Rank Timer ECC Read Data Buffer CSR Interface Avalon-ST Input Interface The Avalon-ST interface serves as the entry point to the memory controller, and provides communication with the requesting data masters. For information about the Avalon interface, refer to Avalon Interface Specifications. AXI to Avalon-ST Converter The HPC II memory controller includes an AXI to Avalon-ST converter for communication with the AXI protocol. The AXI to Avalon-ST converter provides write address, write data, write response, read address, and read data channels on the AXI interface side, and command, write data, and read data channels on the Avalon-ST interface side. External Memory Interface Handbook Volume 3: Reference Material 345 8 Functional Description—HPC II Controller Handshaking The AXI protocol employs a handshaking process similar to the Avalon-ST protocol, based on ready and valid signals. Command Channel Implementation The AXI interface includes separate read and write channels, while the Avalon-ST interface has only one command channel. Arbitration of the read and write channels is based on these policies: • Round robin • Write priority—write channel has priority over read channel • Read priority—read channel has priority over write channel You can choose an arbitration policy by setting the COMMAND_ARB_TYPE parameter to one of ROUND_ROBIN, WRITE_PRIORITY, or READ_PRIORITY in the alt_mem_ddrx_axi_st_converter.v file. Data Ordering The AXI specification requires that write data IDs must arrive in the same order as write address IDs are received. Similarly, read data must be returned in the same order as its associated read address is received. Consequently, the AXI to Avalon-ST converter does not support interleaving of write data; all data must arrive in the same order as its associated write address IDs. On the read side, the controller returns read data based on the read addresses received. Burst Types The AXI to Avalon-ST converter supports the following burst types: • Incrementing burst—the address for each transfer is an increment of the previous transfer address; the increment value depends on the size of the transfer. • Wrapping burst—similar to the incrementing burst, but wraps to the lower address when the burst boundary is reached. The starting address must be aligned to the size of the transfer. Burst length must be 2, 4, 8, or 16. The burst wrap boundary = burst size * burst length. Related Links Avalon Interface Specifications 8.2.1 Backpressure Support The write response and read data channels do not support data transfer with backpressure; consequently, you must assert the ready signal for the write response and read data channels to 1 to ensure acceptance of data at any time. External Memory Interface Handbook Volume 3: Reference Material 346 8 Functional Description—HPC II Controller Figure 158. Data Transfer Without Backpressure clk rready rvalid rid rresp rdata D0 D1 D2 D3 For information about data transfer with and without backpressure, refer to the Avalon Interface Specifications. Related Links Avalon Interface Specifications 8.2.2 Command Generator The command generator accepts commands from the front-end Avalon-ST interface and from local ECC internal logic, and provides those commands to the timing bank pool. 8.2.3 Timing Bank Pool The timing bank pool is a parallel queue that works with the arbiter to enable data reordering. The timing bank pool tracks incoming requests, ensures that all timing requirements are met and, upon receiving write-data-ready notification from the write data buffer, passes the requests to the arbiter in an ordered and efficient manner. 8.2.4 Arbiter The arbiter determines the order in which requests are passed to the memory device. When the arbiter receives a single request, that request is passed immediately; however, when multiple requests are received, the arbiter uses arbitration rules to determine the order in which to pass requests to the memory device. Arbitration Rules The arbiter uses the following arbitration rules: • If only one master is issuing a request, grant that request immediately. • If there are outstanding requests from two or more masters, the arbiter applies the following tests, in order: • Is there a read request? If so, the arbiter grants the read request ahead of any write requests. • If neither of the above conditions apply, the arbiter grants the oldest request first. External Memory Interface Handbook Volume 3: Reference Material 347 8 Functional Description—HPC II Controller 8.2.5 Rank Timer The rank timer maintains rank-specific timing information, and performs the following functions: • Ensures that only four activates occur within a specified timing window. • Manages the read-to-write and write-to-read bus turnaround time. • Manages the time-to-activate delay between different banks. 8.2.6 Read Data Buffer and Write Data Buffer The read data buffer receives data from the PHY and passes that data through the input interface to the master. The write data buffer receives write data from the input interface and passes that data to the PHY, upon approval of the write request. 8.2.7 ECC Block The error-correcting code (ECC) block comprises an encoder and a decoder-corrector, which can detect and correct single-bit errors, and detect double-bit errors. The ECC block can remedy errors resulting from noise or other impairments during data transmission. 8.2.8 AFI and CSR Interfaces The AFI interface provides communication between the controller and the physical layer logic (PHY). The CSR interface provides communication with your system’s internal control status registers. For more information about AFI signals, refer to AFI 4.0 Specification in the Functional Description - UniPHY chapter. Note: Unaligned reads and writes on the AFI interface are not supported. Related Links • Avalon Interface Specifications • Functional Description—UniPHY on page 13 8.3 HPC II Controller Features The HPC II memory controller offers a variety of features. 8.3.1 Data Reordering The controller implements data reordering to maximize efficiency for read and write commands. The controller can reorder read and write commands as necessary to mitigate bus turn-around time and reduce conflict between rows. External Memory Interface Handbook Volume 3: Reference Material 348 8 Functional Description—HPC II Controller Inter-bank data reordering reorders commands going to different bank addresses. Commands going to the same bank address are not reordered. This reordering method implements simple hazard detection on the bank address level. The controller implements logic to limit the length of time that a command can go unserved. This logic is known as starvation control. In starvation control, a counter is incremented for every command served. You can set a starvation limit, to ensure that a waiting command is served immediately, when the starvation counter reaches the specified limit. 8.3.2 Pre-emptive Bank Management Data reordering allows the controller to issue bank-management commands preemptively, based on the patterns of incoming commands. The desired page in memory can be already open when a command reaches the AFI interface. 8.3.3 Quasi-1T and Quasi-2T One controller clock cycle equals two memory clock cycles in a half-rate interface, and to four memory clock cycles in a quarter-rate interface. To fully utilize the command bandwidth, the controller can operate in Quasi-1T half-rate and Quasi-2T quarter-rate modes. In Quasi-1T and Quasi-2T modes, the controller issues two commands on every controller clock cycle. The controller is constrained to issue a row command on the first clock phase and a column command on the second clock phase, or vice versa. Row commands include activate and precharge commands; column commands include read and write commands. The controller operates in Quasi-1T in half-rate mode, and in Quasi-2T in quarter-rate mode; this operation is transparent and has no user settings. 8.3.4 User Autoprecharge Commands The autoprecharge read and autoprecharge write commands allow you to indicate to the memory device that this read or write command is the last access to the currently open row. The memory device automatically closes or autoprecharges the page it is currently accessing so that the next access to the same bank is quicker. This command is useful for applications that require fast random accesses. Since the HPC II controller can reorder transactions for best efficiency, when you assert the local_autopch_req signal, the controller evaluates the current command and buffered commands to determine the best autoprecharge operation. 8.3.5 Address and Command Decoding Logic When the main state machine issues a command to the memory, it asserts a set of internal signals. The address and command decoding logic turns these signals into AFI-specific commands and address. External Memory Interface Handbook Volume 3: Reference Material 349 8 Functional Description—HPC II Controller The following signals are generated: • Clock enable and reset signals: afi_cke, afi_rst_n • Command and address signals: afi_cs_n, afi_ba, afi_addr, afi_ras_n, afi_cas_n, afi_we_n 8.3.6 Low-Power Logic There are two types of low-power logic: the user-controlled self-refresh logic and automatic power-down with programmable time-out logic. User-Controlled Self-Refresh When you assert the local_self_rfsh_req signal, the controller completes any currently executing reads and writes, and then interrupts the command queue and immediately places the memory into self-refresh mode. When the controller places the memory into self-refresh mode, it responds by asserting an acknowledge signal, local_self_rfsh_ack. You can leave the memory in self-refresh mode for as long as you choose. To bring the memory out of self-refresh mode, you must deassert the request signal, and the controller responds by deasserting the acknowledge signal when the memory is no longer in self-refresh mode. Note: If a user-controlled refresh request and a system-generated refresh request occur at the same time, the user-controlled refresh takes priority; the system-generated refresh is processed only after the user-controlled refresh request is completed. Automatic Power-Down with Programmable Time-Out The controller automatically places the memory in power-down mode to save power if the requested number of idle controller clock cycles is observed in the controller. The Auto Power Down Cycles parameter on the Controller Settings tab allows you to specify a range between 1 to 65,535 idle controller clock cycles. The counter for the programmable time-out starts when there are no user read or write requests in the command queue. Once the controller places the memory in power-down mode, it responds by asserting the acknowledge signal, local_power_down_ack. 8.3.7 ODT Generation Logic The on-die termination (ODT) generation logic generates the necessary ODT signals for the controller, based on the scheme that Intel recommends. DDR2 SDRAM Note: There is no ODT for reads. Table 82. ODT—DDR2 SDRAM Single Slot Single Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [0] mem_cs[0] Note: There is no ODT for reads. External Memory Interface Handbook Volume 3: Reference Material 350 8 Functional Description—HPC II Controller Table 83. ODT—DDR2 SDRAM Single Slot Dual Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [0] mem_cs[0] mem_cs[1] Table 84. mem_odt[1] ODT—DDR2 SDRAM Dual Slot Single Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [1] mem_cs[0] mem_cs[1] Table 85. mem_odt[0] ODT—DDR2 SDRAM Dual Slot Dual Chip-select Per DIMM (Write) Write On ODT Enabled mem_cs[0] mem_odt[2] mem_cs[1] mem_odt[3] mem_cs[2] mem_odt[0] mem_cs[3] mem_odt[1] DDR3 SDRAM Note: There is no ODT for reads. Table 86. ODT—DDR3 SDRAM Single Slot Single Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [0] mem_cs[0] Note: There is no ODT for reads. External Memory Interface Handbook Volume 3: Reference Material 351 8 Functional Description—HPC II Controller Table 87. ODT—DDR3 SDRAM Single Slot Dual Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [0] mem_cs[0] mem_cs[1] Table 88. mem_odt[1] ODT—DDR3 SDRAM Dual Slot Single Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [0] and mem_odt [1] mem_cs[0] mem_odt [0] and mem_odt [1] mem_cs[1] Table 89. ODT—DDR3 SDRAM Dual Slot Single Chip-select Per DIMM (Read) Read On ODT Enabled mem_cs[0] mem_odt[1] mem_cs[1] mem_odt[0] Table 90. ODT—DDR3 SDRAM Dual Slot Dual Chip-select Per DIMM (Write) Write On ODT Enabled mem_odt [0] and mem_odt [2] mem_cs[0] mem_odt [1]and mem_odt [3] mem_cs[1] mem_odt [0]and mem_odt [2] mem_cs[2] mem_odt [1]and mem_odt [3] mem_cs[3] External Memory Interface Handbook Volume 3: Reference Material 352 8 Functional Description—HPC II Controller Table 91. ODT—DDR3 SDRAM Dual Slot Dual Rank Per DIMM (Read) Read On ODT Enabled mem_cs[0] mem_odt[2] mem_cs[1] mem_odt[3] mem_cs[2] mem_odt[0] mem_cs[3] mem_odt[1] 8.3.8 Burst Merging The burst merging feature improves controller efficiency by merging two burst chop commands of sequential addresses into one burst length command. Burst merging is opportunistic and happens when the controller receives commands faster than it can process (for example, Avalon commands of multiple burst length) or when the controller temporarily stops processing commands due to Refresh. The burst merging feature is turned off by default when you generate a controller. If your traffic exercises patterns that you can merge, you should turn on burst merging, as follows: 1. In a text editor, open the <variation_name>_c0.v top-level file for your design. 2. Search for the ENABLE_BURST_MERGE parameter in the .v file. 3. Change the ENABLE_BURST_MERGE value from 0 to 1. 8.3.9 ECC The ECC logic comprises an encoder and a decoder-corrector, which can detect and correct single-bit errors, and detect double-bit errors. The ECC logic is available in multiples of 16, 24, 40, and 72 bits. Note: For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, ECC logic is limited to widths of 24 and 40. • The ECC logic has the following features: • Has Hamming code ECC logic that encodes every 64, 32, 16, or 8 bits of data into 72, 40, 24, or 16 bits of codeword. • Has a latency increase of one clock for both writes and reads. • For a 128-bit interface, ECC is generated as one 64-bit data path with 8-bits of ECC path, plus a second 64-bit data path with 8-bits of ECC path. • Detects and corrects all single-bit errors. • Detects all double-bit errors. • Counts the number of single-bit and double-bit errors. External Memory Interface Handbook Volume 3: Reference Material 353 8 Functional Description—HPC II Controller Note: • Accepts partial writes, which trigger a read-modify-write cycle, for memory devices with DM pins. • Can inject single-bit and double-bit errors to trigger ECC correction for testing and debugging purposes. • Generates an interrupt signal when an error occurs. When using ECC, you must initialize your entire memory content to zero before beginning to write to the memory. If you do not initialize the content to zero, and if you read from uninitialized memory locations without having first written to them, you will see junk data which will trigger an ECC interrupt. When a single-bit or double-bit error occurs, the ECC logic triggers the ecc_interrupt signal to inform you that an ECC error has occurred. When a singlebit error occurs, the ECC logic reads the error address, and writes back the corrected data. When a double-bit error occurs, the ECC logic does not do any error correction but it asserts the avl_rdata_error signal to indicate that the data is incorrect. The avl_rdata_error signal follows the same timing as the avl_rdata_valid signal. Enabling autocorrection allows the ECC logic to delay all controller pending activities until the correction completes. You can disable autocorrection and schedule the correction manually when the controller is idle to ensure better system efficiency. To manually correct ECC errors, follow these steps: 1. When an interrupt occurs, read out the SBE_ERROR register. When a single-bit error occurs, the SBE_ERROR register is equal to one. 2. Read out the ERR_ADDR register. 3. Correct the single-bit error by issuing a dummy write to the memory address stored in the ERR_ADDR register. A dummy write is a write request with the local_be signal zero, that triggers a partial write which is effectively a readmodify-write event. The partial write corrects the data at that address and writes it back. 8.3.9.1 Partial Writes The ECC logic supports partial writes. Along with the address, data, and burst signals, the Avalon-MM interface also supports a signal vector, local_be, that is responsible for byte-enable. Every bit of this signal vector represents a byte on the data-bus. Thus, a logic low on any of these bits instructs the controller not to write to that particular byte, resulting in a partial write. The ECC code is calculated on all bytes of the data-bus. If any bytes are changed, the IP core must recalculate the ECC code and write the new code back to the memory. For partial writes, the ECC logic performs the following steps: 1. The ECC logic sends a read command to the partial write address. 2. Upon receiving a return data from the memory for the particular address, the ECC logic decodes the data, checks for errors, and then merges the corrected or correct dataword with the incoming information. 3. The ECC logic issues a write to write back the updated data and the new ECC code. External Memory Interface Handbook Volume 3: Reference Material 354 8 Functional Description—HPC II Controller The following corner cases can occur: • A single-bit error during the read phase of the read-modify-write process. In this case, the IP core corrects the single-bit error first, increments the single-bit error counter and then performs a partial write to this corrected decoded data word. • A double-bit error during the read phase of the read-modify-write process. In this case, the IP core increments the double-bit error counter and issues an interrupt. The IP core writes a new write word to the location of the error. The ECC status register keeps track of the error information. The following figures show partial write operations for the controller, for full and half rate configurations, respectively. Figure 159. Partial Write for the Controller--Full Rate avl_address 0 1 avl_size avl_be avl_wdata 2 X1 XF 01234567 89ABCDEF mem_dm mem_dq 67 R R R EF CD AB 89 Figure 160. Partial Write for the Controller--Half Rate avl_address 0 avl_size 1 avl_be X1 avl_wdata 01234567 mem_dm mem_dq 67 R R R 8.3.9.2 Partial Bursts DIMMs that do not have the DM pins do not support partial bursts. You must write a minimum (or multiples) of memory-burst-length-equivalent words to the memory at the same time. The following figure shows a partial burst operation for the controller. External Memory Interface Handbook Volume 3: Reference Material 355 8 Functional Description—HPC II Controller Figure 161. Partial Burst for Controller avl_address 0 avl_size 1 avl_be X1 01234567 avl_wdata mem_dm mem_dq 67 45 23 01 8.4 External Interfaces This section discusses the interfaces between the controller and other external memory interface components. 8.4.1 Clock and Reset Interface The clock and reset interface is part of the AFI interface. The controller can have up to two clock domains, which are synchronous to each other. The controller operates with a single clock domain when there is no integrated half-rate bridge, and with two-clock domains when there is an integrated half-rate bridge. The clocks are provided by UniPHY. The main controller clock is afi_clk, and the optional half-rate controller clock is afi_half_clk. The main and half-rate clocks must be synchronous and have a 2:1 frequency ratio. The optional quarter-rate controller clock is afi_quarter_clk, which must also be synchronous and have a 4:1 frequency ratio. 8.4.2 Avalon-ST Data Slave Interface The Avalon-ST data slave interface consists of the following Avalon-ST channels, which together form a single data slave: • The command channel, which serves as command and address for both read and write operations. • The write data channel, which carries write data. • The read data channel, which carries read data. Related Links Avalon Interface Specifications 8.4.3 AXI Data Slave Interface The AXI data interface consists of the following channels, which communicate with the Avalon-ST interface through the AXI to Avalon-ST converter: External Memory Interface Handbook Volume 3: Reference Material 356 8 Functional Description—HPC II Controller • The write address channel, which carries address information for write operations. • The write data channel, which carries write data. • The write response channel, which carries write response data. • The read address channel, which carries address information for read operations. • The read data channel, which carries read data. 8.4.3.1 Enabling the AXI Interface This section provides guidance for enabling the AXI interface. 1. To enable the AXI interface, first open in an editor the file appropriate for the required flow, as indicated below: • For synthesis flow: <working_dir>/<variation_name>/ <variation_name>_c0.v • For simulation flow: <working_dir>/<variation_name>_sim/ <variation_name>/<variation_name>_c0.v • Example design fileset for synthesis: <working_dir>/ <variation_name>_example_design/example_project/ <variation_name>_example/submodules/ <variation_name>_example_if0_c0.v • Example design fileset for simulation: <working_dir>/ <variation_name>_example_design/simulation/verilog/submodules/ <variation_name>_example_sim_e0_if0_c0.v 2. Locate and remove the alt_mem_ddrx_mm_st_converter instantiation from the .v file opened in the preceding step. 3. Instantiate the alt_mem_ddrx_axi_st_converter module into the open .v file. Refer to the following code fragment as a guide: module ? # ( parameter // AXI parameters AXI_ID_WIDTH = <replace parameter value>, AXI_ADDR_WIDTH = <replace parameter value>, AXI_LEN_WIDTH = <replace parameter value>, AXI_SIZE_WIDTH = <replace parameter value>, AXI_BURST_WIDTH = <replace parameter value>, AXI_LOCK_WIDTH = <replace parameter value>, AXI_CACHE_WIDTH = <replace parameter value>, AXI_PROT_WIDTH = <replace parameter value>, AXI_DATA_WIDTH = <replace parameter value>, AXI_RESP_WIDTH = <replace parameter value> ) ( // Existing ports ... // AXI Interface ports // Write address channel input wire [AXI_ID_WIDTH - 1 : 0] awid, input wire [AXI_ADDR_WIDTH - 1 : 0] awaddr, input wire [AXI_LEN_WIDTH - 1 : 0] awlen, input wire [AXI_SIZE_WIDTH - 1 : 0] awsize, input wire [AXI_BURST_WIDTH - 1 : 0] awburst, input wire [AXI_LOCK_WIDTH - 1 : 0] awlock, input wire [AXI_CACHE_WIDTH - 1 : 0] awcache, input wire [AXI_PROT_WIDTH - 1 : 0] awprot, input wire awvalid, output wire awready, // Write data channel External Memory Interface Handbook Volume 3: Reference Material 357 8 Functional Description—HPC II Controller input wire [AXI_ID_WIDTH - 1 : 0] wid, input wire [AXI_DATA_WIDTH - 1 : 0] wdata, input wire [AXI_DATA_WIDTH / 8 - 1 : 0] wstrb, input wire wlast, input wire wvalid, output wire wready, // Write response channel output wire [AXI_ID_WIDTH - 1 : 0] bid, output wire [AXI_RESP_WIDTH - 1 : 0] bresp, output wire bvalid, input wire bready, // Read address channel input wire [AXI_ID_WIDTH - 1 : 0] arid, input wire [AXI_ADDR_WIDTH - 1 : 0] araddr, input wire [AXI_LEN_WIDTH - 1 : 0] arlen, input wire [AXI_SIZE_WIDTH - 1 : 0] arsize, input wire [AXI_BURST_WIDTH - 1 : 0] arburst, input wire [AXI_LOCK_WIDTH - 1 : 0] arlock, input wire [AXI_CACHE_WIDTH - 1 : 0] arcache, input wire [AXI_PROT_WIDTH - 1 : 0] arprot, input wire arvalid, output wire arready, // Read data channel output wire [AXI_ID_WIDTH - 1 : 0] rid, output wire [AXI_DATA_WIDTH - 1 : 0] rdata, output wire [AXI_RESP_WIDTH - 1 : 0] rresp, output wire rlast, output wire rvalid, input wire rready ); // Existing wire, register declaration and instantiation ... // AXI interface instantiation alt_mem_ddrx_axi_st_converter # ( .AXI_ID_WIDTH (AXI_ID_WIDTH ), .AXI_ADDR_WIDTH (AXI_ADDR_WIDTH ), .AXI_LEN_WIDTH (AXI_LEN_WIDTH ), .AXI_SIZE_WIDTH (AXI_SIZE_WIDTH ), .AXI_BURST_WIDTH (AXI_BURST_WIDTH ), .AXI_LOCK_WIDTH (AXI_LOCK_WIDTH ), .AXI_CACHE_WIDTH (AXI_CACHE_WIDTH ), .AXI_PROT_WIDTH (AXI_PROT_WIDTH ), .AXI_DATA_WIDTH (AXI_DATA_WIDTH ), .AXI_RESP_WIDTH (AXI_RESP_WIDTH ), .ST_ADDR_WIDTH (ST_ADDR_WIDTH ), .ST_SIZE_WIDTH (ST_SIZE_WIDTH ), .ST_ID_WIDTH (ST_ID_WIDTH ), .ST_DATA_WIDTH (ST_DATA_WIDTH ), .COMMAND_ARB_TYPE (COMMAND_ARB_TYPE) ) a0 ( .ctl_clk (afi_clk), .ctl_reset_n (afi_reset_n), .awid (awid), .awaddr (awaddr), .awlen (awlen), .awsize (awsize), .awburst (awburst), .awlock (awlock), .awcache (awcache), .awprot (awprot), .awvalid (awvalid), .awready (awready), .wid (wid), .wdata (wdata), .wstrb (wstrb), .wlast (wlast), .wvalid (wvalid), .wready (wready), External Memory Interface Handbook Volume 3: Reference Material 358 8 Functional Description—HPC II Controller .bid (bid), .bresp (bresp), .bvalid (bvalid), .bready (bready), .arid (arid), .araddr (araddr), .arlen (arlen), .arsize (arsize), .arburst (arburst), .arlock (arlock), .arcache (arcache), .arprot (arprot), .arvalid (arvalid), .arready (arready), .rid (rid), .rdata (rdata), .rresp (rresp), .rlast (rlast), .rvalid (rvalid), .rready (rready), .itf_cmd_ready (ng0_native_st_itf_cmd_ready), .itf_cmd_valid (a0_native_st_itf_cmd_valid), .itf_cmd (a0_native_st_itf_cmd), .itf_cmd_address (a0_native_st_itf_cmd_address), .itf_cmd_burstlen (a0_native_st_itf_cmd_burstlen), .itf_cmd_id (a0_native_st_itf_cmd_id), .itf_cmd_priority (a0_native_st_itf_cmd_priority), .itf_cmd_autoprecharge (a0_native_st_itf_cmd_autopercharge), .itf_cmd_multicast (a0_native_st_itf_cmd_multicast), .itf_wr_data_ready (ng0_native_st_itf_wr_data_ready), .itf_wr_data_valid (a0_native_st_itf_wr_data_valid), .itf_wr_data (a0_native_st_itf_wr_data), .itf_wr_data_byte_en (a0_native_st_itf_wr_data_byte_en), .itf_wr_data_begin (a0_native_st_itf_wr_data_begin), .itf_wr_data_last (a0_native_st_itf_wr_data_last), .itf_wr_data_id (a0_native_st_itf_wr_data_id), .itf_rd_data_ready (a0_native_st_itf_rd_data_ready), .itf_rd_data_valid (ng0_native_st_itf_rd_data_valid), .itf_rd_data (ng0_native_st_itf_rd_data), .itf_rd_data_error (ng0_native_st_itf_rd_data_error), .itf_rd_data_begin (ng0_native_st_itf_rd_data_begin), .itf_rd_data_last (ng0_native_st_itf_rd_data_last), .itf_rd_data_id (ng0_native_st_itf_rd_data_id) ); 4. Set the required parameters for the AXI interface. The following table lists the available parameters. 5. Export the AXI interface to the top-level wrapper, making it accessible to the AXI master. 6. To add the AXI interface to the Quartus Prime project: • On the Assignments > Settings menu in the Quartus Prime software, open the File tab. • Add the alt_mem_ddrx_axi_st_converter.v file to the project. External Memory Interface Handbook Volume 3: Reference Material 359 8 Functional Description—HPC II Controller 8.4.3.2 AXI Interface Parameters Table 92. AXI Interface Parameters Parameter Name Description / Value AXI_ID_WIDTH Width of the AXI ID bus. Default value is 4. AXI_ADDR_WIDTH Width of the AXI address bus. Must be set according to the Avalon interface address and data bus width as shown below:AXI_ADDR_WIDTH = LOCAL_ADDR_WIDTH + log2(LOCAL_DATA_WIDTH/ 8)LOCAL_ADDR_WIDTH is the memory controller Avalon interface address width.LOCAL_DATA_WIDTH is the memory controller Avalon data interface width. AXI_LEN_WIDTH Width of the AXI length bus. Default value is 8. Should be set to LOCAL_SIZE_WIDTH - 1, where LOCAL_SIZE_WIDTH is the memory controller Avalon interface burst size width AXI_SIZE_WIDTH Width of the AXI size bus. Default value is 3. AXI_BURST_WIDTH Width of the AXI burst bus. Default value is 2. AXI_LOCK_WIDTH Width of the AXI lock bus. Default value is 2. AXI_CACHE_WIDTH Width of the AXI cache bus. Default value is 4. AXI_PROT_WIDTH Width of the AXI protection bus. Default value is 3. AXI_DATA_WIDTH Width of the AXI data bus. Should be set to match the Avalon interface data bus width.AXI_DATA_WIDTH = LOCAL_DATA_WIDTH, where LOCAL_DATA_WIDTH is the memory controller Avalon interface input data width. AXI_RESP_WIDTH Width of the AXI response bus. Default value is 2. ST_ADDR_WIDTH Width of the Avalon interface address. Must be set to match the Avalon interface address bus width.ST_ADDR_WIDTH = LOCAL_ADDR_WIDTH, where LOCAL_ADDR_WIDTH is the memory controller Avalon interface address width. ST_SIZE_WIDTH Width of the Avalon interface burst size.ST_SIZE_WIDTH = AXI_LEN_WIDTH + 1 ST_ID_WIDTH Width of the Avalon interface ID. Default value is 4.ST_ID_WIDTH = AXI_ID_WIDTH ST_DATA_WIDTH Width of the Avalon interface data.ST_DATA_WIDTH = AXI_DATA_WIDTH. COMMAND_ARB_TYPE Specifies the AXI command arbitration type, as shown:ROUND_ROBIN: arbitrates between read and write address channel in round robin fashion. Default option.WRITE_PRIORITY: write address channel has priority if both channels send request simultaneously.READ_PRIORITY: read address channel has priority if both channels send request simultaneously. REGISTERED Setting this parameter to 1 adds an extra register stage in the AXI interface and incurs one extra clock cycle of latency. Default value is 1. 8.4.3.3 AXI Interface Ports Table 93. Name AXI Interface Ports Direction Description awid Input AXI write address channel ID bus. awaddr Input AXI write address channel address bus. awlen Input AXI write address channel length bus. awsize Input AXI write address channel size bus. awburst Input AXI write address channel burst bus.(Interface supports only INCR and WRAP burst types.) awlock Input AXI write address channel lock bus.(Interface does not support this feature.) awcache Input AXI write address channel cache bus.(Interface does not support this feature.) continued... External Memory Interface Handbook Volume 3: Reference Material 360 8 Functional Description—HPC II Controller Name Direction Description awprot Input AXI write address channel protection bus.(Interface does not support this feature.) awvalid Input AXI write address channel valid signal. awready Output AXI write address channel ready signal. wid Input AXI write address channel ID bus. wdata Input AXI write address channel data bus. wstrb Input AXI write data channel strobe bus. wlast Input AXI write data channel last burst signal. wvalid Input AXI write data channel valid signal. wready Output AXI write data channel ready signal. bid Output AXI write response channel ID bus. bresp Output AXI write response channel response bus.Response encoding information:‘b00 - OKAY‘b01 Reserved‘b10 - Reserved‘b11 - Reserved bvalid Output AXI write response channel valid signal. bready Input AXI write response channel ready signal.Must be set to 1. Interface does not support back pressure for write response channel. arid Input AXI read address channel ID bus. araddr Input AXI read address channel address bus. arlen Input AXI read address channel length bus. arsize Input AXI read address channel size bus. arburst Input AXI read address channel burst bus.(Interface supports only INCR and WRAP burst types.) arlock Input AXI read address channel lock bus.(Interface does not support this feature.) arcache Input AXI read address channel cache bus.(Interface does not support this feature.) arprot Input AXI read address channel protection bus.(Interface does not support this feature.) arvalid Input AXI read address channel valid signal. arready Output AXI read address channel ready signal. rid Output AXI read data channel ID bus. rdata Output AXI read data channel data bus. rresp Output AXI read data channel response bus.Response encoding information:‘b00 - OKAY‘b01 Reserved‘b10 - Data error‘b11 - Reserved rlast Output AXI read data channel last burst signal. rvalid Output AXI read data channel valid signal. rready Input AXI read data channel ready signal.Must be set to 1. Interface does not support back pressure for write response channel. For information about the AXI specification, refer to the ARM website. Related Links www.arm.com External Memory Interface Handbook Volume 3: Reference Material 361 8 Functional Description—HPC II Controller 8.4.4 Controller-PHY Interface The interface between the controller and the PHY is part of the AFI interface. The controller assumes that the PHY performs all necessary calibration processes without any interaction with the controller. For more information about AFI signals, refer to AFI 4.0 Specification. 8.4.5 Memory Side-Band Signals The HPC II controller supports several optional side-band signals. Self-Refresh (Low Power) Interface The optional low power self-refresh interface consists of a request signal and an acknowledgement signal, which you can use to instruct the controller to place the memory device into self-refresh mode. This interface is clocked by afi_clk. When you assert the request signal, the controller places the memory device into selfrefresh mode and asserts the acknowledge signal. To bring the memory device out of self-refresh mode, you deassert the request signal; the controller then deasserts the acknowledge signal when the memory device is no longer in self-refresh mode. Note: For multi-rank designs using the HPC II memory controller, a self-refresh and a userrefresh cannot be made to the same memory chip simultaneously. Also, the selfrefresh ack signal indicates that at least one device has entered self-refresh, but does not necessarily mean that all devices have entered self-refresh. User-Controlled Refresh Interface The optional user-controlled refresh interface consists of a request signal, a chip select signal, and an acknowledgement signal. This interface provides increased control over worst-case read latency and enables you to issue refresh bursts during idle periods. This interface is clocked by afi_clk. When you assert a refresh request signal to instruct the controller to perform a refresh operation, that request takes priority over any outstanding read or write requests that might be in the command queue. In addition to the request signal, you must also choose the chip to be refreshed by asserting the refresh chip select signal along with the request signal. If you do not assert the chip select signal with the request signal, unexpected behavior may result. The controller attempts to perform a refresh as long as the refresh request signal is asserted; if you require only one refresh, you should deassert the refresh request signal after the acknowledgement signal is received. If you maintain the request signal high after the acknowledgement is sent, it would indicate that further refresh is required. You should deassert the request signal after the required number of acknowledgement/refresh is received from the controller. You can issue up to a maximum of nine consecutive refresh commands. Note: For multi-rank designs using the HPC II memory controller, a self-refresh and a userrefresh cannot be made to the same memory chip simultaneously. External Memory Interface Handbook Volume 3: Reference Material 362 8 Functional Description—HPC II Controller Configuration and Status Register (CSR) Interface The controller has a configuration and status register (CSR) interface that allows you to configure timing parameters, address widths, and the behavior of the controller. The CSR interface is a 32-bit Avalon-MM slave of fixed address width; if you do not need this feature, you can disable it to save area. This interface is clocked by csr_clk, which is the same as afi_clk, and is always synchronous relative to the main data slave interface. 8.4.6 Controller External Interfaces The following table lists the controller’s external interfaces. Table 94. Summary of Controller External Interfaces Interface Name Display Name Type Description Clock and Reset Interface Clock and Reset Interface Clock and Reset Interface AFI (1) Clock and reset generated by UniPHY to the controller. Avalon-ST Data Slave Interface Command Channel Avalon-ST Data Slave Interface Avalon-ST (2) Address and command channel for read and write, single command single data (SCSD). Write Data Channel Avalon-ST Data Slave Interface Avalon-ST (2) Write Data Channel, single command multiple data (SCMD). Read Data Channel Avalon-ST Data Slave Interface Avalon-ST (2) Read data channel, SCMD with read data error response. Controller-PHY Interface AFI 4.0 AFI Interface AFI (1) Interface between controller and PHY. Memory Side-Band Signals Self Refresh (Low Power) Interface Self Refresh (Low Power) Interface Avalon Control & Status Interface (2) SDRAM-specific signals to place memory into low-power mode. User-Controller Refresh Interface User-Controller Refresh Interface Avalon Control & Status Interface (2) SDRAM-specific signals to request memory refresh. Configuration and Status Register (CSR) Interface CSR Configuration and Status Register Interface Avalon-MM (2) Enables on-the-fly configuration of memory timing parameters, address widths, and controller behaviour. Notes: 1. For information about AFI signals, refer to AFI 4.0 Specification in the Functional Description - UniPHY chapter. 2. For information about Avalon signals, refer to Avalon Interface Specifications. Related Links • Avalon Interface Specifications • Functional Description—UniPHY on page 13 External Memory Interface Handbook Volume 3: Reference Material 363 8 Functional Description—HPC II Controller 8.5 Top-Level Signals Description The top-level signals include clock and reset signals, local interface signals, controller interface signals, and CSR interface signals. 8.5.1 Clock and Reset Signals The following table lists the clock and reset signals. Note: The suffix _n denotes active low signals. Table 95. Clock and Reset Signals Name Direction Description Input The asynchronous reset input to the controller. The IP core derives all other reset signals from resynchronized versions of this signal. This signal holds the PHY, including the PLL, in reset while low. Input The reference clock input to PLL. global_reset_n pll_ref_clk Output phy_clk Output The system clock that the PHY provides to the user. All user inputs to and outputs from the controller must be synchronous to this clock. The reset signal that the PHY provides to the user. The IP core asserts reset_phy_clk_n asynchronously and deasserts synchronously to phy_clk clock domain. reset_phy_clk_n Output An alternative clock that the PHY provides to the user. This clock always runs at the same frequency as the external memory interface. In halfrate designs, this clock is twice the frequency of the phy_clk and you can use it whenever you require a 2x clock. In full-rate designs, the same PLL output as the phy_clk signal drives this clock. Output An alternative clock that the PHY provides to the user. This clock always runs at half the frequency as the external memory interface. In full-rate designs, this clock is half the frequency of the phy_clk and you can use it, for example to clock the user side of a half-rate bridge. In half-rate designs, or if the Enable Half Rate Bridge option is turned on. The same PLL output that drives the phy_clk signal drives this clock. Output Reference clock to feed to an externally instantiated DLL. Output Reset request output that indicates when the PLL outputs are not locked. Use this signal as a reset request input to any system-level reset controller you may have. This signal is always low when the PLL is trying to lock, and so any reset logic using Intel advises you detect a reset request on a falling edge rather than by level detection. Input Edge detect reset input for control by other system reset logic. Assert to cause a complete reset to the PHY, but not to the PLL that the PHY uses. aux_full_rate_clk aux_half_rate_clk dll_reference_clk reset_request_n soft_reset_n seriesterminationcontrol Input (for OCT slave) Output (for OCT master) Required signal for PHY to provide series termination calibration value. Must be connected to a user-instantiated OCT control block (alt_oct) or another UniPHY instance that is set to OCT master mode. Unconnected PHY signal, available for sharing with another PHY. continued... External Memory Interface Handbook Volume 3: Reference Material 364 8 Functional Description—HPC II Controller Name Direction parallelterminationcontrol Input (for OCT slave) Output (for OCT master) Description Required signal for PHY to provide series termination calibration value. Must be connected to a user-instantiated OCT control block (alt_oct) or another UniPHY instance that is set to OCT master mode. Unconnected PHY signal, available for sharing with another PHY. oct_rdn Input (for OCT master) Must connect to calibration resistor tied to GND on the appropriate RDN pin on the device. (Refer to appropriate device handbook.) oct_rup Input (for OCT master) Must connect to calibration resistor tied to Vccio on the appropriate RUP pin on the device. (See appropriate device handbook.) Input Allows the use of DLL in another PHY instance in this PHY instance. Connect the export port on the PHY instance with a DLL to the import port on the other PHY instance. Output Clock for the configuration and status register (CSR) interface, which is the same as afi_clk and is always synchronous relative to the main data slave interface. dqs_delay_ctrl_import csr_clk Note: 1. Applies only to the hard memory controller with multiport front end available in Arria V and Cyclone V devices. 8.5.2 Local Interface Signals The following table lists the controller local interface signals. Table 96. Local Interface Signals Signal Name avl_addr[] (1) Direction Input Description Memory address at which the burst should start. By default, the IP core maps local address to the bank interleaving scheme. You can change the ordering via the Local-to-Memory Address Mapping option in the Controller Settings page. This signal must remain stable only during the first transaction of a burst. The constantBurstBehavior property is always false for UniPHY controllers. The IP core sizes the width of this bus according to the following equations: • Full rate controllers: For one chip select: width = row bits + bank bits + column bits – 1 For multiple chip selects: width = chip bits* + row bits + bank bits + column bits – 1 If the bank address is 2 bits wide, row is 13 bits wide and column is 10 bits wide, the local address is 24 bits wide. To map local_address to bank, row and column address: avl_addr is 24 bits wide avl_addr[23:11]= row address[12:0] avl_addr[10:9] = bank address [1:0] avl_addr[8:0] = column address[9:1] The IP core ignores the least significant bit (LSB) of the column address (multiples of two) on the memory side, because the local data width is twice that of the memory data bus width. • Half rate controllers: continued... External Memory Interface Handbook Volume 3: Reference Material 365 8 Functional Description—HPC II Controller Signal Name Direction Description For one chip select: width = row bits + bank bits + column bits – 2 For multiple chip selects: width = chip bits* + row bits + bank bits + column bits – 2 If the bank address is 2 bits wide, row is 13 bits wide and column is 10 bits wide, the local address is 23 bits wide. To map local_address to bank, row and column address: avl_addr is 23 bits wide avl_addr[22:10] = row address[12:0] avl_addr[9:8] = bank address [1:0] avl_addr[7:0] = column address[9:2] The IP core ignores two LSBs of the column address (multiples of four) on the memory side, because the local data width is four times that of the memory data bus width. • Quarter rate controllers: For one chip select: width = row bits + bank bits + column bits – 3 For multiple chip selects: width = chip bits* + row bits + bank bits + column bits – 3 If the bank address is 2 bits wide, row is 13 bits wide and column is 10 bits wide, the local address is 22 bits wide. (* chip bits is a derived value indicating the number of address bits necessary to uniquely address every memory rank in the system; this value is not user configurable.) • Full-rate hard memory controllers (Arria V and Cyclone V): For one chip select: width = row bits + bank bits + column bits log2(local avalon data width/memory DQ width) For multiple chip selects: width = chip bits* + row bits + bank bits + column bits - log2(local avalon data width/memory data width) If the local Avalon data width is 32, the memory DQ width is 8, the bank address is 3 bits wide, the row is 12 bits wide and the column is 8 bits wide, the local address is 21 bits wide. To map local_address to bank, row and column address: avl_addr is 21 bits wide avl_addr[20:9] = row address[11:0] avl_addr[8:6] = bank address [2:0] avl_addr[5:0] = column address[7:2] The IP core ignores the two least significant bits of the column address on the memory side because the local data width is four times that of the memory data bus width (Multi-Port Frontend). Input avl_be[] (2) Byte enable signal, which you use to mask off individual bytes during writes. avl_be is active high; mem_dm is active low. To map avl_wdata and avl_be to mem_dq and mem_dm, consider a full-rate design with 32-bit avl_wdata and 16-bit mem_dq. avl_wdata = < 22334455 >< 667788AA >< BBCCDDEE> avl_be = < 1100 >< 0110 >< 1010 > These values map to: Mem_dq = <4455><2233><88AA><6677><DDEE><BBCC> Mem_dm = <1 1 ><0 0 ><0 1 ><1 0 ><0 1 ><0 1 > Input avl_burstbegin (3) The Avalon burst begin strobe, which indicates the beginning of an Avalon burst. Unlike all other Avalon-MM signals, the burst begin signal is not dependant on avl_ready. For write transactions, assert this signal at the beginning of each burst transfer and keep this signal high for one cycle per burst transfer, even if the slave deasserts avl_ready. The IP core samples this signal at the rising edge of phy_clk when avl_write_req is asserted. After continued... External Memory Interface Handbook Volume 3: Reference Material 366 8 Functional Description—HPC II Controller Signal Name Direction Description the slave deasserts the avl_ready signal, the master keeps all the write request signals asserted until avl_ready signal becomes high again. For read transactions, assert this signal for one clock cycle when read request is asserted and avl_addr from which the data should be read is given to the memory. After the slave deasserts avl_ready (waitrequest_n in Avalon interface), the master keeps all the read request signals asserted until avl_ready becomes high again. Input Read request signal. You cannot assert read request and write request signals at the same time. The controller must deassert reset_phy_clk_n before you can assert avl_read_req. Input User-controlled refresh request. If Enable User Auto-Refresh Controls option is turned on, local_refresh_req becomes available and you are responsible for issuing sufficient refresh requests to meet the memory requirements. This option allows complete control over when refreshes are issued to the memory including grouping together multiple refresh commands. Refresh requests take priority over read and write requests, unless the IP core is already processing the requests. Input Controls which chip to issue the user refresh to. The IP core uses this active high signal with local_refresh_req. This signal is as wide as the memory chip select. This signal asserts a high value to each bit that represents the refresh for the corresponding memory chip. (4) avl_read_req local_refresh_req local_refresh_chip For example: If local_refresh_chip signal is assigned with a value of 4’b0101, the controller refreshes the memory chips 0 and 2, and memory chips 1 and 3 are not refreshed. avl_size[] avl_wdata[] Input Controls the number of beats in the requested read or write access to memory, encoded as a binary number. In UniPHY, the IP core supports Avalon burst lengths from 1 to 1024. The IP core derives the width of this signal based on the burst count that you specify in the Maximum Avalon-MM burst length option. With the derived width, you specify a value ranging from 1 to the local maximum burst count specified. This signal must remain stable only during the first transaction of a burst. The constantBurstBehavior property is always false for UniPHY controllers. Input Write data bus. The width of avl_wdata is twice that of the memory data bus for a full-rate controller, four times the memory data bus for a half-rate controller, and eight times the memory data bus for a quarterrate controller. If Generate power-of-2 data bus widths for Qsys and SOPC Builder is enabled, the width is rounded down to the nearest power of 2. Input Write request signal. You cannot assert read request and write request signal at the same time. The controller must deassert reset_phy_clk_n before you can assert avl_write_req. Input User control of autoprecharge. If you turn on Enable Auto- Precharge Control, the local_autopch_req signal becomes available and you can request the controller to issue an autoprecharge write or autoprecharge read command. These commands cause the memory to issue a precharge command to the current bank at the appropriate time without an explicit precharge command from the controller. This feature is particularly useful if you know the current read or write is the last one you intend to issue to the currently open row. The next time you need to use that bank, the access could be quicker as the controller does not need to precharge the bank before activating the row you wish to access. (5) (6) avl_write_req (7) local_autopch_req (8) continued... External Memory Interface Handbook Volume 3: Reference Material 367 8 Functional Description—HPC II Controller Signal Name Direction Description Upon receipt of the local_autopch_req signal, the controller evaluates the pending commands in the command buffer and determines the most efficient autoprecharge operation to perform, reordering commands if necessary. The controller must deassert reset_phy_clk_n before you can assert local_autopch_req. Input local_self_rfsh_chip Controls which chip to issue the user refresh to. The IP core uses this active high signal with local_self_rfsh_req. This signal is as wide as the memory chip select. This signal asserts a high value to each bit that represents the refresh for the corresponding memory chip. For example: If local_self_rfsh_chip signal is assigned with a value of 4’b0101, the controller refreshes the memory chips 0 and 2, and memory chips 1 and 3 are not refreshed. Input User control of the self-refresh feature. If you turn on Enable SelfRefresh Controls, you can request that the controller place the memory devices into a self-refresh state by asserting this signal. The controller places the memory in the self-refresh state as soon as it can without violating the relevant timing parameters and responds by asserting local_self_rfsh_ack. You can hold the memory in the self-refresh state by keeping this signal asserted. You can release the memory from the self-refresh state at any time by deasserting local_self_rfsh_req and the controller responds by deasserting local__self_rfsh_ack when it has successfully brought the memory out of the self-refresh state. Output When the memory initialization, training, and calibration are complete, the PHY sequencer asserts ctrl_usr_mode_rdy to the memory controller, which then asserts this signal to indicate that the memory interface is ready for use. Output When the memory initialization, training, and calibration completes successfully, the controller asserts this signal coincident with local_init_done to indicate the memory interface is ready for use. Output When the memory initialization, training, or calibration fails, the controller asserts this signal to indicate that calibration failed. The local_init_done signal will not assert when local_cal_fail asserts. Output Read data bus. The width of avl_rdata is twice that of the memory data bus for a full rate controller; four times the memory data bus for a half rate controller. If Generate power-of-2 data bus widths for Qsys and SOPC Builder is enabled, the width is rounded down to the nearest power of 2. Output Asserted if the current read data has an error. This signal is only available if you turn on Enable Error Detection and Correction Logic. The controller asserts this signal with the avl_rdata_valid signal. If the controller encounters double-bit errors, no correction is made and the controller asserts this signal. Output Read data valid signal. The avl_rdata_valid signal indicates that valid data is present on the read data bus. Output The avl_ready signal indicates that the controller is ready to accept request signals. If controller asserts the avl_ready signal in the clock cycle that it asserts a read or write request, the controller accepts that request. The controller deasserts the avl_ready signal to indicate that it cannot accept any more requests. The controller can buffer eight read or write requests, after which the avl_ready signal goes low. local_self_rfsh_req local_init_done local_cal_success local_cal_fail avl_rdata[] (9) avl_rdata_error (10) avl_rdata_valid (11) avl_ready (12) continued... External Memory Interface Handbook Volume 3: Reference Material 368 8 Functional Description—HPC II Controller Signal Name Direction Description The avl_ready signal is deasserted when any of the following are true: • • • • write data FIFO register is full. controller is waiting for write data when in ECC mode. Refresh request acknowledge, which the controller asserts for one clock cycle every time it issues a refresh. Even if you do not turn on Enable User Auto-Refresh Controls, local_refresh_ack still indicates to the local interface that the controller has just issued a refresh command. Output Self refresh request acknowledge signal. The controller asserts and deasserts this signal in response to the local_self_rfsh_req signal. Output Auto power-down acknowledge signal. The controller asserts this signal for one clock cycle every time auto power-down is issued. Output Interrupt signal from the ECC logic. The controller asserts this signal when the ECC feature is turned on, and the controller detects an error. local_self_rfsh_ack local_power_down_ack (13) Timing bank Pool is full. FIFO register that stores read data from the memory device is Output local_refresh_ack ecc_interrupt The The full. The The Notes to Table: 1. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_addr becomes a per port value, avl_addr_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 2. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_be becomes a per port value, avl_be_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 3. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_burstbegin becomes a per port value, avl_burstbegin_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 4. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_read_req becomes a per port value, avl_read_req_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 5. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_size becomes a per port value, avl_size_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 6. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_wdata becomes a per port value, avl_wdata_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 7. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_write_req becomes a per port value, avl_write_req_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 8. This signal is not applicable to the hard memory controller. 9. For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_rdata becomes a per port value, avl_rdata_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 10.This signal is not applicable to the hard memory controller. 11.For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_rdata_valid becomes a per port value, avl_rdata_valid_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 12.For the hard memory controller with multiport front end available in Arria V and Cyclone V devices, avl_ready becomes a per port value, avl_ready_#, where # is a numeral from 0–5, based on the number of ports selected in the Controller tab. 13.This signal is not applicable to the hard memory controller. 8.5.3 Controller Interface Signals The following table lists the controller interface signals. External Memory Interface Handbook Volume 3: Reference Material 369 8 Functional Description—HPC II Controller Table 97. Interface Signals Signal Name Direction Description Bidirectional Memory data bus. This bus is half the width of the local read and write data busses. Bidirectional Memory data strobe signal, which writes data into the memory device and captures read data into the Intel device. Bidirectional Inverted memory data strobe signal, which with the mem_dqs signal improves signal integrity. mem_dq[] mem_dqs[] mem_dqs_n[] Output Clock for the memory device. Output Inverted clock for the memory device. Output Memory address bus. Output Address or command parity signal generated by the PHY and sent to the DIMM. DDR3 SDRAM only. Output Memory bank address bus. Output Memory column address strobe signal. Output Memory clock enable signals. Output Memory chip select signals. Output Memory data mask signal, which masks individual bytes during writes. Output Memory on-die termination control signal. Output Memory row address strobe signal. Output Memory write enable signal. Output This signal is not used and should be ignored. The PHY does not have capture circuity to trigger on the falling edge of mem_error_out_n. See below for a complete description of mem_error_out_n, and recommendations. mem_ck mem_ck_n mem_addr[] mem_ac_parity (1) mem_ba[] mem_cas_n mem_cke[] mem_cs_n[] mem_dm[] mem_odt mem_ras_n mem_we_n parity_error_n mem_err_out_n (1) (1) Input This is an output of registered DIMMs. When an address-andcommand parity error is detected, DDR3 registered DIMMs assert the mem_err_out_n signal in accordance with the memory buffer configuration. Unlike ECC on the data bus, the controller does not automatically correct errors on the address-and-command bus. You should connect this pin to continued... External Memory Interface Handbook Volume 3: Reference Material 370 8 Functional Description—HPC II Controller Signal Name Direction Description your own falling edge detection circuitry in order to capture when a parity error occurs. Upon error detection, action may be taken such as causing a system interrupt, or an appropriate event. Note: 1. This signal is for registered DIMMs only. 8.5.4 CSR Interface Signals The following table lists the CSR interface signals. Table 98. CSR Interface Signals Signal Name Direction Description Input Register map address.The width of csr_addr is 16 bits. Input Byte-enable signal, which you use to mask off individual bytes during writes. csr_be is active high. csr_addr[] csr_be[] Output csr_clk (1) Clock for the configuration and status register (CSR) interface, which is the same as afi_clk and is always synchronous relative to the main data slave interface. Input Write data bus. The width of csr_wdata is 32 bits. Input Write request signal. You cannot assert csr_write_req and csr_read_req signals at the same time. Input Read request signal. You cannot assert csr_read_req and csr_write_req signals at the same time. csr_wdata[] csr_write_req csr_read_req Output Read data bus. The width of csr_rdata is 32 bits. Output Read data valid signal. The csr_rdata_valid signal indicates that valid data is present on the read data bus. Output The csr_waitrequest signal indicates that the HPC II is busy and not ready to accept request signals. If the csr_waitrequest signal goes high in the clock cycle when a read or write request is asserted, that request is not accepted. If the csr_waitrequest signal goes low, the HPC II is then ready to accept more requests. csr_rdata[] csr_rdata_valid csr_waitrequest Note to Table: 1. Applies only to the hard memory controller with multiport front end available in Arria V and Cyclone V devices. 8.5.5 Soft Controller Register Map The soft controller register map allows you to control the soft memory controller settings. Note: Dynamic reconfiguration is not currently supported. External Memory Interface Handbook Volume 3: Reference Material 371 8 Functional Description—HPC II Controller The following table lists the register map for the controller. Table 99. Soft Controller Register Map Address 0x100 0x110 0x120 Default Access 0 Bit Reserved. Name 0 — Reserved for future use. 1 Reserved. 0 — Reserved for future use. 2 Reserved. 0 — Reserved for future use. 7:3 Reserved. 0 — Reserved for future use. 13:8 Reserved. 0 — Reserved for future use. 30:14 Reserved. 0 — Reserved for future use. 15:0 AUTO_PD_CYCLES 16 Description The number of idle clock cycles after which the controller should place the memory into powerdown mode. The controller is considered to be idle if there are no commands in the command queue. Setting this register to 0 disables the auto power-down mode. The default value of this register depends on the values set during the generation of the design. 0x0 Read write Reserved. 0 — Reserved for future use. 17 Reserved. 0 — Reserved for future use. 18 Reserved. 0 — Reserved for future use. 19 Reserved. 0 — Reserved for future use. 21:20 ADDR_ORDER 00 Read write 22 Reserved. 0 — Reserved for future use. 24:23 Reserved. 0 — Reserved for future use. 30:24 Reserved 0 — Reserved for future use. 7:0 Column address width — Read write The number of column address bits for the memory devices in your memory interface. The range of legal values is 7-12. 15:8 Row address width — Read write The number of row address bits for the memory devices in your memory interface. The range of legal values is 12-16. 19:16 Bank address width — Read write The number of bank address bits for the memory devices in your memory interface. The range of legal values is 2-3. 23:20 Chip select address width — Read write The number of chip select address bits for the memory devices in your memory interface. The range of legal 00 - Chip, row, bank, column.01 - Chip, bank, row, column.10 reserved for future use.11 Reserved for future use. continued... External Memory Interface Handbook Volume 3: Reference Material 372 8 Functional Description—HPC II Controller Address Bit Name Default Access Description values is 0-2. If there is only one single chip select in the memory interface, set this bit to 0. 31:24 Reserved. 0 — 0x121 31:0 Data width representation (word) — Read only The number of DQS bits in the memory interface. This bit can be used to derive the width of the memory interface by multiplying this value by the number of DQ pins per DQS pin (typically 8). 0x122 7:0 Chip select representation — Read only The number of chip select in binary representation. For example, a design with 2 chip selects has the value of 00000011. 31:8 Reserved. 0 — 3:0 tRCD — Read write The activate to read or write a timing parameter. The range of legal values is 2-11 cycles. 7:4 tRRD — Read write The activate to activate a timing parameter. The range of legal values is 2-8 cycles. 11:8 tRP — Read write The precharge to activate a timing parameter. The range of legal values is 2-11 cycles. 15:12 tMRD — Read write The mode register load time parameter. This value is not used by the controller, as the controller derives the correct value from the memory type setting. 23:16 tRAS — Read write The activate to precharge a timing parameter. The range of legal values is 4-29 cycles. 31:24 tRC — Read write The activate to activate a timing parameter. The range of legal values is 8-40 cycles. 3:0 tWTR — Read write The write to read a timing parameter. The range of legal values is 1-10 cycles. 7:4 tRTP — Read write The read to precharge a timing parameter. The range of legal values is 2-8 cycles. 15:8 tFAW — Read write The four-activate window timing parameter. The range of legal values is 6-32 cycles. 31:16 Reserved. 0 — 15:0 tREFI — Read write The refresh interval timing parameter. The range of legal values is 780-6240 cycles. 23:16 tRFC — Read write The refresh cycle timing parameter. The range of legal values is 12-255 cycles. 0x123 0x124 0x125 Reserved for future use. Reserved for future use. Reserved for future use. continued... External Memory Interface Handbook Volume 3: Reference Material 373 8 Functional Description—HPC II Controller Address 0x126 0x130 0x131 Bit Name Default Access Description 31:24 Reserved. 0 — Reserved for future use. 3:0 Reserved. 0 — Reserved for future use. 7:4 Reserved. 0 — Reserved for future use. 11:8 Reserved. 0 — Reserved for future use. 15:12 Reserved. 0 — Reserved for future use. 23:16 Burst Length — Read write 31:24 Reserved. 0 — 0 ENABLE_ECC 1 Read write When this bit equals 1, it enables the generation and checking of ECC. This bit is only active if ECC was enabled during IP parameterization. 1 ENABLE_AUTO_CORR — Read write When this bit equals 1, it enables auto-correction when a single-bit error is detected. 2 GEN_SBE 0 Read write When this bit equals 1, it enables the deliberate insertion of singlebit errors, bit 0, in the data written to memory. This bit is used only for testing purposes. 3 GEN_DBE 0 Read write When this bit equals 1, it enables the deliberate insertion of double-bit errors, bits 0 and 1, in the data written to memory. This bit is used only for testing purposes. 4 ENABLE_INTR 1 Read write When this bit equals 1, it enables the interrupt output. 5 MASK_SBE_INTR 0 Read write When this bit equals 1, it masks the single-bit error interrupt. 6 MASK_DBE_INTR 0 Read write When this bit equals 1, it masks the double-bit error interrupt 7 CLEAR 0 Read write When this bit equals 1, writing to this self-clearing bit clears the interrupt signal, and the error status and error address registers. 8 MASK_CORDROP_INTR 0 Read write When this bit equals 1, the dropped autocorrection error interrupt is dropped. 9 Reserved. 0 — 0 SBE_ERROR 0 Read only Set to 1 when any single-bit errors occur. 1 DBE_ERROR 0 Read only Set to 1 when any double-bit errors occur. 2 CORDROP_ERROR 0 Read only Value is set to 1 when any controller-scheduled autocorrections are dropped. 7:3 Reserved. 0 — Value must match memory burst length. Reserved for future use. Reserved for future use. Reserved for future use. continued... External Memory Interface Handbook Volume 3: Reference Material 374 8 Functional Description—HPC II Controller Address Bit Name Default Access Description 15:8 SBE_COUNT 0 Read only Reports the number of single-bit errors that have occurred since the status register counters were last cleared. 23:16 DBE_COUNT 0 Read only Reports the number of double-bit errors that have occurred since the status register counters were last cleared. 31:24 CORDROP_COUNT 0 Read only Reports the number of controllerscheduled autocorrections dropped since the status register counters were last cleared. 0x132 31:0 ERR_ADDR 0 Read only The address of the most recent ECC error. This address is a memory burst-aligned local address. 0x133 31:0 CORDROP_ADDR 0 Read only The address of the most recent autocorrection that was dropped. This is a memory burst-aligned local address. 0x134 0 REORDER_DATA — Read write 15:1 Reserved. 0 — 23:16 STARVE_LIMIT 0 Read write 31:24 Reserved. 0 — Reserved for future use. Number of commands that can be served before a starved command. Reserved for future use. 8.5.6 Hard Controller Register Map The hard controller register map allows you to control the hard memory controller settings. Note: Dynamic reconfiguration is not currently supported. The following table lists the register map for the hard controller. Table 100. Hard Controller Register Map Address Bit Name Default Access 0x000 3:0 0x001 Description CFG_CAS_WR_LAT 0 Read/Write Memory write latency. 4:0 CFG_ADD_LAT 0 Read/Write Memory additive latency. 0x002 4:0 CFG_TCL 0 Read/Write Memory read latency. 0x003 3:0 CFG_TRRD 0 Read/Write The activate to activate different banks timing parameter. 0x004 5:0 CFG_TFAW 0 Read/Write The four-activate window timing parameter. 0x005 7:0 CFG_TRFC 0 Read/Write The refresh cycle timing parameter. continued... External Memory Interface Handbook Volume 3: Reference Material 375 8 Functional Description—HPC II Controller Address Bit Name Default Access Description 0x006 12:0 CFG_TREFI 0 Read/Write The refresh interval timing parameter. 0x008 3:0 CFG_TREFI 0 Read/Write The activate to read/write timing parameter. 0x009 3:0 CFG_TRP 0 Read/Write The precharge to activate timing parameter. 0x00A 3:0 CFG_TWR 0 Read/Write The write recovery timing. 0x00B 3:0 CFG_TWTR 0 Read/Write The write to read timing parameter. 0x00C 3:0 CFG_TRTP 0 Read/Write The read to precharge timing parameter. 0x00D 4:0 CFG_TRAS 0 Read/Write The activate to precharge timing parameter. 0x00E 5:0 CFG_TRC 0 Read/Write The activate to activate timing parameter. 0x00F 15:0 CFG_AUTO_PD_CYCLES 0 Read/Write The number of idle clock cycles after which the controller should place the memory into power-down mode. 0x011 9:0 CFG_SELF_RFSH_EXIT_CYCLES 0 Read/Write The self-refresh exit cycles. 0x013 9:0 CFG_PDN_EXIT_CYCLES 0 Read/Write The power down exit cycles. 0x015 3:0 CFG_TMRD 0 Read/Write Mode register timing parameter. 0x016 4:0 CFG_COL_ADDR_WIDTH 0 Read/Write The number of column address bits for the memory devices in your memory interface. 0x017 4:0 CFG_ROW_ADDR_WIDTH 0 Read/Write The number of row address bits for the memory devices in your memory interface. 0x018 2:0 CFG_BANK_ADDR_WIDTH 0 Read/Write The number of bank address bits for the memory devices in your memory interface. 0x019 2:0 CFG_CS_ADDR_WIDTH 0 Read/Write The number of chip select address bits for the memory devices in your memory interface. 0x035 3:0 CFG_TCCD 0 Read/Write CAS#-to-CAS# command delay. 0x035 3:0 CFG_WRITE_ODT_CHIP 0 Read/Write CAS#-to-CAS# command delay. 0x037 3:0 CFG_READ_ODT_CHIP 0 Read/Write Read ODT Control. 0x040 2:0 CFG_TYPE 0 Read/Write Selects memory type. 7:3 CFG_BURST_LENGTH 0 Read/Write Configures burst length as a static decimal value. 9:8 CFG_ADDR_ORDER 0 Read/Write Address order selection. 10 CFG_ENABLE_ECC 0 Read/Write Enable the generation and checking of ECC. continued... External Memory Interface Handbook Volume 3: Reference Material 376 8 Functional Description—HPC II Controller Address Bit Name Default Access Description 11 CFG_ENABLE_AUTO_CORR 0 Read/Write Enable auto correction when single bit error is detected. 12 CFG_GEN_SBE 0 Read/Write When this bit equals 1, it enables the deliberate insertion of single-bit errors, bit 0, in the data written to memory. This bit is used only for testing purposes. 13 CFG_GEN_DBE 0 Read/Write When this bit equals 1, it enables the deliberate insertion of double-bit errors, bits 0 and 1, in the data written to memory. This bit is used only for testing purposes. 14 CFG_REORDER_DATA 0 Read/Write Enable Data Reordering. 15 CFG_USER_RFSH 0 Read/Write Enable User Refresh. 16 CFG_REGDIMM_ENABLE 0 Read/Write REG DIMM Configuration. 17 CFG_ENABLE_DQS_TRACKING 0 Read/Write Enable DQS Tracking. 18 CFG_OUTPUT_REGD 0 Read/Write Enable Registered Output. 19 CFG_ENABLE_NO_DM 0 Read/Write No Data Mask Configuration. 20 CFG_ENABLE_ECC_CODE_OVE RWRITES 0 Read/Write Enable ECC Code Overwrite in Double Error Bit Detection. 0x043 7:0 CFG_INTERFACE_WIDTH 0 Read/Write Memory Interface Width. 0x044 3:0 CFG_DEVICE_WIDTH 0 Read/Write Memory Device Width. 0x045 0 CFG_CAL_REQ 0 Read/Write Request re-calibration. CFG_CLOCK_OFF 0 Read/Write Disable memory clock. 0 STS_CAL_SUCCESS 0 Read Only Calibration Success. 1 STS_CAL_FAIL 0 Read Only Calibration Fail. 2 STS_SBE_ERROR 0 Read Only Single Bit Error Detected. 3 STS_DBE_ERROR 0 Read Only Double Bit Error Detected. 4 STS_CORR_DROPPED 0 Read Only Auto Correction Dropped. 0 CFG_ENABLE_INTR 0 Read/Write Enable Interrupt 1 CFG_MASK_SBE_INTR 0 Read/Write Mask Single Bit Error Interrupt. 2 CFG_MASK_DBE_INTR 0 Read/Write Mask Double Bit Error Interrupt. 3 CFG_MASK_DBE_INTR 0 Write Clear Clear Interrupt. 6:1 0x047 0x048 0x049 7:0 STS_SBD_COUNT 0 Read Only Reports the number of SBE errors that have occurred since the status register counters were last cleared. 0x04A 7:0 STS_DBE_COUNT 0 Read Only Reports the number of SBE errors that have occurred since the status register counters were last cleared. continued... External Memory Interface Handbook Volume 3: Reference Material 377 8 Functional Description—HPC II Controller Address Bit Name Default Access Description STS_ERR_ADDR 0 Read Only The address of the most recent ECC error. 0x04B 31:0 0x04F 0 CFG_MASK_CORR_DROPPED_I NTR 0 Read/Write Auto Correction Dropped Count. 0x050 7:0 CFG_MASK_CORR_DROPPED_I NTR 0 Read Only Auto Correction Dropped Count. 0x051 31:0 STS_CORR_DROPPED_ADDR 0 Read Only Auto Correction Dropped Address. 0x055 5:0 CFG_STARVE_LIMIT 0 Read/Write Starvation Limit. 0x056 1:0 CFG_MEM_BL 0 Read/Write Burst Length. 2 CFG_MEM_BL 0 Read/Write ECC Enable. 0x057 1:0 CFG_MEM_BL 0 Read/Write Specifies controller interface width. 0x058 11:0 CMD_PORT_WIDTH 0 Read/Write Specifies per command port data width. 0x05A 11:0 CMD_FIFO_MAP 0 Read/Write Specifies command port to Write FIFO association. 0x05C 11:0 CFG_CPORT_RFIFO_MAP 0 Read/Write Specifies command port to Read FIFO association. 23:12 CFG_RFIFO_CPORT_MAP 0 Read/Write Port assignment (0 - 5) associated with each of the N FIFO. 31:24 CFG_WFIFO_CPORT_MAP 0 Read/Write Port assignment (0 - 5) associated with each of the N FIFO. (con't) 3:0 CFG_WFIFO_CPORT_MAP 0 Read/Write Port assignment (0 - 5) associated with each of the N FIFO. CFG_CPORT_TYPE 0 Read/Write Command port type. 2:0 CFG_CLOSE_TO_FULL 0 Read/Write Indicates when the FIFO has this many empty entries left. 5:3 CFG_CLOSE_TO_EMPTY 0 Read/Write Indicates when the FIFO has this many valid entries left. 11:6 CFG_CLOSE_TO_EMPTY 0 Read/Write Port works in synchronous mode. 12 CFG_INC_SYNC 0 Read/Write Set the number of FF as clock synchronizer. 5:0 CFG_ENABLE_BONDING 0 Read/Write Enables bonding for each of the control ports. 7:6 CFG_DELAY_BONDING 0 Read/Write Set to the value used for the bonding input to bonding output delay. 0x069 5:0 CFG_AUTO_PCH_ENABLE 0 Read/Write Control auto-precharage options. 0x06A 17:0 MP_SCHEDULER_PRIORITY 0 Read/Write Set absolute user priority of the port 0x06D 29:0 RCFG_ST_WT 0 Read/Write Set static weight of the port. 0x05D 14:4 0x062 0x067 continued... External Memory Interface Handbook Volume 3: Reference Material 378 8 Functional Description—HPC II Controller Address Bit Name Default Access Description 31:30 RCFG_SUM_PRI_WT 0 Read/Write Set the sum of static weights for particular user priority. 0x06E 31:0 RCFG_SUM_PRI_WT 0 Read/Write Set the sum of static weights for particular user priority. 0x06F 29:0 RCFG_SUM_PRI_WT 0 Read/Write Set the sum of static weights for particular user priority. 0x0B9 0 CFG_DISABLE_MERGING 0 Read/Write Set to a one to disable command merging. 8.6 Sequence of Operations Various blocks pass information in specific ways in response to write, read, and readmodify-write commands. Write Command When a requesting master issues a write command together with write data, the following events occur: • The input interface accepts the write command and the write data. • The input interface passes the write command to the command generator and the write data to the write data buffer. • The command generator processes the command and sends it to the timing bank pool. • Once all timing requirements are met and a write-data-ready notification has been received from the write data buffer, the timing bank pool sends the command to the arbiter. • When rank timing requirements are met, the arbiter grants the command request from the timing bank pool and passes the write command to the AFI interface. • The AFI interface receives the write command from the arbiter and requests the corresponding write data from the write data buffer. • The PHY receives the write command and the write data, through the AFI interface. Read Command When a requesting master issues a read command, the following events occur: • The input interface accepts the read command. • The input interface passes the read command to the command generator. • The command generator processes the command and sends it to the timing bank pool. • Once all timing requirements are met, the timing bank pool sends the command to the arbiter. • When rank timing requirements are met, the arbiter grants the command request from the timing bank pool and passes the read command to the AFI interface. • The AFI interface receives the read command from the arbiter and passes the command to the PHY. External Memory Interface Handbook Volume 3: Reference Material 379 8 Functional Description—HPC II Controller • The PHY receives the read command through the AFI interface, and returns read data through the AFI interface. • The AFI interface passes the read data from the PHY to the read data buffer. • The read data buffer sends the read data to the master through the input interface. Read-Modify-Write Command A read-modify-write command can occur when enabling ECC for partial write, and for ECC correction commands. When a read-modify-write command is issued, the following events occur: • The command generator issues a read command to the timing bank pool. • The timing bank pool and arbiter passes the read command to the PHY through the AFI interface. • The PHY receives the read command, reads data from the memory device, and returns the read data through the AFI interface. • The read data received from the PHY passes to the ECC block. • The read data is processed by the write data buffer. • When the write data buffer issues a read-modify-write data ready notification to the command generator, the command generator issues a write command to the timing bank pool. The arbiter can then issue the write request to the PHY through the AFI interface. • When the PHY receives the write request, it passes the data to the memory device. 8.7 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 • • Changed instances of Quartus II to Quartus Prime. Added CFG_GEN_SBE and CFG_GEN_DBE to Hard Controller Register Map table. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 • • Renamed Controller Register Map to Soft Controller Register Map. Added Hard Controller Register Map. August 2014 2014.08.15 • Added "asynchronous" to descriptions of mp_cmd_reset_n_#_reset_n, mp_rfifo_reset_n_#_reset_n, and mp_wfifo_reset_n_#_reset_n signals in the MPFE Signals table. Added Reset description to Hard Memory Controller section. Added full-rate hard memory controller information for Arria V and Cyclone V to description of avl_addr[] in the Local Interface Signals table. • • • Reworded avl_burstbegin description in the Local Interface Signals table. continued... External Memory Interface Handbook Volume 3: Reference Material 380 8 Functional Description—HPC II Controller Date December 2013 Version 2013.12.16 Changes • • • • • • • • Removed references to ALTMEMPHY. Removed references to SOPC Builder. Removed Half-Rate Bridge information. Modified Burst Merging description. Expanded description of avl_ready in Local Interface Signals table. Added descriptions of local_cal_success and local_cal_fail to Local Interface Signals table. Modified description of avl_size in Local Interface Signals table. Added guidance to initialize memory before use. November 2012 2.1 • • • • Added Controller Register Map information. Added Burst Merging information. Updated User-Controlled Refresh Interface information. Changed chapter number from 4 to 5. June 2012 2.0 • • Added LPDDR2 support. Added Feedback icon. November 2011 1.1 • • • • Revised Figure 5–1. Added AXI to Avalon-ST Converter information. Added AXI Data Slave Interface information. Added Half-Rate Bridge information. External Memory Interface Handbook Volume 3: Reference Material 381 9 Functional Description—QDR II Controller 9 Functional Description—QDR II Controller The QDR II, QDR II+, and QDR II+ Xtreme controller translates memory requests from the Avalon Memory-Mapped (Avalon-MM) interface to AFI, while satisfying timing requirements imposed by the memory configuration. QDR II, QDR II+, and QDR II+ Xtreme SRAM has unidirectional data buses, therefore read and write operations are highly independent of each other and each has its own interface and state machine. 9.1 Block Description The following figure shows a block diagram of the QDR II, QDR II+, and QDR II+ Xtreme SRAM controller architecture. Figure 162. QDR II, QDR II+, and QDR II+ Xtreme SRAM Controller Architecture Block Diagram Controller Write Data FIFO AFI Command Issuing FSM Avalon-MM Slave Write Interface Avalon-MM Slave Write Interface Avalon-MM Slave Read Interface 9.1.1 Avalon-MM Slave Read and Write Interfaces The read and write blocks accept read and write requests, respectively, from the Avalon-MM interface. Each block has a simple state machine that represents the state of the command and address registers, which stores the command and address when a request arrives. The read data passes through without the controller registering it, as the PHY takes care of read latency. The write data goes through a pipeline stage to delay for a fixed number of cycles as specified by the write latency. In the full-rate burst length of four controller, the write data is also multiplexed into a burst of 2, which is then multiplexed again in the PHY to become a burst of 4 in DDR. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 9 Functional Description—QDR II Controller The user interface to the controller has separate read and write Avalon-MM interfaces because reads and writes are independent of each other in the memory device. The separate channels give efficient use of available bandwidth. 9.1.2 Command Issuing FSM The command-issuing finite-state machine (FSM) has two states: INIT and INIT_COMPLETE. In the INIT_COMPLE TE state, commands are issued immediately as requests arrive using combinational logic and do not require state transitions. 9.1.3 AFI The QDR II, QDR II+, and QDR II+ Xtreme controller communicates with the PHY using the AFI interface. In the full-rate burst-length-of-two configuration, the controller can issue both read and write commands in the same clock cycle. In the memory device, both commands are clocked on the positive edge, but the read address is clocked on the positive edge, while the write address is clocked on the negative edge. Care must be taken on how these signals are ordered in the AFI. For the half-rate burst-length-of-four configuration, the controller also issues both read and write commands, but the AFI width is doubled to fill two memory clocks per controller clock. Because the controller issues only one write command and one read command per controller clock, the AFI read and write signals corresponding to the other memory cycle are tied to no operation (NOP). For the full-rate burst-length-of-four configuration, the controller alternates between issuing read and write commands every clock cycle. The memory device requires two clock cycles to complete the burst-length-of-four operation and requires an interleaving of read and write commands. For information on the AFI, refer to AFI 4.0 Specification in chapter 1, Functional Description - UniPHY. Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Functional Description—UniPHY on page 13 9.2 Avalon-MM and Memory Data Width The following table lists the data width ratio between the memory interface and the Avalon-MM interface. The half-rate controller does not support burst-of-2 devices because it under-uses the available memory bandwidth. Regardless of full or half-rate decision and the device burst length, the Avalon-MM interface must supply all the data for the entire memory burst in a single clock cycle. Therefore the Avalon-MM data width of the full-rate controller with burst-of-4 devices is four times the memory data width. For widthexpanded configurations, the data width is further multiplied by the expansion factor (not shown in table 5-1 and 5-2). External Memory Interface Handbook Volume 3: Reference Material 383 9 Functional Description—QDR II Controller Table 101. Data Width Ratio Memory Burst Length QDR II 2-word burst Half-Rate Designs Full-Rate Designs No Support 2:1 QDR II, QDR II+, and QDR II+ Extreme 4-word burst 4:1 9.3 Signal Descriptions The following tables lists the signals of the controller’s Avalon-MM slave interface. Table 102. Avalon-MM Slave Read Signals UniPHY Signal Name Arria 10 Signal Name Width (UniPHY) 1 avl_r_ready amm_ready_0 avl_r_read_req amm_read_0 avl_r_addr amm_address_0 avl_r_rdata_valid amm_readdatavalid_0 avl_r_rdata amm_readdata_0 avl_r_size amm_burstcount_0 Width (Arria 10) 1 Direction Out waitrequest_n 1 1 In read 15–25 17–23 In address 1 1 Out readdatavalid 9, 18, 36 (or 8, 16, 32 if powerof-2-bus is enabled in the controller) 9, 18, 36 (or 8, 16, 32 if powerof-2-bus is enabled in the controller) log_2(MAX_BURS T_SIZE) + 1 ceil(log2(CTRL_Q DR2_AVL_MAX_B URST_COUNT+1)) Out readdata In burstcount Note: To obtain the actual signal width, you must multiply the widths in the above table by the data width ratio and the width expansion ratio. Table 103. Avalon-MM Slave Write Signals UniPHY Signal Name Arria 10 Signal Name Width (UniPHY) 1 avl_w_ready amm_ready_1 avl_w_write_req amm_write_1 avl_w_addr amm_address_1 Avalon-MM Signal Type Width (Arria 10) 1 Direction Avalon-MM Signal Type Out waitrequest_n 1 1 In write 15–25 17–23 In address continued... External Memory Interface Handbook Volume 3: Reference Material 384 9 Functional Description—QDR II Controller UniPHY Signal Name Arria 10 Signal Name avl_w_wdata amm_writedata_1 avl_w_be amm_byteenable_1 avl_w_size amm_burstcount_1 Note: Width (UniPHY) Width (Arria 10) Direction 9, 18, 36 (or 8, 16, 32 if powerof-2-bus is enabled in the controller) 9, 18, 36 (or 8, 16, 32 if powerof-2-bus is enabled in the controller) In 1,2,4 1,2,4 In Avalon-MM Signal Type writedata byteenable log_2(MAX_BURS T_SIZE) + 1 ceil(log2(CTRL_Q DR2_AVL_MAX_B URST_COUNT+1)) In burstcount To obtain the actual signal width, you must multiply the widths in the above table by the data width ratio and the width expansion ratio. Related Links Avalon Interface Specifications 9.4 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 • • • • Added QDR II+ Xtreme throughout the chapter. Added full-rate, burst-length-of-four information to AFI section. Revised Avalon-MM Slave Read Signals table to include Arria 10 information. Revised Avalon-MM Slave Write Signals table to include Arria 10 information. December 2013 2013.12.16 Removed references to SOPC Builder. November 2012 3.3 Changed chapter number from 5 to 6. June 2012 3.2 Added Feedback icon. November 2011 3.1 Harvested Controller chapter from 11.0 QDR II and QDR II+ SRAM Controller with UniPHY User Guide. External Memory Interface Handbook Volume 3: Reference Material 385 10 Functional Description—QDR-IV Controller 10 Functional Description—QDR-IV Controller The QDR-IV controller translates memory requests from the Avalon Memory-Mapped (Avalon-MM) interface to AFI, while satisfying timing requirements imposed by the memory configuration. QDR-IV has two independent bidirectional data ports, each of which can perform read and write operations on each memory clock cycle, subject to bank address and bus turnaround restrictions. To maximize data bus utilization, the QDR-IV controller provides eight separate Avalon interfaces, one for each QDR-IV data port and time slot (at quarter rate). 10.1 Block Description The following figure shows a block diagram of the QDR-IV controller architecture. Figure 163. QDR-IV Controller Architecture Block Diagram QDR-IV Controller Avalon-MM Slave 0 . .. Avalon-MM Slave 7 Avalon FSM Queue Scheduler Address Parity Address/Write data bus inversion AFI AFI to AVL Note: Read data bus inversion To achieve maximum performance for a QDR-IV interface on an Arria 10 device, read data bus inversion must be enabled. 10.1.1 Avalon-MM Slave Read and Write Interfaces To maximize available bandwidth utilization, the QDR-IV external memory controller provides eight separate bidirectional Avalon interfaces—one channel for each of the two QDR-IV data ports in each of the four memory time slots. At the memory device, the PHY ensures that port A commands are issued on the rising edge of the clock, and port B commands are issued on the falling edge of the clock. The Avalon finite state machine (FSM) implements the standard Avalon-MM interface for each channel. New commands arrive in a queue, before being sent to the PHY by the scheduler. The scheduler ensures that bank policy and bus turnaround restrictions are met. The controller handles address/data bus inversion, if you have enabled those features. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 10 Functional Description—QDR-IV Controller The controller schedules commands in a simple round-robin fashion, always maintaining the time slot relationship shown in the following table. At each AFI cycle, the scheduler attempts to issue up to eight commands—one from each channel. If a particular command cannot be issued due to bank policy or bus turnaround violations, only commands from the preceding channels are issued. The remaining commands are issued in the following one or more cycles, depending on whether further violations occur. Read data passes through without the controller registering it, after AFI to Avalon bus conversion. The PHY implements read latency requirements. Table 104. Avalon-MM to Memory Time Slot and Port Mapping Avalon-MM Slave Interface Memory Time Slot Memory Port 0 0 A 1 0 B 2 1 A 3 1 B 4 2 A 5 2 B 6 3 A 7 3 B 10.1.2 AFI The QDR-IV controller communicates with the PHY using the AFI interface. The controller supports a quarter-rate burst-length-of-two configuration, and can issue up to eight read and/or write commands per controller clock cycle. The controller may have to reduce the actual number of commands issued to avoid banking and bus turnaround violations. If you use QDR-IV devices with banked operation, you cannot access the same bank in the same memory clock cycle. Write latency is much shorter than read latency, therefore you must be careful not to place write data at the same time that read data is driven on the bus, on the same port. In addition, when switching between read and write operations, a delay may be required to avoid signal reflection and allow enough time for dynamic on-chip termination (OCT) to work. If necessary, the scheduler in the controller can delay the issuing of commands to avoid these issues. For information on the AFI, refer to AFI 4.0 Specification in Functional Description Arria 10 EMIF IP. Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Functional Description—Intel Arria 10 EMIF IP on page 144 Intel Arria 10 devices can interface with external memory devices clocking at frequencies of up to 1.3 GHz. The external memory interface IP component for Arria 10 devices provides a single parameter editor for creating external memory interfaces, regardless of memory protocol. External Memory Interface Handbook Volume 3: Reference Material 387 10 Functional Description—QDR-IV Controller 10.2 Avalon-MM and Memory Data Width The following table lists the data width ratio between the memory interface and the Avalon-MM interface. QDR-IV memory devices use a burst length of 2. Because the Avalon-MM interface must supply all of the data for the entire memory burst in a single controller clock cycle, the Avalon-MM data width of the controller is double the memory data width. For width-expanded configurations, the data width is further multiplied by the expansion factor. Table 105. Data Width Ratio Memory Burst Length Controller 2 2:1 10.3 Signal Descriptions The following tables lists the signals of the controller’s Avalon-MM slave interface. Table 106. Avalon-MM Slave Interface Signals Signal Name Width 1 Direction Out amm_ready_i waitrequest_n 1 In amm_read_i read 1 In amm_write_i write 21-25 In amm_address_i address 1 Out amm_readdatavalid_i readdatavalid 18, 36, 72 Out amm_readdata_i readdata 18, 36, 72 In amm_writedata_i amm_burstcount_i Avalon-MM Signal Type writedata ceil(log2(CTRL_QDR4_AVL_MAX_BUR ST_COUNT+1)) 1 In burstcount Out emif_usr_clk clk 1 Out emif_usr_reset_n reset_n 1 global_reset_n External Memory Interface Handbook Volume 3: Reference Material 388 In reset_n 10 Functional Description—QDR-IV Controller Note: 1. In the above table, the _i suffix represents the Avalon-MM interface index, which has the range of 0-7. 2. To obtain the actual signal width for the data ports (readdata, writedata), you must multiply the widths in the above table by the data width ratio and the width expansion ratio. 3. • emif_usr_clk: User clock domain. • emif_usr_reset_n: Reset for the clock domain. Asynchronous assertion and synchronous deassertion (to the emif_usr_clk clock signal). • global_reset_n: Asynchronous reset causes the memory interface to be reset and recalibrated. Related Links Avalon Interface Specifications 10.4 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Initial Release. External Memory Interface Handbook Volume 3: Reference Material 389 11 Functional Description—RLDRAM II Controller 11 Functional Description—RLDRAM II Controller The RLDRAM II controller translates memory requests from the Avalon MemoryMapped (Avalon-MM) interface to AFI, while satisfying timing requirements imposed by the memory configurations. 11.1 Block Description The following figure shows a block diagram of the RLDRAM II controller architecture. Figure 164. RLDRAM II Controller Architecture Block Diagram Controller Write Data FIFO AFI Command Issuing FSM Avalon-MM Slave Write Interface Avalon-MM Slave Write Interface Avalon-MM Slave Read Interface 11.1.1 Avalon-MM Slave Interface The Avalon-MM slave interface accepts read and write requests. A simple state machine represents the state of the command and address registers, which stores the command and address when a request arrives. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 11 Functional Description—RLDRAM II Controller The Avalon-MM slave interface decomposes the Avalon-MM address to the memory bank, column, and row addresses. The IP automatically maps the bank address to the LSB of the Avalon address vector. The Avalon-MM slave interface includes a burst adaptor, which has two parts: • The first part is a read and write request combiner that groups requests to sequential addresses into the native memory burst. Given that the second request arrives within the read and write latency window of the first request, the controller can combine and satisfy both requests with a single memory transaction. • The second part is the burst divider in the front end of the Avalon-MM interface, which breaks long Avalon bursts into individual requests of sequential addresses, which then pass to the controller state machine. 11.1.2 Write Data FIFO Buffer The write data FIFO buffer accepts write data from the Avalon-MM interface. The AFI controls the subsequent consumption of the FIFO buffer write data. 11.1.3 Command Issuing FSM The command issuing finite-state machine (FSM) has three states. The controller is in the INIT state when the PHY initializes the memory. Upon receiving the afi_cal_success signal, the state transitions to INIT_COMPLETE. If the calibration fails, afi_cal_fail is asserted and the state transitions to INIT_FAIL. The PHY receives commands only in the INIT_COMPLETE state. When a refresh request arrives at the state machine at the same time as a read or write request, the refresh request takes precedence. The read or write request waits until there are no more refresh requests, and is issued immediately if timing requirements are met. 11.1.4 Refresh Timer With automatic refresh, the refresh timer periodically issues refresh requests to the command issuing FSM. The refresh interval can be set at generation. 11.1.5 Timer Module The timer module contains one DQ timer and eight bank timers (one per bank). The DQ timer tracks how often read and write requests can be issued, to avoid bus contention. The bank timers track the cycle time (tRC). The 8-bit wide output bus of the bank timer indicates to the command issuing FSM whether each bank can be issued a read, write, or refresh command. 11.1.6 AFI The RLDRAM II controller communicates with the PHY using the AFI interface. For information on the AFI, refer to AFI 4.0 Specification in Functional Description UniPHY. External Memory Interface Handbook Volume 3: Reference Material 391 11 Functional Description—RLDRAM II Controller Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Functional Description—UniPHY on page 13 11.2 User-Controlled Features The General Settings tab of the parameter editor contains several features which are disabled by default. You can enable these features as required for your external memory interface. 11.2.1 Error Detection Parity The error detection feature asserts an error signal if it detects any corrupted data during the read process. The error detection parity protection feature creates a simple parity encoder block which processes all read and write data. For every 8 bits of write data, a parity bit is generated and concatenated to the data before it is written to the memory. During the subsequent read operation, the parity bit is checked against the data bits to ensure data integrity. When you enable the error detection parity protection feature, the local data width is reduced by one. For example, a nine-bit memory interface will present eight bits of data to the controller interface. You can enable error detection parity protection in the Controller Settings section of the General Settings tab of the parameter editor. 11.2.2 User-Controlled Refresh The user-controlled refresh feature allows you to take control of the refresh process that the controller normally performs automatically. You can control when refresh requests occur, and, if there are multiple memory devices, you control which bank receives the refresh signal. When you enable this feature, you disable auto-refresh, and assume responsibility for maintaining the necessary average periodic refresh rate. You can enable usercontrolled refresh in the Controller Settings section of the General Settings tab of the parameter editor. 11.3 Avalon-MM and Memory Data Width The following table lists the data width ratio between the memory interface and the Avalon-MM interface. The half-rate controller does not support burst-of-2 devices because it under-uses the available memory bandwidth. External Memory Interface Handbook Volume 3: Reference Material 392 11 Functional Description—RLDRAM II Controller Table 107. Data Width Ratio Memory Burst Length 2-word Half-Rate Designs Full-Rate Designs No Support 4-word 2:1 4:1 8-word 11.4 Signal Descriptions The following table lists the signals of the controller’s Avalon-MM slave interface. For information on the AFI signals, refer to AFI 4.0 Specification in Functional Description - UniPHY. Table 108. Avalon-MM Slave Signals Signal avl_size Width 1 to 11 1 Direction In Avalon-MM Signal Type burstcount Out avl_ready Description — — waitrequest_n 1 In avl_read_req — read 1 In avl_write_req — write 25 In avl_addr — address 1 Out avl_rdata_valid — readdatavalid 18, 36, 72, 144 Out avl_rdata — readdata 18, 36, 72, 144 avl_wdata In — writedata Note: If you are using Qsys, the data width of the Avalon-MM interface is restricted to powers of two. Non-power-of-two data widths are available with the IP Catalog. Note: The RLDRAM II controller does not support the byteenable signal. If the RLDRAM II controller is used with the Avalon-MM Efficiency Monitor and Protocol Checker, data corruption can occur if the byteenable signal on the efficiency monitor is used. For example, this can occur if using the JTAG Avalon-MM Master component to drive the efficiency monitor. Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. External Memory Interface Handbook Volume 3: Reference Material 393 11 Functional Description—RLDRAM II Controller • Functional Description—UniPHY on page 13 11.5 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Removed occurrence of MegaWizard Plug-In Manager. December 2013 2013.12.16 Removed references to SOPC Builder. November 2012 3.3 Changed chapter number from 6 to 7. June 2012 3.2 Added Feedback icon. November 2011 3.1 Harvested Controller chapter from 11.0 RLDRAM II Controller with UniPHY IP User Guide. External Memory Interface Handbook Volume 3: Reference Material 394 12 Functional Description—RLDRAM 3 PHY-Only IP 12 Functional Description—RLDRAM 3 PHY-Only IP The RLDRAM 3 PHY-only IP works with a customer-supplied memory controller to translate memory requests from user logic to RLDRAM 3 memory devices, while satisfying timing requirements imposed by the memory configurations. 12.1 Block Description The RLDRAM 3 UniPHY-based IP is a PHY-only offering which you can use with a thirdparty controller or a controller that you develop yourself. The following figure shows a block diagram of the RLDRAM 3 system architecture. Figure 165. RLDRAM 3 System Architecture FPGA User Logic Avalon-MM Controller AFI (Custom or third-party) PHY (RLDRAM III UniPHY) RLDRAM III Memory Device 12.2 Features The RLDRAM 3 UniPHY-based IP supports features available from major RLDRAM 3 device vendors at speeds of up to 800 MHz. The following list summarizes key features of the RLDRAM 3 UniPHY-based IP: • support for Arria V GZ and Stratix V devices • standard AFI interface between the PHY and the memory controller • quarter-rate and half-rate AFI interface • maximum frequency of 533 MHz for half-rate operation and 800 MHz for quarterrate operation • burst length of 2, 4, or 8 • x18 and x36 memory organization • common I/O device support • nonmultiplexed addressing • multibank write and refresh protocol (programmable through mode register) • optional use of data mask pins Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 12 Functional Description—RLDRAM 3 PHY-Only IP 12.3 RLDRAM 3 AFI Protocol The RLDRAM 3 UniPHY-based IP communicates with the memory controller using an AFI interface that follows the AFI 4.0 specification. To maximize bus utilization efficiency, the RLDRAM 3 UniPHY-based IP can issue multiple memory read/write operations within a single AFI cycle. The following figure illustrates AFI bus activity when a quarter-rate controller issues four consecutive burst-length 2 read requests. Figure 166. AFI Bus Activity for Quarter-Rate Controller Issuing Four Burst-Length 2 Read Requests Read Latency afi_clk afi_cs_n[3:0] 4’b0000 4’b1111 afi_ref_n[3:0] 4’b1111 afi_we_n[3:0] 4’b1111 afi_rdata_en_full[3:0] 4’b1111 4’b0000 afi_rdata_valid[3:0] 4’b1111 afi_rdata[...] Data The controller does not have to begin a read or write command using channel 0 of the AFI bus. The flexibility afforded by being able to begin a command on any bus channel can facilitate command scheduling in the memory controller. The following figure illustrates the AFI bus activity when a quarter-rate controller issues a single burst-length 4 read command to the memory on channel 1 of the AFI bus. Figure 167. AFI Bus Activity for Quarter-Rate Controller Issuing One Burst-Length 4 Read Request Read Latency afi_clk afi_cs_n[3:0] 4’b1101 4’b1111 afi_ref_n[3:0] 4’b1111 afi_we_n[3:0] 4’b1111 afi_rdata_en_full[3:0] 4’b0110 afi_rdata_valid[3:0] afi_rdata[...] External Memory Interface Handbook Volume 3: Reference Material 396 4’b0000 4’b0110 Data 12 Functional Description—RLDRAM 3 PHY-Only IP Note: For information on the AFI, refer to AFI 4.0 Specification in Functional Description UniPHY. Related Links • AFI 4.0 Specification on page 120 The Altera PHY interface (AFI) 4.0 defines communication between the controller and physical layer (PHY) in the external memory interface IP. • Functional Description—UniPHY on page 13 12.4 RLDRAM 3 Controller with Arria 10 EMIF Interfaces The following table lists the RLDRAM 3 signals available for each interface when using Arria 10 EMIF IP. Signal Interface Type Description pll_ref_clk interface pll_ref_clock Clock input Clock input to the PLL inside EMIF. Clock output AFI clock output from the PLL inside EMIF. Clock output AFI clock output from the PLL inside EMIF running at half speed. Conduit Interface signal between the PHY and the memory device. Conduit Memory interface status signal. afi_clk_interface afi_clock afi_half_clk_interface afi_half_clock Memory interface mem_a mem_ba mem_ck mem_ck_n mem_cs_n mem_dk mem_dk_n mem_dm mem_dq mem_qk mem_qk_n mem_ref_n mem_we_n mem_reset_n Status interface local_init_done local_cal_success continued... External Memory Interface Handbook Volume 3: Reference Material 397 12 Functional Description—RLDRAM 3 PHY-Only IP Signal Interface Type Description local_cal_fail local_cal_request oct interface oct_rzqin Conduit OCT reference resistor pins for RZQ. Avalon-MM slave Altera PHY interface (AFI) signal between the PHY and the memory controller. afi_interface afi_addr afi_ba afi_cs_n afi_we_n afi_ref_n afi_wdata_valid afi_wdata afi_dm afi_rdata afi_rdata_en_full afi_rdata_valid afi_rst_n afi_cal_success afi_cal_fail afi_wlat afi_rlat 12.5 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Added RLDRAM 3 Controller with Arria 10 EMIF Interfaces. December 2013 2013.12.16 Maintenance release. November 2012 1.0 Initial release. External Memory Interface Handbook Volume 3: Reference Material 398 13 Functional Description—Example Designs 13 Functional Description—Example Designs Generation of your external memory interface IP creates two independent example designs. These example designs illustrate how to instantiate and connect the memory interface for both synthesis and simulation flows. 13.1 Arria 10 EMIF IP Example Designs Quick Start Guide A new interface and more automated design example flow is available for Arria 10 external memory interfaces. The Example Designs tab is available in the parameter editor when you specify an Arria 10 target device. This tab allows you to select from a list of presets for Intel FPGA development kits and target interface protocols. All tabs are automatically parameterized with appropriate values, based on the preset that you select. You can specify that the system create directories for simulation and synthesis file sets, and generate the file sets automatically. You can generate an example design specifically for an Intel FPGA development kit, or for any EMIF IP that you generate. Figure 168. Using the Example Design Design Example Design Example Generation Compilation (Simulator) Functional Simulation Compilation (Quartus Prime) Timing Analysis (Quartus Prime) Hardware Testing 13.1.1 Typical Example Design Workflow When you generate your EMIF IP, the system creates an example design that includes a traffic generator and external memory interface controller. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 13 Functional Description—Example Designs Figure 169. Example Design for Synthesis Synthesis Example Design Avalon-MM Traffic Generator AFI Controller PHY Memory IP Figure 170. Example Design for Simulation Simulation Example Design Abstract instance of the synthesis example design Traffic Generator Avalon-MM Controller AFI PHY Memory Memory Model IP Status Checker Workflow Overview Figure 171. Generating a Simulation or Synthesis File set Start the Parameter Editor Select Family and Device Specify IP Parameters Check Simulation or Synthesis under Example Design Files Generate Example Design If you select Simulation or Synthesis under Example Design Files on the Example Designs tab, the system creates a complete simulation file set or a complete synthesis file set, in accordance with your selection. 13.1.2 Example Designs Interface Tab If you have selected an Arria 10 device, the parameter editor includes an Example Designs tab which allows you to parameterize and generate your example designs. External Memory Interface Handbook Volume 3: Reference Material 400 13 Functional Description—Example Designs Figure 172. Example Designs Tab in the External Memory Interfaces Parameter Editor Available Example Designs Section The Select design pulldown allows you to select the desired example design. At present, EMIF Example Design is the only available choice, and is selected by default. Example Design Files Section The example design supports generation, simulation, and Quartus Prime compilation flows for any selected device. The Simulation and Synthesis checkboxes in this section allow you to specify whether to generate a simulation file set, a synthesis file set, or both. Check Simulation to use the example design for simulation, or check Synthesis to use the example design for compilation and hardware. The system creates the specified file sets when you click the Generate Example Design button. Note: If you don't select the Simulation or Synthesis checkbox, the destination directory will contain Qsys design files, which are not compilable by Quartus Prime directly, but can be viewed or edited under Qsys. • To create a compilable project, you must run the quartus_sh -t make_qii_design.tcl script in the destination directory. • To create a simulation project, you must run the quartus_sh -t make_sim_design.tcl script in the destination directory. Generated HDL Format Section The Simulation HDL format pulldown allows you to specify the target format for the generated design. At present, only Verilog filesets are supported. External Memory Interface Handbook Volume 3: Reference Material 401 13 Functional Description—Example Designs Target Development Kit Section The Select board pulldown in this section applies the appropriate development kit pin assignments to the example design. • This setting is available only when you turn on the Synthesis checkbox in the Example Design Files section. • This setting must match the applied development kit present, or else an error message appears. If the value None appears in the Select board pulldown, it indicates that the current parameter selections do not match any development kit configurations. You may apply a development kit-specific IP and related parameter settings by selecting one of the presets from the preset library. When you apply a preset, the current IP and other parameter settings are set to match the selected preset. If you want to save your current settings, you should do so before you select a preset. If you do select a preset without saving your prior settings, you can always save the new preset settings under a different name If you want to generate the example design for use on your own board, set Select board to None, generate the example design, and then add pin location constraints. Note: In the 15.1 release, programming file (.sof/.pof) generation is supported for Arria 10 engineering samples only. If you would like to learn more about support for Arria 10 production devices in 15.1, contact your Intel FPGA representative or use the support link on www.altera.com. 13.1.3 Development Kit Preset Workflow Arria 10 users can take advantage of development kit presets to create an example design that is automatically configured and ready for hardware testing or simulation. The use of presets greatly simplifies the workflow and reduces the steps necessary to have an EMIF example design working on an Arria 10 development kit. Figure 173. Generating an Example Design for Use With an Arria 10 Development Kit Start the Parameter Editor Select Arria 10 Device Select Traffic Generator 2.0 on Diagnostics tab (if desired) Apply a Development Kit Preset Set Parameters on Example Designs tab External Memory Interface Handbook Volume 3: Reference Material 402 Generate Example Design Compile the Design 13 Functional Description—Example Designs Follow these steps to generate the example design: 1. Under Memory Interfaces and Controllers (Tools > Qsys>IP Catalog>Library>Memory Interfaces and Controllers), select Arria 10 External Memory Interfaces. The parameter editor appears. 2. Specify a top-level name and the folder for your custom IP variation, and specify an Arria 10 device. Click OK. 3. Select the appropriate preset for your development kit from the Presets Library. Click Apply. 4. By default, the example design will be generated with the traditional Traffic Generator. If you would rather use the new EMIF Configurable Traffic Generator 2.0, select Use configurable Avalon traffic generator 2.0 on the Diagnostics tab of the parameter editor. 5. On the Example Designs tab in the parameter editor, make the following settings: a. Under Available Example Designs, only EMIF Example Design is available, and is selected by default. b. Under Example Design Files, select Simulation or Synthesis to have the system create a simulation file set or synthesis file set. c. Under Generated HDL Format, only Verilog is available, and is selected by default. d. Under Target Development Kit, select the desired Arria 10 development kit. The development kit that you select here must match the development kit preset that you selected in step 1, or else an error message appears. Note: The Target Development Kit section is available only if you have specified Synthesis under Example Design Files. 6. Click the Generate Example Design button at the top of the parameter editor. The software generates all files necessary to run simulations and hardware tests on the selected development kit. Related Links Support Center Web Page on www.altera.com 13.1.4 Compiling and Simulating the Design This section explains how to simulate your EMIF example design. Figure 174. Procedure for Simulating the Design Run Simulation Script Change to Simulation Directory Analyze Results 1. Change to the simulation directory. 2. Run the simulation script for the simulator of your choice. Refer to the table below. 3. Analyze the results. External Memory Interface Handbook Volume 3: Reference Material 403 13 Functional Description—Example Designs Table 109. Steps to Run Simulation Simulator Riviera-PRO Working Directory Instructions <variation_name>_example_de sign\sim\ aldec To simulate the example design using the Riviera-PRO simulator, follow these steps: a. At the Linux* or Windows* shell command prompt, change directory to: <variation_name>_example_design\sim\ aldec. b. Execute the rivierapro_setup.tcl script by typing the following command at the Linux or Windows command prompt: vsim -do rivierapro_setup.tcl . c. To compile and elaborate the design after the script loads, type: ld_debug. d. Type run -all to run the simulation. e. A successful simulation ends with the following message, “--SIMULATION PASSED ---“ NCSim <variation_name>_example_de sign\sim\cadence\ To simulate the example design using the NCSim simulator, follow these steps: a. At the Linux shell command prompt, change directory to <variation_name>_example_design\sim\cadence\. b. Run the simulation by typing the following at the command prompt: sh ncsim_setup.sh. c. A successful simulation ends with the following message, “--SIMULATION PASSED ---“ ModelSim* <variation_name>_example_de sign\sim\mentor\ To simulate the example design using the Modelsim simulator, follow these steps: a. At the Linux or Windows shell command prompt, change directory to <variation_name>_example_design\sim \mentor\. b. Perform one of the following: • Execute the msim_setup.tcl script by typing the following command at the Linux or Windows command prompt: vsim -do msim_setup.tcl. • Type the following command at the ModelSim command prompt: do msim_setup.tcl. c. A successful simulation ends with the following message, “--SIMULATION PASSED ---“ VCS <variation_name>_example_de sign\sim\synopsys\vcsmx\ To simulate the example design using the VCS simulator, follow these steps: a. At the Linux shell command prompt, change directory to <variation_name>_example_design\sim\synopsys \vcsmx\. b. Run the simulation by typing the following at the command prompt: sh vcsmx_setup.sh. c. A successful simulation ends with the following message, “--SIMULATION PASSED ---“ 13.1.5 Compiling and Testing the Design in Hardware This section explains how to compile and test your EMIF example design in hardware. Figure 175. Procedure for Testing in Hardware Arrange to Control and Monitor Pins Compile the Design External Memory Interface Handbook Volume 3: Reference Material 404 Configure Board with the SOF file Issue Reset and Monitor for Success 13 Functional Description—Example Designs Follow these steps to compile and test the design in hardware: 1. 2. Before compiling a design, select a method for controlling and monitoring the following pins: • global_reset_reset_n • <variation_name>_status_local_cal_fail • <variation_name>_status_local_cal_success • <variation_name>_tg_0_traffic_gen_fail • <variation_name>_tg_0_traffic_gen_pass • <variation_name>_tg_0_traffic_gen_timeout • Additional pins that may exist if the example design contains more than one traffic generator. Assign pins by one of the following means, as appropriate for your situation: • If you have selected a development kit preset, the above pins are automatically assigned to switches and LEDs on the board. See the documentation accompanying your specific development kit for details. • If you are using a non-development kit board, assign the above pins to appropriate locations on your board. • Alternatively, the above pins can be controlled and monitored using Intel insystem sources and probes. To instantiate these in your design, add the following line to your (.qsf) file: set_global_assignment -name VERILOG_MACRO "ALTERA_EMIF_ENABLE_ISSP=1" 3. Compile the design by clicking Processing ➤ Start Compilation in the Quartus Prime software. 4. Configure the FPGA on the board using the generated (.sof) file. 5. Issue a reset on the global_reset_reset_n port. If the traffic generator test completes successfully, the <variation_name>_status_local_cal_success and <variation_name>_tg_0_traffic_gen_pass signals will go high, indicating success. 6. If you are using the Traffic Generator 2.0, you can configure and monitor the traffic generator using the EMIF Debug Toolkit. Refer to Configuring the Traffic Generator 2.0. 13.2 Testing the EMIF Interface Using the Traffic Generator 2.0 The EMIF Configurable Traffic Generator 2.0 can assist in debugging and stress-testing your external memory interface. The traffic generator 2.0 supports Arria 10 and later device families. Key Features The traffic generator 2.0 has the following key features: • Is a standalone, soft-logic device that resides in the FPGA core. • Is independent of the FPGA architecture and the external memory protocol in use. • Offers configuration options for the generation of reads and writes, addressing, data, and data mask. External Memory Interface Handbook Volume 3: Reference Material 405 13 Functional Description—Example Designs For information on using the Traffic Generator 2.0 from within the EMIF Debug Toolkit, refer to Traffic Generator 2.0 in External Memory Interface Debug Toolkit. 13.2.1 Configurable Traffic Generator 2.0 Configuration Options The EMIF Configurable Traffic Generator 2.0 offers a range of configuration options for fast debugging of your external memory interface. You configure the traffic generator by modifying address-mapped registers through the simulation test bench file or by creating a configuration test stage. Configuration Syntax The test bench includes an example test procedure. The syntax for writing to a configuraton register is as follows: tg_send_cfg_write_<index>(<Register Name>, <Value to be written>); The index value represents the index of the memory interface (which is usually 0, but can be 0/1 in ping-pong PHY mode). The register name values are listed in the tables of configuration options, and can also be found in the altera_emif_avl_tg_defs.sv file, in the same directory as the test bench. The final configuration command must be a write of any value to TG_START, which starts the traffic generator for the specified interface. Configuration Options Configuration options are divided into read/write, address, data, and data mask generation categories. Table 110. Configuration Options for Read/Write Generation Toolkit GUI Tab Toolkit GUI Parameter -- -- Loops Register Name Description TG_START Tells the system to perform a write to this register to initiate traffic generation test. Loops TG_LOOP_COUNT Specifies the number of read/write loops to perform before completion of the test stage. A loop is a single iteration of writes and reads. Upon completion, the base address is either incremented (SEQ or RAND_SEQ) or replaced by a newly generated random address (RAND or RAND_SEQ). Writes per block TG_WRITE_COUNT Specifies the number of writes to be performed in each loop of the test. Reads per block TG_READ_COUNT Specifies the number of reads to be performed in each loop of the test. This register must have the same value as TG_WRITE_COUNT. -- TG_WRITE_REPEAT_COUNT Specifies the number of times each write transaction is repeated. continued... External Memory Interface Handbook Volume 3: Reference Material 406 13 Functional Description—Example Designs Toolkit GUI Tab Register Name Description -- TG_READ_REPEAT_COUNT Specifies the number of times each read transaction is repeated. Reads per block TG_BURST_LENGTH Configures the length of each write/read burst to memory. -- -- TG_CLEAR_FIRST_FAIL Clears the record of first failure occurrence. New failure information such as expected data, read data, and fail address, is written to the corresponding registers following the next failure. -- -- TG_TEST_BYTEEN Toggles the byte-enable (data mask enable) register within the traffic generator which allows the current test to use byte-enable signals. -- -- TG_DATA_MODE Specifies the source of data used for data signal generation during the test. Set to 0 to use pseudorandom data. Set to 1 to use userspecified values stored in the static data generators. -- -- TG_BYTEEN_MODE Specifies the source of data used for byte-enable signal generation during the test. Set to 0 to use pseudo-random data. Set to 1 to use user-specified values stored in the static data generators. Table 111. Configuration Options for Address Generation Toolkit GUI Tab Address Toolkit GUI Parameter Toolkit GUI Parameter Register Name Description Start Address TG_SEQ_START_ADDR_WR_L Specifies the sequential start address (lower 32 bits). Start address TG_SEQ_START_ADDR_WR_H Specifies the sequential start address (upper 32 bits) Address mode TG_ADDR_MODE_WR Specifies how write addresses are generated by writing the value in parentheses to the register address. Values are: randomize the address for every write (0), increment sequentially from a specified address (1), increment sequentially from a random address (2), or perform one hot addressing (3). Number of sequential address TG_RAND_SEQ_ADDRS_WR Specifies the number of times to increment sequentially on the random base address before generating a new random write address. Return to start address TG_RETURN_TO_START_ADDR Return to start address in deterministic sequential address mode (if 1 is written to TG_ADDR_MODE_WR) after every loop of transactions. continued... External Memory Interface Handbook Volume 3: Reference Material 407 13 Functional Description—Example Designs Toolkit GUI Tab Toolkit GUI Parameter Register Name Description Rank | Mask Mode TG_RANK_MASK_EN Specifies the rank masking mode by writing the value in parentheses to the register address. Values are: disable rank masking (0), maintain a static rank mask (1), cycle through rank masks incrementally (2). Bank Address | Mask Mode TG_BANK_MASK_EN Specifies the bank masking mode by writing the value in parentheses to the register address. Values are: disable bank masking (0), maintain a static bank mask (1), cycle through bank masks incrementally (2), cycle through only three consecutive bank masks (3). Row | Mask Mode TG_ROW_MASK_EN Specifies the mode for row masking by writing the value in parentheses to the register address. Values are: disable row masking (0), maintain a static row mask (1), cycle through row masks incrementally (2). Bank Group | Mask Mode TG_BG_MASK_EN Specifies the mode for bank group masking by writing the value in parentheses to the register address. Values are: disable bank group masking (0), maintain a static bank group mask (1), cycle through bank group masks incrementally (2). Rank | Mask Value TG_RANK_MASK Specifies the initial rank to be masked into the generated traffic generator address. Bank Address | Mask Value TG_BANK_MASK Specifies the initial bank to be masked into the generated traffic generator address. Row | Mask Value TG_ROW_MASK Specifies the initial row to be masked into the generated traffic generator address. Bank Group | Mask Value TG_BG_MASK Specifies the initial bank group to be masked into the generated traffic generator address. Sequential Address Increment TG_SEQ_ADDR_INCR Specifies the increment to use when sequentially incrementing the address. This value is used by both deterministic sequential addressing and random sequential addressing. (Refer to TG_ADDR_MODE_WR) Start Address TG_SEQ_START_ADDR_RD_L Specifies the sequential start read address (lower 32 bits). continued... External Memory Interface Handbook Volume 3: Reference Material 408 13 Functional Description—Example Designs Toolkit GUI Tab Table 112. Data Description TG_SEQ_START_ADDR_RD_H Specifies the sequential start read address (upper 32 bits). Address Mode TG_ADDR_MODE_RD Similar to TG_ADDR_MODE_WR but for reads. Number of sequential address TG_RAND_SEQ_ADDRS_RD Specifies the number of times to increment the random sequential read address. Toolkit GUI Parameter Register Name Description Seed/Fixed Pattern TG_DATA_SEED Specifies an initial value to the data generator corresponding to the index value. PRBS and Fixed Pattern radio buttons TG_DATA_MODE Specifies whether to treat the initial value of the data generator of corresponding index as a seed for generating pseudo-random data (value of 0) or to keep the initial value static (value of 1). Configuration Options for Data Mask Generation Toolkit GUI Tab Data Register Name Configuration Options for Data Generation Toolkit GUI Tab Table 113. Toolkit GUI Parameter Toolkit GUI Parameter Register Name Description Seed/Fixed Pattern TG_BYTEEN_SEED Specifies an initial value to the byte-enable generator corresponding to the index value. PRBS and Fixed Pattern radio buttons TG_BYTEEN_MODE Specifies whether to treat the initial value of the byte-enable generator of corresponding index as a seed and generate pseudorandom data (value of 0) or to keep the initial value static (value of 1). 13.2.1.1 Test Information In the test bench file, register reads are encoded in a similar syntax to register writes. The following example illustrates the syntax of a register read: integer <Variable Name>; tg_send_cfg_read_<index>(<Register Name>, <Variable Name>); In hardware, you can probe the registers storing the test information (such as pnf per bit persist, first fail read address, first fail read data, and first fail expected data). External Memory Interface Handbook Volume 3: Reference Material 409 13 Functional Description—Example Designs Table 114. Test Information Read-Accessible Through Register Addresses Register Name Description TG_PASS Returns a high value if the traffic generator passes at the end of all the test stages. TG_FAIL Returns a high value if the traffic generator fails at the end of all the test stages. TG_FAIL_COUNT_L Reports the failure count (lower 32 bits). TG_FAIL_COUNT_H Reports the failure count(upper 32 bits). TG_FIRST_FAIL_ADDR_L Reports the address of the first failure (lower 32 bits). TG_FIRST_FAIL_ADDR_H Reports the address of the first failure (upper 32 bits). TG_FIRST_FAIL_IS_READ First failure is Read Failure. TG_FIRST_FAIL_IS_WRITE First failure is Write Failure. TG_VERSION Reports the traffic generator version number. TG_NUM_DATA_GEN Reports the number of data generators. TG_NUM_BYTEEN_GEN Reports the number of byte-enable generators. TG_RANK_ADDR_WIDTH Reports the rank address width. TG_BANK_ADDR_WIDTH Reports the bank address width. TG_ROW_ADDR_WIDTH Reports the row address width. TG_BANK_GROUP_WIDTH Reports the bank group width. TG_RDATA_WIDTH Reports the width of all data and PNF signals within the traffic generator. TG_DATA_PATTERN_LENGTH Reports the length of the static pattern to be loaded into static per-pin data generators. TG_BYTEEN_PATTERN_LENGTH Reports the length of the static pattern to be loaded into static per-pin byte-enable generators. TG_MIN_ADDR_INCR Reports the minimum address increment permitted for sequential and random-sequential address generation. TG_ERROR_REPORT Reports error bits. Refer to Error Report Register Bits for details. TG_PNF Read the persistent PNF per bit as an array of 32-bit entries. TG_FAIL_EXPECTED_DATA Reports the first failure expected data. This is read as an array of 32-bit entries. TG_FAIL_READ_DATA Reports the first failure read data. This is read as an array of 32-bit entries. The configuration error report register contains information on common misconfigurations of the traffic generator. The bit corresponding to a given configuration error is set high when that configuration error is detected. Ensure that all bits in this register are low, to avoid test failures or unexpected behavior due to improper configuration. External Memory Interface Handbook Volume 3: Reference Material 410 13 Functional Description—Example Designs Table 115. Error Report Register Bits Bit Index (LSB = 0x0) Bit Name Description of Error 0 ERR_MORE_READS_THAN_WRITES You have requested more read transactions per loop than write transactions per loop. 1 ERR_BURSTLENGTH_GT_SEQ_ADDR_I NCR The configured burst length is greater than the configured sequential address increment when address generation is in sequential or random sequential mode. 2 ERR_ADDR_DIVISIBLE_BY_GT_SEQ_A DDR_INCR The configured sequential address increment is less than the minimum required address increment when address generation is in sequential or random sequential mode. 3 ERR_SEQ_ADDR_INCR_NOT_DIVISIBL E The configured sequential address increment is not a multiple of the minimum required address increment when address generation is in sequential or random sequential mode. 4 ERR_READ_AND_WRITE_START_ADDR S_DIFFER The configured start addresses for reads and writes are different when address generation is in sequential mode. 5 ERR_ADDR_MODES_DIFFERENT The configured address modes for reads and writes are different. 6 ERR_NUMBER_OF_RAND_SEQ_ADDRS _DIFFERENT The configured number of random sequential addresses for reads and writes are different when address generation is in random sequential mode. 7 ERR_REPEATS_SET_TO_ZERO The number of read and/or write repeats is set to 0. 8 ERR_BOTH_BURST_AND_REPEAT_MOD E_ACTIVE The burst length is set greater than 1 and read/write requests are set greater than 1. 9-31 Reserved -- 13.2.2 Performing Your Own Tests Using Traffic Generator 2.0 If you want, you can create your own configuration test stage for the EMIF Configurable Traffic Generator. The general flow of a configuration test stage, including the default test stages, is as follows: • Configure the number of loops to be completed by the traffic generator (TG_LOOP_COUNT). • Configure the number of writes and reads to be complete per loop (TG_WRITE_COUNT and TG_READ_COUNT respectively). • Choose the burst length of each write and read (TG_BURST_LENGTH). • Select starting write address by writing to the lower and upper bits of the address register (TG_SEQ_START_ADDR_WR_L and TG_SEQ_START_ADDR_WR_H, respectively). External Memory Interface Handbook Volume 3: Reference Material 411 13 Functional Description—Example Designs • Select write address generation mode (TG_ADDR_MODE_WR). • Select starting read address by writing to the lower and upper bits of the address register (TG_SEQ_START_ADDR_RD_L and TG_SEQ_START_ADDR_RD_H, respectively). • Select read address generation mode (TG_ADDR_MODE_RD). • If applicable, select sequential address increment (TG_SEQ_ADDR_INCR). • Write initial values/seeds to the data and byte-enable generators (TG_DATA_SEED and TG_BYTEEN_SEED). • Select generation mode of the data and byte-enable generators (TG_DATA_MODE and TG_BYTEEN_MODE). • Initiate test (TG_START). Simulation For a comprehensive example of how to write your own configuration test for simulation, refer to the test bench file, located at <example_design_directory>/sim/ed_sim/ altera_emif_tg_avl_2_<>/sim/altera_emif_avl_tg_2_tb.sv To iterate over the data generators or byte-enable generators, you must read the number of data generators and number of byte-enable generators. These values are mapped to read-accessible registers TG_NUM_DATA_GEN and TG_NUM_BYTEEN_GEN, respectively. The following example illustrates how one would configure the data generators to continuously output the pattern 0x5a, using the simulation test bench: integer num_data_generators; … tg_send_cfg_read_0(TG_NUM_DATA_GEN, num_data_generators); tg_send_cfg_write_0(TG_DATA_MODE, 32'h1); for (i = 0; i < num_data_generators; i = i + 1) begin tg_send_cfg_write_0(TG_DATA_SEED + i, 32'h5A); end Hardware Configuration test stages in hardware must be inserted into the RTL, and will resemble the single read/write, byte-enable, and block read/write stages in the default test pattern. In most cases, you can modify one of the existing stages to create the desired custom test stage. The stages are linear, finite, state machines that write predetermined values to the configuration address-mapped registers. As always, the last state in configuration is a write to address 0x0 or TG_START. The state machine then waits for the traffic generator to return a signal signifying its completion of the test stage. Refer to the aforementioned default test stages as examples of hardware test stages. The default test stages are contained within the following files: • <example_design_directory>/qii/ed_synth/ altera_emif_tg_avl_2_<>/synth/altera_emif_avl_tg_2_rw_stage.sv • <example_design_directory>/qii/ed_synth/ altera_emif_tg_avl_2_<>/synth/ altera_emif_avl_tg_2_byteenable_test_stage.sv External Memory Interface Handbook Volume 3: Reference Material 412 13 Functional Description—Example Designs You can also configure the configurable traffic generator in real time using the EMIF Debug Toolkit. The configuration settings available in the Toolkit interface are detailed in the Configurable Traffic Generator 2.0 Configuration Options topic. 13.2.3 Signal Splitter Component The signal splitter (altera_emif_sig_splitter) is an internal IP component which receives a single signal as its input and passes that signal directly to n outputs, where n is an interger value equal to or greater than 1.The signal splitter is useful because Qsys does not directly allow one-to-many connections for conduit interfaces. The signal splitter contains no logic or memory elements. When you configure the signal splitter to have exactly one output port, it is functionally identical to a single wire, and can be replaced by one with no loss of performance. The rzq_splitter is an instantiation of the signal splitter component specifically for the RZQ signal. The signal splitter facilitates the sharing of one RZQ signal among multiple memory interfaces in an EMIF example design. 13.3 UniPHY-Based Example Designs For UniPHY-based interfaces, two independent example designs are created, each containing independent RTL files and other project files. You should compile or simulate these designs separately, and the files should not be mixed. Nonetheless, the designs are related, as the simulation example design builds upon the design of the synthesis example design. 13.3.1 Synthesis Example Design The synthesis example design contains the major blocks shown in the figure below. • A traffic generator, which is a synthesizable Avalon-MM example driver that implements a pseudo-random pattern of reads and writes to a parameterized number of addresses. The traffic generator also monitors the data read from the memory to ensure it matches the written data and asserts a failure otherwise. • An instance of the UniPHY memory interface, which includes a memory controller that moderates between the Avalon-MM interface and the AFI interface, and the UniPHY, which serves as an interface between the memory controller and external memory devices to perform read and write operations. Figure 176. Synthesis Example Design Synthesis Example Design Traffic Generator Avalon-MM Controller AFI PHY Memory IP External Memory Interface Handbook Volume 3: Reference Material 413 13 Functional Description—Example Designs If you are using the Ping Pong PHY feature, the synthesis example design includes two traffic generators issuing commands to two independent memory devices through two independent controllers and a common PHY, as shown in the following figure. Figure 177. Synthesis Example Design for Ping Pong PHY Synthesis Example Design Traffic Generator 0 Traffic Generator 1 Avalon-MM Avalon-MM Memory AFI Controller 0 AFI Controller 1 PHY Memory IP If you are using RLDRAM 3, the traffic generator in the synthesis example design communicates directly with the PHY using AFI, as shown in the following figure. Figure 178. Synthesis Example Design for RLDRAM 3 Interfaces Synthesis Example Design Traffic Generator AFI PHY Memory IP You can obtain the synthesis example design by generating your IP core. The files related to the synthesis example design reside at < variation_name >_ example_design / example_project. The synthesis example design includes a Quartus Prime project file (< variation_name >_ example_design / example_project /< variation_name >_example.qpf). The Quartus Prime project file can be compiled in the Quartus Prime software, and can be run on hardware. Note: If one or more of the PLL Sharing Mode, DLL Sharing Mode, or OCT Sharing Mode parameters are set to any value other than No Sharing, the synthesis example design will contain two traffic generator/memory interface instances. The two traffic generator/memory interface instances are related only by shared PLL/DLL/OCT connections as defined by the parameter settings. The traffic generator/memory interface instances demonstrate how you can make such connections in your own designs. External Memory Interface Handbook Volume 3: Reference Material 414 13 Functional Description—Example Designs 13.3.2 Simulation Example Design The simulation example design contains the major blocks shown in the following figure. • An instance of the synthesis example design. As described in the previous section, the synthesis example design contains a traffic generator and an instance of the UniPHY memory interface. These blocks default to abstract simulation models where appropriate for rapid simulation. • A memory model, which acts as a generic model that adheres to the memory protocol specifications. Frequently, memory vendors provide simulation models for specific memory components that you can download from their websites. • A status checker, which monitors the status signals from the UniPHY IP and the traffic generator, to signal an overall pass or fail condition. Figure 179. Simulation Example Design Simulation Example Design Abstract instance of the synthesis example design Traffic Generator Avalon-MM Controller AFI PHY Memory Memory Model IP Status Checker If you are using the Ping Pong PHY feature, the simulation example design includes two traffic generators issuing commands to two independent memory devices through two independent controllers and a common PHY, as shown in the following figure. External Memory Interface Handbook Volume 3: Reference Material 415 13 Functional Description—Example Designs Figure 180. Simulation Example Design for Ping Pong PHY Simulation Example Design Abstract Instance of the Synthesis Example Design Traffic Generator 0 Avalon-MM Controller 0 AFI Memory Memory Model 0 Memory Memory Model 1 PHY Traffic Generator 1 Avalon-MM Controller 1 AFI IP Status Checker If you are using RLDRAM 3, the traffic generator in the simulation example design communicates directly with the PHY using AFI, as shown in the following figure. Figure 181. Simulation Example Design for RLDRAM 3 Interfaces Synthesis Example Design Abstract instance of the synthesis example design Traffic Generator AFI PHY Memory Memory Model IP You can obtain the simulation example design by generating your IP core. The files related to the simulation example design reside at < variation_name >_ example_design /simulation. After obtaining the generated files, you must still generate the simulation example design RTL for your desired HDL language. The file < variation_name >_ example_design /simulation/README.txt contains details about how to generate the IP and to run the simulation in ModelSim or ModelSim Intel FPGA Edition. 13.3.3 Traffic Generator and BIST Engine The traffic generator and built-in self test (BIST) engine for Avalon-MM memory interfaces generates Avalon-MM traffic on an Avalon-MM master interface. The traffic generator creates read and write traffic, stores the expected read responses internally, and compares the expected responses to the read responses as they arrive. If all reads report their expected response, the pass signal is asserted; however, if any read responds with unexpected data a fail signal occurs. External Memory Interface Handbook Volume 3: Reference Material 416 13 Functional Description—Example Designs Each operation generated by the traffic generator is a single write or block of writes followed by a single read or block of reads to the same addresses, which allows the driver to precisely determine the data that should be expected when the read data is returned by the memory interface. The traffic generator comprises a traffic generation block, the Avalon-MM interface and a read comparison block. The traffic generation block generates addresses and write data, which are then sent out over the AvalonMM interface. The read comparison block compares the read data received from the Avalon-MM interface to the write data from the traffic generator. If at any time the data received is not the expected data, the read comparison block records the failure, finishes reading all the data, and then signals that there is a failure and the traffic generator enters a fail state. If all patterns have been generated and compared successfully, the traffic generator enters a pass state. Figure 182. Example Driver Operations Initialize Sequential Addresses Individual Reads/Writes Random Addresses Sequential/Random Addresses Block Reads/Writes User-Defined Stages (Optional) Pass User-defined Addresses (Optional) Fail Within the traffic generator, there are the following main states: • Generation of individual read and writes • Generation of block read and writes • The pass state • The fail state Within each of the generation states there are the following substates: • Sequential address generation • Random address generation • Mixed sequential and random address generation For each of the states and substates, the order and number of operations generated for each substate is parameterizable—you can decide how many of each address pattern to generate, or can disable certain patterns entirely if you want. The sequential and random interleave substate takes in additions to the number of operations to generate. An additional parameter specifies the ratio of sequential to random addresses to generate randomly. External Memory Interface Handbook Volume 3: Reference Material 417 13 Functional Description—Example Designs 13.3.3.1 Read and Write Generation The traffic generator block can generate individual or block reads and writes. Individual Read and Write Generation During the traffic generator’s individual read and write generation state, the traffic generation block generates individual write followed by individual read Avalon-MM transactions, where the address for the transactions is chosen according to the specific substate. The width of the Avalon-MM interface is a global parameter for the driver, but each substate can have a parameterizable range of burst lengths for each operation. Block Read and Write Generation During the traffic generator’s block read and write generation state, the traffic generator block generates a parameterizable number of write operations followed by the same number of read operations. The specific addresses generated for the blocks are chosen by the specific substates. The burst length of each block operation can be parameterized by a range of acceptable burst lengths. 13.3.3.2 Address and Burst Length Generation The traffic generator block can perform sequential or random addressing. Sequential Addressing The sequential addressing substate defines a traffic pattern where addresses are chosen in sequential order starting from a user definable address. The number of operations in this substate is parameterizable. Random Addressing The random addressing substate defines a traffic pattern where addresses are chosen randomly over a parameterizable range. The number of operations in this substate is parameterizable. Sequential and Random Interleaved Addressing The sequential and random interleaved addressing substate defines a traffic pattern where addresses are chosen to be either sequential or random based on a parameterizable ratio. The acceptable address range is parameterizable as is the number of operations to perform in this substate. 13.3.3.3 Traffic Generator Signals The following table lists the signals used by the traffic generator. Table 116. Traffic Generator Signals Signal Signal Type clk Input clock to traffic generator. For 28nm devices, use the AFI clock. For 20nm devices, use the emif_usr_clk. reset_n Active low reset input. For 28nm devices, typically connected to afi_reset_n. For 20nm devices, typically connectd to emif_usr_reset_n. continued... External Memory Interface Handbook Volume 3: Reference Material 418 13 Functional Description—Example Designs Signal Signal Type avl_ready Refer to Avalon Interface Specification. avl_write_req Refer to Avalon Interface Specification. avl_read_req Refer to Avalon Interface Specification. avl_addr Refer to Avalon Interface Specification. avl_size Refer to Avalon Interface Specification. avl_wdata Refer to Avalon Interface Specification. avl_rdata Refer to Avalon Interface Specification. avl_rdata_valid Refer to Avalon Interface Specification. pnf_per_bit Output. Bitwise pass/fail for last traffic generator read compare of AFI interface. No errors produces value of all 1s. pnf_per_bit_persist Output. Cumulative bitwise pass/fail for all previous traffic generator read compares of AFI interface. No errors produces value of all 1s. pass Active high output when traffic generator tests complete successfully. fail Active high output when traffic generator test does not complete successfully. test_complete Active high output when traffic generator test completes. For information about the Avalon signals and the Avalon interface, refer to Avalon Interface Specifications. Related Links Avalon Interface Specifications 13.3.3.4 Traffic Generator Add-Ons Some optional components that can be useful for verifying aspects of the controller and PHY operation are generated in conjunction with certain user-specified options. These add-on components are self-contained, and are not part of the controller or PHY, nor the traffic generator. External Memory Interface Handbook Volume 3: Reference Material 419 13 Functional Description—Example Designs User Refresh Generator The user refresh generator sends refresh requests to the memory controller when user refresh is enabled. The memory controller returns an acknowledgement signal and then issues the refresh command to the memory device. The user refresh generator is created when you turn on Enable User Refresh on the Controller Settings tab of the parameter editor. 13.3.3.5 Traffic Generator Timeout Counter The traffic generator timeout counter uses the Avalon interface clock. When a test fails due to driver failure or timeout, the fail signal is asserted. When a test has failed, the traffic generator must be reset with the reset_n signal. 13.3.4 Creating and Connecting the UniPHY Memory Interface and the Traffic Generator in Qsys The traffic generator can be used in Qsys as a stand-alone component for use within a larger system. To create the system in Qsys, perform the following steps: 1. Start Qsys. 2. On the Project Settings tab, select the required device from the Device Family list. 3. In the Component Library, choose a UniPHY memory interface to instantiate. For example, under Library > Memories and Memory Controllers > External Memory Interfaces, select DDR3 SDRAM Controller with UniPHY. 4. Configure the parameters for your instantiation of the memory interface. 5. In the Component Library, find the example driver and instantiate it in the system. For example, under Library > Memories and Memory Controllers > Pattern Generators, select Avalon-MM Traffic Generator and BIST Engine. 6. Configure the parameters for your instantiation of the example driver. Note: The Avalon specification stipulates that Avalon-MM master interfaces issue byte addresses, while Avalon-MM slave interfaces accept word addresses. The default for the Avalon-MM Traffic Generator and BIST Engine is to issue word addresses. When using Qsys, you must enable the Generate per byte address setting in the traffic generator. 7. Connect the interfaces as illustrated in the following figure. At this point, you can generate synthesis RTL, Verilog or VHDL simulation RTL, or a simulation testbench system. External Memory Interface Handbook Volume 3: Reference Material 420 13 Functional Description—Example Designs 13.3.4.1 Notes on Configuring UniPHY IP in Qsys This section includes notes and tips on configuring the UniPHY IP in Qsys. • The address ranges shown for the Avalon-MM slave interface on the UniPHY component should be interpreted as byte addresses that an Avalon-MM master would address, despite the fact that this range is modified by configuring the word addresses width of the Avalon-MM slave interface on the UniPHY controller. • The afi_clk clock source is the associated clock to the Avalon-MM slave interface on the memory controller. This is the ideal clock source to use for all IP components connected on the same Avalon network. Using another clock would cause Qsys to automatically instantiate clock-crossing logic, potentially degrading performance. • The afi_clk clock rate is determined by the Rate on Avalon-MM interface setting on the UniPHY PHY Settings tab. The afi_half_clk clock interface has a rate which is further halved. For example, if Rate on Avalon-MM interface is set to Half, the afi_clk rate is half of the memory clock frequency, and the afi_half_clk is one quarter of the memory clock frequency. • The global_reset input interface can be used to reset the UniPHY memory interface and the PLL contained therein. The soft_reset input interface can be used to reset the UniPHY memory interface but allow the PLL to remain locked. You can use the soft_reset input to reset the memory but to maintain the AFI clock output to other components in the system. External Memory Interface Handbook Volume 3: Reference Material 421 13 Functional Description—Example Designs • Do not connect a reset request from a system component (such as a Nios II processor) to the UniPHY global_reset_n port. Doing so would reset the UniPHY PLL, which would propagate as a reset condition on afi_reset back to the requester; the resulting reset loop could freeze the system. • Qsys generates an interconnect fabric for each Avalon network. The interconnect fabric is capable of burst and width adaptation. If your UniPHY memory controller is configured with an Avalon interface data width which is wider than an AvalonMM master interface connected to it, you must enable the byte enable signal on the Avalon-MM slave interface, by checking the Enable Avalon-MM byte-enable signal checkbox on the Controller Settings tab in the parameter editor. • If you have a point-to-point connection from an Avalon-MM master to the AvalonMM slave interface on the memory controller, and if the Avalon data width and burst length settings match, then the Avalon interface data widths may be multiples of either a power of two or nine. Otherwise, you must enable Generate power-of-2 data bus widths for Qsys or SOPC Builder on the Controller Settings tab of the parameter editor. 13.4 Document Revision History Date May 2017 Version 2017.05.08 Changes • • • • Retitled EMIF Configurable Traffic Generator 2.0 Reference to Testing the EMIF Interface Using the Traffic Generator 2.0. Removed Configurable Traffic Generator Parameters section. Consolidated Traffic Generator 2.0 usage information in External Memory Interface Debug Toolkit chapter. Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 • • • • • • • • Modified step 6 of Compiling and Testing the Design in Hardware. Added Configuring the Traffic Generator 2.0. Added The Traffic Generator 2.0 Report. Changed heading from EMIF Configurable Traffic Generator 2.0 for Arria 10 EMIF IP to EMIF Configurable Traffic Generator 2.0 Reference. Added Bypass the traffic generator repeated-writes/ repeated-reads test pattern, Bypass the traffic generator stress pattern, and Export Traffic Generator 2.0 configuration interface to Configurable Traffic Generator Parameters table. Removed Number of Traffic Generator 2.0 configuration interfaces. Added TG_WRITE_REPEAT_ COUNT, TG_READ_REPEAT_ COUNT, TG_DATA_MODE, and TG_BYTEEN_MODE to Configuration Options for Read/Write Generation table. Added Error Report Register Bits table to Test Information section. Corrected paths in Simulation and Hardware sections of Performing Your Own Tests Using Traffic Generator 2.0 topic. November 2015 2015.11.02 • • • • Added EMIF Configurable Traffic Generator 2.0 for Arria 10 EMIF IP. Added Arria 10 EMIF IP Example Design Quick Start Guide. Changed order of sections in chapter. Changed instances of Quartus II to Quartus Prime. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Removed occurrence of MegaWizard Plug-In Manager. December 2013 2013.12.16 Removed references to SOPC Builder. continued... External Memory Interface Handbook Volume 3: Reference Material 422 13 Functional Description—Example Designs Date November 2012 Version 1.3 Changes • • Added block diagrams of simulation and synthesis example designs for RLDRAM 3 and Ping Pong PHY. Changed chapter number from 7 to 9. June 2012 1.2 Added Feedback icon. November 2011 1.1 • • • Added Synthesis Example Design and Simulation Example Design sections. Added Creating and Connecting the UniPHY Memory Interface and the Traffic Generator in Qsys. Revised Example Driver section as Traffic Generator and BIST Engine. External Memory Interface Handbook Volume 3: Reference Material 423 14 Introduction to UniPHY IP 14 Introduction to UniPHY IP The UniPHY IP is an interface between a memory controller and memory devices and performs read and write operations to the memory. The UniPHY IP creates the datapath between the memory device and the memory controller and user logic in various Intel devices. The Intel FPGA DDR2, DDR3, and LPDDR2 SDRAM controllers with UniPHY, QDR II and QDR II+ SRAM controllers with UniPHY, RLDRAM II controller with UniPHY, and RLDRAM 3 PHY-only IP provide low latency, high-performance, feature-rich interfaces to industry-standard memory devices. The DDR2, QDR II and QDR II+, and RLDRAM II controllers with UniPHY offer full-rate and half-rate interfaces, while the DDR3 controller with UniPHY and the RLDRAM 3 PHY-only IP offer half-rate and quarter-rate interfaces, and the LPDDR2 controller with UniPHY offers a half-rate interface. When you generate your external memory interface IP core, the system creates an example top-level project, consisting of an example driver, and your controller custom variation. The controller instantiates an instance of the UniPHY datapath. The example top-level project is a fully-functional design that you can simulate, synthesize, and use in hardware. The example driver is a self-test module that issues read and write commands to the controller and checks the read data to produce the pass, fail, and test-complete signals. If the UniPHY datapath does not match your requirements, you can create your own memory interface datapath using the ALTDLL, ALTDQ_DQS, ALTDQ_DQS2, ALTDQ, or ALTDQS IP cores, available in the Quartus Prime software, but you are then responsible for all aspects of the design including timing analysis and design constraints. 14.1 Release Information The following table provides information about this release of the DDR2 and DDR3 SDRAM, QDR II and QDR II+ SRAM, and RLDRAM II controllers with UniPHY, and the RLDRAM 3 PHY-only IP. Table 117. Release Information Item Protocol DDR2, DDR3, LPDDR2 QDR II RLDRAM II RLDRAM 3 Version 13.1 13.1 13.1 13.1 Release Date November 2013 November 2013 November 2013 November 2013 Ordering Code IP-DDR2/UNI IP-DDR3/UNI IP-SDRAM/LPDDR2 IP-QDRII/UNI IP-RLDII/UNI Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. — ISO 9001:2008 Registered 14 Introduction to UniPHY IP Intel verifies that the current version of the Quartus Prime software compiles the previous version of each IP core. The Intel FPGA IP Library Release Notes and Errata report any exceptions to this verification. Intel does not verify compilation with IP core versions older than one release. Related Links Intel FPGA IP Library Release Notes 14.2 Device Support Levels The following terms define the device support levels for Intel FPGA IP cores. Intel FPGA IP Core Device Support Levels • Preliminary support—Intel verifies the IP core with preliminary timing models for this device family. The IP core meets all functional requirements, but might still be undergoing timing analysis for the device family. You can use it in production designs with caution. • Final support—Intel verifies the IP core with final timing models for this device family. The IP core meets all functional and timing requirements for the device family and can be used in production designs. 14.3 Device Family and Protocol Support The following table shows the level of support offered by each of the UniPHY-based external memory interface protocols for Intel device families. Table 118. Device Family Support Device Family Support Level DDR2 DDR3 No support No support No support Final No support No support Arria II GZ Final Final No support Final Final No support Arria V Refer to the What’s New in Intel FPGA IP on www.altera.com. Arria V GZ Refer to the What’s New in Intel FPGA IP on www.altera.com. Cyclone V Refer to the What’s New in Intel FPGA IP on www.altera.com. Stratix® Final Final (Only Vcc = 1.1V supported) No support Final Final (Only Vcc = 1.1V supported) No support Stratix IV Final Final No support Final Final No support Stratix V Refer to the What’s New in Intel FPGA IP on www.altera.com. No support Refer to the What’s New in Intel FPGA IP on www.altera.com. Other device families No support No support No support Arria® II GX III No support LPDDR2 No support QDR II RLDRAM II RLDRAM 3 No support Refer to the What’s New in Intel FPGA IP on www.altera.com. No support No support No support For information about features and supported clock rates for external memory interfaces, refer to the External Memory Specification Estimator. External Memory Interface Handbook Volume 3: Reference Material 425 14 Introduction to UniPHY IP Related Links • What's New in Intel FPGA IP • External Memory Interface Spec Estimator 14.4 UniPHY-Based External Memory Interface Features The following table summarizes key feature support for Intel’s UniPHY-based external memory interfaces. Table 119. Feature Support Key Feature Protocol DDR2 DDR3 LPDDR2 QDR II RLDRAM II RLDRAM 3 High-performance controller II (HPC II) Yes Yes Yes — — — Half-rate core logic and user interface Yes Yes Yes Yes Yes Yes Full-rate core logic and user interface Yes — — Yes Yes — — — — Yes Yes Yes Yes Quarter-rate core logic and user interface — (1) Yes Dynamically generated Nios II-based sequencer Yes Yes Yes Choice of RTL-based or dynamically generated Nios® II-based sequencer — — — Yes Yes Available Efficiency Monitor and Protocol Checker (14) DDR3L support — (2) (3) Yes (12) (12) — Yes — Yes Yes (13) — — — — (4) (5) — — — — Yes UDIMM and RDIMM in any form factor Yes Multiple components in a single-rank UDIMM or RDIMM layout Yes Yes — — — — LRDIMM — Yes — — — — Burst length (half-rate) 8 — 8 or 16 4 4 or 8 2, 4, or 8 Burst length (full-rate) 4 — — 2 or 4 2, 4, or 8 — Burst length (quarter-rate) — 8 — — — 2, 4, or 8 Burst length of 8 and burst chop of 4 (on the fly) — Yes — — — — (9) (10) — — — Yes Yes — — — 32 bits 72 bits 72 bits 72 bits With leveling Yes Yes 240 MHz and above (10) Without leveling Below 240 MHz Maximum data width 144 bits (6) Yes — 144 bits (6) Reduced controller latency — — — Read latency — — — Yes (2) (7) 1.5 (QDR II) 2 or 2.5 (QDR II+) Yes (2) (7) — — — continued... External Memory Interface Handbook Volume 3: Reference Material 426 14 Introduction to UniPHY IP Key Feature Protocol DDR2 DDR3 LPDDR2 QDR II RLDRAM II RLDRAM 3 ODT (in memory device) — Yes — QDR II+ only Yes Yes x36 emulation mode — — — Yes (8) (10) — — Notes: 1. For Arria V, Arria V GZ, and Stratix V devices only. 2. Not available in Arria II GX devices. 3. Nios II-based sequencer not available for full-rate interfaces. 4. For DDR3, the DIMM form is not supported in Arria II GX, Arria II GZ, Arria V, or Cyclone V devices. 5. Arria II GZ uses leveling logic for discrete devices in DDR3 interfaces to achieve high speeds, but that leveling cannot be used to implement the DIMM form in DDR3 interfaces. 6. For any interface with data width above 72 bits, you must use software timing analysis of your complete design to determine the maximum clock rate. 7. The maximum achievable clock rate when reduced controller latency is selected must be attained through Quatrus Prime software timing analysis of your complete design. 8. Emulation mode allows emulation of a larger memory-width interface using multiple smaller memory-width interfaces. For example, an x36 QDR II or QDR II+ interface can be emulated using two x18 interfaces. 9. The leveling delay on the board between first and last DDR3 SDRAM component laid out as a DIMM must be less than 0.69 tCK. 10.Leveling is not available for Arria V or Cyclone V devices. 11.x36 emulation mode is not supported in Arria V, Arria V GZ, Cyclone V, or Stratix V devices. 12.The RTL-based sequencer is not available for QDR II or RLDRAM II interfaces on Arria V devices. 13.For Arria V, Arria V GZ, Cyclone V, and Stratix V. 14.The Efficiency Monitor and Protocol Checker is is not available for QDR II and QDR II+ SRAM, or for the MAX 10 device family, or for Arria V or Cyclone V designs using the Hard Memory Controller. 14.5 System Requirements For system requirements and installation instructions, refer to Intel FPGA Software Installation and Licensing manual. The DDR2, DDR3, and LPDDR2 SDRAM controllers with UniPHY, QDR II and QDR II+ SRAM controllers with UniPHY, RLDRAM II controller with UniPHY, and RLDRAM 3 PHYonly IP are part of the Intel FPGA IP Library, which Intel distributes with the Quartus Prime software. Related Links Intel FPGA Software Installation and Licensing Manual 14.6 Intel FPGA IP Core Verification Intel has carried out extensive random, directed tests with functional test coverage using industry-standard models to ensure the functionality of the external memory controllers with UniPHY. Intel’s functional verification of the external memory controllers with UniPHY use modified Denali models, with certain assertions disabled. 14.7 Resource Utilization The following topics provide resource utilization data for the external memory controllers with UniPHY for supported device families. External Memory Interface Handbook Volume 3: Reference Material 427 14 Introduction to UniPHY IP 14.7.1 DDR2, DDR3, and LPDDR2 Resource Utilization in Arria V Devices The following table shows typical resource usage of the DDR2, DDR3, and LPDDR2 SDRAM controllers with UniPHY in the current version of Quartus Prime software for Arria V devices. Table 120. Resource Utilization in Arria V Devices Protocol Memory Width (Bits) Combinationa l ALUTS Logic Registers M10K Blocks Memory (Bits) Hard Memory Controiler Controller DDR2 (Half rate) 8 2286 1404 4 6560 0 64 2304 1379 17 51360 0 DDR2 (Fullrate) 32 0 0 0 0 1 DDR3 (Half rate) 8 2355 1412 4 6560 0 64 2372 1440 17 51360 0 DDR3 (Full rate) 32 0 0 0 0 1 LPDDR2 (Half rate) 8 2230 1617 4 6560 0 32 2239 1600 10 25760 0 DDR2 (Half rate) 8 1652 2015 34 141312 0 64 1819 2089 34 174080 0 DDR2 (Fullrate) 32 1222 1415 34 157696 1 DDR3 (Half rate) 8 1653 1977 34 141312 0 64 1822 2090 34 174080 0 DDR3 (Full rate) 32 1220 1428 34 157696 0 LPDDR2 (Half rate) 8 2998 3187 35 150016 0 32 3289 3306 35 174592 0 DDR2 (Half rate) 8 4555 3959 39 148384 0 64 4991 4002 52 225952 0 DDR2 (Fullrate) 32 1776 1890 35 158208 1 DDR3 (Half rate) 8 4640 3934 39 148384 0 64 5078 4072 52 225952 0 DDR3 (Full rate) 32 1774 1917 35 158208 1 LPDDR2 (Half rate) 8 5228 4804 39 156576 0 32 5528 4906 45 200352 0 PHY Total External Memory Interface Handbook Volume 3: Reference Material 428 14 Introduction to UniPHY IP 14.7.2 DDR2 and DDR3 Resource Utilization in Arria II GZ Devices The following table shows typical resource usage of the DDR2 and DDR3 SDRAM controllers with UniPHY in the current version of Quartus Prime software for Arria II GZ devices. Table 121. Resource Utilization in Arria II GZ Devices Protocol Memory Width (Bits) Combinatio nal ALUTS Logic Registers Mem ALUTs M9K Blocks M144K Blocks Memory (Bits) Controller DDR2 (Half rate) DDR2 (Full rate) DDR3 (Half rate) 8 1,781 1,092 10 2 0 4,352 16 1,784 1,092 10 4 0 8,704 64 1,818 1,108 10 15 0 34,560 72 1,872 1,092 10 17 0 39,168 8 1,851 1,124 10 2 0 2,176 16 1,847 1,124 10 2 0 4,352 64 1,848 1,124 10 8 0 17,408 72 1,852 1,124 10 9 0 19,574 8 1,869 1,115 10 2 0 4,352 16 1,868 1,115 10 4 0 8,704 64 1,882 1,131 10 15 0 34,560 72 1,888 1,115 10 17 0 39,168 8 2,560 2,042 183 22 0 157,696 16 2,730 2,262 183 22 0 157,696 64 3,606 3,581 183 22 0 157,696 72 3,743 3,796 183 22 0 157,696 8 2,494 1,934 169 22 0 157,696 16 2,652 2,149 169 22 0 157,696 64 3,519 3,428 169 22 0 157,696 72 3,646 3,642 169 22 0 157,696 8 2,555 2,032 187 22 0 157,696 16 3,731 2,251 187 22 0 157,696 64 3,607 3,572 187 22 0 157,696 72 3,749 3,788 187 22 0 157,696 8 4,341 3,134 193 24 0 4,374 16 4,514 3,354 193 26 0 166,400 64 5,424 4,689 193 37 0 PHY DDR2 (Half rate) DDR2 (Full rate) DDR3 (Half rate) Total DDR2 (Half rate) 192,256 continued... External Memory Interface Handbook Volume 3: Reference Material 429 14 Introduction to UniPHY IP Protocol DDR2 (Full rate) DDR3 (Half rate) Memory Width (Bits) Combinatio nal ALUTS Logic Registers Mem ALUTs M9K Blocks M144K Blocks Memory (Bits) 72 5,615 4,888 193 39 0 196,864 8 4,345 3,058 179 24 0 159,872 16 4,499 3,273 179 24 0 162,048 64 5,367 4,552 179 30 0 175,104 72 5,498 4,766 179 31 0 177,280 8 4,424 3,147 197 24 0 162,048 16 5,599 3,366 197 26 0 166,400 64 5,489 4,703 197 37 0 192,256 72 5,637 4,903 197 39 0 196,864 14.7.3 DDR2 and DDR3 Resource Utilization in Stratix III Devices The following table shows typical resource usage of the DDR2 and DDR3 SDRAM controllers with UniPHY in the current version of Quartus Prime software for Stratix III devices. Table 122. Resource Utilization in Stratix III Devices Protocol Memory Width (Bits) Combinatio nal ALUTS Logic Registers Mem ALUTs M9K Blocks M144K Blocks Memory (Bits) Controller DDR2 (Half rate) DDR2 (Full rate) DDR3 (Half rate) 8 1,807 1,058 0 4 0 4,464 16 1,809 1,058 0 6 0 8,816 64 1,810 1,272 10 14 0 32,256 72 1,842 1,090 10 17 0 39,168 8 1,856 1,093 0 4 0 2,288 16 1,855 1,092 0 4 0 4,464 64 1,841 1,092 0 10 0 17,520 72 1,834 1,092 0 11 0 19,696 8 1,861 1,083 0 4 0 4,464 16 1,863 1,083 0 6 0 8,816 64 1,878 1,295 10 14 0 32,256 72 1,895 1,115 10 17 0 39,168 8 2,591 2,100 218 6 1 157,696 16 2,762 2,320 218 6 1 157,696 64 3,672 3,658 242 6 1 157,696 72 3,814 3,877 242 6 1 PHY DDR2 (Half rate) 157,696 continued... External Memory Interface Handbook Volume 3: Reference Material 430 14 Introduction to UniPHY IP Protocol DDR2 (Full rate) DDR3 (Half rate) Memory Width (Bits) Combinatio nal ALUTS Logic Registers Mem ALUTs M9K Blocks M144K Blocks Memory (Bits) 8 2,510 1,986 200 6 1 157,696 16 2,666 2,200 200 6 1 157,696 64 3,571 3,504 224 6 1 157,696 72 3,731 3,715 224 6 1 157,696 8 2,591 2,094 224 6 1 157,696 16 2,765 2,314 224 6 1 157,696 64 3,680 3,653 248 6 1 157,696 72 3,819 3,871 248 6 1 157,696 8 4,398 3,158 218 10 1 162,160 16 4,571 3,378 218 12 1 166,512 64 5,482 4,930 252 20 1 189,952 72 5,656 4,967 252 23 1 196,864 8 4,366 3,079 200 10 1 159,984 16 4,521 3,292 200 10 1 162,160 64 5,412 4,596 224 16 1 175,216 72 5,565 4,807 224 17 1 177,392 8 4,452 3,177 224 10 1 162,160 16 4,628 3,397 224 12 1 166,512 64 5,558 4,948 258 20 1 189,952 72 5,714 4,986 258 23 1 196,864 Total DDR2 (Half rate) DDR2 (Full rate) DDR3 (Half rate) 14.7.4 DDR2 and DDR3 Resource Utilization in Stratix IV Devices The following table shows typical resource usage of the DDR2 and DDR3 SDRAM controllers with UniPHY in the current version of Quartus Prime software for Stratix IV devices. Table 123. Resource Utilization in Stratix IV Devices Protocol Memory Width (Bits) Combinatio nal ALUTS Logic Registers Mem ALUTs M9K Blocks M144K Blocks Memory (Bits) Controller DDR2 (Half rate) 8 1,785 1,090 10 2 0 4,352 16 1,785 1,090 10 4 0 8,704 64 1,796 1,106 10 15 0 34,560 72 1,798 1,090 10 17 0 39,168 continued... External Memory Interface Handbook Volume 3: Reference Material 431 14 Introduction to UniPHY IP Protocol DDR2 (Full rate) DDR3 (Half rate) Memory Width (Bits) Combinatio nal ALUTS Logic Registers Mem ALUTs M9K Blocks M144K Blocks Memory (Bits) 8 1,843 1,124 10 2 0 2,176 16 1,845 1,124 10 2 0 4,352 64 1,832 1,124 10 8 0 17,408 72 1,834 1,124 10 9 0 19,584 8 1,862 1,115 10 2 0 4,352 16 1,874 1,115 10 4 0 8,704 64 1,880 1,131 10 15 0 34,560 72 1,886 1,115 10 17 0 39,168 8 2,558 2,041 183 6 1 157,696 16 2,728 2,262 183 6 1 157,696 64 3,606 3,581 183 6 1 157,696 72 3,748 3,800 183 6 1 157,696 8 2,492 1,934 169 6 1 157,696 16 2,652 2,148 169 6 1 157,696 64 3,522 3,428 169 6 1 157,696 72 3,646 3,641 169 6 1 157,696 8 2,575 2,031 187 6 1 157,696 16 2,732 2,251 187 6 1 157,696 64 3,602 3,568 187 6 1 157,696 72 3,750 3,791 187 6 1 157,696 8 4,343 3,131 193 8 1 162,048 16 4,513 3,352 193 10 1 166,400 64 5,402 4,687 193 21 1 192,256 72 5,546 4,890 193 23 1 196,864 8 4,335 3,058 179 8 1 159,872 16 4,497 3,272 179 8 1 162,048 64 5,354 4,552 179 14 1 175,104 72 5,480 4,765 179 15 1 177,280 8 4,437 3,146 197 8 1 162,048 16 4,606 3,366 197 10 1 166,400 64 5,482 4,699 197 21 1 192,256 72 5,636 4,906 197 23 1 196,864 PHY DDR2 (Half rate) DDR2 (Full rate) DDR3 (Half rate) Total DDR2 (Half rate) DDR2 (Full rate) DDR3 (Half rate) External Memory Interface Handbook Volume 3: Reference Material 432 14 Introduction to UniPHY IP 14.7.5 DDR2 and DDR3 Resource Utilization in Arria V GZ and Stratix V Devices The following table shows typical resource usage of the DDR2 and DDR3 SDRAM controllers with UniPHY in the current version of Quartus Prime software for Arria V GZ and Stratix V devices. Table 124. Resource Utilization in Arria V GZ and Stratix V Devices Protocol Memory Width (Bits) Combinational LCs Logic Registers M20K Blocks Memory (Bits) Controller DDR2 (Half rate) DDR2 (Full rate) DDR3 (Quarter rate) DDR3 (Half rate) 8 1,787 1,064 2 4,352 16 1,794 1,064 4 8,704 64 1,830 1,070 14 34,304 72 1,828 1,076 15 38,400 8 2,099 1,290 2 2,176 16 2,099 1,290 2 4,352 64 2,126 1,296 7 16,896 72 2,117 1,296 8 19,456 8 2,101 1,370 4 8,704 16 2,123 1,440 7 16,896 64 2,236 1,885 28 69,632 72 2,102 1,870 30 74,880 8 1,849 1,104 2 4,352 16 1,851 1,104 4 8,704 64 1,853 1,112 14 34,304 72 1,889 1,116 15 38,400 8 2,567 1,757 13 157,696 16 2,688 1,809 13 157,696 64 3,273 2,115 13 157,696 72 3,377 2,166 13 157,696 8 2,491 1,695 13 157,696 16 2,578 1,759 13 157,696 64 3,062 2,137 13 157,696 72 3,114 2,200 13 157,696 8 2,209 2,918 18 149,504 16 2,355 3,327 18 157,696 64 3,358 5,228 18 182,272 72 4,016 6,318 18 198,656 PHY DDR2 (Half rate) DDR2 (Full rate) DDR3 (Quarter rate) continued... External Memory Interface Handbook Volume 3: Reference Material 433 14 Introduction to UniPHY IP Protocol Memory Width (Bits) DDR3 (Half rate) Combinational LCs Logic Registers M20K Blocks Memory (Bits) 8 2,573 1,791 13 157,696 16 2,691 1,843 13 157,696 64 3,284 2,149 13 157,696 72 3,378 2,200 13 157,696 8 4,354 2,821 15 162,048 16 4,482 2,873 17 166,400 64 5,103 3,185 27 192,000 72 5,205 3,242 28 196,096 8 4,590 2,985 15 159,872 16 4,677 3,049 15 162,048 64 5,188 3,433 20 174,592 72 5,231 3,496 21 177,152 8 4,897 4,844 23 158,720 16 5,065 5,318 26 175,104 64 6,183 7,669 47 252,416 72 6,705 8,744 49 274,048 8 4,422 2,895 15 162,048 16 4,542 2,947 17 166,400 64 5,137 3,261 27 192,000 72 5,267 3,316 28 196,096 Total DDR2 (Half rate) DDR2 (Full rate) DDR3 (Quarter rate) DDR3 (Half rate) 14.7.6 QDR II and QDR II+ Resource Utilization in Arria V Devices The following table shows typical resource usage of the QDR II and QDR II+ SRAM controllers with UniPHY in the current version of Quartus Prime software for Arria V devices. Table 125. Resource Utilization in Arria V Devices PHY Rate Memory Width (Bits) Combinationa l ALUTs Logic Registers M10K Blocks Memory (Bits) Hard Memory Controiler Controller Half 9 98 120 0 0 0 18 96 156 0 0 0 36 94 224 0 0 0 9 234 257 0 0 0 18 328 370 0 0 0 PHY Half continued... External Memory Interface Handbook Volume 3: Reference Material 434 14 Introduction to UniPHY IP PHY Rate Memory Width (Bits) Combinationa l ALUTs Logic Registers M10K Blocks Memory (Bits) Hard Memory Controiler 36 522 579 0 0 0 9 416 377 0 0 0 18 542 526 0 0 0 36 804 803 0 0 0 Total Half 14.7.7 QDR II and QDR II+ Resource Utilization in Arria II GX Devices The following table shows typical resource usage of the QDR II and QDR II+ SRAM controllers with UniPHY in the current version of Quartus Prime software for Arria II GX devices. Table 126. PHY Rate Half Full Resource Utilization in Arria II GX Devices Memory Width (Bits) Combinational ALUTS Logic Registers Memory (Bits) M9K Blocks 9 620 701 0 0 18 921 1122 0 0 36 1534 1964 0 0 9 584 708 0 0 18 850 1126 0 0 36 1387 1962 0 0 14.7.8 QDR II and QDR II+ Resource Utilization in Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V Devices The following table shows typical resource usage of the QDR II and QDR II+ SRAM controllers with UniPHY in the current version of Quartus Prime software for Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V devices. Table 127. PHY Rate Half Full Resource Utilization in Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V Devices Memory Width (Bits) Combinational ALUTS Logic Registers Memory (Bits) M9K Blocks 9 602 641 0 0 18 883 1002 0 0 36 1457 1724 0 0 9 586 708 0 0 18 851 1126 0 0 36 1392 1962 0 0 14.7.9 RLDRAM II Resource Utilization in Arria V Devices The following table shows typical resource usage of the RLDRAM II controller with UniPHY in the current version of Quartus Prime software for Arria V devices. External Memory Interface Handbook Volume 3: Reference Material 435 14 Introduction to UniPHY IP Table 128. Resource Utilization in Arria V Devices (Part 1 of 2) PHY Rate Memory Width (Bits) Combinationa l ALUTs Logic Registers M10K Blocks Memory (Bits) Hard Memory Controller Controller Half 9 353 303 1 288 0 18 350 324 2 576 0 36 350 402 4 1152 0 9 295 474 0 0 0 18 428 719 0 0 0 36 681 1229 0 0 0 9 705 777 1 288 0 18 871 1043 2 576 0 36 1198 1631 4 1152 0 PHY Half Total Half 14.7.10 RLDRAM II Resource Utilization in Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V Devices The following table shows typical resource usage of the RLDRAM II controller with UniPHY in the current version of Quartus Prime software for Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V devices. Table 129. Resource Utilization in Arria II GZ, Arria V GZ, Stratix III, Stratix IV, and Stratix V Devices (1) PHY Rate Half Full Memory Width (Bits) Combinational ALUTS Logic Registers Memory (Bits) M9K Blocks 9 829 763 288 1 18 1145 1147 576 2 36 1713 1861 1152 4 9 892 839 288 1 18 1182 1197 576 1 36 1678 1874 1152 2 Note to Table: 1. Half-rate designs use the same amount of memory as full-rate designs, but the data is organized in a different way (half the width, double the depth) and the design may need more M9K resources. 14.8 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. continued... External Memory Interface Handbook Volume 3: Reference Material 436 14 Introduction to UniPHY IP Date Version Changes May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Changed instances of Quartus II to Quartus Prime. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Removed occurrences of MegaWizard Plug-In Manager. December 2013 2013.12.16 • • Removed references to ALTMEMPHY. Removed references to HardCopy. November 2012 2.1 • • • • Added RLDRAM 3 support. Added LRDIMM support. Added Arria V GZ support. Changed chapter number from 8 to 10. June 2012 2.0 • • • Added LPDDR2 support. Moved Protocol Support Matrix to Volume 1. Added Feedback icon. November 2011 1.1 • Combined Release Information, Device Family Support, Features list, and Unsupported Features list for DDR2, DDR3, QDR II, and RLDRAM II. Added Protocol Support Matrix. Combined Resource Utilization information for DDR2, DDR3, QDR II, and RLDRAM II. Updated data for 11.1. • • • External Memory Interface Handbook Volume 3: Reference Material 437 15 Latency for UniPHY IP 15 Latency for UniPHY IP Intel defines read and write latencies in terms of memory device clock cycles, which are always full-rate. There are two types of latencies that exists while designing with memory controllers—read and write latencies, which have the following definitions: • Read latency—the amount of time it takes for the read data to appear at the local interface after initiating the read request. • Write latency—the amount of time it takes for the write data to appear at the memory interface after initiating the write request. Latency of the memory interface depends on its configuration and traffic patterns, therefore you should simulate your system to determine precise latency values. The numbers presented in this chapter are typical values meant only as guidelines. Latency found in simulation may differ from latency found on the board, because functional simulation does not consider board trace delays and differences in process, voltage, and temperature. For a given design on a given board, the latency found may differ by one clock cycle (for full-rate designs), or two clock cycles (for quarter-rate or half-rate designs) upon resetting the board. The same design can yield different latencies on different boards. Note: For a half-rate controller, the local side frequency is half of the memory interface frequency. For a full-rate controller, the local side frequency is equal to the memory interface frequency. 15.1 DDR2 SDRAM LATENCY The following table shows the DDR2 SDRAM latency in full-rate memory clock cycles. Table 130. DDR2 SDRAM Controller Latency (In Full-Rate Memory Clock Cycles) (1) (2) Latency in Full-Rate Memory Clock Cycles Rate Half Controller Address & Command 10 PHY Address & Command EWL: 3 Memory Maximum Read 3–7 PHY Read Return 6 Controller Read Return 4 OWL: 4 Full 5 0 3–7 4 10 Round Trip Round Trip Without Memory EWL: 26–30 EWL: 23 OWL: 27–31 OWL: 24 22–26 19 Notes to Table: 1. EWL = Even write latency 2. OWL = Odd write latency Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 15 Latency for UniPHY IP 15.2 DDR3 SDRAM LATENCY The following table shows the DDR3 SDRAM latency in full-rate memory clock cycles. Table 131. DDR3 SDRAM Controller Latency (In Full-Rate Memory Clock Cycles) (1) (2) (3) (4) Latency in Full-Rate Memory Clock Cycles Rate Quarter Half Controller Address & Command 20 10 Full 5 PHY Address & Command EWER : 8 Memory Maximum Read 5–11 PHY Read Return EWER: 16 Controller Read Return 8 Round Trip Round Trip Without Memory EWER: 57– 63 EWER: 52 EWOR: 8 EWOR: 17 EWOR:58– 64 EWOR: 53 OWER: 11 OWER: 17 OWER:61– 67 OWER: 56 OWOR: 11 OWOR: 14 OWOR: 58– 64 OWOR: 53 EWER: 29– 35 EWER: 24 EWER: 3 5–11 EWER: 7 4 EWOR: 3 EWOR: 6 EWOR: 28– 34 EWOR: 23 OWER: 4 OWER: 6 OWER: 29– 35 OWER: 24 OWOR: 4 OWOR: 7 OWOR: 30– 36 OWOR: 25 24–30 19 0 5–11 4 10 Notes to Table: 1. EWER = Even write latency and even read latency 2. EWOR = Even write latency and odd read latency 3. OWER = Odd write latency and even read latency 4. OWOR = Odd write latency and odd read latency 15.3 LPDDR2 SDRAM LATENCY The following table shows the LPDDR2 SDRAM latency in full-rate memory clock cycles. Table 132. LPDDR2 SDRAM Controller Latency (In Full-Rate Memory Clock Cycles) (1) (2) (3) (4) Latency in Full-Rate Memory Clock Cycles Rate Half Controller Address & Command 10 PHY Address & Command EWER: 3 EWOR: 3 Memory Maximum Read 5–11 PHY Read Return EWER: 7 EWOR: 6 Controller Read Return 4 Round Trip Round Trip Without Memory EWER: 29– 35 EWER: 24 EWOR: 28– 34 EWOR: 23 continued... External Memory Interface Handbook Volume 3: Reference Material 439 15 Latency for UniPHY IP Latency in Full-Rate Memory Clock Cycles Rate Full Controller Address & Command 5 PHY Address & Command Memory Maximum Read PHY Read Return Controller Read Return Round Trip Round Trip Without Memory OWER: 4 OWER: 6 OWER: 29– 35 OWER: 24 OWOR: 4 OWOR: 7 OWOR: 30– 36 OWOR: 25 24–30 19 0 5–11 4 10 Notes to Table: 1. EWER = Even write latency and even read latency 2. EWOR = Even write latency and odd read latency 3. OWER = Odd write latency and even read latency 4. OWOR = Odd write latency and odd read latency 15.4 QDR II and QDR II+ SRAM Latency The following table shows the latency in full-rate memory clock cycles. Table 133. QDR II Latency (In Full-Rate Memory Clock Cycles) (1) Latency in Full-Rate Memory Clock Cycles Rate Controller Address & Command PHY Address & Command Memory Maximum Read PHY Read Return Controller Read Return Round Trip Round Trip Without Memory Half 1.5 RL 2 5.5 1.5 7.0 0 16 14.5 Half 2.0 RL 2 5.5 2.0 6.5 0 16 14.0 Half 2.5 RL 2 5.5 2.5 6.0 0 16 13.5 Full 1.5 RL 2 1.5 1.5 4.0 1 10 8.5 Full 2.0 RL 2 1.5 2.0 4.5 1 11 9.0 Full 2.5 RL 2 1.5 2.5 4.0 1 11 8.5 Note to Table: 1. RL = Read latency 15.5 RLDRAM II Latency The following table shows the latency in full-rate memory clock cycles. External Memory Interface Handbook Volume 3: Reference Material 440 15 Latency for UniPHY IP Table 134. RLDRAM II Latency (In Full-Rate Memory Clock Cycles) (1) (2) Latency in Full-Rate Memory Clock Cycles Rate Controller Address & Command Half 4 PHY Address & Command EWL: 1 Memory Maximum Read 3–8 EWL: 4 OWL: 2 Full 2 1 PHY Read Return Controller Read Return 0 OWL: 4 3–8 4 0 Round Trip Round Trip Without Memory EWL: 12–17 EWL: 9 OWL: 13–18 OWL: 10 10–15 7 Notes to Table: 1. EWL = Even write latency 2. OWL = Odd write latency 15.6 RLDRAM 3 Latency The following table shows the latency in full-rate memory clock cycles. Table 135. RLDRAM 3 Latency (In Full-Rate Memory Clock Cycles) Latency in Full-Rate Memory Clock Cycles Rate PHY Address & Command Memory Maximum Read PHY Read Return Controller Read Return Round Trip Round Trip Without Memory Quarter 7 3–16 18 0 28–41 25 Half 4 3–16 6 0 13–26 10 15.7 Variable Controller Latency The variable controller latency feature allows you to take advantage of lower latency for variations designed to run at lower frequency. When deciding whether to vary the controller latency from the default value of 1, be aware of the following considerations: • Reduced latency can help achieve a reduction in resource usage and clock cycles in the controller, but might result in lower fMAX. • Increased latency can help acheive greater fMAX, but might consume more clock cycles in the controller and result in increased resource usage. If you select a latency value that is inappropriate for the target frequency, the system displays a warning message in the text area at the bottom of the parameter editor. You can change the controller latency by altering the value of the Controller Latency setting in the Controller Settings section of the General Settings tab of the QDR II and QDR II+ SRAM controller with UniPHY parameter editor. 15.8 Document Revision History External Memory Interface Handbook Volume 3: Reference Material 441 15 Latency for UniPHY IP Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Maintenance release. December 2013 2013.12.16 Updated latency data for QDR II/II+. November 2012 2.1 • • Added latency information for RLDRAM 3. Changed chapter number from 9 to 11. June 2012 2.0 • • Added latency information for LPDDR2. Added Feedback icon. November 2011 1.0 • Consolidated latency information from 11.0 DDR2 and DDR3 SDRAM Controller with UniPHY User Guide, QDR II and QDR II+ SRAM Controller with UniPHY User Guide, and RLDRAM II Controller with UniPHY IP User Guide. Updated data for 11.1. • External Memory Interface Handbook Volume 3: Reference Material 442 16 Timing Diagrams for UniPHY IP 16 Timing Diagrams for UniPHY IP The following topics contain timing diagrams for UniPHY-based external memory interface IP for supported protocols. 16.1 DDR2 Timing Diagrams This topic contains timing diagrams for UniPHY-based external memory interface IP for DDR2 protocols. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 16 Timing Diagrams for UniPHY IP The following figures present timing diagrams based on a Stratix III device: Figure 183. Full-Rate DDR2 SDRAM Read afi_clk avl_ready [1] avl_read_req avl_size[1:0] X 2 X avl_addr[24:0] X 0 X avl_burstbegin avl_rdata_valid X avl_rdata[15:0] X afi_cs_n [8] afi_ras_n afi_cas_n [2] [4] afi_we_n afi_ba[1:0] afi_addr[13:0] X 0 X 0 X X 0 X 0 X afi_rdata_en_full afi_rdata_en afi_rdata_valid X afi_rdata[15:0] X [7] mem_ck mem_cs_n mem_ras_n [3] mem_cas_n [5] mem_we_n mem_ba[1:0] X 0 X 0 X mem_a[13:0] X 0 X 0 X mem_dqs mem_dq[7:0] Notes for the above Figure: 1. Controller receives read command. 2. Controller issues activate command to PHY. 3. PHY issues activate command to memory. 4. Controller issues read command to PHY. 5. PHY issues read command to memory. 6. PHY receives read data from memory. 7. Controller receives read data from PHY. 8. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 444 [6] 16 Timing Diagrams for UniPHY IP Figure 184. Full-Rate DDR2 SDRAM Write afi_clk avl_ready avl_write_req avl_size[1:0] avl_addr[24:0] avl_burstbegin avl_wdata[15:0] avl_be[1:0] [1] X X X 2 0 X X X X 3 1 X [2] afi_cs_n afi_ras_n [3] afi_cas_n afi_we_n afi_ba[1:0] X 0 X 0 X afi_addr[13:0] X 0 X 0 X [5] afi_dqs_burst [7] afi_wdata_valid afi_wdata[15:0] X afi_dm_[1:0] X X 0 2 X 05 afi_wlat[5:0] mem_ck mem_cs_n mem_ras_n [4] mem_cas_n [6] mem_we_n mem_ba[1:0] X 0 X 0 X mem_a[13:0] X 0 X 0 X mem_dqs [8] mem_dq[7:0] mem_dm Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues activate command to PHY. 4. PHY issues activate command to memory. 5. Controller issues write command to PHY. 6. PHY issues write command to memory. 7. Controller sends write data to PHY. 8. PHY sends write data to memory. External Memory Interface Handbook Volume 3: Reference Material 445 16 Timing Diagrams for UniPHY IP Figure 185. Half-Rate DDR2 SDRAM Read afi_clk avl_ready [1] avl_read_req avl_size[1:0] X 2 X avl_addr[23:0] X 0 X avl_burstbegin avl_rdata_valid X avl_rdata[31:0] afi_cs_n[1:0] 3 1 afi_ras_n[1:0] 3 0 3 1 [8] 3 3 0 3 afi_cas_n[1:0] X 3 [2] 3 afi_we_n[1:0] [4] afi_ba[3:0] X 0 X 0 X afi_addr[27:0] X 0 X 0 X afi_rdata_en_full[1:0] 0 3 0 afi_rdata_en[1:0] 0 3 0 afi_rdata_valid afi_rdata[31:0] X X mem_ck [7] mem_cs_n mem_ras_n [3] mem_cas_n [5] mem_we_n mem_ba[1:0] X 0 X 0 X mem_a[13:0] X 0 X 0 X mem_dqs mem_dq[7:0] Notes for the above Figure: 1. Controller receives read command. 2. Controller issues activate command to PHY. 3. PHY issues activate command to memory. 4. Controller issues read command to PHY. 5. PHY issues read command to memory. 6. PHY receives read data from memory. 7. Controller receives read data from PHY. 8. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 446 [6] 16 Timing Diagrams for UniPHY IP Figure 186. Half-Rate DDR2 SDRAM Write afi_clk avl_ready [1] avl_write_req avl_size[1:0] X 2 X avl_addr[23:0] X 0 X avl_burstbegin avl_wdata[15:0] X avl_be[3:0] X X B D X [2] afi_cs_n[1:0] 3 1 afi_ras_n[1:0] 3 0 afi_cas_n[1:0] 3 0 3 afi_we_n[1:0] 3 0 3 3 3 1 3 afi_ba[3:0] X 0 X 0 X afi_addr[27:0] X 0 X 0 X afi_dqs_burst[1:0] 0 afi_wdata_valid[1:0] 0 afi_wdata[31:0] X afi_dm_[3:0] X 2 [3] [5] 3 0 3 0 [7] X X 4 2 02 afi_wlat[5:0] mem_ck mem_cs_n mem_ras_n [4] mem_cas_n [6] mem_we_n mem_ba[1:0] X 0 X 0 X mem_a[13:0] X 0 X 0 X [8] mem_dqs mem_dq[7:0] mem_dm Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues activate command to PHY. 4. PHY issues activate command to memory. 5. Controller issues write command to PHY. External Memory Interface Handbook Volume 3: Reference Material 447 16 Timing Diagrams for UniPHY IP 6. PHY issues write command to memory. 7. Controller sends write data to PHY. 8. PHY sends write data to memory. 16.2 DDR3 Timing Diagrams This topic contains timing diagrams for UniPHY-based external memory interface IP for DDR3 protocols. External Memory Interface Handbook Volume 3: Reference Material 448 16 Timing Diagrams for UniPHY IP The following figures present timing diagrams based on a Stratix III device Figure 187. Half-Rate DDR3 SDRAM Read afi_clk avl_ready [1] avl_read_req avl_size[1:0] 2 X avl_addr[25:0] 0 X avl_burstbegin avl_rdata_valid X avl_rdata[31:0] afi_cs_n[1:0] 3 1 afi_ras_n[1:0] 3 0 3 [8] 3 1 3 3 afi_cas_n[1:0] X 3 0 [2] [4] 3 afi_we_n[1:0] afi_ba[5:0] X 0 X 0 X afi_addr[29:0] X 0 X 0 X afi_rdata_en_full[1:0] 0 3 0 afi_rdata_en[1:0] 0 3 0 [7] afi_rdata_valid[1:0] X afi_rdata[31:0] X mem_ck mem_cs_n mem_ras_n [3] mem_cas_n [5] mem_we_n mem_ba[2:0] X 0 X 0 X mem_a[14:0] X 0 X 0 X mem_dqs mem_dq[7:0] External Memory Interface Handbook Volume 3: Reference Material 449 [6] 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives read command. 2. Controller issues activate command to PHY. 3. PHY issues activate command to memory. 4. Controller issues read command to PHY. 5. PHY issues read command to memory. 6. PHY receives read data from memory. 7. Controller receives read data from PHY. 8. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 450 16 Timing Diagrams for UniPHY IP Figure 188. Half-Rate DDR3 SDRAM Writes afi_clk avl_ready [1] avl_write_req avl_size[1:0] X 2 X avl_addr[25:0] X 0 1X avl_burstbegin avl_wdata[31:0] avl_be[3:0] X X X 1 8 X 3 afi_cs_n[1:0] 3 1 afi_ras_n[1:0] 3 0 afi_cas_n[1:0] 3 0 3 [3] afi_we_n[1:0] 3 0 3 [5] afi_ba[5:0] X 0 X 0 X afi_addr[29:0] X 0 X 0 X afi_dqs_burst[1:0] 0 afi_wdata_valid[1:0] 0 afi_wdata[31:0] X afi_dm[3:0] X 3 [2] 1 3 2 3 0 3 0 X E 7 [7] X 03 afi_wlat[5:0] mem_ck mem_cs_n mem_ras_n [4] [6] mem_cas_n mem_we_n mem_ba[2:0] X 0 X 0 X mem_a[14:0] X 0 X 0 X [8] mem_dqs mem_dq[7:0] mem_dm External Memory Interface Handbook Volume 3: Reference Material 451 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues activate command to PHY. 4. PHY issues activate command to memory. 5. Controller issues write command to PHY. 6. PHY issues write command to memory. 7. Controller sends write data to PHY. 8. PHY sends write data to memory. External Memory Interface Handbook Volume 3: Reference Material 452 16 Timing Diagrams for UniPHY IP Figure 189. Quarter-Rate DDR3 SDRAM Reads afi_clk avl_ready [1] avl_read_req avl_size[2:0] X 1 X avl_addr[24:0] X 0 X avl_burstbegin avl_rdata_valid X X avl_rdata[63:0] afi_cs_n[3:0] F 7 afi_ras_n[3:0] F 0 F [8] F 0 F afi_cas_n[3:0] F 7 F [2] F afi_we_n[3:0] [4] afi_ba[11:0] X 0 X 0 X afi_addr[59:0] X 0 X 0 X afi_rdata_en_full[3:0] 0 F 0 afi_rdata_en[3:0] 0 F 0 [7] afi_rdata_valid[3:0] X afi_rdata[63:0] X X mem_ck mem_cs_n mem_ras_n [3] mem_cas_n [5] mem_we_n mem_ba[2:0] X 0 X 0 X mem_a[14:0] X 0 X 0 X [6] mem_dqs mem_dq[7:0] External Memory Interface Handbook Volume 3: Reference Material 453 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives read command. 2. Controller issues activate command to PHY. 3. PHY issues activate command to memory. 4. Controller issues read command to PHY. 5. PHY issues read command to memory. 6. PHY receives read data from memory 7. Controller receives read data from PHY 8. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 454 16 Timing Diagrams for UniPHY IP Figure 190. Quarter-Rate DDR3 SDRAM Writes afi_clk avl_ready [1] avl_write_req avl_size[2:0] 0 1 0 avl_addr[24:0] X 0 X avl_burstbegin avl_wdata[63:0] X X avl_be[7:0] X X afi_cs_n[3:0] F 7 afi_ras_n[3:0] F 0 F [2] 7 F afi_cas_n[3:0] F 0 F [3] afi_we_n[3:0] F 0 F [5] afi_ba[11:0] X 0 X 0 X afi_addr[59:0] X 0 X 0 X afi_dqs_burst[3:0] 0 afi_wdata_valid[3:0] 8 F 0 0 F 0 X EC X [7] afi_wdata[63:0] afi_dm[7:0] 02 afi_wlat[5:0] mem_ck mem_cs_n mem_ras_n [4] [6] mem_cas_n mem_we_n mem_ba[2:0] X 0 X 0 X mem_a[14:0] X 0 X 0 X mem_dqs [8] mem_dq[7:0] mem_dm External Memory Interface Handbook Volume 3: Reference Material 455 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues activate command to PHY 4. PHY issues activate command to memory. 5. Controller issues write command to PHY 6. PHY issues write command to memory 7. Controller sends write data to PHY 8. PHY sends write data to memory. 16.3 QDR II and QDR II+ Timing Diagrams This topic contains timing diagrams for UniPHY-based external memory interface IP for QDR II and QDR II+ protocols. External Memory Interface Handbook Volume 3: Reference Material 456 16 Timing Diagrams for UniPHY IP The following figures present timing diagrams, based on a Stratix III device: Figure 191. Half-Rate QDR II and QDR II+ SRAM Read afi_clk avl_r_ready [1] avl_r_read_req avl_r_size[2:0] X 2 X avl_r_addr[19:0] X 0 X avl_r_rdata_valid [6] X avl_r_rdata[71:0] afi_rps_n[1:0] X 3 1 3 3 afi_wps_n[1:0] afi_addr[39:0] [2] X 0 1 X afi_rdata_en_full afi_rdata_en afi_rdata_valid [5] X afi_rdata[71:0] X mem_k mem_rps_n [3] mem_wps_n mem_a[19:0] X 0X1 X mem_cq [4] mem_q[17:0] Notes for the above Figure: 1. Controller receives read command. 2. Controller issues two read commands to PHY. 3. PHY issues two read commands to memory. 4. PHY receives read data from memory. 5. Controller receives read data from PHY. 6. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 457 16 Timing Diagrams for UniPHY IP Figure 192. Half-Rate QDR II and QDR II+ SRAM Write afi_clk avl_w_ready [1] avl_w_write_req avl_w_size[2:0] X 2 X avl_w_addr[19:0] X 0 X 3 afi_rps_n[1:0] afi_wps_n[1:0] afi_addr[39:0] [2] X avl_w_wdata[71:0] X 3 X afi_wdata_valid[1:0] 0 X afi_bws_n[7:0] X 3 X 1 0 3 0 afi_wdata[71:0] [3] 2 X [4] X 0 mem_k mem_rps_n mem_wps_n mem_a[19:0] [5] X 0 X 1 X mem_d[17:0] mem_bws_n[1:0] [6] X 0 Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues two write commands to PHY. 4. Controller sends write data to PHY. 5. PHY issues two write commands to memory. 6. PHY sends write data to memory. External Memory Interface Handbook Volume 3: Reference Material 458 X 16 Timing Diagrams for UniPHY IP Figure 193. Full-Rate QDR II and QDR II+ SRAM Read afi_clk avl_r_ready [1] avl_r_read_req avl_r_size[2:0] X 2 X avl_r_addr[20:0] X 0 X avl_r_rdata_valid X avl_r_rdata[35:0] X [6] afi_rdata_valid afi_rdata_en_full afi_rdata_en 3 afi_wps_n[1:0] afi_addr[41:0] afi_rps_n[1:0] 1 0 X 3 X 3 2 X afi_rdata[35:0] X [2] [5] mem_k mem_rps_n [3] mem_wps_n mem_a[20:0] X 0 X 1 X mem_cq [4] mem_q[17:0] Notes for the above Figure: 1. Controller receives read command. 2. Controller issues two read commands to PHY. 3. PHY issues two read commands to memory. 4. PHY receives read data from memory. 5. Controller receives read data from PHY. 6. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 459 16 Timing Diagrams for UniPHY IP Figure 194. Full-Rate QDR II and QDR II+ SRAM Write afi_clk avl_w_ready [1] avl_w_write_req avl_w_size[2:0] X 2 X avl_w_addr[20:0] X 0 X avl_w_wdata[35:0] X [2] X 3 afi_rps_n[1:0] afi_wps_n[1:0] 3 afi_addr[41:0] X 1 0 [3] 3 2 X [4] afi_wdata_valid afi_wdata[35:0] X afi_bws_n[3:0] X X X 0 mem_k mem_rps_n mem_wps_n [5] X mem_a[20:0] 0X 1 X mem_d[17:0] [6] mem_bws_n[1:0] X 0 X Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues two write commands to PHY. 4. Controller sends write data to PHY. 5. PHY issues two write commands to memory. 6. PHY sends write data to memory. 16.4 RLDRAM II Timing Diagrams This topic contains timing diagrams for UniPHY-based external memory interface IP for RLDRAM protocols. External Memory Interface Handbook Volume 3: Reference Material 460 16 Timing Diagrams for UniPHY IP The following figures present timing diagrams, based on a Stratix III device: Figure 195. Half-Rate RLDRAM II Read afi_clk avl_ready [1] avl_read_req avl_size[2:0] X 2 X avl_addr[22:0] X 0 X avl_rdata_valid afi_cs_n[1:0] 3 [6] X X avl_rdata[71:0] 1 [2] 3 afi_we_n[1:0] 3 afi_ref_n[1:0] 3 afi_ba[5:0] X 0 X afi_addr[37:0] X 0 X afi_rdata_en_full afi_rdata_en afi_rdata_valid [5] X afi_rdata[71:0] X mem_ck mem_cs_n [3] mem_we_n mem_ref_n mem_ba[2:0] X 0 X mem_a[18:0] X 0 X mem_qk [4] mem_dq[17:0] External Memory Interface Handbook Volume 3: Reference Material 461 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives read command. 2. Controller issues read command to PHY. 3. PHY issues read command to memory. 4. PHY receives read data from memory. 5. Controller receives read data from PHY. 6. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 462 16 Timing Diagrams for UniPHY IP Figure 196. Half-Rate RLDRAM II Write afi_clk avl_ready [1] avl_write_req avl_size[2:0] X 2 X avl_addr[22:0] X 2 X avl_wdata[71:0] X [2] X afi_cs_n[1:0] 3 1 3 afi_we_n[1:0] 3 1 3 [3] 3 afi_ref_n[1:0] afi_ba[5:0] X 0 X afi_addr[37:0] X 0 X afi_wdata_valid[1:0] 0 afi_wdata[71:0] X afi_dm[3:0] 3 3 0 [5] X 0 3 mem_ck mem_cs_n mem_we_n [4] mem_ref_n mem_ba[2:0] X 0 X mem_a[18:0] X 0 X mem_dk [6] mem_dq[17:0] mem_dm External Memory Interface Handbook Volume 3: Reference Material 463 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues write command to PHY. 4. PHY issues write command to memory. 5. Controller sends write data to PHY. 6. PHY sends write data to memory. External Memory Interface Handbook Volume 3: Reference Material 464 16 Timing Diagrams for UniPHY IP Figure 197. Full-Rate RLDRAM II Read afi_clk avl_ready [1] avl_read_req avl_size[2:0] X 2 X avl_addr[23:0] X 0 X avl_rdata_valid X avl_rdata[35:0] X afi_cs_n [6] [2] afi_we_n afi_ref_n afi_ba[2:0] X 0 X afi_addr[19:0] X 0 X afi_rdata_en_full afi_rdata_en afi_rdata_valid X afi_rdata[35:0] X [5] mem_ck [3] mem_cs_n mem_we_n mem_ref_n mem_ba[2:0] X 0 X mem_a[19:0] X 0 X mem_qk [4] mem_dq[17:0] External Memory Interface Handbook Volume 3: Reference Material 465 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives read command. 2. Controller issues read command to PHY. 3. PHY issues read command to memory. 4. PHY receives read data from memory. 5. Controller receives read data from PHY. 6. User logic receives read data from controller. Figure 198. Full-Rate RLDRAM II Write afi_clk avl_ready [1] avl_write_req avl_size[2:0] X 2 X avl_addr[23:0] X 0 X avl_wdata[35:0] X [2] X afi_cs_n afi_we_n [3] afi_ref_n afi_ba[2:0] X 0 X afi_addr[19:0] X 0 X afi_wdata_valid afi_wdata[35:0] X afi_dm[1:0] 3 X 0 [5] 3 mem_ck mem_cs_n mem_we_n [4] mem_ref_n mem_ba[2:0] X 0 X mem_a[19:0] X 0 X mem_dk mem_dq[17:0] mem_dm External Memory Interface Handbook Volume 3: Reference Material 466 [6] 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues write command to PHY. 4. PHY issues write command to memory. 5. Controller sends write data to PHY. 6. PHY sends write data to memory. 16.5 LPDDR2 Timing Diagrams This topic contains timing diagrams for UniPHY-based external memory interface IP for LPDDR2 protocols. External Memory Interface Handbook Volume 3: Reference Material 467 16 Timing Diagrams for UniPHY IP Figure 199. Half-Rate LPDDR2 Read [8] [1] afi_clk avl_ready avl_read_req avl_size[2:0] X 1 X avl_addr[22:0] X 1 X avl_burstbegin avl_rdata_valid ABA... avl_rdata[31:0] afi_addr[39:0] X afi_cs_n[1:0] X X 2 X afi_rdata_en_full[1:0] 0 afi_rdata_en[1:0] 0 X 1 X 3 0 3 0 0 afi_rdata_valid[1:0] X afi_rdata[31:0] 3 0 AB mem_ck mem_cs_n mem_ca[9:0] X X X mem_dqs mem_dq[7:0] [2] [3] [4] Notes for the above Figure: 1. Controller receives read command. 2. Controller issues activate command to PHY. 3. PHY issues activate command to memory. 4. Controller issues read command to PHY. 5. PHY issues read command to memory. External Memory Interface Handbook Volume 3: Reference Material 468 [5] [6] [7] 16 Timing Diagrams for UniPHY IP 6. PHY receives read data from memory. 7. Controller receives read data from PHY. 8. User logic receives read data from controller. Figure 200. Half-Rate LPDDR2 Write [1] [2] [3] [5] afi_clk avl_ready avl_write_req avl_size[2:0] 1 avl_addr[22:0] 000000 avl_burstbegin avl_wdata[31:0] avl_be[7:0] X afi_cs_n[1:0] 3 afi_addr[39:0] X ABAB... FF X 2 0... X 3 1 0... X afi_dqs_burst[1:0] 0 2 afi_wdata_valid[1:0] 0 afi_wdata[31:0] X 3 0 3 0 ABABABAB afi_dm[3:0] X F 0 afi_wlat[5:0] 03 mem_ck mem_cs_n mem_ca[9:0] X X X mem_dq[7:0] mem_dqs mem_dm [4] [6] [7] External Memory Interface Handbook Volume 3: Reference Material 469 [8] 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues activate command to PHY. 4. PHY issues activate command to memory. 5. Controller issues write command to PHY. 6. PHY issues write command to memory. 7. Controller sends write data to PHY. 8. PHY sends write data to memory. External Memory Interface Handbook Volume 3: Reference Material 470 16 Timing Diagrams for UniPHY IP Figure 201. Full-Rate LPDDR2 Read [1] [2] [8] [4] afi_clk avl_ready avl_read_req avl_size[2:0] X X 1 avl_addr[24:0] X X avl_burstbegin avl_rdata_valid X avl_rdata[63:0] X afi_cs_n afi_addr[19:0] X X X afi_rdata_en_full afi_rdata_en afi_rdata_valid afi_rdata[63:0] X X mem_ck mem_cs_n mem_ca[9:0] X X X 0 mem_dqs[3:0] mem_dq[31:0] AAA... [3] [5] [6] [7] External Memory Interface Handbook Volume 3: Reference Material 471 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives read command. 2. Controller issues activate command to PHY. 3. PHY issues activate command to memory. 4. Controller issues read command to PHY. 5. PHY issues read command to memory. 6. PHY receives read data from memory. 7. Controller receives read data from PHY. 8. User logic receives read data from controller. External Memory Interface Handbook Volume 3: Reference Material 472 16 Timing Diagrams for UniPHY IP Figure 202. Full-Rate LPDDR2 Write [1] [2] [3] [5] afi_clk avl_ready avl_write_req avl_size[2:0] X 1 X avl_addr[24:0] X 1 X avl_burstbegin avl_wdata[63:0] X AB... X avl_be[7:0] X FF X afi_cs_n afi_addr[19:0] X 0... X 0... X F afi_dqs_burst[3:0] 0 F 0 ABABABABAB... X afi_wdata_valid[3:0] 0 afi_wdata[63:0] X X afi_dm[7:0] F 0 FF 00 X afi_wlat[5:0] 00 mem_ck mem_cs_n X mem_ca[9:0] X X mem_dqs[3:0] 0 F 0 F 0F 0 F 0 mem_dq[31:0] ABABABAB 0 mem_dm[3:0] F [4] [6] F [7] External Memory Interface Handbook Volume 3: Reference Material 473 [8] 16 Timing Diagrams for UniPHY IP Notes for the above Figure: 1. Controller receives write command. 2. Controller receives write data. 3. Controller issues activate command to PHY. 4. PHY issues activate command to memory. 5. Controller issues write command to PHY. 6. PHY issues write command to memory. 7. Controller sends write data to PHY. 8. PHY sends write data to memory. 16.6 RLDRAM 3 Timing Diagrams This topic contains timing diagrams for UniPHY-based external memory interface IP for RLDRAM 3 protocols. External Memory Interface Handbook Volume 3: Reference Material 474 16 Timing Diagrams for UniPHY IP Figure 203. Quarter-Rate RLDRAM 3 Read [4] afi_clk afi_addr afi_rdata_en_full afi_rdata_valid afi_we_n afi_ba afi_cs_n [1] afi_wdata afi_rdata_en afi_rst_n afi_rlat afi_rdata afi_wdata_valid afi_dm mem_ck mem_ck_n global_reset mem_cs_n [2] mem_we_n mem_ba mem_a mem_qk mem_qk_n mem_dq [3] mem_dk mem_dk_n mem_ref_n mem_dm 000c8000c8000c8000c8 3 7 0 0 f 7 3 0 zzzz... c8c8c8c8c8c... 9c63 8421 0000 8 f c c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8 3 7 0 f 00 00 0000 0 000c8 Notes for the above Figure: 1. Controller issues read command to PHY. 2. PHY issues read command to memory. 3. PHY receives data from memory. 4. Controller receives read data from PHY. External Memory Interface Handbook Volume 3: Reference Material 475 16 Timing Diagrams for UniPHY IP Figure 204. Quarter-Rate RLDRAM 3 Write afi_clk afi_addr afi_rdata_en_full afi_rdata_valid afi_we_n [1] afi_ba afi_cs_n afi_wdata afi_rdata_en afi_rst_n afi_rlat afi_rdata afi_wdata_valid 00 [2] afi_dm mem_ck mem_ck_n global_reset mem_cs_n mem_we_n [3] mem_ba mem_a mem_qk 3 mem_qk_n 0 mem_dq [4] mem_dk 3 mem_dk_n 0 mem_ref_n mem_dm 000e9000e9000e9000e9 9 e 8421 9c63 9 e e9e9e9e9e9e9e9e9e9e9e9e9e9e9e9e9e9e9 3c f 0000 f 03 00 3 6 c 9 0 1 2 4 8 000e9 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 Notes for the above Figure: 1. Controller issues write command to PHY. 2. Data ready from controller for PHY. 3. PHY issues write command to memory. 4. PHY sends read data to memory. External Memory Interface Handbook Volume 3: Reference Material 476 16 Timing Diagrams for UniPHY IP Figure 205. Half-rate RLDRAM 3 Read afi_clk afi_addr afi_rdata_en_full afi_rdata_valid afi_we_n afi_ba afi_cs_n [1] afi_wdata afi_rdata_en afi_rst_n afi_rlat afi_rdata afi_wdata_valid afi_dm mem_ck mem_ck_n global_reset_n mem_cs_n [2] mem_we_n mem_a mem_ba mem_qk mem_qk_n mem_dq [3] mem_dk mem_dk_n mem_ref_n mem_dm [4] 0... 0000800008 0 3 1 0 3 00 21 63 2 3 0 9... 888888888888888888 0 3 1 3 00 0 3 00 3 0 xxxxxxxxx88... 888888888888888888 0 00 00009 00008 0 1 2 3 3 0 3 0 3 0 3 0 3 0 3 0 3 0 0 3 0 3 0 3 0 3 0 3 0 3 0 3 3 0 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 6 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 3 0 3 0 3 0 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 3 0 3 0 0 3 0 3 3 0 3 0 0 3 0 3 Notes for the above Figure: 1. Controller issues read command to PHY. 2. PHY issues read command to memory. 3. PHY receives data from memory. 4. Controller receives read data from PHY. External Memory Interface Handbook Volume 3: Reference Material 477 3 0 0 3 3 0 0 3 16 Timing Diagrams for UniPHY IP Figure 206. Half-Rate RLDRAM 3 Write afi_clk afi_addr 0000700007 afi_rdata_en_full 0 afi_rdata_valid 0 afi_we_n 3 1 3 [1] afi_ba 21 63 00 3 afi_cs_n 3 1 afi_wdata 777777777777777777 afi_rdata_en 0 afi_rst_n 3 afi_rlat 00 afi_rdata fffffffffxxxxxxxxx afi_wdata_valid 0 c [2] afi_dm 00 mem_ck mem_ck_n global_reset_n mem_cs_n mem_we_n [3] 00007 mem_a 00008 mem_ba 0 1 mem_qk 3 0 3 0 3 0 3 mem_qk_n 0 3 0 3 0 3 0 mem_dq [4] mem_dk 3 0 3 0 3 0 3 mem_dk_n 0 3 0 3 0 3 0 mem_ref_n mem_dm local_cal_success 0 2 0 3 3 3 0 0 3 3 0 6 0 3 3 0 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 0 3 3 0 Notes for the above Figure: 1. Controller issues write command to PHY. 2. Data ready from controller for PHY. 3. PHY issues write command to memory. 4. PHY sends read data to memory. 16.7 Document Revision History Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Maintenance release. May 2015 2015.05.04 Maintenance release. continued... External Memory Interface Handbook Volume 3: Reference Material 478 16 Timing Diagrams for UniPHY IP Date Version Changes December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Maintenance release. December 2013 2013.12.16 Updated timing diagrams for LPDDR2. November 2012 2.1 • • Added timing diagrams for RLDRAM 3. Changed chapter number from 10 to 12. June 2012 2.0 • • Added timing diagrams for LPDDR2. Added Feedback icon. November 2011 1.1 • Consolidated timing diagrams from 11.0 DDR2 and DDR3 SDRAM Controller with UniPHY User Guide, QDR II and QDR II+ SRAM Controller with UniPHY User Guide, and RLDRAM II Controller with UniPHY IP User Guide. Added Read and Write diagrams for DDR3 quarter-rate. • External Memory Interface Handbook Volume 3: Reference Material 479 17 External Memory Interface Debug Toolkit 17 External Memory Interface Debug Toolkit The EMIF Toolkit lets you run your own traffic patterns, diagnose and debug calibration problems, and produce margining reports for your external memory interface. The toolkit is compatible with UniPHY-based external memory interfaces that use the Nios II-based sequencer, with toolkit communication enabled, and with Arria 10 EMIF IP. Toolkit communication is on by default in versions 10.1 and 11.0 of UniPHY IP; for version 11.1 and later, toolkit communication is on whenever debugging is enabled on the Diagnostics tab of the IP core interface. The EMIF Toolkit can communicate with several different memory interfaces on the same device, but can communicate with only one memory device at a time. Note: The EMIF Debug Toolkit does not support MAX 10 devices. 17.1 User Interface The EMIF toolkit provides a graphical user interface for communication with connections. All functions provided in the toolkit are also available directly from the quartus_sh TCL shell, through the external_memif_toolkit TCL package. The availablity of TCL support allows you to create scripts to run automatically from TCL. You can find information about specific TCL commands by running help -pkg external_memif_toolkit from the quartus_sh TCL shell. If you want, you can begin interacting with the toolkit through the GUI, and later automate your workflow by creating TCL scripts. The toolkit GUI records a history of the commands that you run. You can see the command history on the History tab in the toolkit GUI. 17.1.1 Communication Communication between the EMIF Toolkit and external memory interface connections varies, depending on the connection type and version. In versions 10.1 and 11.0 of the EMIF IP, communication is achieved using direct communication to the Nios IIbased sequencer. In version 11.1 and later, communication is achieved using a JTAG Avalon-MM master attached to the sequencer bus. The following figure shows the structure of UniPHY-based IP version 11.1 and later, with JTAG Avalon-MM master attached to sequencer bus masters. Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 17 External Memory Interface Debug Toolkit Figure 207. UniPHY IP Version 11.1 and Later, with JTAG Avalon-MM Master JTAG Avalon Master (new) Combined ROM/RAM ( Variable Size ) EMIF Toolkit Bridge Debug Bus Sequencer Bus Avalon -MM Avalon-MM NIOS II SCC PHY AFI Tracking Register File Sequencer Managers 17.1.2 Calibration and Report Generation The EMIF Toolkit uses calibration differently, depending on the version of external memory interface in use. For versions 10.1 and 11.0 interfaces, the EMIF Toolkit causes the memory interface to calibrate several times, to produce the data from which the toolkit generates its reports. In version 11.1 and later, report data is generated during calibration, without need to repeat calibration. For version 11.1 and later, generated reports reflect the result of the previous calibration, without need to recalibrate unless you choose to do so. 17.2 Setup and Use Before using the EMIF Toolkit, you should compile your design and program the target device with the resulting SRAM Object File (. sof). For designs compiled in the Quartus II software version 12.0 or earlier, debugging information is contained in the JTAG Debugging Information file (.jdi); however, for designs compiled in the Quartus II software version 12.1 or later, all debugging information resides in the .sof file. You can run the toolkit using all your project files, or using only the Quartus Prime Project File (.qpf), Quartus Prime Settings File (.qsf), and .sof file; the .jdi file is also required for designs compiled prior to version 12.1. To ensure that all debugging information is correctly synchronized for designs compiled prior to version 12.1, ensure that the .sof and .jdi files that you used are generated during the same run of the Quartus Prime Assembler. After you have programmed the target device, you can run the EMIF Toolkit and open your project. You can then use the toolkit to create connections to the external memory interface. External Memory Interface Handbook Volume 3: Reference Material 481 17 External Memory Interface Debug Toolkit 17.2.1 General Workflow To use the EMIF Toolkit, you must link your compiled project to a device, and create a communication channel to the connection that you want to examine. 17.2.2 Linking the Project to a Device 1. To launch the toolkit, select External Memory Interface Toolkit from the Tools menu in the Quartus Prime software. 2. After you have launched the toolkit, open your project and click the Initialize connections task in the Tasks window, to initialize a list of all known connections. 3. To link your project to a specific device on specific hardware, perform the following steps: a. Click the Link Project to Device task in the Tasks window. b. Select the desired hardware from the Hardware dropdown menu in the Link Project to Device dialog box. c. Select the desired device on the hardware from the Device dropdown menu in the Link Project to Device dialog box. d. Select the correct Link file type, depending on the version of software in which your design was compiled: • If your design was compiled in the Quartus II software version 12.0 or earlier, select JDI as the Link file type, verify that the .jdi file is correct for your .sof file, and click Ok. • If your design was compiled in the Quartus II software version 12.1 or later, or the Quartus Prime software, select SOF as the Link file type, verify that the .sof file is correct for your programmed device, and click Ok. Figure 208. Link Project to Device Dialog Box When you link your project to the device, the toolkit verifies all connections on the device against the information in the JDI or SOF file, as appropriate. If the toolkit detects any mismatch between the JDI file and the device connections, an error message is displayed. For designs compiled using the Quartus II software version 12.1 or later, or the Quartus Prime software, the SOF file contains a design hash to ensure the SOF file used to program the device matches the SOF file specified for linking to a project. If the hash does not match, an error message appears. External Memory Interface Handbook Volume 3: Reference Material 482 17 External Memory Interface Debug Toolkit If the toolkit successfully verifies all connections, it then attempts to determine the connection type for each connection. Connections of a known type are listed in the Linked Connections report, and are available for the toolkit to use. 17.2.3 Establishing Communication to Connections After you have completed linking the project, you can establish communication to the connections. 1. 2. In the Tasks window, • Click Establish Memory Interface Connection to create a connection to the external memory interface. • Click Establish Efficiency Monitor Connection to create a connection to the efficiency monitor. • Click Establish Traffic Generator Connection to create a connection to the Traffic Generator 2.0 (if enabled, Arria 10 only). To create a communication channel to a connection, select the desired connection from the displayed pulldown menu of connections, and click Ok. The toolkit establishes a communication channel to the connection, creates a report folder for the connection, and creates a folder of tasks for the connection. Note: By default, the connection and the reports and tasks folders are named according to the hierarchy path of the connection. If you want, you can specify a different name for the connection and its folders. 3. You can run any of the tasks in the folder for the connection; any resulting reports appear in the reports folder for the connection. 17.2.4 Selecting an Active Interface If you have more than one external memory interface in an Arria 10 I/O column, you can select one instance as the active interface for debugging. 1. To select one of multiple EMIF instances in an Arria 10 I/O column, use the Set Active Interface dialog box. 2. If you want to generate reports for the new active interface, you must first recalibrate the interface. External Memory Interface Handbook Volume 3: Reference Material 483 17 External Memory Interface Debug Toolkit 17.2.5 Reports The toolkit can generate a variety of reports, including summary, calibration, and margining reports for external memory interface connections. To generate a supported type of report for a connection, you run the associated task in the tasks folder for that connection. Summary Report The Summary Report provides an overview of the memory interface; it consists of the following tables: • Summary table. Provides a high-level summary of calibration results. This table lists details about the connection, IP version, IP protocol, and basic calibration results, including calibration failures. This table also lists the estimated average read and write data valid windows, and the calibrated read and write latencies. • Interface Details table. Provides details about the parameterization of the memory IP. This table allows you to verify that the parameters in use match the actual memory device in use. • Groups Masked from Calibration table. Lists any groups that were masked from calibration when calibration occurred. Masked groups are ignored during calibration. Note: This table applies only to UniPHY-based interfaces; it is not applicable to Arria 10 EMIF. • Ranks Masked from Calibration tables (DDR2 and DDR3 only). Lists any ranks that were masked from calibration when calibration occurred. Masked ranks are ignored during calibration. Calibration Report (UniPHY) The Calibration Report provides detailed information about the margins observed before and after calibration, and the settings applied to the memory interface during calibration; it consists of the following tables: • Per DQS Group Calibration table: Lists calibration results for each group. If a group fails calibration, this table also lists the reason for the failure. Note: If a group fails calibration, the calibration routine skips all remaining groups. You can deactivate this behaviour by running the Enable Calibration for All Groups On Failure command in the toolkit. • DQ Pin Margins Observed Before Calibration table: Lists the DQ pin margins observed before calibration occurs. You can refer to this table to see the per-bit skews resulting from the specific silicon and board that you are using. • DQS Group Margins Observed During Calibration table: Lists the DQS group margins observed during calibration. • DQ Pin Settings After Calibration and DQS Group Settings After Calibration table: Lists the settings made to all dynamically controllable parts of the memory interface as a result of calibration. You can refer to this table to see the modifications made by the calibration algorithm. External Memory Interface Handbook Volume 3: Reference Material 484 17 External Memory Interface Debug Toolkit Calibration Report (Arria 10 EMIF) The Calibration Report provides detailed information about the margins observed during calibration, and the settings applied to the memory interface during calibration; it consists of the following tables: • Calibration Status Per Group table: Lists the pass/fail status per group. • DQ Pin Margins Observed During Calibration table: Lists the DQ read/write margins and calibrated delay settings. These are the expected margins after calibration, based on calibration data patterns. This table also contains DM/DBI margins, if applicable. • DQS Pin Margins Observed During Calibration table: Lists the DQS margins observed during calibration. • FIFO Settings table: Lists the VFIFO and LFIFO settings made during calibration. • Latency Observed During Calibration table: Lists the calibrated read/write latency. • Address/Command Margins Observed During Calibration table: Lists the margins on calibrated A/C pins, for protocols that support Address/Command calibration. Margin Report The Margin Report lists the post-calibration margins for each DQ and data mask pin, keeping all other pin settings constant; it consists of the following tables: Note: • DQ Pin Post Calibration Margins table. Lists the margin data in tabular format. • Read Data Valid Windows report. Shows read data valid windows in graphical format. • Write Data Valid Windows report. Shows write data valid windows in graphical format. The Margin Report applies only to UniPHY-based interfaces; it is not applicable to Arria 10 EMIF. For Arria 10 EMIF, the Calibration Report provides equivalent information. 17.3 Operational Considerations Some features and considerations are of interest in particular situations. Specifying a Particular JDI File Correct operation of the EMIF Toolkit depends on the correct JDI being used when linking the project to the device. The JDI file is produced by the Quartus Prime Assembler, and contains a list of all system level debug nodes and their heirarchy path names. If the default .jdi file name is incorrect for your project, you must specify the correct .jdi file. The .jdi file is supplied during the link-project-to-device step, where the revision_name.jdi file in the project directory is used by default. To supply an alternative .jdi file, click on the ellipse then select the correct .jdi file. External Memory Interface Handbook Volume 3: Reference Material 485 17 External Memory Interface Debug Toolkit PLL Status When connecting to DDR-based external memory interface connections, the PLL status appears in the Establish Connection dialog box when the IP is generated to use the CSR controller port, allowing you to immediately see whether the PLL status is locked. If the PLL is not locked, no communication can occur until the PLL becomes locked and the memory interface reset is deasserted. When you are linking your project to a device, an error message will occur if the toolkit detects that a JTAG Avalon-MM master has no clock running. You can run the Reindex Connections task to have the toolkit rescan for connections and update the status and type of found connections in the Linked Connections report. Margining Reports The EMIF Toolkit can display margining information showing the post-calibration datavalid windows for reads and writes. Margining information is determined by individually modifying the input and output delay chains for each data and strobe/ clock pin to determine the working region. The toolkit can display margining data in both tabular and hierarchial formats. Group Masks To aid in debugging your external memory interface, the EMIF Toolkit allows you to mask individual groups and ranks from calibration. Masked groups and ranks are skipped during the calibration process, meaning that only unmasked groups and ranks are included in calibration. Subsequent mask operations overwrite any previous masks. Using with Arria 10 Devices in a PHY-Only Configuration If you want to use the Debug Toolkit with an Arria 10 EMIF IP in a PHY-only configuration, you must connect a debug clock and a reset signal to the Arria 10 external memory interface debug component. Intel recommends using the AFI clock and AFI reset for this purpose. Note: For information about calibration stages in UniPHY-based interfaces, refer to UniPHY Calibration Stages in the Functional Description - UniPHY chapter. For information about calibration stages in Arria 10 EMIF IP, refer to Arria 10 EMIF Calibration, in the Functional Description - Arria 10 EMIF chapter. Related Links • Calibration Stages on page 59 The calibration process begins when the PHY reset signal deasserts and the PLL and DLL lock. • Functional Description—UniPHY on page 13 • Functional Description—Intel Arria 10 EMIF IP on page 144 Intel Arria 10 devices can interface with external memory devices clocking at frequencies of up to 1.3 GHz. The external memory interface IP component for Arria 10 devices provides a single parameter editor for creating external memory interfaces, regardless of memory protocol. External Memory Interface Handbook Volume 3: Reference Material 486 17 External Memory Interface Debug Toolkit 17.4 Troubleshooting In the event of calibration failure, refer to the following figure to assist in troubleshooting your design. Calibration results and failing stages are available through the external memory interface toolkit. Figure 209. Debugging Tips Calibration Failure Failing stage of group Stage 2 Write Leveling Stage 1 Read Calibration (VFIFO) Guaranteed Read Failure No working DQSen phase found Address/Command Skew Issue Check Address/Command Clock Phase Check Address/Command Board Skew and Delays Ensure board parameters are correct and include complete path to the memory device VFIFO Center (Per-bit read deskew failure) Add or subtract delay to DQS to center DQS within DQ window Ensure board parameters are correct and include complete path to the memory device DQ/DQS Centering Issue Check DQS enable calibration margin Ensure DQ and DQS are matched on the board No first working write leveling phase found No last working write leveling phase found Verify the write leveling window to ensure all groups are within the maximum write leveling range Increase the D6 delay on CLK Stage 3 Read Calibration (LFIFO) Write leveling copy failure Verify the write leveling window to ensure all groups are within the maximum write leveling range Increase the D6 delay on DQ and DQS for the groups that cannot write level LFIFO Tuning Failure is unexpected here. If a failure occurs at this point, it is probably Stage 2 (Write Leveling) that is failing. Contact Altera Stage 4 Write Calibration Write Center (Per-bit write deskew) failure Mask the failing DQ group and rerun calibration Check margins to see if failure is all bits in group or only some bits Verify board traces and solder joints Check DQ-to-DQS phase Check DQ-to-DQS phase; it should be edge aligned. 17.5 Debug Report for Arria V and Cyclone V SoC Devices The External Memory Interface Debug Toolkit and EMIF On-Chip Debug Port do not work with Arria V and Cyclone V SoC devices. Debugging information for Arria V and Cyclone V SoC devices is available by enabling a debug output report, which contains similar information. 17.5.1 Enabling the Debug Report for Arria V and Cyclone V SoC Devices To enable a debug report for Arria V or Cyclone V SoC devices, perform the following steps: 1. Open the <design_name>/hps_isw_handoff/sequencer_defines.h file in a text editor. 2. In the sequencer_defines.h file, locate the following line: #define RUNTIME_CAL_REPORT 0 3. Change #define RUNTIME_CAL_REPORT 0 to #define RUNTIME_CAL_REPORT 1, and save the file. 4. Generate the board support package (BSP) with semihosting enabled, or with UART output. The system will now generate the debugging report as part of the calibration process. 17.5.2 Determining the Failing Calibration Stage for a Cyclone V or Arria V HPS SDRAM Controller To determine the failing calibration stage, you must turn on the debug output report by setting the RUNTIME_CAL_REPORT option to 1 in the sequencer_defines.h file, located in the hps_isw_handoff directory. External Memory Interface Handbook Volume 3: Reference Material 487 17 External Memory Interface Debug Toolkit If calibration fails, the following statements are printed in the debug output report: SEQ.C: SEQ.C: SEQ.C: SEQ.C: Calibration Failed Error Stage : <Num> Error Substage: <Num> Error Group : <Num> To determine the stage and sub-stage, open the sequencer.h file in the hps_isw_handoff directory and look for the calibration defines: /* calibration stages */ #define CAL_STAGE_NIL 0 #define CAL_STAGE_VFIFO 1 #define CAL_STAGE_WLEVEL 2 #define CAL_STAGE_LFIFO 3 #define CAL_STAGE_WRITES 4 #define CAL_STAGE_FULLTEST 5 #define CAL_STAGE_REFRESH 6 #define CAL_STAGE_CAL_SKIPPED 7 #define CAL_STAGE_CAL_ABORTED 8 #define CAL_STAGE_VFIFO_AFTER_WRITES 9 /* calibration substages */ #define CAL_SUBSTAGE_NIL 0 #define CAL_SUBSTAGE_GUARANTEED_READ 1 #define CAL_SUBSTAGE_DQS_EN_PHASE 2 #define CAL_SUBSTAGE_VFIFO_CENTER 3 #define CAL_SUBSTAGE_WORKING_DELAY 1 #define CAL_SUBSTAGE_LAST_WORKING_DELAY 2 #define CAL_SUBSTAGE_WLEVEL_COPY 3 #define CAL_SUBSTAGE_WRITES_CENTER 1 #define CAL_SUBSTAGE_READ_LATENCY 1 #define CAL_SUBSTAGE_REFRESH 1 For details about the stages of calibration, refer to Calibration Stages in Functional Description - UniPHY. Related Links Calibration Stages on page 59 The calibration process begins when the PHY reset signal deasserts and the PLL and DLL lock. 17.6 On-Chip Debug Port for UniPHY-based EMIF IP The EMIF On-Chip Debug Port allows user logic to access the same calibration data used by the EMIF Toolkit, and allows user logic to send commands to the sequencer. You can use the EMIF On-Chip Debug Port to access calibration data for your design and to send commands to the sequencer just as the EMIF Toolkit would. The following information is available: • Pass/fail status for each DQS group • Read and write data valid windows for each group In addition, user logic can request the following commands from the sequencer: • Destructive recalibration of all groups • Masking of groups and ranks • Generation of per-DQ pin margining data as part of calibration External Memory Interface Handbook Volume 3: Reference Material 488 17 External Memory Interface Debug Toolkit The user logic communicates through an Avalon-MM slave interface as shown below. Figure 210. User Logic Access User logic Avalon Slave Altera Memory Interface 17.6.1 Access Protocol The On-Chip Debug Port provides access to calibration data through an Avalon-MM slave interface. To send a command to the sequencer, user logic sends a command code to the command space in sequencer memory. The sequencer polls the command space for new commands after each group completes calibration, and continuously after overall calibration has completed. The communication protocol to send commands from user logic to the sequencer uses a multistep handshake with a data structure as shown below, and an algorithm as shown in the figure which follows. typedef struct_debug_data_struct { ... // Command interaction alt_u32 requested_command; alt_u32 command_status; alt_u32 command_parameters[COMMAND_PARAM_WORDS];... } To send a command to the sequencer, user logic must first poll the command_status word for a value of TCLDBG_TX_STATUS_CMD_READY, which indicates that the sequencer is ready to accept commands. When the sequencer is ready to accept commands, user logic must write the command parameters into command_parameters, and then write the command code into requested_command. The sequencer detects the command code and replaces command_status with TCLDBG_TX_STATUS_CMD_EXE, to indicate that it is processing the command. When the sequencer has finished running the command, it sets command_status to TCLDBG_TX_STATUS_RESPONSE_READY to indicate that the result of the command is available to be read. (If the sequencer rejects the requested command as illegal, it sets command_status to TCLDBG_TX_STATUS_ILLEGAL_CMD.) User logic acknowledges completion of the command by writing TCLDBG_CMD_RESPONSE_ACK to requested_command. The sequencer responds by setting command_status back to STATUS_CMD_READY. (If an illegal command is received, it must be cleared using CMD_RESPONSE_ACK.) External Memory Interface Handbook Volume 3: Reference Material 489 17 External Memory Interface Debug Toolkit Figure 211. Debugging Algorithm Flowchart Read Command_status command_status == CMD_READY ? No Yes Write command payload Write command code Read command_status command_status == RESPONSE_READY ? No Yes Write RESPONSE_ACK code End 17.6.2 Command Codes Reference The following table lists the supported command codes for the On-Chip Debug Port. Table 136. Supported Command Codes Command Parameters Description TCLDBG_RUN_MEM_CALIBRATE None Runs the calibration routine. TCLDBG_MARK_ALL_DQS_GROUPS_AS_VALID None Marks all groups as valid for calibration. TCLDBG_MARK_GROUP_AS_SKIP Group to skip Mark the specified group to be skipped by calibration. continued... External Memory Interface Handbook Volume 3: Reference Material 490 17 External Memory Interface Debug Toolkit Command Parameters Description TCLDBG_MARK_ALL_RANKS_AS_VALID None Mark all ranks as valid for calibration TCLDBG_MARK_RANK_AS_SKIP Rank to skip Mark the specified rank to be skipped by calibration. TCLDBG_ENABLE_MARGIN_REPORT None Enables generation of the margin report. 17.6.3 Header Files The external memory interface IP generates header files which identify the debug data structures and memory locations used with the EMIF On-Chip Debug Port. You should refer to these header files for information required for use with your core user logic. It is highly recommended to use a software component (such as a Nios II processor) to access the calibration debug data. The header files are unique to your IP parameterization and version, therefore you must ensure that you are referring to the correct version of header for your design. The names of the header files are: core_debug.h and core_debug_defines.h. The header files reside in <design_name>/<design_name>_s0_software. 17.6.4 Generating IP With the Debug Port The following steps summarize the procedure for implementing your IP with the EMIF On-Chip Debug Port enabled. 1. Start the Quartus Prime software and generate a new external memory interface. For QDR II and RLDRAM II protocols, ensure that sequencer optimization is set to Performance (for Nios II-based sequencer). 2. On the Diagnostics tab of the parameter editor, turn on Enable EMIF On-Chip Debug Port. 3. Ensure that the EMIF On-Chip Debug Port interface type is set to Avalon-MM Slave. 4. Click Finish to generate your IP. 5. Find the Avalon interface in the top-level generated file. Connect this interface to your debug component. input input output input input output input output wire wire wire wire wire wire wire wire [19:0] seq_debug_addr, seq_debug_read_req, [31:0] seq_debug_rdata, seq_debug_write_req, [31:0] seq_debug_wdata, seq_debug_waitrequest, [3:0] seq_debug_be, seq_debug_rdata_valid // // // // // // // // seq_debug.address .read .readdata .write .writedata .waitrequest .byteenable .readdatavalid External Memory Interface Handbook Volume 3: Reference Material 491 17 External Memory Interface Debug Toolkit If you are using UniPHY-based IP with the hard memory controller, also connect the seq_debug_clk and seq_debug_reset_in signals to clock and asynchronous reset signals that control your debug logic. 6. Find the core_debug.h and core_debug_defines.h header files in <design_name>/<design_name>_s0_software and include these files in your debug component code. 7. Write your debug component using the supported command codes, to read and write to the Avalon-MM interface. The debug data structure resides at the memory address SEQ_CORE_DEBUG_BASE, which is defined in the core_debug_defines.h header file. 17.6.5 Example C Code for Accessing Debug Data A typical use of the EMIF On-Chip Debug Port might be to recalibrate the external memory interface, and then access the reports directly using the summary_report_ptr, cal_report_ptr, and margin_report_ptr pointers, which are part of the debug data structure. The following code sample illustrates: /* * DDR3 UniPHY sequencer core access example */ #include #include #include #include <stdio.h> <unistd.h> <io.h> "core_debug_defines.h" int send_command(volatile debug_data_t* debug_data_ptr, int command, int args[], int num_args) { volatile int i, response; // Wait until command_status is ready do { response = IORD_32DIRECT(&(debug_data_ptr->command_status), 0); } while(response != TCLDBG_TX_STATUS_CMD_READY); // Load arguments if(num_args > COMMAND_PARAM_WORDS) { // Too many arguments return 0; } for(i = 0; i < num_args; i++) { IOWR_32DIRECT(&(debug_data_ptr->command_parameters[i]), 0, args[i]); } // Send command code IOWR_32DIRECT(&(debug_data_ptr->requested_command), 0, command); // Wait for acknowledgment do { response = IORD_32DIRECT(&(debug_data_ptr->command_status), 0); } while(response != TCLDBG_TX_STATUS_RESPOSE_READY && response != TCLDBG_TX_STATUS_ILLEGAL_CMD); // Acknowledge response IOWR_32DIRECT(&(debug_data_ptr->requested_command), 0, TCLDBG_CMD_RESPONSE_ACK); // Return 1 on success, 0 on illegal command return (response != TCLDBG_TX_STATUS_ILLEGAL_CMD); } int main() External Memory Interface Handbook Volume 3: Reference Material 492 17 External Memory Interface Debug Toolkit { volatile debug_data_t* my_debug_data_ptr; volatile debug_summary_report_t* my_summary_report_ptr; volatile debug_cal_report_t* my_cal_report_ptr; volatile debug_margin_report_t* my_margin_report_ptr; volatile debug_cal_observed_dq_margins_t* cal_observed_dq_margins_ptr; int i, j, size; int args[COMMAND_PARAM_WORDS]; // Initialize pointers to the debug reports my_debug_data_ptr = (debug_data_t*)SEQ_CORE_DEBUG_BASE; my_summary_report_ptr = (debug_summary_report_t*) (IORD_32DIRECT(&(my_debug_data_ptr->summary_report_ptr), 0)); my_cal_report_ptr = (debug_cal_report_t*)(IORD_32DIRECT(&(my_debug_data_ptr>cal_report_ptr), 0)); my_margin_report_ptr = (debug_margin_report_t*) (IORD_32DIRECT(&(my_debug_data_ptr->margin_report_ptr), 0)); // Activate all groups and ranks send_command(my_debug_data_ptr, TCLDBG_MARK_ALL_DQS_GROUPS_AS_VALID, 0, 0); send_command(my_debug_data_ptr, TCLDBG_MARK_ALL_RANKS_AS_VALID, 0, 0); send_command(my_debug_data_ptr, TCLDBG_ENABLE_MARGIN_REPORT, 0, 0); // Mask group 4 args[0] = 4; send_command(my_debug_data_ptr, TCLDBG_MARK_GROUP_AS_SKIP, args, 1); send_command(my_debug_data_ptr, TCLDBG_RUN_MEM_CALIBRATE, 0, 0); // SUMMARY printf("SUMMARY REPORT\n"); printf("mem_address_width: %u\n", IORD_32DIRECT(&(my_summary_report_ptr>mem_address_width), 0)); printf("mem_bank_width: %u\n", IORD_32DIRECT(&(my_summary_report_ptr>mem_bank_width), 0)); // etc... // CAL REPORT printf("CALIBRATION REPORT\n"); // DQ read margins for(i = 0; i < RW_MGR_MEM_DATA_WIDTH; i++) { cal_observed_dq_margins_ptr = &(my_cal_report_ptr->cal_dq_in_margins[i]); printf("0x%x DQ %d Read Margin (taps): -%d : %d\n", (unsigned int)cal_observed_dq_margins_ptr, i, IORD_32DIRECT(&(cal_observed_dq_margins_ptr->left_edge), 0), IORD_32DIRECT(&(cal_observed_dq_margins_ptr->right_edge), 0)); } // etc... return 0; } 17.7 On-Chip Debug Port for Arria 10 EMIF IP The EMIF On-Chip Debug Port allows user logic to access the same calibration data used by the EMIF Toolkit, and allows user logic to send commands to the sequencer. You can use the EMIF On-Chip Debug Port to access calibration data for your design and to send commands to the sequencer just as the EMIF Toolkit would. The following information is available: • Pass/fail status for each DQS group • Read and write data valid windows for each group External Memory Interface Handbook Volume 3: Reference Material 493 17 External Memory Interface Debug Toolkit In addition, user logic can request the following commands from the sequencer: • Destructive recalibration of all groups • Masking of groups and ranks • Generation of per-DQ pin margining data as part of calibration The user logic communicates through an Avalon-MM slave interface as shown below. Figure 212. User Logic Access User logic Avalon Slave Altera Memory Interface 17.7.1 Access Protocol The On-Chip Debug Port provides access to calibration data through an Avalon-MM slave interface. To send a command to the sequencer, user logic sends a command code to the command space in sequencer memory. The sequencer polls the command space for new commands after each group completes calibration, and continuously after overall calibration has completed. The communication protocol to send commands from user logic to the sequencer uses a multistep handshake with a data structure as shown below, and an algorithm as shown in the figure which follows. typedef struct_debug_data_struct { ... // Command interaction alt_u32 requested_command; alt_u32 command_status; alt_u32 command_parameters[COMMAND_PARAM_WORDS];... } To send a command to the sequencer, user logic must first poll the command_status word for a value of TCLDBG_TX_STATUS_CMD_READY, which indicates that the sequencer is ready to accept commands. When the sequencer is ready to accept commands, user logic must write the command parameters into command_parameters, and then write the command code into requested_command. The sequencer detects the command code and replaces command_status with TCLDBG_TX_STATUS_CMD_EXE, to indicate that it is processing the command. When the sequencer has finished running the command, it sets command_status to TCLDBG_TX_STATUS_RESPONSE_READY to indicate that the result of the command is available to be read. (If the sequencer rejects the requested command as illegal, it sets command_status to TCLDBG_TX_STATUS_ILLEGAL_CMD.) External Memory Interface Handbook Volume 3: Reference Material 494 17 External Memory Interface Debug Toolkit User logic acknowledges completion of the command by writing TCLDBG_CMD_RESPONSE_ACK to requested_command. The sequencer responds by setting command_status back to STATUS_CMD_READY. (If an illegal command is received, it must be cleared using CMD_RESPONSE_ACK.) Figure 213. Debugging Algorithm Flowchart Read Command_status command_status == CMD_READY ? No Yes Write command payload Write command code Read command_status command_status == RESPONSE_READY ? No Yes Write RESPONSE_ACK code End 17.7.2 EMIF On-Chip Debug Port In Arria 10 and later families, access to on-chip debug is provided through software running on a Nios processor connected to the external memory interface. If you enable the Use Soft Nios Processor for On-Chip Debug option, the system instantiates a soft Nios processor, and software files are provided as part of the EMIF IP. External Memory Interface Handbook Volume 3: Reference Material 495 17 External Memory Interface Debug Toolkit Instructions on how to use the software are available in the following file: : <variation_name>/altera_emif_arch_nf_<version number>/<synth| sim>/<variation_name>_altera_emif_arch_nf_<version number>_<unique ID>_readme.txt. 17.7.3 On-Die Termination Calibration The Calibrate Termination feature lets you determine the optimal On-Die Termination and Output Drive Strength settings for your memory interface, for Arria 10 and later families. The Calibrate Termination function runs calibration with all available termination settings and selects the optimal settings based on the calibration margins. The Calibrate Termination feature is available for DDR3, DDR4, and RLDRAM 3 protocols, on Arria 10 devices. 17.7.4 Eye Diagram The Generate Eye Diagram feature allows you to create read and write eye diagrams for each pin in your memory interface, for Arria 10 and later families. The Generate Eye Diagram feature uses calibration data patterns to determine margins at each Vref setting on both the FPGA pins and the memory device pins. A full calibration is done for each Vref setting. Other settings, such as DQ delay chains, will change for each calibration. At the end of a Generate Eye Diagram command, a default calibration is run to restore original behavior The Generate Eye Diagram feature is available for DDR4 and QDR-IV protocols, on Arria 10 devices. 17.8 Driver Margining for Arria 10 EMIF IP The Driver Margining feature lets you measure margins on your memory interface using a driver with arbitrary traffic patterns. Margins measured with this feature may differ from margins measured during calibration, because of different traffic patterns. Driver margining is not available if ECC is enabled. To use driver margining, ensure that the following signals on the driver are connected to In-System Sources/Probes: • Reset_n: An active low reset signal • Pass: A signal which indicates that the driver test has completed successfully. No further memory transactions must be sent after this signal is asserted. • Fail: A signal which indicates that the driver test has failed. No further memory transactions must be sent after this signal is asserted. • PNF (Pass Not Fail): An array of signals that indicate the pass/fail status of individual bits of a data burst. The PNF should be arranged such that each bit index corresponds to (Bit of burst * DQ width) + (DQ pin). A 1 indicates pass, 0 indicates fail. If the PNF width exceeds the capacity of one InSystem Probe, specify them in PNF[1] and PNF[2]; otherwise, leave them blank. External Memory Interface Handbook Volume 3: Reference Material 496 17 External Memory Interface Debug Toolkit If you are using the example design for EMIF, the In-System Sources/Probes can be enabled by adding the following line to your .qsf file: set_global_assignment -name VERILOG_MACRO "ALTERA_EMIF_ENABLE_ISSP=1" 17.8.1 Determining Margin The Driver Margining feature lets you measure margins on your Arria 10 EMIF IP interface using a driver with arbitrary traffic patterns. The Driver Margining feature is available only for DDR3 and DDR4 interfaces on Arria 10 devices, when ECC is not enabled. 1. Establish a connection to the desired interface and ensure that it has calibrated successfully. 2. Select Driver Margining from the Commands folder under the target interface connection. 3. Select the appropriate In-System Sources/Probes using the drop-down menus. 4. If required, set additional options in the Advanced Options section: • Specify Traffic Generator 2.0 to allow margining on a per-rank basis. Otherwise, margining is performed on all ranks together. • Step size specifies the granularity of the driver margining process. Larger step sizes allow faster margining but reduced accuracy. It is recommended to omit this setting. • Adjust delays after margining causes delay settings to be adjusted to the center of the window based on driver margining results. • The Margin Read, Write, Write DM, and DBI checkboxes allow you to control which settings are tested during driver margining. You can uncheck boxes to allow driver margining to complete more quickly. 5. Click OK to run the tests. The toolkit measures margins for DQ read/write and DM. The process may take several minutes, depending on the margin size and the duration of the driver tests. The test results are available in the Margin Report. 17.9 Read Setting and Apply Setting Commands for Arria 10 EMIF IP The Read Setting command allows you to read calibration settings directly from the EMIF PHY. The Apply Setting command allows you to write calibration settings, to override existing settings for testing purposes. 17.9.1 Reading or Applying Calibration Settings The Read Setting and Apply Setting commands let you read and write calibration settings directly. The Read Setting and Apply Setting commands are available only for DDR3 and DDR4 interfaces on Arria 10 devices. External Memory Interface Handbook Volume 3: Reference Material 497 17 External Memory Interface Debug Toolkit 1. Establish a connection to the desired interface. 2. Select Read Setting or Apply Setting from the Settings folder under the target interface connection. 3. Select the desired setting type. 4. Select the desired rank shadow register to modify. (Leave this field blank if it is not applicable). 5. Select the index of the pin or group to modify. 6. (For the Apply Setting command) Enter the new value to apply to the desired location. 7. Click OK . The setting is read (or applied) using a read/write at the address indicated in the Tcl command window. You can perform similar transactions using the On-Chip Debug Port. 17.10 Traffic Generator 2.0 The Traffic Generator 2.0 lets you emulate traffic to the external memory, and helps you test, debug, and understand the performance of your external memory interface on hardware in a standalone fashion, without having to incorporate your entire design. The Traffic Generator 2.0 lets you customize data patterns being written to the memory, address locations accessed in the memory, and the order of write and read transactions. You can use the traffic generator code with any FPGA architecture and memory protocol. 17.10.1 Configuring the Traffic Generator 2.0 The traffic generator replaces user logic to generate traffic to the external memory. You must incorporate the traffic generator design into the EMIF IP design during IP generation. When you generate the example design in the parameter editor, the traffic generator module and EMIF IP are generated together. If you have an example design with the Traffic Generator 2.0 enabled, you can configure the traffic pattern using the EMIF Debug Toolkit. External Memory Interface Handbook Volume 3: Reference Material 498 17 External Memory Interface Debug Toolkit Figure 214. Traffic Generator 2.0 Generated with EMIF IP in Example Design Mode Generating the External Memory Interface 1. Select the FPGA and Memory parameters. 2. On the Diagnostics tab, configure the following parameters: 3. a. Select Use Configurable Avalon Traffic Generator 2.0. b. Configure the desired traffic pattern, by specifying traffic patterns to be bypassed. The traffic pattern not bypassed is issued to the memory immediately after completion of calibration. You can choose to bypass any of the following traffic patterns: • Bypass the default traffic pattern Specifies not to use the default traffic patterns from the traffic generator. The default patterns include single read/write, byte-enabled read/write, and block read/write. • Bypass the user-configured traffic stage. Specifies to skip the stage that uses the user-defined test bench file to configure the traffic generator in simulation. • Bypass the traffic generator repeated-writes/repeated-reads test pattern. Bypasses the traffic generator's repeat test stage, which causes every write and read to be repeated several times. • Bypass the traffic generator stress pattern. Bypasses a test stage intended to stress-test signal integrity and memory interface calibration. • Export Traffic Generator 2.0 configuration interface. Instantiates a port for traffic generator configuration. Use this port if the traffic generator is to be configured by user logic. Click Generate Example Design to generate the EMIF IP, including the Traffic Generator 2.0 design, with the traffic pattern that you have configured. Note: If you click the Generate HDL option instead, the Traffic Generator 2.0 design is not included in the generated IP. External Memory Interface Handbook Volume 3: Reference Material 499 17 External Memory Interface Debug Toolkit Figure 215. Enabling the Traffic Generator 2.0 in the Parameter Editor 17.10.2 Running the Traffic Generator 2.0 You can use the EMIF Debug Toolkit to configure the traffic generator infrastructure to send custom traffic patterns to the memory. 1. Launch the EMIF Debug Toolkit by selecting Tools ➤ System Debugging Tools ➤ External Memory Interface Toolkit. 2. 3. After you launch the toolkit, you must establish the following connections, before running the custom traffic generator: • Initialize Connections • Link Project to Device • Connections — Create Memory Interface Connection — Create Traffic Generator Connection Launch the Traffic Generator by selecting Traffic Generator ➤ Settings ➤ Run Custom Traffic Pattern. 17.10.3 Understanding the Custom Traffic Generator User Interface The Custom Traffic Generator interface lets you configure data patterns, the bytes to be enabled, the addressing mode, and the order in which traffic is organized. External Memory Interface Handbook Volume 3: Reference Material 500 17 External Memory Interface Debug Toolkit The interface has three tabs: • Data tab • Address tab • Loops tab Data Tab The Data tab is divided into Data Pins and Data Mask Pins sections. Figure 216. Data Tab The Data Pins section helps with customizing the patterns selected for the data pins. You can choose between two options for Data Mode: • PRBS The default write data to all the data pins. • Fixed Pattern Lets you specify a pattern to be written to the memory. Select the All Pins option when you want to write the same data pattern to all the data pins. If data must be individually assigned to the data pins, you must enter the data value for each individual pin. The width of the data entered is based on the AVL_TO_DQ_WIDTH_RATIO, which is based on the ratio of the memory clock to the user clock. External Memory Interface Handbook Volume 3: Reference Material 501 17 External Memory Interface Debug Toolkit All data bytes are enabled by default; the Data Mask Pins section lets you disable any of the bytes if you want to. To disable data bytes individually, select Test data mask. You can choose between two options for Data Mode: • PRBS Specifies the PRBS pattern to enable or disable data bytes. A 1 denotes a data byte enabled, while a 0 denotes a data byte being masked or disabled. • Fixed Pattern Lets you enable or disable individual bytes. You can apply byte enables to all pins or to individual bytes. A 1 denotes a data byte enabled, while a 0 denotes a data byte being masked or disabled. Address Tab The Address tab lets you configure sequential, random, or random sequential (where the initial start address is random, but sequential thereafter) addressing schemes. The Address tab is divided into Address Mode and Address Configuration sections. Figure 217. Address Tab The Address Mode section lets you specify the pattern of addresses generated to access the memory. You can choose between three address modes : • Sequential Each address is incremented by the Sequential address increment value that you specify. You also specify the Start Address from which the increments begin. • Random Each address is generated randomly. (You set the number of random addresses on the Loops tab.) • Random Sequential Each address is generated randomly, and then incremented sequentially. You specify the number of sequential increments in the Number of sequential addresses field. External Memory Interface Handbook Volume 3: Reference Material 502 17 External Memory Interface Debug Toolkit The Address Configuration section contains the settings with which you configure the address mode that you chose in the Address Mode section. The following settings are available: • Start address Specifies the starting address for Sequential Address Mode. The maximum address value that can be reached is 1FF_FFFF. (The Traffic Generator 2.0 will accept higher values, but wraps back to 0 after the maximum value has been reached.) The Start address setting applies only to Sequential Address Mode. • Number of sequential addresses Specifies the number of sequential addresses generated after the first random address generated. This setting applies only in Random sequential mode. • Sequential address increment Specifies the size of increment between each address in the Sequential address mode and Random sequential address mode. • Return to start address Specifies that the address value generated return back to the value entered in the Start Address field, after a block of transactions to the memory has completed. This setting applies only to Sequential address mode. • Address masking Masking provides additional options for exploring certain specific address spaces in memory: — Disabled does not enable masking, and increments address based on the selected Address Mode. — Fixed cycling allows you to restrict the addressing to a specific row or a specific bank, which you can specify in the corresponding Mask Value field. External Memory Interface Handbook Volume 3: Reference Material 503 17 External Memory Interface Debug Toolkit Loops Tab The Loops tab lets you order the transactions to the memory as desired. A unit size of transactions to the memory is defined as a block; a block includes a set of write transaction(s) immediately followed by a set of read transaction(s). Figure 218. Loops Tab External Memory Interface Handbook Volume 3: Reference Material 504 17 External Memory Interface Debug Toolkit The Loops tab provides the following configuration options: • Loops Specifies the number of blocks of transactions to be sent to the memory. This option helps to extend the range of addresses that the controller can access. The address range is incremented as each loop is executed, unless you specify Return to Start Address, which causes each loop to begin from the same start address. The range of supported values for the Loops option is from 1 to 4095. • Writes per block Specifies the size of the block (that is, the number of consecutive write operations that can be issued in a single block). The range of values for this option is as follows: • — When address masking is disabled, the number of writes per block supported is 1 to 4094. — When address masking is enabled, the maximum number of writes issued inside a block is 255. Reads per block Specifies the number of consecutive read operations that can be issued in a single block, immediately following the consecutive writes issued. The number of reads per block should be identical to the number of writes per block, because data mismatches can occur when the two values are not identical. The range of values for this option is as follows: — When address masking is disabled, the number of reads per block supported is 1 to 4094. — When address masking is enabled, the maximum number of reads issued inside a block is 255. • Write repeats Specifies the number of times each write command is issued in repetition to the same address. A maximum number of 255 repeat write transactions can be issued. The repeat writes are issued immediately after the first write command has been issued. • Read repeats Specifies the number of times each read command is issued in repetition to the same address. A maximum number of 255 repeat read transactions can be issued. The repeat reads are issued immediately after the first read command has been issued. • Avalon burst length Specifies the length of each Avalon burst. The value of this field should be less than the Sequential address increment specified on the Address tab. The number of write and read repeats default to 1 if the Avalon burst length is greater than 1. 17.10.4 Applying the Traffic Generator 2.0 You can apply the Traffic Generator 2.0 to run stress tests, debug your hardware platform for signal integrity problems, and to emulate actual memory transactions. This topic presents some common applications where you can benefit by using the Traffic Generator 2.0. Testing Signal Integrity with PRBS Data Pattern You can apply PRBS data to the data pins to help emulate an actual traffic pattern to the memory interface. The traffic generator uses a PRBS7 data pattern as the default traffic pattern on the data pins, and can support PRBS-15 and PRBS-31. External Memory Interface Handbook Volume 3: Reference Material 505 17 External Memory Interface Debug Toolkit Debugging and Monitoring an Address for Reliable Data Capture You can send a single write followed by multiple reads to a specific address to help debug and monitor a specific address for reliable data capture. You can do this with the following settings: • Writes per block: 1 • Reads per block: 1 • Write repeats: 1 • Read repeats: 1 to 255 Figure 219. Configuring the Loops Tab for a Single Write Followed by Multiple Reads If you specify a Loops value greater than 1, every block of write and multiple read transactions will follow the same pattern. If there is a specific address to which this transaction must be issued, you should specify that address in the Start address field on the Address tab, with the Sequential address mode selected. Accessing Large Sections of Memory The maximum number of unique addresses that can be written to in one block is 4094. Using the maximum Loops value of 4095, the address range that can be supported in one test is equal to the number of loops multiplied by the number of writes per block. Further address expansion can be achieved by changing the Start address value appropriately and reissuing the tests. To continue addressing sections of the memory beyond the address range that can be specified in one set of toolkit configurations, you can incrementally access the next set of addresses in the memory by changing the Start address value. For example, in a memory where row address width is 15, bank address width is 3 and column address width is 10, the total number of address locations to be accessed is: 2 (row address width) x (bank address width x 2 (column address width)). The maximum number of address locations that can be accessed is limited by the width of the internal address bus, which is 25 bits wide. For the example described above, you must set the following values on the Address tab: • Select the Sequential address mode. • Set the Start address to 0x00. • Ensure that you do not select Return to start addess. • Ensure that you disable address masking for rank, row, bank, and bank group. External Memory Interface Handbook Volume 3: Reference Material 506 17 External Memory Interface Debug Toolkit Figure 220. Address Configuration to Access the First Set of Addresses You must also set the following values on the Loops tab: • Set Loops to the maximun value of 4095. • Set Writes per block to the maximum value of 4094. • Set Reads per block to the maximum value of 4094. • Set Write repeats to 1. • Set Read repeats to 1. Figure 221. Loop Configuration to Access the First Set of Addresses Each iteration can access a maximum of 4095 x 4094 locations (16,764,930 address locations i.e. Address ranging from 000_0000’h to FF_D001’h). To access the next 4095 x 4094 locations, the same settings as above must be repeated, except for the Start address value, whichmust be set to a hex value of 16,764,931 i.e. FF_D002. The same process can be repeated to further access memory locations inside the memory. The maximum value supported is 25’h 1FF_FFFF which is the equivalent of 33,554,432 locations inside the memory. External Memory Interface Handbook Volume 3: Reference Material 507 17 External Memory Interface Debug Toolkit Figure 222. Address Configuration to Access the Second Set of Addresses 17.11 The Traffic Generator 2.0 Report The traffic generator report provides information about the configuration of the Traffic Generator 2.0 and the result of the most recent run of traffic. Understanding the Traffic Generator 2.0 Report The traffic generator report contains the following information: • A Pass flag value of 1 indicates that the run completed with no errors. • A Fail flag value of 1 indicates that the run encountered one or more errors. • The Failure Count indicates the number of read transactions where data did not match the expected value. • The First failure address indicates the address corresponding to the first data mismatch. • The Version indicates the version number of the traffic generator. • The Number of data generators indicates the number of data pins at the memory interface. • The Number of byte enable generators indicates the number of byte enable and data mask pins at the memory interface. • The Rank Address, Bank address, and Bank group width values indicate the number of bits in the Avalon address corresponding to each of those components. • The Data/Byte enable pattern length indicates the number of bits in the fixed pattern used on each data/byte enable pin. • The PNF (pass not fail) value indicates the persistent pass/fail status for each bit in the Avalon data. It is also presented on a per-memory-pin basis for each beat within a memory burst. • Fail Expected Data is the data that was expected on the first failing transaction (if applicable). • Fail Read Data is the data that was received on the first failing transaction (if applicable). External Memory Interface Handbook Volume 3: Reference Material 508 17 External Memory Interface Debug Toolkit 17.12 Example Tcl Script for Running the EMIF Debug Toolkit If you want, you can run the EMIF Debug Toolkit using a Tcl script. The following example Tcl script is applicable to all device families. The following example Tcl script opens a file, runs the debug toolkit, and writes the resulting calibration reports to a file. You should adjust the variables in the script to match your design. You can then run the script using the command quartus_sh -t example.tcl. # Modify the following variables for your project set project "ed_synth.qpf" # Index of the programming cable. Can be listed using "get_hardware_names" set hardware_index 1 # Index of the device on the specified cable. Can be listed using "get_device_names" set device_index 1 # SOF file containing the EMIF to debug set sof "ed_synth.sof" # Connection ID of the EMIF debug interface. Can be listed using "get_connections" set connection_id 2 # Output file set report "toolkit.rpt" # The following code opens a project and writes its calibration reports to a file. project_open $project load_package ::quartus::external_memif_toolkit initialize_connections set hardware_name [lindex [get_hardware_names] $hardware_index] set device_name [lindex [get_device_names -hardware_name $hardware_name] $device_index] link_project_to_device -device_name $device_name -hardware_name $hardware_name sof_file $sof establish_connection -id $connection_id create_connection_report -id $connection_id -report_type summary create_connection_report -id $connection_id -report_type calib write_connection_target_report -id $connection_id -file $report 17.13 Calibration Adjustment Delay Step Sizes for Arria 10 Devices Refer to the following tables for information on delay step sizes for calibration adjustment. 17.13.1 Addressing Each reconfigurable feature of the interface has an associated memory address; however, this address is placement dependent. If Altera PHYLite for Parallel Interfaces IP cores and the Arria 10 External Memory Interfaces IP cores share the same I/O column, you must track the addresses of the interface lanes and the pins. Addressing is done at the 32-bit word boundary, where avl_address[1:0] are always 00. External Memory Interface Handbook Volume 3: Reference Material 509 17 External Memory Interface Debug Toolkit Address Map These points apply to the following table: • id[3:0] refers to the Interface ID parameter. • lane_addr[7:0] refers to the address of a given lane in an interface. The Fitter sets this address value. You can query this in the Parameter Table Lookup Operation Sequence as described in Address Lookup section of the Intel PHYLite for Parallel Interfaces IP Core User Guide. • pin[4:0] refers to the physical location of the pin in a lane. You can use the Fitter to automatically determine a pin location or you can manually set the pin location through .qsf assignment. Refer to the Parameter Table Lookup Operation Sequence as described in Address Lookup section of the Intel PHYLite for Parallel Interfaces IP Core User Guide for more information. Feature Pin Output Phase Avalon Address R/W {id[3:0], 3'h4,lane_addr[7: 0],pin{4:0],8'D0} Address CSR R {id[3:0], 3'h4,lane_addr[7 :0],pin{4:0], 8'E8} Control Value Field Range Phase Value 12..0 Minimum Setting: Refer to Table 137 on page 513 Maximum Setting: Refer to Table 137 on page 513 Incremental Delay: 1/128th VCO clock period Note: The pin output phase switches from the CSR value to the Avalon value after the first Avalon write. It is only reset to the CSR value on a reset of the interface. Reserved 31..13 Delay Value 8..0 Reserved 11..9 Enable 12 Reserved 31..13 1 Pin PVT Compensated Input Delay {id[3:0], 3'h4,lane_addr[7: 0], 4'hC,lgc_sel[1:0] ,pin_off[2:0], 4'h0} Not supported 1 1 — Minimum Setting: 0 Maximum Setting: 511 VCO clock periods Incremental Delay: 1/256th VCO clock period — 0 = Delay value is 0. 1 = Select delay value from Avalon register — continued... External Memory Interface Handbook Volume 3: Reference Material 510 17 External Memory Interface Debug Toolkit Feature Avalon Address R/W • • Strobe PVT compensated input delay 2 Control Field Range Not supported Delay Value 9..0 lgc_sel[1:0] = 2'b01 Reserved 11..10 Enable 12 Reserved 31..13 Phase Value 12..0 Reserved 14..13 Enable 15 Reserved 31..16 1 1 Strobe enable phase 2 {id[3:0], 3'h4,lane_addr[7: 0], 4'hC,lgc_sel[1:0] ,3'h7,4'h0} • Value lgc_sel[1:0] is: — 2'b01 for DQ [5:0] — 2'b10 for DQ [11:6] pin_off[2:0] : — 3'h0: DQ [0], DQ [6] — 3’h1: DQ [1], DQ [7] — 3’h2: DQ [2], DQ [8] — 3’h3: DQ [3], DQ [9] — 3’h4: DQ [4], DQ [10] — 3’h5: DQ [5], DQ [11] {id[3:0], 3'h4,lane_addr[7: 0], 4'hC,lgc_sel[1:0] ,3'h6,4'h0} • Address CSR R {id[3:0], 3'h4,lane_addr[7 :0],4'hC,9'h198} lgc_sel[1:0] = 2'b01 1 1 Minimum Setting: 0 Maximum Setting: 1023 VCO clock periods Incremental Delay: 1/256th VCO clock period — 0 = Select delay value from CSR register. The CSR value is set through the Capture Strobe Phase Shift parameter during IP core instantiation. 1 = Select delay value from Avalon register — Minimum Setting: Refer to Table 137 on page 513 Maximum Setting: Refer to Table 137 on page 513 Incremental Delay: 1/128th VCO clock period — 0 = Select delay value from CSR register 1 = Select delay value from Avalon register — continued... External Memory Interface Handbook Volume 3: Reference Material 511 17 External Memory Interface Debug Toolkit Feature Avalon Address R/W Strobe enable delay 2 {id[3:0], 3'h4,lane_addr[7: 0],4'hC,9'h008} Address CSR R {id[3:0], 3'h4,lane_addr[7 :0],4'hC,9'h1A8} Control Field Range Delay Value 5..0 Reserved 14..6 Enable 15 Reserved 31..16 Delay Value 6..0 Reserved 14..7 Enable 15 Reserved 31..16 VREF Code 5..0 1 1 Read valid delay 2 {id[3:0], 3'h4,lane_addr[7: 0],4'hC,9'h00C} {id[3:0], 3'h4,lane_addr[7 :0],4'hC,9'h1A4} 1 1 Internal VREF Code {id[3:0], 3'h4,lane_addr[7: 0],4'hC,9'h014} Not supported Value Reserved 1 31..6 Minimum Setting: 0 external clock cycles Maximum Setting: 63 external memory clock cycles Incremental Delay: 1 external memory clock cycle — 0 = Select delay value from CSR register 1 = Select delay value from Avalon register — Minimum Setting: 0 external clock cycles Maximum Setting: 127 external memory clock cycles Incremental Delay: 1 external memory clock cycle — 0 = Select delay value from CSR register 1 = Select delay value from Avalon register — Refer to Calibrated VREF Settings in the Intel PHYLite for Parallel Interfaces IP Core User Guide. 9 — 1. Reserved bit ranges must be zero. 2. Modifying these values must be done on all lanes in a group. Note: For more information about performing various clocking and delay calculations, depending on the interface frequency and rate, refer to PHYLite_delay_calculations.xlsx. 17.13.2 Output and Strobe Enable Minimum and Maximum Phase Settings When dynamically reconfiguring the interpolator phase settings, the values must be kept within the ranges below to ensure proper operation of the circuitry. External Memory Interface Handbook Volume 3: Reference Material 512 17 External Memory Interface Debug Toolkit Table 137. Output and Strobe Enable Minimum and Maximum Phase Settings VCO Multiplication Factor 1 2 4 8 Core Rate Minimum Interpolator Phase Maximum Interpolator Phase Output Bidirectional Bidirectional with OCT Enabled Full 0x080 0x100 0x100 0xA80 Half 0x080 0x100 0x100 0xBC0 Quarter 0x080 0x100 0x100 0xA00 Full 0x080 0x100 0x180 0x1400 Half 0x080 0x100 0x180 0x1400 Quarter 0x080 0x100 0x180 0x1400 Full 0x080 0x100 0x280 0x1FFF Half 0x080 0x100 0x280 0x1FFF Quarter 0x080 0x100 0x280 0x1FFF Full 0x080 0x100 0x480 0x1FFF Half 0x080 0x100 0x480 0x1FFF Quarter 0x080 0x100 0x480 0x1FFF For more information about performing various clocking and delay calculations, depending on the interface frequency and rate, refer to PHYLite_delay_calculations.xlsx. 17.14 Using the EMIF Debug Toolkit with Arria 10 HPS Interfaces The External Memory Interface Debug Toolkit is not directly compatible with Arria 10 HPS interfaces. To debug your Arria 10 HPS interface using the EMIF Debug Toolkit, you should create an identically parameterized, non-HPS version of your interface, and apply the EMIF Debug Toolkit to that interface. When you finish debugging this non-HPS interface, you can then apply any needed changes to your HPS interface, and continue your design development. 17.15 Document Revision History Date May 2017 Version 2017.05.08 Changes • • • • October 2016 2016.10.31 Added Using the EMIF Debug Toolkit with Arria 10 HPS Interfaces topic. Added Calibration Adjustment Delay Step Sizes for Arria 10 Devices topic. Replaced EMIF Configurable Traffic Generator 2.0 section with new Traffic Generator 2.0 section. Rebranded as Intel. Maintenance release. continued... External Memory Interface Handbook Volume 3: Reference Material 513 17 External Memory Interface Debug Toolkit Date May 2016 Version 2016.05.02 Changes • • • • November 2015 2015.11.02 • • • • • • • • May 2015 2015.05.04 • • • • December 2014 2014.12.15 • • Added additional option to step 1 of Establishing Communication to Connections. Added sentence to second bullet in Eye Diagram. Expanded step 4 and added step 5, in Determining Margin. Added Configuring the Traffic Generator 2.0 and The Traffic Generator 2.0 Report. Changed title of Architecture section to User Interface. Added sentence to Driver Margining section stating that driver margining is not available if ECC is enabled. Removed note that the memory map for Arria 10 On-Chip Debug would be available in a future release. Created separate On-Chip Debug sections for UniPHY-based EMIF IP and Arria 10 EMIF IP. Changed title of Driver Margining (Arria 10 only) section to Driver Margining for Arria 10 EMIF IP. Changed title of Read Setting and Apply Setting Commands (Arria 10 only) to Read Setting and Apply Setting Commands for Arria 10 EMIF IP. Added section Example Tcl Script for Running the EMIF Debug Toolkit. Changed instances of Quartus II to Quartus Prime. Added Determining the Failing Calibration Stage for a Cyclone V or Arria V HPS SDRAM Controller. Changed occurrences of On-Chip Debug Toolkit to On-Chip Debug Port. Added Driver Margining (Arria 10 only) and Determining Margin. Added Read Setting and Apply Setting Commands (Arria 10 only) and Reading or Applying Calibration Settings. Added paragraph to step 5 of Generating IP With the Debug Port. Added mention of seq_debug_clk and seq_debug_reset_in to step 5 of Generating IP With the Debug Port. August 2014 2014.08.15 Maintenance release. December 2013 2013.12.16 Maintenance release. November 2012 2.2 • • • August 2012 2.1 Added table of debugging tips. June 2012 2.0 • • November 2011 1.0 Harvested 11.0 DDR2 and DDR3 SDRAM Controller with UniPHY EMIF Toolkit content. Changes to Setup and Use and General Workflow sections. Added EMIF On-Chip Debug Toolkit section Changed chapter number from 11 to 13. Revised content for new UnIPHY EMIF Toolkit. Added Feedback icon. External Memory Interface Handbook Volume 3: Reference Material 514 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers The following topics describe the process of upgrading to UniPHY from DDR2 or DDR3 SDRAM High-Performance Controller II with ALTMEMPHY Designs. NOTE: Designs that do not use the AFI cannot be upgraded to UniPHY. If your design uses non-AFI IP cores, Intel recommends that you start a new design with the UniPHY IP core . In addition, Intel recommends that any new designs targeting Stratix III, Stratix IV, or Stratix V use the UniPHY datapath. To upgrade your ALTMEMPHY-based DDR2 or DDR3 SDRAM High-Performance Controller II design to a DDR2 or DDR3 SDRAM controller with UniPHY IP core, you must complete the tasks listed below: 1. Generating Equivalent Design 2. Replacing the ALTMEMPHY Datapath with UniPHY Datapath 3. Resolving Port Name Differences 4. Creating OCT Signals 5. Running Pin Assignments Script 6. Removing Obsolete Files 7. Simulating your Design The following topics describe these tasks in detail. Related Links • Generating Equivalent Design on page 516 Create a new DDR2 or DDR3 SDRAM controller with UniPHY IP core, by following the steps in Implementing and Parameterizing Memory IP and apply the following guidelines: • Replacing the ALTMEMPHY Datapath with UniPHY Datapath on page 516 To replace the ALTMEMPHY datapath with the UniPHY datapath, follow these steps: • Resolving Port Name Differences on page 517 Several port names in the ALTMEMPHY datapath are different than in the UniPHY datapath. • Creating OCT Signals on page 518 In ALTMEMPHY-based designs, the Quartus Prime Fitter creates the alt_oct block outside the IP core and connects it to the oct_ctl_rs_value and oct_ctl_rt_value signals. • Running Pin Assignments Script on page 518 Remap your design by running analysis and synthesis. • Removing Obsolete Files on page 519 Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers After you upgrade the design, you may remove the unnecessary ALTMEMPHY design files from your project. • Simulating your Design on page 519 You must use the UniPHY memory model to simulate your new design. • Creating OCT Signals on page 518 In ALTMEMPHY-based designs, the Quartus Prime Fitter creates the alt_oct block outside the IP core and connects it to the oct_ctl_rs_value and oct_ctl_rt_value signals. 18.1 Generating Equivalent Design Create a new DDR2 or DDR3 SDRAM controller with UniPHY IP core, by following the steps in Implementing and Parameterizing Memory IP and apply the following guidelines: • Specify the same variation name as the ALTMEMPHY variation. • Specify a directory different than the ALTMEMPHY design directory to prevent files from overwriting each other during generation. To ease the migration process, ensure the UniPHY-based design you create is as similar as possible to the existing ALTMEMPHY-based design. In particular, you should ensure the following settings are the same in your UniPHY-based design: Note: • PHY settings tab • FPGA speed grade • PLL reference clock • Memory clock frequency • There is no need to change the default Address and command clock phase settings; however, if you have board skew effects in your ALTMEMPHY design, enter the difference between that clock phase and the default clock phase into the Address and command clock phase settings. • Memory Parameters tab—all parameters must match. • Memory Timing tab—all parameters must match. • Board settings tab—all parameters must match. • Controller settings tab—all parameters must match In ALTMEMPHY-based designs you can turn off dynamic OCT. However, all UniPHYbased designs use dynamic parallel OCT and you cannot turn it off. Related Links Implementing and Parameterizing Memory IP 18.2 Replacing the ALTMEMPHY Datapath with UniPHY Datapath To replace the ALTMEMPHY datapath with the UniPHY datapath, follow these steps: External Memory Interface Handbook Volume 3: Reference Material 516 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers 1. In the Quartus Prime software, open the Assignment Editor, on the Assignments menu click Assignment Editor. 2. Manually delete all of the assignments related to the external memory interface pins, except for the location assignments if you are preserving the pinout. By default, these pin names start with the mem prefix, though in your design they may have a different name. 3. Remove the old ALTMEMPHY .qip file from the project, as follows: a. On the Assignments menu click Settings. b. Specify the old .qip, and click Remove. Your design now uses the UniPHY datapath. 18.3 Resolving Port Name Differences Several port names in the ALTMEMPHY datapath are different than in the UniPHY datapath. The different names may cause compilation errors. This topic describes the changes you must make in the RTL for the entity that instantiates the memory IP core. Each change applies to a specific port in the ALTMEMPHY datapath. Unconnected ports require no changes. In some instances, multiple ports in ALTMEMPHY-based designs are mapped to a single port in UniPHY-based designs. If you use both ports in ALTMEMPHY-based designs, assign a temporary signal to the common port and connect it to the original wires. The following table shows the changes you must make. Table 138. Changes to ALTMEMPHY Port Names ALTMEMPHY Port Changes aux_full_rate_clk The UniPHY-based design does not generate this signal. You can generate it if you require it. aux_scan_clk The UniPHY-based design does not generate this signal. You can generate it if you require it. aux_scan_clk_reset_n The UniPHY-based design does not generate this signal. You can generate it if you require it. dll_reference_clk The UniPHY-based design does not generate this signal. You can generate it if you require it. dqs_delay_ctrl_export This signal is for DLL sharing between ALTMEMPHY instances and is not applicable for UniPHY-based designs. local_address Rename to avl_addr. local_be Rename to avl_be. local_burstbegin Rename to avl_burstbegin. local_rdata Rename to avl_rdata. local_rdata_valid Rename to avl_rdata_valid. local_read_req Rename to avl_read_req. local_ready Rename to avl_ready. local_size Rename to avl_size. continued... External Memory Interface Handbook Volume 3: Reference Material 517 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers ALTMEMPHY Port Changes local_wdata Rename to avl_wdata. local_write_req Rename to avl_write_req. mem_addr Rename to mem_a. mem_clk Rename to mem_ck. mem_clk_n Rename to mem_ck_n. mem_dqsn Rename to mem_dqs_n. oct_ctl_rs_value Remove from design (see “Creating OCT Signals”). oct_ctl_rt_value Remove from design (see “Creating OCT Signals”). phy_clk Rename to afi_clk. reset_phy_clk_n Rename to afi_reset_n. local_refresh_ack reset_request_n The controller no longer exposes these signals to the top-level design, so comment out these outputs. If you need it, bring the wire out from the highperformance II controller entity in <project_directory>/<variation name>.v. Related Links Creating OCT Signals on page 518 In ALTMEMPHY-based designs, the Quartus Prime Fitter creates the alt_oct block outside the IP core and connects it to the oct_ctl_rs_value and oct_ctl_rt_value signals. 18.4 Creating OCT Signals In ALTMEMPHY-based designs, the Quartus Prime Fitter creates the alt_oct block outside the IP core and connects it to the oct_ctl_rs_value and oct_ctl_rt_value signals. In UniPHY-based designs, the OCT block is part of the IP core, so the design no longer requires these two ports. Instead, the UniPHY-based design requires two additional ports, oct_rup and oct_rdn (for Stratix III and Stratix IV devices), or oct_rzqin (for Stratix V devices). You must create these ports in the instantiating entity as input pins and connect to the UniPHY instance. Then route these pins to the top-level design and connect to the OCT RUP and RDOWN resistors on the board. For information on OCT control block sharing, refer to “The OCT Sharing Interface” in this volume. 18.5 Running Pin Assignments Script Remap your design by running analysis and synthesis. When analysis and synthesis completes, run the pin assignments Tcl script and then verify the new pin assignments in the Assignment Editor. External Memory Interface Handbook Volume 3: Reference Material 518 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers 18.6 Removing Obsolete Files After you upgrade the design, you may remove the unnecessary ALTMEMPHY design files from your project. To identify these files, examine the original ALTMEMPHY-generated .qip file in any text editor. 18.7 Simulating your Design You must use the UniPHY memory model to simulate your new design. To use the UniPHY memory model, follow these steps: 1. Edit your instantiation of the UniPHY datapath to ensure the local_init_done, local_cal_success, local_cal_fail, soft_reset_n, oct_rdn, oct_rup, reset_phy_clk_n, and phy_clk signals are at the top-level entity so that an instantiating testbench can refer to those signals. 2. To use the UniPHY testbench and memory model, generate the example design when generating your IP instantiation. 3. Specify that your third-party simulator should use the UniPHY testbench and memory model instead of the ALTMEMPHY memory model, as follows: a. On the Assignments menu, point to Settings and click the Project Settings window. b. Select the Simulation tab, click Test Benches, click Edit, and replace the ALTMEMPHY testbench files with the following files: • \<project directory>\<variation name>_example_design \simulation\verilog\submodules \altera_avalon_clock_source.sv or \<project directory> \<variation name>_example_design\simulation\vhdl \submodules\altera_avalon_clock_source.vhd • \<project directory>\<variation name>_example_design \simulation\verilog\submodules \altera_avalon_reset_source.sv or \<project directory> \<variation name>_example_design\simulation\vhdl \submodules\altera_avalon_reset_source.vhd • \<project directory>\<variation name>_example_design \simulation\verilog\<variation name>_example_sim.v or \uniphy\<variation name>_example_design\simulation\vhdl \<variation name>_example_sim.vhd • \<project directory>\<variation name>_example_design \simulation\verilog\submodules\verbosity_pkg.sv External Memory Interface Handbook Volume 3: Reference Material 519 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers 4. • \<project directory>\<variation name>_example_design \simulation\verilog\submodules \status_checker_no_ifdef_params.sv or \<project directory> \<variation name>_example_design\simulation\vhdl \submodules\status_checker_no_ifdef_params.sv • \<project directory>\<variation name>_example_design \simulation\verilog\submodules \alt_mem_if_common_ddr_mem_model_ddr3_mem_if_dm_pins_en_ mem_if_dqsn_en.sv or \<project directory>\<variation name>_example_design\simulation\vhdl\submodules \alt_mem_if_common_ddr_mem_model_ddr3_mem_if_dm_pins_en_ mem_if_dqsn_en.sv • \<project directory>\<variation name>_example_design \simulation\verilog\submodules \alt_mem_if_ddr3_mem_model_top_ddr3_mem_if_dm_pins_en_me m_if_dqsn_en or \<project directory>\<variation name>_example_design\simulation\vhdl\submodules \alt_mem_if_ddr3_mem_model_top_ddr3_mem_if_dm_pins_en_me m_if_dqsn_en Open the <variation name>_example_sim.v file and find the UniPHYgenerated simulation example design module name below: <variation name>_example_sim_e0. 5. Change the module name above to the name of your top-level design module. 6. Refer to the following table and update the listed port names of the example design in the UniPHY-generated <variation name>_example_sim.v file. Table 139. Example Design Port Names Example Design Name New Name pll_ref_clk Rename to clock_source. mem_a Rename to mem_addr. mem_ck Rename to mem_clk. mem_ck_n Rename to mem_clk_n. mem_dqs_n Rename to mem_dqsn. drv_status_pass Rename to pnf. afi_clk Rename to phy_clk. afi_reset_n Rename to reset_phy_clk_n. drv_status_fail This signal is not available, so comment out this output. afi_half_clk This signal is not exposed to the top-level design, so comment out this output. For more information about generating example simulation files, refer to Simulating Memory IP, in volume 2 of the External Memory Interface Handbook. 18.8 Document Revision History External Memory Interface Handbook Volume 3: Reference Material 520 18 Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers Date Version Changes May 2017 2017.05.08 Rebranded as Intel. October 2016 2016.10.31 Maintenance release. May 2016 2016.05.02 Maintenance release. November 2015 2015.11.02 Changed instances of Quartus II to Quartus Prime. May 2015 2015.05.04 Maintenance release. December 2014 2014.12.15 Maintenance release. August 2014 2014.08.15 Maintenance release. December 2013 2013.12.16 Removed local_wdata_req from port names table. November 2012 2.3 Changed chapter number from 12 to 14. June 2012 2.2 Added Feedback icon. November 2011 2.1 Revised Simulating your Design section. External Memory Interface Handbook Volume 3: Reference Material 521
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
advertisement