External Memory Interface Handbook

External Memory Interface Handbook
External Memory Interface Handbook Volume 1: Altera
Memory Solution Overview and Design Flow
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design
Flow
101 Innovation Drive
San Jose, CA 95134
www.altera.com
EMI_GS-1.0
Document last updated for Altera Complete Design Suite version:
Document publication date:
11.1
November 2011
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
ISO
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and 9001:2008
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service Registered
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
Contents
Chapter Revision Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Section I. Altera Memory Solution Overview and Design Flow
Chapter 1. Introduction to Altera Memory Solution
Soft and Hard Memory IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–1
Memory Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–2
Low Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–4
Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–5
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–6
Chapter 2. Recommended Design Flow
Select Your Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Select Your FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Planning Pin and FPGA Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Determine Board Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Implementing and Parameterizing Memory IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Simulating Memory IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Analyzing Timing of Memory IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Perform Post-Fit Timing Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Debugging Memory IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Design Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–8
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
iv
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
Contents
November 2011 Altera Corporation
Chapter Revision Dates
The chapters in this document, External Memory Interface Handbook, Volume 1:
Altera Memory Solution Overview and Design Flow, were revised on the following
dates. Where chapters or groups of chapters are available separately, part numbers are
listed.
Chapter 1.
Introduction to Altera Memory Solution
Revised:
November 2011
Part Number: EMI_GS_001
Chapter 2.
Recommended Design Flow
Revised:
November 2011
Part Number: EMI_GS_002
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
vi
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
Chapter Revision Dates
November 2011 Altera Corporation
Section I. Altera Memory Solution
Overview and Design Flow
This section provides an overview of Altera memory solutions and the recommended
memory IP design flow.
This section includes the following chapters:
■
Chapter 1, Introduction to Altera Memory Solution
■
Chapter 2, Recommended Design Flow
f For information about the revision history for chapters in this section, refer to
“Document Revision History” in each individual chapter.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
I–2
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
Section I: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
1. Introduction to Altera Memory
Solution
November 2011
EMI_GS_001
EMI_GS_001-1.0
This chapter describes the memory solutions that Altera provides.
Altera provides the fastest, most efficient, and lowest latency memory controllers. The
controllers are designed to allow you to easily interface with today's higher speed
memories.
Altera supports a wide variety of memory interfaces suitable for applications ranging
from routers and switches to video cameras. You can easily implement Altera’s
intellectual property (IP) using the memory MegaCore functions through the
Quartus II software. The Quartus II software also provides an external memory
toolkit that helps you test the implementation of the IP in the FPGA device.
f Refer to the External Memory Interface Spec Estimator page for the maximum speed
that supported by Altera FPGAs.
Soft and Hard Memory IP
Altera's latest devices, the 28-nm FPGAs provide two types of memory solutions: soft
memory IP and hard memory IP. Arria V and Cyclone V devices offer both soft and
hard memory IP, while Stratix V devices offer only soft memory IP.
The soft memory IP gives you the flexibility to design your own interfaces to meet
your system requirements and still benefit from the industry leading performance.
The hard memory IP is designed to give you a complete out-of-the-box experience
when designing a memory controller.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011
Subscribe
1–2
Chapter 1: Introduction to Altera Memory Solution
Memory Solutions
Table 1–1 lists the features of the soft and hard memory IP.
Table 1–1. Features of the Soft and Hard Memory IP
Soft Memory IP
Hard Memory IP
■
Consists of a DDR2 or DDR3 SDRAM
high-performance memory controller with
UniPHY IP.
■
Has hardened read and write data paths to
ensure your design meets timing at the
highest speeds. The data paths include I/O,
phase-locked loops (PLLs), delay-locked loop
(DLL), and read and write FIFO buffers.
■
Allows you to choose the location to place the
memory controller and the ability to size the
memory controller based on the system
requirements, especially in the Stratix V
devices.
■
Consists of a DDR2 or DDR3 SDRAM
high-performance memory controller with a
hard UniPHY IP, and a multiport front-end
block.
■
Has a fixed location on the die and a fixed
maximum width; ×32 for Arria V devices and
×16 for Cyclone V devices.
■
Runs at full rate to allow decreased latency
and to minimize the required bus width of
signals going into the core of the device.
■
Simplifies the overall memory design in
Arria V and Cyclone V devices, and provides a
truly out-of-the-box experience for every
designer.
Figure 1–1 shows the hardened data paths in the soft memory IP of a Stratix V device.
Figure 1–1. Hardened Data Paths in the Soft Memory IP
Stratix V
DLL
I/O Structure
PLL
UniPHY
Reconfig
Clock
Generator
External
Memory
Device
Memory
Controller
Calibration
Sequencer
DQS
Path
Write Path
DQ I/O
FIFO
Read Path
I/O
Block
Address/Command
Path
Hard IP
Soft IP
Memory Solutions
Altera FPGAs achieve optimal memory interface performance with external memory
IP. The IP provides the following components:
■
Physical layer interface (PHY) which handles the timing on the data path itself.
■
Memory controller block which implements all the memory commands and
addresses.
■
Multiport front-end (MPFE) block which allows multiple processes inside the
FPGA device to share a common bank of memory. The MPFE block is a new
feature in Arria V and Cyclone V devices.
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
Chapter 1: Introduction to Altera Memory Solution
Memory Solutions
1–3
These blocks are critical to the design and the use of the memory interface block.
Altera provides modular memory solutions that allow you to customize your
memory interface design to any of the following configurations:
■
PHY with your own controller
■
PHY with Altera controller
■
PHY with Altera controller and the MPFE block
You can also build a custom PHY, a custom controller, or both, as desired.
Table 1–2 shows the recommended memory types and controllers that Altera offers
with the PHY IP.
Table 1–2. Altera Memory Types, PHY, and Controllers in the Quartus II Software (Part 1 of 2)
Quartus II Version
Memory
DDR/DDR2/DDR3
11.1
10.0
November 2011
Altera Corporation
ALTMEMPHY (AFI)
HPC II
UniPHY
HPC II
QDR II/QDR II+
UniPHY
QDR/RLD II controller
RLDRAM II
UniPHY
QDR/RLD II controller
Other
ALTDQ_DQS (2)
ALTDQ_DQS2
(3)
Custom
Custom
DDR/DDR2/DDR3
ALTMEMPHY (AFI)
HPC II
DDR2/DDR3
UniPHY
HPC II
QDR II/QDR II+
UniPHY
QDR/RLD II controller
RLDRAM II
UniPHY
QDR/RLD II controller
Other
ALTDQ_DQS (2)
Other
10.1
Controller IP
(1)
DDR2/DDR3
Other
11.0
PHY IP
ALTDQ_DQS2
(3)
Custom
Custom
HPC
DDR/DDR2/DDR3
ALTMEMPHY (AFI)
DDR2/DDR3
UniPHY Nios-based Sequencer
HPC II
QDR II/QDR II+
UniPHY RTL Sequencer
QDR/RLD II controller
RLDRAM II
UniPHY RTL Sequencer
QDR/RLD II controller
(2)
HPC II
Other
ALTDQ_DQS
Other
ALTDQ_DQS2 (3)
DDR/DDR2/DDR3
ALTMEMPHY (AFI)
DDR2/DDR3
UniPHY Nios-based Sequencer
HPC II
QDR II/QDR II+
UniPHY RTL Sequencer
QDR/RLD II controller
RLDRAM II
UniPHY RTL Sequencer
QDR/RLD II controller
Other
ALTDQ_DQS (2)
Custom
Other
ALTDQ_DQS2 (3)
Custom
Custom
Custom
HPC
HPC II
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
1–4
Chapter 1: Introduction to Altera Memory Solution
Low Latency
Table 1–2. Altera Memory Types, PHY, and Controllers in the Quartus II Software (Part 2 of 2)
Quartus II Version
9.1
Memory
PHY IP
DDR/DDR2/DDR3
ALTMEMPHY (AFI)
Controller IP
HPC
HPC II
QDR II/QDR II+
UniPHY
QDR II controller
RLDRAM II
UniPHY
RLDRAM II controller
Other
ALTDQ_DQS (2)
Custom
Note to Table 1–2:
(1) AFI = Altera PHY interface
(2) Applicable for Arria II, Stratix III, and Stratix IV devices.
(3) Applicable only for Arria V and Stratix V devices.
f For more information about the controllers with the UniPHY or the ALTMEMPHY IP,
refer to the Functional Descriptions section in Volume 3 of the External Memory Interface
Handbook.
For more information about the ALTDQ_DQS megafunction, refer to the ALTDLL and
ALTDQ_DQS Megafunctions User Guide.
For more information about the ALTDQ_DQS2 megafunction, refer to the
ALTDQ_DQS2 Megafunction User Guide.
For more information and design example about custom PHY, refer to the Design
Example - Stratix III ALTDQ DQS DDR2 SDRAM page.
Low Latency
Altera generally offers low latency solutions that are drastically better than Altera’s
competitors. Altera’s 28-nm FPGA devices have a balanced clocked network in the
periphery to reduce switching noise. The hardened read data FIFO buffer guarantees
timing and makes it easier for the fitter to place the controller. Together with the latest
UniPHY IP, these design changes provide drastic reduction in latency.
Table 1–3 shows latency comparison for Altera and its closest competition.
Table 1–3. Latency Comparison for Quarter-Rate DDR3 SDRAM Controllers
Latency (Memory Clock Cycles)
Latency Type
Advantage
Competitor (1)
Altera
Write Command
46
29
Altera
Read Command
46
29
Altera
Read Data
31
11
Altera
Note to Table 1–3:
(1) Does not include AXI latency.
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
Chapter 1: Introduction to Altera Memory Solution
Efficiency
1–5
Efficiency
Altera memory controllers are also highly efficient. Figure 1–2 shows the memory
efficiency of a DDR3 SDRAM memory controller with UniPHY IP.
Figure 1–2. Memory Efficiency of DDR3 SDRAM Memory Controllers with UniPHY
1.4
28% more efficient
17% more efficient
1.2
14% more efficient
Relative Efficiency
1.0
0.8
0.6
0.4
0.2
0
Alternating Read/Write Turnaround
and Random Access
Altera
Competitor
November 2011
Altera Corporation
50% Read/Write Turnaround
50% and Random Access
0% Read/Write Turnaround
100% Random Access
DDR3 SDRAM Memory Efficiency
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
1–6
Chapter 1: Introduction to Altera Memory Solution
Document Revision History
Document Revision History
Table 1–4 shows the revision history for this document.
Table 1–4. Document Revision History
Date
November 2011
Version
1.0
Changes
Initial release.
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
2. Recommended Design Flow
November 2011
EMI_GS_002
EMI_GS_002-2.1
This chapter describes the Altera-recommended design flow for successfully
implementing external memory interfaces in Altera® devices. Altera recommends
that you create an example top-level file with the desired pin outs and all interface IP
instantiated, which enables the Quartus® II software to validate your design and
resource allocation before PCB and schematic sign off. Use the “Design Checklist” on
page 2–5, to verify whether you have performed all the recommended steps in
creating a working and robust external memory interface.
Figure 2–1 shows the design flow to provide the fastest out-of-the-box experience
with external memory interfaces in Altera devices. This topic directs you where to
find information on how to perform each step of the recommended design flow. The
flow assumes that you are using Altera IP to implement the external memory
interface.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011
Subscribe
2–2
Chapter 2: Recommended Design Flow
Figure 2–1. External Memory Interfaces Design Flowchart
Select Your Memory/FPGA
Start Design
Plan Pin and
FPGA Resources
Instantiate
PHY and Controller
UniPHY-Based
Designs Only on Arria II GX
and Stratix IV Devices
Determine Board Layout
Qsys or
SOPC Builder
Flow
Perform Board Level
Simulations
Adjust Termination
and Drive Strength
No
MegaWizard
Flow
Specify Parameters
Do Signals
Meet Electrical
Requirements?
Specify Parameters
Complete
SOPC Builder System
Yes
Optional
Perform
Functional Simulation
Yes
Does
Simulation Give
Expected Results?
Add Constraints and
Compile Design
Verify Timing
No
Does the
Design Have Positive
Margin?
Debug Design
Optional
Perform
Post-Fit Timing Simulation
Does
Simulation Give
Expected Results?
No
Debug Design
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
Yes
No
Adjust Constraints
Yes
Verify Design
Functionality on Board
Is Design Working?
No
Debug Design
Yes
Design Done
November 2011 Altera Corporation
Chapter 2: Recommended Design Flow
Select Your Memory
2–3
Select Your Memory
When you select an external memory device, you have to consider factors like
bandwidth, data storage, latency and power consumption.
f For more information about selecting your memory device, refer to the Selecting Your
Memory chapter in the External Memory Interface Handbook.
Select Your FPGA
Different Altera FPGAs support different memory types and configurations.
Depending on the requirements of your design, you need to determine the
appropriate FPGA.
f For more information about selecting your device, refer to the Selecting Your FPGA
chapter in the External Memory Interface Handbook.
Planning Pin and FPGA Resources
Before determining the board layout, you need to determine the usage of FPGA pins,
phase-locked loop (PLL), delay-locked loops (DLLs), and other resources.
f For more information about planning pins and resources, refer to the Planning Pin and
FPGA Resources chapter in the External Memory Interface Handbook.
Determine Board Layout
To improve the signal integrity, you have to consider the termination scheme that you
use, the drive strength setting on the FPGA, and the loading seen by the driver. You
must understand the tradeoffs between the different types of termination schemes
and the effects of output drive strengths and loading, to choose the best possible
settings for your designs.
f For more information about guidelines to determine your board layout for the
different memory controllers, refer to the following chapters in the External Memory
Interface Handbook:
■
DDR2 and DDR3 SDRAM Board Design Guidelines
■
Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
■
RLDRAM II Board Design Guidelines
■
QDR II SRAM Board Design Guidelines
Implementing and Parameterizing Memory IP
After selecting the appropriate device and memory type, create a project in the
Quartus II software that targets the device and memory type.
November 2011 Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
2–4
Chapter 2: Recommended Design Flow
Simulating Memory IP
When implementing and parameterizing external memory interfaces, Altera
recommends that you use Altera memory interface IP, which includes a PHY that you
can use with the Altera high-performance controller or with your own custom
controller.
f For more information about specifying parameters, refer to the Implementing and
Parameterizing Memory IP chapter in the External Memory Interface Handbook.
Simulating Memory IP
After implementing and parameterizing the memory IP, you need to perform
functional simulation.
f For more information about simulation, refer to the Simulating Memory IP chapter in
the External Memory Interface Handbook.
Analyzing Timing of Memory IP
To ensure your external memory interface meets the various timing requirements, you
need to analyze the timing paths, adjust constraints, and verify timing.
f For more information about analyzing timing, adjust constraints, and verify timing,
refer to the Analyzing Timing of Memory IP chapter in the External Memory Interface
Handbook.
Perform Post-Fit Timing Simulation
This step is optional. It ensures that the IP is working properly.
f For more information about simulating, refer to the Simulating Memory IP chapter in
the External Memory Interface Handbook.
Debugging Memory IP
You need to perform system level verification to correlate the system against your
design targets, using the Altera SignalTap® II logic analyzer.
f For more information about using the SignalTap II analyzer, refer to the Debugging
Memory IP chapter in the External Memory Interface Handbook.
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
Chapter 2: Recommended Design Flow
Design Checklist
2–5
Design Checklist
This topic contains a design checklist that you can use when implementing external
memory interfaces in Altera devices.
Done
Action
Reference
Select Your Memory
1.

Select the memory interface frequency of operation and bus
width.
■
Selecting Your Memory chapter in the
External Memory Interface Handbook.
■
Selecting Your FPGA chapter in the
External Memory Interface Handbook.
Select Your FPGA
2.

Select the FPGA device density and package combination
that you want to target.
Plan Pin and FPGA Resources
3.

Ensure that the target FPGA device supports the desired
clock rate and memory bus width. Also the FPGA must have
sufficient I/O pins for the DQ/DQS read and write groups.
For detailed device resource information,
refer to the relevant device handbook
chapter on external memory interface
support.
Determine Board Layout
4.

Select the termination scheme and drive strength settings
for all the memory interface signals on the memory side
and the FPGA side.
■
DDR2 and DDR3 SDRAM Board Design
Guidelines chapter in the External
Memory Interface Handbook.
5.

Ensure you apply appropriate termination and drive
strength settings on all the memory interface signals, and
verify using board level simulations.
■
Dual-DIMM DDR2 and DDR3 SDRAM
Board Design Guidelines chapter in the
External Memory Interface Handbook.
6.

Use board level simulations to pick the optimal setting for
best signal integrity. On the memory side, Altera
recommends the use of external parallel termination on
input signals to the memory (write data, address,
command, and clock signals).
■
RLDRAM II Board Design Guidelines
chapter in the External Memory
Interface Handbook.
■
QDR II SRAM Board Design Guidelines
chapter in the External Memory
Interface Handbook.
■
Implementing and Parameterizing
Memory IP chapter in the External
Memory Interface Handbook
7.

Perform board level simulations, to ensure electrical and
timing margins for your memory interface
8.

Ensure you have a sufficient eye opening using simulations.
Use the latest FPGA and memory IBIS models, board trace
characteristics, drive strength, and termination settings in
your simulation.
Any timing uncertainties at the board level that you
calculate using simulations must be used to adjust the input
timing constraints to ensure the accuracy of Quartus II
timing margin reports. For example crosstalk, ISI, and slew
rate deration.
Parameterize and Implement the Memory IP
9.

Parameterize and instantiate the Altera external memory IP
for your target memory interface.
November 2011 Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
2–6
Chapter 2: Recommended Design Flow
Design Checklist
Done
10.
11.


Action
Reference
Ensure that you perform the following actions:
■
Pick the correct memory interface data rates, width, and
configurations.
■
For DDR, DDR2, and DDR3 SDRAM interfaces, ensure
that you derate the tIS, tIH, tDS, and tDH parameters, as
necessary.
■
Include the board skew parameters for your board.
Connect the PHY's local signals to your driver logic and the
PHY's memory interface signals to top-level pins.
Ensure that the local interface signals of the PHY are
appropriately connected to your own logic. If the
ALTMEMPHY IP is compiled without these local interface
connections, you may encounter compilation problems,
when the number of signals exceeds the pins available on
your target device.
■
Functional Description: HPC II chapter
in the External Memory Interface
Handbook.
■
Functional Description: QDR II and
QDR II+ SRAM Controller chapter in the
External Memory Interface Handbook.
■
Functional Description: RLDRAM II
Controller chapter in the External
Memory Interface Handbook.
■
Simulating Memory IP chapter in the
External Memory Interface Handbook
You may also use the example top-level file as an example
on how to connect your own custom controller to the Altera
memory PHY.
Perform Functional Simulation
12.

Simulate your design using the RTL functional model.
Use the IP functional simulation model with your own driver
logic, testbench, and a memory model, to ensure correct
read and write transactions to the memory.
You may need to prepare the memory functional model by
setting the speed grade and device bus mode.
Add Constraints
13.

Add timing constraints. The wizard-generated .sdc file adds
timing constraints to the interface. However, you may need
to adjust these settings to best fit your memory interface
configuration.
14.

Add pin settings and DQ group assignments. The
wizard-generated .tcl file includes I/O standard and pin
loading constraints to your design.
15.

Ensure that generic pin names used in the constraint scripts
are modified to match your top-level pin names. The
loading on memory interface pins is dependent on your
board topology (memory components).
16.

Add pin location assignments. However, you need to assign
the pin location assignments manually using the Pin
Planner.
17.

Ensure that the example top-level file or your top-level logic
is set as top-level entity.
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
Chapter 2: Recommended Design Flow
Design Checklist
Done
18.

2–7
Action
Reference
Adjust optimization techniques, to ensure the remaining
unconstrained paths are routed with the highest speed and
efficiency:
a. On the Assignments menu click Settings.
b. Select Analysis & Synthesis Settings.
c. Select Speed under Optimization Technique.
d. Expand Fitter Settings.
e. Turn on Optimize Hold Timing and select All Paths.
f. Turn on Optimize Fast Corner Timing.
g. Select Standard Fit under Fitter Effort.
19.

Provide board trace delay model. For accurate I/O timing
analysis, you specify the board trace and loading
information in the Quartus II software. This information
should be derived and refined during your board
development process of prelayout (line) simulation and
finally post-layout (board) simulation. Provide the board
trace information for the output and bidirectional pins
through the board trace model in the Quartus II software.
Compile Design and Verify Timing
20.

Compile your design and verify timing closure using all
available models.
21.

Run the wizard-generated
<variation_name>_report_timing.tcl file, to generate a
custom timing report for each of your IP instances. Run
this process across all device timing models (slow 0°C,
slow 85°C, fast 0°C).
22.

If there are timing violations, adjust your constraints to
optimize timing
23.

As required, adjust PLL clock phase shift settings or
appropriate timing and location assignments margins for
the various timing paths within the IP.
■
Analyzing Timing of Memory IP chapter
in the External Memory Interface
Handbook
■
Simulating Memory IP chapter in the
External Memory Interface Handbook.
■
Debugging Memory IP chapter in the
External Memory Interface Handbook
Perform Post-Fit Timing Simulation
24.

Perform post-fit timing simulation to ensure that all the
memory transactions meet the timing specifications with
the vendor’s memory model.
Verify Design Functionality
25.

Verify the functionality of your memory interface in the
system
November 2011 Altera Corporation
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
2–8
Chapter 2: Recommended Design Flow
Document Revision History
Document Revision History
Table 2–1 shows the revision history for this document.
Table 2–1. Document Revision History
Date
Version
Changes
November 2011
2.1
Updated the design flow and the design checklist.
July 2010
2.0
Updated for 10.0 release.
January 2010
1.1
November 2009
1.0
■
Improved description for Implementing Altera Memory Interface IP chapter.
■
Added timing simulation to flow chart and to design checklist.
First published.
External Memory Interface Handbook
Volume 1: Altera Memory Solution Overview and Design Flow
November 2011 Altera Corporation
External Memory Interface Handbook Volume 2: Design
Guidelines
External Memory Interface Handbook
Volume 2: Design Guidelines
101 Innovation Drive
San Jose, CA 95134
www.altera.com
EMI_DG-1.0
Document last updated for Altera Complete Design Suite version:
Document publication date:
11.1
November 2011
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
ISO
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and 9001:2008
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service Registered
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
Contents
Chapter Revision Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Section I. Design Flow Guidelines
Chapter 1. Selecting Your Memory
Memory Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–1
DDR, DDR2, and DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3
DDR SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3
DDR2 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3
DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3
DDR, DDR2, and DDR3 SDRAM Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–4
QDR, QDR II, and QDR II+ SRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–5
RLDRAM and RLDRAM II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–6
Memory Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–8
High-Speed Memory in Embedded Processor Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–9
High-Speed Memory in Telecom Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–11
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–13
Chapter 2. Selecting Your FPGA Device
Device Family Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–1
Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–1
Memory Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–2
I/O Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–2
Wraparound Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Read and Write Leveling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Dynamic OCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Device Settings Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Speed Grade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Operating Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Package Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–4
Device Density and I/O Pin Counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5
Device Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5
I/O Pin Counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–6
Chapter 3. Planning Pin and FPGA Resources
Interface Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
DDR, DDR2, and DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4
Clock Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4
Command and Address Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
Data, Data Strobes, DM, and Optional ECC Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
DIMM Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–7
QDR II+ and QDR II SRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
Clock Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
Command Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–9
Address Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–9
Data and QVLD Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–10
RLDRAM II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–11
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
ii
Contents
Clock Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–11
Commands and Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–11
Data, DM and QVLD Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–12
Maximum Number of Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–13
OCT Support for Arria II GX, Arria II GZ, Arria V, Cyclone V, Stratix III, Stratix IV, and Stratix V
Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–23
General Pin-out Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–24
Pin-out Rule Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–26
Exceptions for ×36 Emulated QDR II and QDR II+ SRAM Interfaces in Arria II, Stratix III and
Stratix IV Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–26
Timing Impact on x36 Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–29
Rules to Combine Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–29
Determining the CQ/CQn Arrival Time Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–30
Exceptions for RLDRAM II Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–31
Interfacing with ×9 RLDRAM II CIO Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–32
Interfacing with ×18 RLDRAM II CIO Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–32
Interfacing with RLDRAM II ×36 CIO Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–33
Exceptions for QDR II and QDR II+ SRAM Burst-length-of-two Interfaces . . . . . . . . . . . . . . . . . . . 3–33
Pin Connection Guidelines Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–33
Additional Placement Rules for Cyclone III and Cyclone IV Devices . . . . . . . . . . . . . . . . . . . . . . . . 3–36
Additional Guidelines for Stratix V Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–41
Performing Manual Pin Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–41
PLLs and Clock Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–42
Using PLL Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–48
PLL Cascading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–49
DLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–49
Other FPGA Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–51
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–51
Chapter 4. DDR2 and DDR3 SDRAM Board Design Guidelines
Leveling and Dynamic ODT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–2
Read and Write Leveling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–3
Dynamic ODT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Dynamic OCT in Stratix III and Stratix IV Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Dynamic OCT in Stratix V Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–7
Board Termination for DDR2 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–7
External Parallel Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–8
On-Chip Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–9
Recommended Termination Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–10
Dynamic On-Chip Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–15
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–15
FPGA Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–18
On-Chip Termination (Non-Dynamic) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–19
Class II External Parallel Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–22
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–22
FPGA Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–25
Class I External Parallel Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–27
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–28
FPGA Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–29
Class I Termination Using ODT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–31
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–31
FPGA Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–33
No-Parallel Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–33
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–33
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Contents
iii
FPGA Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–35
Board Termination for DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–38
Single-Rank DDR3 SDRAM Unbuffered DIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–38
DQS, DQ, and DM for DDR3 SDRAM UDIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–40
Memory Clocks for DDR3 SDRAM UDIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–42
Commands and Addresses for DDR3 SDRAM UDIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–44
Stratix III, Stratix IV, and Stratix V FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–45
DQS, DQ, and DM for Stratix III, Stratix IV, and Stratix V FPGA . . . . . . . . . . . . . . . . . . . . . . . . . 4–45
Memory Clocks for Stratix III, Stratix IV, and Stratix V FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–46
Commands and Addresses for Stratix III and Stratix IV FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–46
Multi-Rank DDR3 SDRAM Unbuffered DIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–46
DDR3 SDRAM Registered DIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–48
DDR3 SDRAM Components With Leveling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–48
DDR3 SDRAM Components With or Without Leveling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–48
Stratix III, Stratix IV, and Stratix V FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–52
Drive Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–53
How Strong is Strong Enough? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–54
System Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–55
Component Versus DIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–55
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–55
FPGA Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–57
Single- Versus Dual-Rank DIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–58
Single DIMM Versus Multiple DIMMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–60
Design Layout Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–60
Layout Guidelines for DDR2 SDRAM Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–61
Layout Guidelines for DDR3 SDRAM Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–64
Layout Guidelines for DDR3 SDRAM Wide Interface (>72 bits) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–68
Fly-By Network Design for Clock, Command, and Address Signals . . . . . . . . . . . . . . . . . . . . . . 4–68
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–71
Chapter 5. Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–1
Stratix II High Speed Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–2
Overview of ODT Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–3
DIMM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–5
Dual-DIMM Memory Interface with Slot 1 Populated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–5
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–5
Write to Memory Using an ODT Setting of 150   
Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–7
Dual-DIMM with Slot 2 Populated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–8
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–8
Write to Memory Using an ODT Setting of 150   
Reading from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–10
Dual-DIMM Memory Interface with Both Slot 1 and Slot 2 Populated . . . . . . . . . . . . . . . . . . . . . . . 5–12
FPGA Writing to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–12
Write to Memory in Slot 1 Using an ODT Setting of 75-  
Write to Memory in Slot 2 Using an ODT Setting of 75-  
Reading From Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–15
Dual-DIMM DDR2 Clock, Address, and Command Termination and Topology . . . . . . . . . . . . . . 5–20
Address and Command Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–21
Control Group Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–21
Clock Group Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–21
DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–22
Comparison of DDR3 and DDR2 DQ and DQS ODT Features and Topology . . . . . . . . . . . . . . . . . 5–22
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
iv
Contents
Dual-DIMM DDR3 Clock, Address, and Command Termination and Topology . . . . . . . . . . . . . . 5–23
Address and Command Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–23
Control Group Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–23
Clock Group Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–23
Write to Memory in Slot 1 Using an ODT Setting of 75  With One Slot Populated . . . . . . . . . . . . . . 5–24
Write to Memory in Slot 2 Using an ODT Setting of 75  With One Slot Populated . . . . . . . . . . . . . . 5–25
Write to Memory in Slot 1 Using an ODT Setting of 150  With Both Slots Populated . . . . . . . . . . . . 5–26
Write to Memory in Slot 2 Using an ODT Setting of 150  With Both Slots Populated . . . . . . . . . . . . 5–27
Read from Memory in Slot 1 Using an ODT Setting of 150  on Slot 2 with Both Slots Populated . . 5–28
Read From Memory in Slot 2 Using an ODT Setting of 150  on Slot 1 With Both Slots Populated . 5–29
FPGA OCT Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–30
Arria V, Cyclone V, Stratix III, Stratix IV, and Stratix V Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–30
Arria II GX Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–30
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–31
Chapter 6. RLDRAM II Board Design Guidelines
I/O Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–1
RLDRAM II Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–2
Signal Terminations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–3
Outputs from the FPGA to the RLDRAM II Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–5
Input to the FPGA from the RLDRAM II Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–10
Termination Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–11
PCB Layout Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–12
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6–14
Chapter 7. QDR II SRAM Board Design Guidelines
I/O Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–1
QDR II SRAM Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–2
Signal Terminations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–4
Output from the FPGA to the QDR II SRAM Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–5
Input to the FPGA from the QDR II SRAM Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–13
Termination Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–17
PCB Layout Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–18
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7–22
Chapter 8. Implementing and Parameterizing Memory IP
Installation and Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–1
Free Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–2
OpenCore Plus Time-Out Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–2
Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–3
MegaWizard Plug-In Manager Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–4
Specifying Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–4
Constraining the Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–6
Add Pins and DQ Group Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–6
Compiling the Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–7
SOPC Builder Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–8
Specifying Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–8
Completing the SOPC Builder System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–9
Qsys System Integration Tool Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–9
Specify Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–10
Complete the Qsys System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–11
Qsys and SOPC Builder Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–12
Generated Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–29
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Contents
v
Generated Files for Memory Controllers with the ALTMEMPHY IP . . . . . . . . . . . . . . . . . . . . . . . . 8–30
Generated Files for Memory Controllers with the UniPHY IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–34
Parameterizing Memory Controllers with ALTMEMPHY IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–38
Memory Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–38
Show in ‘Memory Preset’ List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–39
Memory Presets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–39
Preset Editor Settings for DDR and DDR2 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–40
Preset Editor Settings for DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–45
Derating Memory Setup and Hold Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–50
PHY Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–52
Board Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–54
Controller Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–55
Parameterizing Memory Controllers with UniPHY IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–57
PHY Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–57
Memory Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–61
DDR2 and DDR3 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–61
QDR II and QDR II+ SRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–63
RLDRAM II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–64
Memory Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–65
Board Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–66
Setup and Hold Derating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–67
Intersymbol Inteference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–69
Board Skews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–70
Controller Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–76
Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–79
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–80
Chapter 9. Simulating Memory IP
Memory Simulation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–1
Simulation Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–1
Simulation Walkthrough with UniPHY IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–3
Simulation Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–3
Preparing the Vendor Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–4
Functional Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–7
Verilog HDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–7
VHDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–8
Simulating the Example Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–9
Abstract PHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–9
PHY-Only Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–10
Post-fit Functional Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–11
Simulation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–13
Simulation Walkthrough with ALTMEMPHY IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–15
Before Simulating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–16
Preparing the Vendor Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–17
Simulating Using NativeLink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–19
IP Functional Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–20
VHDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–20
Verilog HDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–22
Simulation Tips and Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–23
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–23
DDR3 SDRAM (without Leveling) Warnings and Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–24
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–25
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
vi
Contents
Chapter 10. Analyzing Timing of Memory IP
Memory Interface Timing Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–2
Source-Synchronous Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–2
Calibrated Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–3
Internal FPGA Timing Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–3
Other FPGA Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–3
FPGA Timing Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–4
Arria II Device PHY Timing Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–4
Stratix III and Stratix IV PHY Timing Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–6
Arria V, Cyclone V, and Stratix V Timing paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–8
Cyclone III and Cyclone IV PHY Timing Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–9
Timing Constraint and Report Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–10
ALTMEMPHY Megafunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–10
UniPHY IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–12
Timing Analysis Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–13
Address and Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–14
PHY or Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–14
PHY or Core Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–14
Read Capture and Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–14
Cyclone III and Stratix III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–14
Arria II, Arria V, Cyclone IV, Cyclone V, Stratix IV and Stratix V . . . . . . . . . . . . . . . . . . . . . . . . 10–21
Read Resynchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–22
Mimic Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–22
DQS versus CK—Arria II GX, Cyclone III, and Cyclone IV Devices . . . . . . . . . . . . . . . . . . . . . . . . 10–22
Write Leveling tDQSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–23
Write Leveling tDSH/tDSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–23
DK versus CK (RLDRAM II with UniPHY) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–23
Bus Turnaround Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–23
Timing Report DDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–24
Report SDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–27
Calibration Effect in Timing Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–28
Calibration Emulation for Calibrated Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–28
Calibration Error or Quantization Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–28
Calibration Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–28
Memory Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–28
Timing Model Assumptions and Design Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–29
Memory Clock Output Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–30
Cyclone III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–31
Stratix III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–31
Write Data Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–32
Cyclone III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–33
Stratix III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–33
Read Data Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–35
Cyclone III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–35
Stratix III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–36
Mimic Path Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–36
DLL Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–36
PLL and Clock Network Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–37
Stratix III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–37
Cyclone III Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–37
Timing Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–38
Common Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–38
Missing Timing Margin Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–38
Incomplete Timing Margin Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–38
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Contents
vii
Read Capture Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–38
Write Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–39
Address and Command Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–39
PHY Reset Recovery and Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–40
Clock-to-Strobe (for DDR and DDR2 SDRAM Only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–40
Read Resynchronization and Write Leveling Timing (for SDRAM Only) . . . . . . . . . . . . . . . . . 10–40
Optimizing Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–41
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs . . . . . 10–43
Multiple Chip Select Configuration Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–43
ISI Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–44
Calibration Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–44
Board Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–45
Timing Deration using the Board Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–45
Slew Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–45
Intersymbol Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–48
Board Skews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–49
Timing Deration Using the Excel-Based Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–52
Before You Use the Excel-based Calculator for Timing Deration . . . . . . . . . . . . . . . . . . . . . . . . 10–52
Using the Excel-Based Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–52
Using the Excel-based Calculator for Timing Deration (Without Board Trace Models) . . . . . 10–55
Performing I/O Timing Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–56
Perform I/O Timing Analysis with 3rd Party Simulation Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–56
Perform Advanced I/O Timing Analysis with Board Trace Delay Model . . . . . . . . . . . . . . . . . . . 10–56
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10–57
Chapter 11. Debugging Memory IP
Memory IP Debugging Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–1
Resource and Planning Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–2
Resource Issue Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–2
Dedicated IOE DQS Group Resources and Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–2
Dedicated DLL Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–3
Specific PLL Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–3
Specific Global, Regional and Dual-Regional Clock Net Resources . . . . . . . . . . . . . . . . . . . . . . . 11–3
Planning Issue Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–4
Interface Configuration Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–4
Performance Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–4
Bottleneck and Efficiency Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–5
Functional Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–6
Functional Issue Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–6
Correct Combination of the Quartus II Software and ModelSim-Altera Device Models . . . . . . 11–6
Altera IP Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–7
Vendor Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–7
Out of PC Memory Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–7
Transcript Window Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–8
Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–9
Modifying the Example Driver to Replicate the Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–9
Timing Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–10
Timing Issue Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–10
Timing Issue Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–11
FPGA Timing Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–11
External Memory Interface Timing Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–12
Verifying Memory IP Using the SignalTap II Logic Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–13
Monitoring Signals with the SignalTap II Logic Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–15
DDR, DDR2, and DDR3 ALTMEMPHY Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–15
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
viii
Contents
UniPHY Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–17
Hardware Debugging Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–18
Create a Simplified Design that Demonstrates the Same Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–18
Measure Power Distribution Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–18
Measure Signal Integrity and Setup and Hold Margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–18
Vary Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–18
Use Freezer Spray and Heat Gun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–19
Operate at a Lower Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–19
Find Out if the Issue Exists in Previous Versions of Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–19
Find out if the Issue Exists in the Current Version of Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–19
Try A Different PCB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–20
Try Other Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–20
Debugging Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–20
Catagorizing Hardware Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–21
Signal Integrity Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–21
Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–21
Evaluating SignaI Integrity Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–22
Hardware and Calibration Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–24
Hardware and Calibration Issue Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–24
Evaluating Hardware and Calibration Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–24
Intermittent Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–27
Intermittent Issue Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–27
Debug Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–28
ALTMEMPHY Debug Toolkit Overview and Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–28
UniPHY EMIF Debug Toolkit Overview and Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–29
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11–29
Section II. Miscellaneous Guidelines
Chapter 12. HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–1
Differences in UniPHY IP Generated with HardCopy Migration Support . . . . . . . . . . . . . . . . . . . . 12–3
ROM Loader for Designs Using Nios II Sequencer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–3
Passive Serial (PS) Configuration Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–5
Active Serial (AS) Configuration Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–5
Fast Passive Parallel (FPP) Configuration Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–5
PLL/DLL Run-time Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–6
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12–8
Chapter 13. Optimizing the Controller
Controller Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–1
Factors Affecting Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–2
Interface Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–2
Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–5
Ways to Improve Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–6
DDR2 SDRAM Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–6
Auto-Precharge Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–6
Additive Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–8
Bank Interleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–8
Additive Latency and Bank Interleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–11
User-Controlled Refresh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–13
Frequency of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–13
Burst Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–13
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Contents
ix
Series of Reads or Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–14
Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–14
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13–15
Chapter 14. PHY Considerations
Core Logic and User Interface Data Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14–1
Hard and Soft Memory PHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14–2
Sequencer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14–3
PLL, DLL and OCT Resource Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14–3
Pin Placement Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14–5
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14–5
Chapter 15. Power Estimation Methods for External Memory Interface Designs
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15–2
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
Chapter Revision Dates
The chapters in this document, External Memory Interface Handbook, were revised
on the following dates. Where chapters or groups of chapters are available separately,
part numbers are listed.
Chapter 1.
Selecting Your Memory
Revised:
November 2011
Part Number: EMI_DG_001-4.0
Chapter 2.
Selecting Your FPGA Device
Revised:
November 2011
Part Number: EMI_DG_002-4.0
Chapter 3.
Planning Pin and FPGA Resources
Revised:
November 2011
Part Number: EMI_DG_003-4.0
Chapter 4.
DDR2 and DDR3 SDRAM Board Design Guidelines
Revised:
November 2011
Part Number: EMI_DG_004-4.0
Chapter 5.
Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Revised:
November 2011
Part Number: EMI_DG_005-2.0
Chapter 6.
RLDRAM II Board Design Guidelines
Revised:
November 2011
Part Number: EMI_DG_006-3.0
Chapter 7.
QDR II SRAM Board Design Guidelines
Revised:
November 2011
Part Number: EMI_DG_007-4.0
Chapter 8.
Implementing and Parameterizing Memory IP
Revised:
November 2011
Part Number: EMI_DG_008-4.0
Chapter 9.
Simulating Memory IP
Revised:
November 2011
Part Number: EMI_DG_009-4.0
Chapter 10. Analyzing Timing of Memory IP
Revised:
November 2011
Part Number: EMI_DG_010-4.0
Chapter 11. Debugging Memory IP
Revised:
November 2011
Part Number: EMI_DG_011-4.0
Chapter 12. HardCopy Design Migration Guidelines
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
xii
Chapter Revision Dates
Revised:
November 2011
Part Number: EMI_DG_012-2.0
Chapter 13. Optimizing the Controller
Revised:
November 2011
Part Number: EMI_DG_013-2.0
Chapter 14. PHY Considerations
Revised:
November 2011
Part Number: EMI_DG_014-1.0
Chapter 15. Power Estimation Methods for External Memory Interface Designs
Revised:
November 2011
Part Number: EMI_DG_015-2.0
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Section I. Design Flow Guidelines
This section provides guidelines on how to select your memory and FPGA device, pin
and resource planning, board design guidelines, and the memory IP design flow.
This section includes the following chapters:
■
Chapter 1, Selecting Your Memory
■
Chapter 2, Selecting Your FPGA Device
■
Chapter 3, Planning Pin and FPGA Resources
■
Chapter 4, DDR2 and DDR3 SDRAM Board Design Guidelines
■
Chapter 5, Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
■
Chapter 6, RLDRAM II Board Design Guidelines
■
Chapter 7, QDR II SRAM Board Design Guidelines
■
Chapter 8, Implementing and Parameterizing Memory IP
■
Chapter 9, Simulating Memory IP
■
Chapter 10, Analyzing Timing of Memory IP
■
Chapter 11, Debugging Memory IP
f For information about the revision history for chapters in this section, refer to
“Document Revision History” in each individual chapter.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
1. Selecting Your Memory
November 2011
EMI_DG_001-4.0
EMI_DG_001-4.0
This chapter describes some of the high-speed memory selection criteria based on
strengths and weaknesses, and the various Altera® FPGA devices these memories can
interface with. This chapter also describes the memory component's capability and
provide some typical applications where these memories are used.
The Altera IP may or may not support all of the features supported by the memory.
f For the maximum supported performance supported by Altera FPGAs, refer to the
External Memory Interface Spec Estimator page on the Altera website.
Memory Overview
System architects must resolve a number of complex issues in high-performance
system applications that range from architecture, algorithms, and features of the
available components. Typically, one of the fundamental problems in these
applications is memories, as the bottlenecks and challenges of system performance
often reside in its memory architecture. As higher speeds become necessary for
external memories, signal integrity gets more difficult. Newer devices have added
several features to overcome this issue. Altera FPGAs also support these
advancements with dedicated I/O circuitry, various I/O standard support, and
specialized intellectual property (IP).
When you select an external memory device, consider the following factors:
■
Bandwidth and speed
■
Cost
■
Data storage size and capacity
■
Latency
■
Power consumption
Because no single memory type can excel in every area, system architects must
determine the right balance for their design.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
1–2
Chapter 1: Selecting Your Memory
Memory Overview
Table 1–1 lists the two common types of high-speed memories and their
characteristics.
Table 1–1. Differences between DRAM and SRAM
Memory
Type
DRAM
Description
A dynamic random access
memory (DRAM) cell consisting
of a capacitor and a single
transistor. DRAM memory must
be refreshed periodically to retain
the data, resulting in lower overall
efficiency and more complex
controllers.
Generally, designers select DRAM
where cost per bit and capacity
are important. DRAM is
commonly used for main
memory.
SRAM
A static random access memory
(SRAM) cell that consists of six
transistors. SRAM does not need
to be refreshed because the
transistors continue to hold the
data as long as the power supply
is not cut off.
Generally, designers select SRAM
where speed is more important
than capacity. SRAM is
commonly used for cache
memory.
External Memory Interface Handbook
Volume 2: Design Guidelines
Bandwidth
and Speed
Cost
Data
Storage
Size and
Capacity
Power
consumption
Latency
Lower
bandwidth
resulting in
slower speed
Lower cost
Higher data
storage and
capacity
Higher-power
consumption
Higher
latency
Higher
bandwidth
resulting in
faster speed
Higher cost
Lower data
storage and
capacity
Lower-power
consumption
Lower
latency
November 2011 Altera Corporation
Chapter 1: Selecting Your Memory
Memory Overview
1–3
DDR, DDR2, and DDR3 SDRAM
This section describes and compares the features of the DDR, DDR2, and DDR2
SDRAM.
DDR SDRAM
DDR SDRAM is a 2n prefetch architecture with two data transfers per clock cycle. It
uses a single-ended strobe, DQS, which is associated with a group of data pins, DQ, for
read and write operations. Both DQS and DQ are bidirectional ports. Address ports are
shared for read and write operations.
The desktop computing market has positioned double data rate (DDR) SDRAM as a
mainstream commodity product, which means this memory is very low-cost. DDR
SDRAM is also high-density and low-power. Relative to other high-speed memories,
DDR SDRAM has higher latency-they have a multiplexed address bus, which reduces
the pin count (minimizing cost) at the expense of a longer and more complex bus
cycle.
DDR2 SDRAM
DDR2 SDRAM is the second generation of the DDR SDRAM standard. It is a 4n
prefetch architecture (internally the memory operates at half the interface frequency)
with two data transfers per clock cycle. DDR2 SDRAM can use a single-ended or
differential strobe, DQS or DQSn, which is associated with a group of data pins, DQ, for
read and write operations. The DQS, DQSn, and DQ are bidirectional ports. Address ports
are shared for read and write operations.
DDR2 SDRAM includes additional features such as increased bandwidth due to
higher clock speeds, improved signal integrity on DIMMs with on-die terminations,
and lower supply voltages to reduce power.
DDR3 SDRAM
DDR3 SDRAM is the latest generation of SDRAM. DDR3 SDRAM is internally
configured as an eight-bank DRAM and it uses an 8n prefetch architecture to achieve
high-speed operation. The 8n prefetch architecture is combined with an interface that
transfers two data words per clock cycle at the I/O pins. A single read or write
operation for DDR3 SDRAM consists of a single 8n-bit wide, four-clock data transfer
at the internal DRAM core and two corresponding n-bit wide, one-half clock cycle
data transfers at the I/O pins. DDR3 SDRAMs are available as components and
modules, such as DIMMs, SODIMMs, and RDIMMs.
DDR3 SDRAM is more effective at saving system power, further increases system
performance, lowers power, achieves better maximum throughput, and improves
signal integrity with fly-by and dynamic on-die terminations.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
1–4
Chapter 1: Selecting Your Memory
Memory Overview
Read and write operations to the DDR3 SDRAM are burst oriented. Operation begins
with the registration of an active command, which is then followed by a read or write
command. The address bits registered coincident with the active command select the
bank and row to be activated (BA0 to BA2 select the bank; A0 to A15 select the row). The
address bits registered coincident with the read or write command select the starting
column location for the burst operation, determine if the auto precharge command is
to be issued (via A10), and select burst chop (BC) of 4 or burst length (BL) of 8 mode at
runtime (via A12), if enabled in the mode register. Before normal operation, the DDR3
SDRAM must be powered up and initialized in a predefined manner.
Differential strobes DQS and DQSn are mandated for DDR3 SDRAM and are associated
with a group of data pins, DQ, for read and write operations. DQS, DQSn, and DQ ports
are bidirectional. Address ports are shared for read and write operations. Write and
read operations are sent in bursts, DDR3 SDRAM supports BC of 4 and BL of 8.
1
The DDR3 SDRAM high-performance controller only supports local interfaces
running at half the rate of the memory interface.
f For more information, refer to the respective DDR, DDR2, and DDR3 SDRAM
datasheets.
f For more information about parameterizing the DDR2 and DDR3 SDRAM IP, refer to
the Implementing and Parameterizing Memory IP chapter.
DDR, DDR2, and DDR3 SDRAM Comparison
Table 1–2 compares DDR, DDR2, and DDR3 SDRAM features.
Table 1–2. DDR, DDR2, and DDR3 SDRAM Features (Part 1 of 2)
Feature
DDR SDRAM
DDR2 SDRAM
DDR3 SDRAM
DDR3 SDRAM Advantage
Voltage
2.5 V
1.8 V
1.5 V
Reduces memory system power
demand from DDR or DDR2 by
17%.
Density
64 MB to 1GB
256 MB to 4 GB
512 MB to 8 GB
High-density components simplify
memory subsystem.
Internal banks
4 (fixed number of
rows and columns)
4 and 8
8
Has higher page-to-hit ratio and
better maximum throughput.
Bank
interleaving
—
Allows bank
interleaving
Allows bank
interleaving
Is extremely effective for concurrent
operations and can hide the timing
overhead.
Prefetch
2
4
8
Lower memory core speed results in
higher operating frequency and
lower power operation.
Speed
100 to 200 MHz
200 to 533 MHz
300 to 1,066 MHz
Higher data rate.
Maximum
frequency
200 MHz or
400 Mbps per DQ
pin
533 MHz or
1,066 Mbps per
DQ pin
1,066 MHz or
2,133 Mbps per DQ pin
Higher data rate.
Read latency
2, 2.5, 3 clocks
3, 4, 5 clocks
5, 6, 7, 8, 9, 10, and 11
Eliminating half clock setting allows
8n prefetch architecture.
Additive latency
(1)
—
0, 1, 2, 3, 4
0, CL1, or CL2
Improves command efficiency.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 1: Selecting Your Memory
Memory Overview
1–5
Table 1–2. DDR, DDR2, and DDR3 SDRAM Features (Part 2 of 2)
Feature
DDR SDRAM
DDR2 SDRAM
DDR3 SDRAM
DDR3 SDRAM Advantage
Write latency
One clock
Read latency – 1
5, 6, 7, or 8
Improves command efficiency.
CAS latency
2, 2.5, 3
2, 3, 4, 5
5, 6, 7, 8, 9, 10
Improves command efficiency.
Burst length
2, 4, 8
4, 8
8
Improves command efficiency.
PCB, discrete to VTT
Discrete to VTT or
ODT
Discrete to VTT or ODT
parallel termination.
Controlled impedance
output.
Improves signaling, eases PCB
layout, reduces system cost.
ODT
—
ODT signal
options of 50, 75,
or 150  on all
DQ, DM, and DQS
and DQSn signals
Parallel ODT options of
RZQ/2, RZQ/4, or
RZQ/6  on all DQ, DM,
and DQS and DQSn
signals
DDR3 supports calibrated parallel
ODT through an external resistor
RZQ signal termination. DDR3 also
supports dynamic ODT.
Data strobes
Single-ended
Differential or
single-ended
Differential mandated
Improves timing margin.
Series or daisy chained
The DDR3 SDRAM read and write
leveling feature allows for a much
simplified PCB and DIMM layout.
You can still optionally use the
balanced tree topology by using the
DDR3 without the leveling option.
Termination
Clock, address,
and command
(CAC) layout
Balanced tree
Balanced tree
Note to Table 1–2:
(1) The Altera DDR and DDR2 SDRAM high-performance controllers do not support additive latency, but the high-performance controller II does.
QDR, QDR II, and QDR II+ SRAM
Quad Data Rate (QDR) SRAM has independent read and write ports that run
concurrently at double data rate. QDR SRAM is true dual-port (although the address
bus is still shared), which gives this memory a significantly higher bandwidth,
allowing back-to-back transactions without the contention issues that can occur when
using a single bidirectional data bus. Write and read operations share address ports.
The QDR II SRAM devices are available in ×8, ×9, ×18, and ×36 data bus width
configurations. The QDR II+ SRAM devices are available in ×9, ×18, and ×36 data bus
width configurations. Write and read operations are burst-oriented. All the data bus
width configurations of QDR II SRAM support burst lengths of two and four. QDR II+
SRAM supports only a burst length of four. Burst-of-two and burst-of-four for QDR II
and burst-of-four for QDR II+ SRAM devices provide the same overall bandwidth at a
given clock speed.
For QDR II SRAM devices, the read latency is 1.5 clock cycles, while for QDR II+
SRAM devices it is 2 or 2.5 clock cycles, depending on the memory device. For
QDR II+ and burst-of-four QDR II SRAM devices, the write commands and addresses
are clocked on the rising edge of clock and write latency is one clock cycle. For
burst-of-two QDR II SRAM devices, the write command is clocked on the rising edge
of clock and the write address is clocked on the falling edge of clock. Therefore, the
write latency is zero, because the write data is presented at the same time as the write
command.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
1–6
Chapter 1: Selecting Your Memory
Memory Overview
QDR II+ and QDR II SRAM interfaces use a delay-locked loop (DLL) inside the device
to edge-align the data with respect to the K and Kn or C and Cn pins. You can optionally
turn off the DLL, but the performance of the QDR II+ and QDR II SRAM devices is
degraded. All timing specifications listed in this document assume that the DLL is on.
QDR II+ and QDR II SRAM devices also offer programmable impedance output
buffers. You can set the buffers by terminating the ZQ pin to VSS through a resistor,
RQ. The value of RQ should be five times the desired output impedance. The range
for RQ should be between 175  and 350  with a tolerance of 10%.
QDR II/+ SRAM is best suited for applications where the required read/write ratio is
near one-to-one. QDR II/+ SRAM includes additional features such as increased
bandwidth due to higher clock speeds, lower voltages to reduce power, and on-die
termination to improve signal integrity. QDR II+ SDRAM is the latest and fastest
generation. For QDR II+ and QDR II SRAM interfaces, Altera supports both 1.5-V and
1.8-V HSTL I/O standards.
f For more information, refer to the respective QDRII and QDRII+ datasheets.
f For more information about parameterizing the QDRII and QDRII+ SRAM IP, refer to
the Implementing and Parameterizing Memory IP chapter.
RLDRAM and RLDRAM II
Reduced latency DRAM II (RLDRAM II) is a DRAM-based point-to-point memory
device designed for communications, imaging, server systems, networking, and cache
applications requiring high density, high memory bandwidth, and low latency. The
fast random access speeds in RLDRAM II devices make them a viable alternative to
SRAM devices at a lower cost.
RLDRAM is partitioned into eight smaller banks. This partitioning reduces the
parasitic capacitance of the address and data lines, allowing faster accesses and
reducing the probability of random access conflicts. Also, most DRAM memory types
need both a row and column phase on a multiplexed address bus to support full
random access, while RLDRAM supports a non-multiplexed address, saving bus
cycles at the expense of more pins. RLDRAM utilizes higher operating frequencies
and uses the 1.8V High-Speed Transceiver Logic (HSTL) standard with DDR data
transfer to provide a very high throughput.
There are two types of RLDRAM II devices—common I/O (CIO) and separate I/O
(SIO). CIO devices share a single data I/O bus which is similar to the double data rate
(DDR) SDRAM interface. SIO devices, with separate data read and write buses, have
an interface similar to SRAM.
Compared to DDR SDRAM, RLDRAM II has simpler bank management and lower
latency inside the memory. RLDRAM II devices are divided into eight banks instead
of the typical four banks in most memory devices, providing a more efficient data
flow within the device. Each bank has a fixed number of rows and columns. Only one
row per bank is accessed at a time. The memory (instead of the controller) controls the
opening and closing of a row, which is similar to an SRAM interface. RLDRAM II
offers up to 2.4 Gigabytes per second (Gbps) aggregate bandwidth.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 1: Selecting Your Memory
Memory Overview
1–7
RLDRAM II uses a DDR scheme, performing two data transfers per clock cycle.
RLDRAM II CIO devices use the bidirectional data pins (DQ) for both read and write
data, while RLDRAM II SIO devices use D pins for write data (input to the memory)
and Q pins for read data (output from the memory). Both types use two pairs of unidirectional free-running clocks. The memory uses DK and DK# pins during write
operations, and generates QK and QK# pins during read operations. In addition,
RLDRAM II uses the system clocks (CK and CK# pins) to sample commands and
addresses and generate the QK and QK# read clocks. Address ports are shared for write
and read operations.
The RLDRAM II SIO devices are available in ×9 and ×18 data bus width
configurations, while the RLDRAM II CIO devices are available in ×9, ×18, and
×36 data bus width configurations. RLDRAM II CIO interfaces may require an extra
cycle for bus turnaround time for switching read and write operations.
Write and read operations are burst oriented and all the data bus width configurations
of RLDRAM II support burst lengths of two and four. In addition, RLDRAM II
devices with data bus width configurations of ×9 and ×18 also support burst length of
eight.
The RLDRAM devices have up to five programmable configuration settings that
determine the row cycle times, read latency, and write latency of the interface at a
given frequency of operation.
RLDRAM II also offers programmable impedance output buffers and on-die
termination. The programmable impedance output buffers are for impedance
matching and are guaranteed to produce 25- to 60-ohm output impedance. The on-die
termination is dynamically switched on during read operations and switched off
during write operations. Perform an IBIS simulation to observe the effects of this
dynamic termination on your system. IBIS simulation can also show the effects of
different drive strengths, termination resistors, and capacitive loads on your system.
RLDRAM II devices use either the 1.5-V HSTL or 1.8-V HSTL I/O standard. You can
use either I/O standard to interface with Altera FPGAs.
f For more information, refer to the RLDRAM II datasheets.
f For more information about parameterizing the RLDRAM II IP, refer to the
Implementing and Parameterizing Memory IP chapter.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
1–8
Chapter 1: Selecting Your Memory
Memory Selection
Memory Selection
One of the first considerations in choosing a high-speed memory is data bandwidth.
Based on the system requirements, an approximate data rate to the external memory
should be determined. You must also consider other memory attributes, including
how much memory is required (density), how much latency can be tolerated, what is
the power budget, and whether the system is cost sensitive.
Table 1–3 lists the memory bandwidth, the features and target markets of each
technology.
Table 1–3. Memory Selection Overview (Part 1 of 2)
Parameter
DDR3 SDRAM
DDR2 SDRAM
DDR SDRAM
RLDRAM II
QDR II/+ SRAM
Bandwidth for 32
bits (Gbps) (1)
34.1
25.6
12.8
25.6
44.8
Bandwidth at %
Efficiency
(Gbps) (2)
23.9
17.9
9
17.9
38.1
Performance /
Clock frequency
400–1,066 MHz
200–533 MHz
100–200 MHz
200–533 MHz
154–350 MHz
Altera-supported
data rate
Up to 2,133 Mbps
Up to 1,066 Mbps
Up to 400 Mbps
Up to 2132 Mbps
Up to 1400 Mbps
Density
512 Mbytes–
8 Gbytes,
32 Mbytes –
8 Gbytes (DIMM)
256 Mbytes–
1 Gbytes,
32 Mbytes –
4 Gbytes (DIMM)
128 Mbytes–
1 Gbytes,
32 Mbytes –
2 Gbytes (DIMM)
288 Mbytes,
576 Mbytes
8–72 Mbytes
I/O standard
SSTL-15 Class I, II SSTL-18 Class I, II SSTL-2 Class I, II
HSTL-1.8V/1.5V
HSTL-1.8V/1.5V
Data width (bits)
4, 8, 16
4, 8, 16
4, 8, 16, 32
9, 18, 36
8, 9, 18, 36
Burst length
8
4, 8
2, 4, 8
2, 4, 8
2, 4
Number of banks
8
8 (>1 GB), 4
4
8
N/A
Row/column
access
Row before
column
Row before
column
Row before
column
Row and column
together or
N/A
multiplexed option
CAS latency (CL)
5, 6, 7, 8, 9, 10
3, 4, 5
2, 2.5, 3
4, 6, 8
N/A
Posted CAS
additive latency
(AL)
0, CL-1, CL-2
0, 1, 2, 3, 4
N/A
N/A
N/A
Read latency (RL)
RL = CL + AL
RL = CL + AL
RL = CL
RL = CL/CL + 1
1.5, 2, and 2.5
clock cycles
Yes
No
Yes
Yes
Data strobe
Differential
bidirectional
strobe only
Differential or
single-ended
bidirectional
strobe
Single-ended
bidirectional
strobe
Free-running
differential read
and write clocks
Free-running read
and write clocks
Refresh
requirement
Yes
Yes
Yes
Yes
No
Relative cost
comparison
Presently lower
than DDR2
Less than DDR
SDRAM with
Low
market acceptance
Higher than DDR
SDRAM,
less than SRAM
Highest
On-die termination Yes
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 1: Selecting Your Memory
High-Speed Memory in Embedded Processor Application Example
1–9
Table 1–3. Memory Selection Overview (Part 2 of 2)
Parameter
Target market
DDR3 SDRAM
DDR2 SDRAM
DDR SDRAM
RLDRAM II
QDR II/+ SRAM
Desktops, servers,
storage, LCDs,
displays,
networking, and
communication
equipment
Desktops, servers,
storage, LCDs,
displays,
networking, and
communication
equipment
Desktops, servers,
storage, LCDs,
displays,
networking, and
communication
equipment
Main memory,
cache memory,
networking,
packet processing,
and traffic
management
Cache memory,
routers, ATM
switches, packet
memories, lookup,
and classification
memories
Notes to Table 1–3:
(1) 32-bit data bus operating at the maximum supported frequency in a Stratix® IV FPGA.
(2) 70% efficiency for DDR memories, which takes into consideration the bus turnaround, refresh, burst length and random access latency and
assumes 85% efficiency for QDR memories
Altera supports these memory interfaces, provides various IP for the physical
interface and the controller, and offers many reference designs (refer to Altera’s
Memory Solutions Center).
f For Altera support and the maximum performance for the various high-speed
memory interfaces, refer to the External Memory Interface Spec Estimator page on the
Altera website.
High-Speed Memory in Embedded Processor Application Example
In embedded processor applications—any system that uses processors, excluding
desktop processors—DDR SDRAM is typically used for main memory due to its very
low cost, high density, and low power. Next-generation processors invest a large
amount of die area to on-chip cache memory to prevent the execution pipelines from
sitting idle. Unfortunately, these on-chip caches are limited in size, as a balance of
performance, cost, and power must be taken into consideration. In many systems,
external memories are used to add another level of cache. In high-performance
systems, three levels of cache memory is common: level one (8 Kbytes is common)
and level two (512 Kbytes) on chip, and level three off chip (2 Mbytes).
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
1–10
Chapter 1: Selecting Your Memory
High-Speed Memory in Embedded Processor Application Example
High-end servers, routers, and even video game systems are examples of highperformance embedded products that require memory architectures that are both
high speed and low latency. Advanced memory controllers are required to manage
transactions between embedded processors and their memories. Altera Arria® series
and Stratix series FPGAs optimally implement advanced memory controllers by
utilizing their built-in DQS (strobe) phase shift circuitry. Figure 1–1 highlights some of
the features available in an Altera FPGA in an embedded application, where DDR2
SDRAM is used as the main memory and QDR II SRAM or RLDRAM II is an external
cache level.
Figure 1–1. Memory Controller Example Using FPGA
533-Mbps DDR2 SDRAM (1)
DDR2 SDRAM
DIMM
IP available for processor interfaces
such as PowerPC, MIPs, and ARM
Embedded
processor
Altera
FPGA
DDR2 Interface
Processor
Interface
Memory
controller
PCI interface
PCI Master/Target cores capable of
64-bit, 66-MHz 1361 LEs ,
4% of an EP2S30 (5)
Memory Interface
350-MHz embedded SRAM (2)
600-Mbps RLDRAM II (3)
or 1-Gbps QDR II SRAM (4)
RLDRAM II or
QDR II SRAM
[
Notes to Figure 1–1:
(1) 533-Mbps DDR2 SDRAM operation using dedicated DQS circuitry, post-amble circuitry, automatic phase shifting, and six registers in the I/O
element: 790 LEs, 3% of an EP2S30, and four clock buffers (for a 72-bit interface).
(2) High-speed memory interfaces such as QDR II SRAM require at least four clock buffers to handle all the different clock phases and data directions.
(3) 600-Mbps RLDRAM II operation: 740 logic elements (LEs), 3% of an EP2S30, and four clock buffers (for a 36-bit wide interface).
(4) Embedded SRAM with features such as true-dual port and 350-MHz operation allows complex “store and forward” memory controller
architectures.
(5) The Quartus® II software reports the number of adaptive look-up tables (ALUTs) that the design uses in the FPGA. The LE count is based on this
number of ALUTs.
One of the target markets of RLDRAM II and QDR/QDR II SRAM is external cache
memory. RLDRAM II has a read latency close to SSRAM, but with the density of
SDRAM. A sixteen times increase in external cache density is achievable with one
RLDRAM II versus that of SSRAM. In contrast, consider QDR and QDR II SRAM for
systems that require high bandwidth and minimal latency. Architecturally, the
dual-port nature of QDR and QDR II SRAM allows cache controllers to handle read
data and instruction fetches completely independent of writes.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 1: Selecting Your Memory
High-Speed Memory in Telecom Application Example
1–11
High-Speed Memory in Telecom Application Example
Because telecommunication network architectures are becoming more complex,
high-end network systems are running multiple 10-Gbps line cards that connect to
multi-shelf switch fabrics scaling to Terabits per second. Figure 1–2 shows an example
of a typical system line interface card. These line cards offer interfaces ranging from a
single-port OC-192 to multi-port Gigabit Ethernet, and consist of a number of devices,
including a PHY/framer, network processors, traffic managers, fabric interface
devices, and high-speed memories.
Figure 1–2. Typical Telecom System Line Interface Card
Telecom line card datapath
Lookup
table
Coprocessor
Lookup
table
Buffer
memory
Pre-processor
Buffer
memory
Buffer
memory
Network
processor
Traffic
manager
PHY/
framer
Pre-processor
Buffer
memory
Network
processor
Traffic
manager
Buffer
memory
Buffer
memory
Switch fabric
interface
As packets traverse from the PHY/framer device to the switch fabric interface, they
are buffered into memories, while the data path devices process headers (determining
the destination, classifying packets, and storing statistics for billing) and control the
flow of packets into the network to avoid congestion. Typically DDR/DDR2/DDR3
SDRAM and RLDRAM II are used for large buffer memories off network processors,
traffic managers, and fabric interfaces, while QDR and QDR II SRAMs are used for
look-up tables (LUTs) off preprocessors and coprocessors.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
1–12
Chapter 1: Selecting Your Memory
High-Speed Memory in Telecom Application Example
In many designs, FPGAs connect devices together for interoperability and
coprocessing, implement features that are not supported by ASIC devices, or
implement a device function entirely. Altera Stratix series FPGAs implement traffic
management, packet processing, switch fabric interfaces, and coprocessor functions,
using features such as 1-Gbps LVDS I/O, high-speed memory interface support,
multi-gigabit transceivers, and IP cores. Figure 1–3 highlights some of these features
in a packet buffering application where RLDRAM II is used for packet buffer memory
and QDR II SRAM is used for control memory.
Figure 1–3. FPGA Example in Packet Buffering Application
RLDRAM II
Altera
FPGA (1) , (8)
RLDRAM II
Interface (2)
Dedicated SERDES and DPA (3)
SP14.2i
RX
Differential termination (4)
Core
logic
PCI
Interface (6)
SP14.2i
TX (5)
QDRII SRAM
Interface (7)
QDRII SRAM
Notes to Figure 1–3:
(1) As an example, 85% of the LEs still available in an EP2S90.
(2) 600-Mbps RLDRAM II operation: 740 LEs, 1% of an EP2S90, and four clock buffers (for a 36-bit wide interface).
(3) Dedicated hardware SERDES and DPA circuitry allows clean and reliable implementation of 1-Gbps LVDS.
(4) Differential termination is built in Stratix FPGAs, simplifying board layout and improving signal quality.
(5) SPI 4.2i core capable of 1 Gbps: 5178 LEs per Rx, 6087 LEs per Tx, 12% of an ES2S90, and four clock buffers (for both directions using individual
buffer mode, 32-bit data path, and 10 logical ports).
(6) PCI cores capable of 64-bit 66-MHz 656 LEs, 1% of an EP2S90 for a 32-bit target
(7) 1-Gbps QDR II SRAM operation: 100 LEs, 0.1% of an EP2S90, and four clock buffers (for an 18-bit interface).
(8) Note that the Quartus II software reports the number of ALUTs that the design uses in Stratix II devices. The LE count is based on this number of
ALUTs.
SDRAM is usually the best choice for buffering at high data rates due to the large
amounts of memory required. Some system designers take a hybrid approach to the
memory architecture, using SRAM to store the packet headers and DRAM to store the
payload. The depth of the memories depends on the architecture and throughput of
the system.
The buffer memory for the packet buffering application of an OC-192 line card
(approximately 10 Gbps) must be able to sustain a minimum of one write and one
read operation, which requires a memory bandwidth of 20 Gbps to operate at full line
rate (more bandwidth is required if the headers are modified). The bandwidth
requirement for memory is a key factor in memory selection (see Table 1–3). As an
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 1: Selecting Your Memory
Document Revision History
1–13
example, a simple first-order calculation using RLDRAM II as buffer memory requires
a bus width of 48 bits to sustain 20 Gbps (300 MHz × 2 DDR × 0.70 efficiency × 48 bits
= 20.1 Gbps), which needs two RLDRAM II parts (one ×18 and one ×36). RLDRAM II
also inherently includes the additional memory bits used for parity or error correction
code (ECC).
QDR and QDR II SRAM have bandwidth and low random access latency advantages
that make them useful for control memory in queue management and traffic
management applications. Another typical implementation for this memory is billing
and packet statistics, where each packet requires counters to be read from memory,
incremented, and then rewritten to memory. The high bandwidth, low latency, and
optimal one-to-one read/write ratio make QDR SRAM ideal for this feature.
Document Revision History
Table 1–4 lists the revision history for this document.
Table 1–4. Document Revision History
Date
Version
Changes
November 2011
4.0
Moved and reorganized “Selecting your Memory” section to Volume 2:Design Guidelines.
June 2011
3.0
Added “Selecting Memory IP” chapter from Volume 2 Section I.
December 2010
2.1
July 2010
2.0
■
Moved protocol-specific feature information to the memory interface user guides in
Volume 3.
■
Updated maximum clock rate information for 10.1.
■
Added specifications for DDR2 and DDR3 SDRAM Controllers with UniPHY.
■
Streamlined the specification tables.
■
Added reference to web-based Specification Estimator Tool.
January 2010
1.1
Updated DDR, DDR2, and DDR3 specifications.
November 2009
1.0
First published.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
2. Selecting Your FPGA Device
November 2011
EMI_DG_002-4.0
EMI_DG_002-4.0
This chapter discusses the following topics about selecting the right Altera® FPGA
device for your external memory interface:
■
“Device Family Selection” on page 2–1
■
“Device Settings Selection” on page 2–4
f Use this document with the Planning Pin and FPGA Resources chapter, before you start
implementing your external memory interface.
Device Family Selection
Altera external memory interfaces support three FPGA device families—Arria®,
Stratix®, and Cyclone® device families. These FPGA device families varies in terms of
cost, memory standards, speed grade, and features.
1
Use the Altera Product Selector to find and compare specifications and features of
Altera devices.
The following sections describe the factors that you must consider when selecting an
FPGA device family.
Cost
The cost of an FPGA is the main factor in selecting a device family that suites your
design. The Stratix FPGA family delivers the industry’s highest bandwidth and
density. It also has the highest level of system integration with ultimate flexibility at a
reduced cost, and the lowest total power for high-end applications. By combining
high density, high performance, and a rich feature set, the Stratix series FPGAs allow
you to integrate more functions and maximize system bandwidth.
Altera's Arria FPGA series is designed to deliver a balance between cost, power, and
performance. This device family targets the cost-and power-sensitive
transceiver-based applications. The Arria FPGA series has a rich feature set of
memory, logic, and digital signal processing (DSP) blocks combined with superior
signal integrity of up to 10G transceivers that allows you to integrate more functions
and maximize system bandwidth.
The Cyclone FPGA series is designed for the lowest power consumptions and
cost-sensitive design needs. The Cyclone FPGA family provides the market’s lowest
system cost and lowest power FPGA solution for applications in various fields.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
2–2
Chapter 2: Selecting Your FPGA Device
Device Family Selection
Memory Standards
There are two common types of high-speed memories that are supported by Altera
devices—dynamic random access memory (DRAM) and static random access
memory (SRAM). The commonly used DRAM devices include DDR, DDR2, DDR3
SDRAM, and RDRAM II while SRAM devices include QDR II and QDR II+ SRAM.
f For more information about these memory types, refer to the Selecting Your Memory
chapter.
Different Altera FPGA devices support different memory types; not all Altera devices
support all memory types and configurations. Before you start your design, you must
select an Altera device, which supports the memory standard and configurations you
plan to use.
In addition, Altera’s FPGA devices support various data widths for different memory
interfaces. The memory interface support between density and package combinations
differs, so you must determine which FPGA device density and package combination
suits your application.
f For more information about the supported memory types and configurations, refer to
the External Memory Interface Spec Estimator page on the Altera website.
I/O Interfaces
Ideally any interface should reside entirely in a single bank. However, interfaces that
span across multiple adjacent banks or the entire side of a device are also fully
supported. Interfaces that span across sides (top and bottom, or left and right) and
wraparound interfaces provide the same level of performance.
Table 2-1 lists the location of interfaces for various device families.
Table 2–1. Location of I/O Interfaces (Part 1 of 2)
Devices
Interface Location
Exceptions
top and bottom sides
—
Arria II GZ
top and bottom sides
—
Arria V (2)
top and bottom sides
5AGXA1 and 5AGXA3 devices
support interfaces on the left side.
Cyclone III
top and bottom sides
Cyclone III E devices support
interfaces support interfaces
spanning left and right sides.
Cyclone IV (2)
Arria II
GX (2)
top and bottom sides
—
Stratix
II (1)
all sides
—
Stratix
III (1)
all sides
—
EP4SGX290 and EP4SGX360
devices does not support interfaces
on left and right side.
Stratix IV (1)
External Memory Interface Handbook
Volume 2: Design Guidelines
all sides
EP4SGX70, EP4SGX110,
EP4SGX180, and EP4SGX230
devices does not support interfaces
on the right side.
November 2011 Altera Corporation
Chapter 2: Selecting Your FPGA Device
Device Family Selection
2–3
Table 2–1. Location of I/O Interfaces (Part 2 of 2)
Devices
Interface Location
Exceptions
V (2)
top and bottom sides
—
Stratix
Notes to Table 2–1:
(1) Although vertical and horizontal I/O timing parameters are not identical, timing closure can be achieved on all sides
of the FPGA for the maximum interface frequency.
(2) There are no user I/O pins on the right and left sides of the device, other than the transceiver pins available in these
devices.
f For more information about I/O interfaces supported for each device, refer to the
respective device handbooks.
Wraparound Interfaces
For maximum performance, Altera recommends that data groups for external
memory interfaces should always be within the same side of a device, ideally reside
within a single bank. High-speed memory interfaces using top or bottom I/O bank
versus left or right IO bank have different timing characteristics, so the timing
margins are also different. However, Altera can support interfaces with wraparound
data groups that wraparound a corner of the device between vertical and horizontal
I/O banks at some speeds. Some devices wraparound interfaces are the same speed as
row or column interfaces.
The Arria II GX, Cyclone III and Cyclone IV devices can support wraparound
interface across all sides of devices that are not used for transceivers. Other Altera
devices only support interfaces with data groups that wraparound a corner of the
device.
Read and Write Leveling
The Stratix III, Stratix IV, and Stratix V I/O registers include read and write leveling
circuitry to enable skew to be removed or applied to the interface on a DQS group
basis. There is one leveling circuit located in each I/O subbank.
1
UniPHY-based designs do not require read leveling circuitry during read leveling
operation.
For more information about read and write leveling, refer to “Leveling Circuitry”
section in Volume 3: Reference Material of the External Memory Interface Handbook.
Dynamic OCT
The Arria II GZ, Arria V, Cyclone V, Stratix III, Stratix IV, and Stratix V devices
support dynamic calibrated OCT. This feature allows the specified series termination
to be enabled during writes, and parallel termination to be enabled during reads.
These I/O features allow you to simplify PCB termination schemes.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
2–4
Chapter 2: Selecting Your FPGA Device
Device Settings Selection
Device Settings Selection
After you have selected the appropriate FPGA device family for your memory
interface, configure the device settings of your selected FPGA device family to meet
your design needs.
Refer to the device ordering code and determine the appropriate device settings for
your target device family.
f For more information about the ordering code for your target device, refer to the
“Ordering Information” section in volume 1 of the respective device handbooks.
The following sections describe the ordering code and how to select the appropriate
device settings based on the ordering code to meet the requirements of your external
memory interface.
Speed Grade
The device speed grade affects the device timing performance, timing closure, and
power utilization. The device with the smallest number is the fastest device and
vice-versa. Generally, the faster devices cost more.
Operating Temperature
The operating temperature of the FPGA is divided into the following categories:
■
Commercial grade—Used for all device families. The operating temperature range
from 0C to 85C
■
Industrial grade—Used for all device families. The operating temperature range
from -40C to 100C
■
Military grade—Used for Stratix IV device family only. The operating temperature
range from -55C to 125C
■
Automotive grade—Used for Cyclone IV and Cyclone V device families only. The
operating temperature range from -40C to -125C
Package Size
Each FPGA family has various range of package size. Package size refers to the actual
size of an FPGA device and corresponds to the number of pin counts. For example,the
package size for the smallest FPGA device in the Stratix IV family is 29 mm x 29 mm,
categorized under the F780 package option, where F780 refers to a device with 780 pin
counts.
f For more information about the package size available for your device, refer to the
respective device handbooks.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 2: Selecting Your FPGA Device
Device Settings Selection
2–5
Device Density and I/O Pin Counts
An FPGA device of the same device family and package size also varies in terms of
device density and I/O pin counts. For example, after you have selected the Stratix IV
device family with the F780 packaging option, you must determine the type of device
models that ranges from EP4GX70 to EP4GX230. Each of these devices have similar
speed grades that ranges from grade 2 to grade 4, but are different in density.
Device Density
Device density refers to the number of logic elements (LEs). For example, PLLs,
memory blocks, and so on. An FPGA device with higher density contains more logic
elements in less area.
I/O Pin Counts
To meet the growing demand for memory bandwidth and memory data rates,
memory interface systems use parallel memory channels and multiple controller
interfaces. However, the number of memory channels is limited by the package pin
count of the Altera devices. Therefore, you must consider device pin count when you
select a device; you must select a device with enough I/O pins for your memory
interface requirement.
The number of device pins depends on the memory standard, the number of memory
interfaces, and the memory data width. For example, a ×72 DDR3 SDRAM
single-rank interface requires 125 I/O pins:
■
72 DQ pins (including ECC)
■
9 DM pins
■
9 DQS, DQSn differential pin pairs
■
17 address pins (address and bank address)
■
7 command pins (CAS, RAS, WE, CKE, ODT, reset, and CS)
■
1 CK, CK# differential pin pair
f For more information about the number of embedded memory, PLLs and user I/O
counts that are available for your device, refer to the respective device handbooks.
f For the number of DQS groups available for each FPGA device, refer to the respective
device handbooks.
f For the maximum number of controllers that is supported by the FPGAs for different
memory types, refer to the Planning Pin and FPGA Resources chapter.
Altera devices do not limit the interface widths beyond the following requirements:
November 2011
■
The DQS, DQ, clock, and address signals of the entire interface must reside within
the same bank or side of the device if possible, to achieve better performance.
Although wraparound interfaces are also supported at limited frequencies.
■
The maximum possible interface width in any particular device is limited by the
number of DQS and DQ groups available within that bank or side.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
2–6
Chapter 2: Selecting Your FPGA Device
Document Revision History
■
Sufficient regional clock networks are available to the interface PLL to allow
implementation within the required number of quadrants.
■
Sufficient spare pins exist within the chosen bank or side of the device to include
all other clock, address, and command pin placement requirements.
■
The greater the number of banks, the greater the skew. Altera recommends that
you always compile a test project of your desired configuration and confirm that it
meets timing requirement.
Your pin count calculation also determines which device side to use (top or bottom,
left or right, and wraparound).
1
There is a constraint in Arria® II GX devices when assigning DQS and DQ pins. You
are only allowed to use twelve of the sixteen I/O pins in an I/O module as DQ pins.
The remaining four pins can only be used as input pins.
f For DQS groups pin-out restriction format, refer to Arria II GX Pin Connection
Guidelines.
1
The Arria II GX, Cyclone IV, and Stratix V devices do not support the left interface.
There are no user I/O pins, other than the transceiver pins available in these devices.
Document Revision History
Table 2–2 lists the revision history for this document.
Table 2–2. Document Revision History
Date
Version
Changes
November 2011
4.0
Moved and reorganized “Selecting your FPGA device” section to Volume 2:Design
Guidelines.
June 2011
3.0
Added “Selecting a Device” chapter from Volume 2 Section I.
December 2010
July 2010
■
Moved protocol-specific feature information to the memory interface user guides in
Volume 3.
■
Updated maximum clock rate information for 10.1.
■
Added specifications for DDR2 and DDR3 SDRAM Controllers with UniPHY.
■
Streamlined the specification tables.
■
Added reference to web-based Specification Estimator Tool.
2.1
2.0
January 2010
1.1
Updated DDR, DDR2, and DDR3 specifications.
November 2009
1.0
First published.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
3. Planning Pin and FPGA Resources
November 2011
EMI_DG_003-4.0
EMI_DG_003-4.0
This chapter is for board designers who need to determine the FPGA pin usage, to
create the board layout for the system, as the board design process sometimes occurs
concurrently with the RTL design process.
f Use this document with the External Memory Interfaces chapter of the relevant device
family handbook.
All external memory interfaces typically require the following FPGA resources:
■
Interface pins
■
PLL and clock network
■
DLL (not applicable in Cyclone® III and Cyclone IV devices)
■
Other FPGA resources—for example, core fabric logic, and on-chip termination
(OCT) calibration blocks
When you know the requirements for your memory interface, you can then start
planning how you can architect your system. The I/O pins and internal memory
cannot be shared for other applications or memory interfaces. However, if you do not
have enough PLLs, DLLs, or clock networks for your application, you may share
these resources among multiple memory interfaces or modules in your system.
Ideally, any interface should reside entirely in a single bank. However, interfaces that
span multiple adjacent banks or the entire side of a device are also fully supported. In
addition, you may also have wraparound memory interfaces, where the design uses
two adjacent sides of the device and the memory interface logic resides in a device
quadrant. In some cases, top or bottom bank interfaces have higher supported clock
rate than left or right or wraparound interfaces.
Interface Pins
Any I/O banks that do not support transceiver operations in Arria® II, Cyclone III,
Cyclone IV, Stratix® III, Stratix IV, and Stratix V devices support memory interfaces.
However, DQS (data strobe or data clock) and DQ (data) pins are listed in the device
pin tables and fixed at specific locations in the device. You must adhere to these pin
locations as these locations are optimized in routing to minimize skew and maximize
margin. Always check the external memory interfaces chapters from the device
handbooks for the number of DQS and DQ groups supported in a particular device
and the pin table for the actual locations of the DQS and DQ pins.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
3–2
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
For maximum performance and best skew across the interface, each required memory
interface should completely reside within a single I/O bank, or at least one side of the
device. Address and command pins can be constrained in a different side of the
device if there are not enough pins available. For example, you may have the read and
write data pins on the top side of the device, and have the address and command pins
on the left side of the device. In memory interfaces with unidirectional data, you may
also have all the read data pins on the top side of the device and the write data pin on
the left side of the device. However, you should not break a unidirectional pin group
across multiple sides of the device. Memory interfaces typically have the following
pin groups:
■
Write data pin group and read data pin group
■
Address and command pin group
Table 3–1 lists a summary of the number of pins required for various example
memory interfaces. Table 3–1 uses series OCT with calibration, parallel OCT with
calibration, or dynamic calibrated OCT, when applicable, shown by the usage of RUP
and RDN pins or RZQ pin.
Table 3–1. Pin Counts for Various Example Memory Interfaces
Memory
Interface
FPGA
DQS
Bus
Width
×4
DDR3
SDRAM
(5), (6)
×8
×4
DDR2
SDRAM
(8)
×8
×4
DDR
SDRAM
(6)
×8
×9
QDR II+
SRAM
QDR II
SRAM
RLDRAMII
CIO
Number Number
of DQ
of DQS
Pins
Pins
Number
of
DM/BWSn
Pins
(7)
(Part 1 of 2)
Total
Number
RZQ
Pins
of
RUP/RDN
Pins
with
Clock Pins (4) (11)
RUP/RDN
Pins
Total
Pins
with
RZQ
Number
of
Address
Pins (3)
Number of
Command
Pins
14
10
2
2
1
34
33
4
2
8
2
1
14
10
2
2
1
39
38
16
4
2
14
10
2
2
1
50
49
72
18
9
14
14
4
2
1
134
133
15
9
2
2
1
34
33
4
1
0
(1), (2)
1
(7)
8
1
(9)
1
15
9
2
2
1
38
37
16
2
(9)
2
15
9
2
2
1
48
47
72
9
(9)
9
15
12
6
2
1
125
124
14
7
2
2
1
29
28
4
1
8
1
1
14
7
2
2
1
33
35
16
2
2
14
7
2
2
1
43
42
72
9
9
13
9
6
2
1
118
117
19
3
(10)
4
2
1
49
48
18
2
1
(7)
1
×18
36
2
2
18
3
(10)
4
2
1
67
66
×36
72
2
4
17
3
(10)
4
2
1
104
103
×9
18
2
1
19
2
4
2
1
48
47
×18
36
2
2
18
2
4
2
1
66
65
×36
72
2
4
17
2
4
2
1
103
102
22
7
(10)
4
2
2
47
46
7
(10)
6
2
2
57
56
7
(10)
8
2
2
76
75
×9
×18
9
18
36
External Memory Interface Handbook
Volume 2: Design Guidelines
2
2
2
1
1
1
21
20
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–3
Table 3–1. Pin Counts for Various Example Memory Interfaces
Memory
Interface
RLDRAM II
SIO
FPGA
DQS
Bus
Width
Number Number
of DQ
of DQS
Pins
Pins
18
×9
×18
(1), (2)
Number
of
DM/BWSn
Pins
Number
of
Address
Pins (3)
1
22
2
(Part 2 of 2)
Number of
Command
Pins
Total
Number
RZQ
Pins
of
RUP/RDN
Pins
with
Clock Pins (4) (11)
RUP/RDN
Pins
Total
Pins
with
RZQ
7
(10)
4
6
2
2
75
74
8
2
2
112
111
36
2
1
21
7
(10)
72
2
1
20
7
(10)
2
2
56
55
Notes to Table 3–1:
(1) These example pin counts are derived from memory vendor data sheets. Check the exact number of addresses and command pins of
the memory devices in the configuration that you are using.
(2) PLL and DLL input reference clock pins are not counted in this calculation.
(3) The number of address pins depend on the memory device density.
(4) Some DQS or DQ pins are dual purpose and can also be required as RUP, RDN, or configuration pins. A DQS group is lost if you use these
pins for configuration or as RUP or RDN pins for calibrated OCT. Pick RUP and RDN pins in a DQS group that is not used for memory
interface purposes. You may need to place the DQS and DQ pins manually if you place the RUP and RDN pins in the same DQS group pins.
(5) The TDQS and TDQS# pins are not counted in this calculation, as these pins are not used in the memory controller.
(6) Numbers are based on 1-GB memory devices.
(7) Altera® FPGAs do not support DM pins in ×4 mode with differential DQS signaling.
(8) Numbers are based on 2-GB memory devices without using differential DQS, RDQS, and RDQS# pin support.
(9) Assumes single ended DQS mode. DDR2 SDRAM also supports differential DQS, which makes these DQS and DM numbers identical to
DDR3 SDRAM.
(10) The QVLD pin that indicates read data valid from the QDR II+ SRAM or RLDRAM II device, is included in this number.
(11) RZQ pins are supported by Stratix V devices only.
1
Maximum interface width varies from device to device depending on the number of
I/O pins and DQS or DQ groups available. Achievable interface width also depends
on the number of address and command pins that the design requires. To ensure
adequate PLL, clock, and device routing resources are available, you should always
test fit any IP in the Quartus® II software before PCB sign-off.
Altera devices do not limit the width of external memory interfaces beyond the
following requirements:
■
Maximum possible interface width in any particular device is limited by the
number of DQS groups available.
■
Sufficient clock networks are available to the interface PLL as required by the IP.
■
Sufficient spare pins exist within the chosen bank or side of the device to include
all other address and command, and clock pin placement requirements.
■
The greater the number of banks, the greater the skew, hence Altera recommends
that you always generate a test project of your desired configuration and confirm
that it meets timing.
While you should use the Quartus II software for final pin fitting, you can estimate
whether you have enough pins for your memory interface using the following steps:
1. Find out how many read data pins are associated per read data strobe or clock
pair, to determine which column of the DQS and DQ group availability (×4,
×8/×9, ×16/×18, or ×32/×36) look at the pin table.
2. Check the device density and package offering information to see if you can
implement the interface in one I/O bank or on one side or on two adjacent sides.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–4
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
1
If you target Arria II GX, Cyclone III, or Cyclone IV devices and you do not
have enough I/O pins to have the memory interface on one side of the
device, you may place them on the other side of the device. These device
families allow a memory interface to span across the top and bottom, or left
and right sides of the device. For any interface that spans across two
different sides, use the wraparound interface performance.
3. Calculate the number of other memory interface pins needed, including any other
clocks (write clock or memory system clock), address, command, RUP, RDN, RZQ,
and any other pins to be connected to the memory components. Ensure you have
enough pins to implement the interface in one I/O bank or one side or on two
adjacent sides.
1
The DQS groups in Arria II GX devices reside on I/O modules, each
consisting of 16 I/O pins. You can only use a maximum of 12 pins per I/O
modules when the pins are used as DQS or DQ pins or HSTL/SSTL output
or HSTL/SSTL bidirectional pins. When counting the number of available
pins for the rest of your memory interface, ensure you do not count the
leftover four pins per I/O modules used for DQS, DQ, address and
command pins. The leftover four pins can be used as input pins only.
f Refer to the device pin-out tables and look for the blank space in the
relevant DQS group column to identify the four pins that cannot be used in
an I/O module for Arria II GX devices.
You should always try the proposed pin-outs with the rest of your design in the
Quartus II software (with the correct I/O standard and OCT connections) before
finalizing the pin-outs, as there may be some interactions between modules that are
illegal in the Quartus II software that you may not find out unless you try compiling a
design and use the Quartus II Pin Planner.
The following sections describe the pins for each memory interfaces.
DDR, DDR2, and DDR3 SDRAM
This section provides a description of the clock, command, address, and data signals
for DDR, DDR2, and DDR3 SDRAM interfaces.
Clock Signals
DDR, DDR2, and DDR3 SDRAM devices use CK and CK# signals to clock the address
and command signals into the memory. Furthermore, the memory uses these clock
signals to generate the DQS signal during a read through the DLL inside the memory.
The SDRAM data sheet specifies the following timings:
■
tDQSCK is the skew between the CK or CK# signals and the SDRAM-generated DQS
signal
■
tDSH is the DQS falling edge from CK rising edge hold time
■
tDSS is the DQS falling edge from CK rising edge setup time
■
tDQSS is the positive DQS latching edge to CK rising edge
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–5
These SDRAM have a write requirement (tDQSS) that states the positive edge of the
DQS signal on writes must be within ± 25% (± 90) of the positive edge of the SDRAM
clock input. Therefore, you should generate the CK and CK# signals using the DDR
registers in the IOE to match with the DQS signal and reduce any variations across
process, voltage, and temperature. The positive edge of the SDRAM clock, CK, is
aligned with the DQS write to satisfy tDQSS.
DDR3 SDRAM can use a daisy-chained control address command (CAC) topology, in
which the memory clock must arrive at each chip at a different time. To compensate
for this flight-time skew between devices across a typical DIMM, write leveling must
be employed.
Command and Address Signals
Command and address signals in SDRAM devices are clocked into the memory
device using the CK or CK# signal. These pins operate at single data rate (SDR) using
only one clock edge. The number of address pins depends on the SDRAM device
capacity. The address pins are multiplexed, so two clock cycles are required to send
the row, column, and bank address. The CS#, RAS, CAS, WE, CKE, and ODT pins are
SDRAM command and control pins. DDR3 SDRAM has an additional pin, RESET#,
while some DDR3 DIMMs have these additional pins: RESET#, PAR_IN, and ERR_OUT#.
The RESET# pin uses the 1.5-V LVCMOS I/O standard, and the PAR_IN and ERR_OUT#
pins use the SSTL-15 I/O standard.
The DDR2 SDRAM command and address inputs do not have a symmetrical setup
and hold time requirement with respect to the SDRAM clocks, CK, and CK#.
For ALTMEMPHY or Altera SDRAM high-performance controllers in Stratix III and
Stratix IV devices, the command and address clock is a dedicated PLL clock output
whose phase can be adjusted to meet the setup and hold requirements of the memory
clock. The command and address clock is also typically half-rate, although a full-rate
implementation can also be created. The command and address pins use the DDIO
output circuitry to launch commands from either the rising or falling edges of the
clock. The chip select (mem_cs_n), clock enable (mem_cke), and ODT (mem_odt) pins are
only enabled for one memory clock cycle and can be launched from either the rising
or falling edge of the command and address clock signal. The address and other
command pins are enabled for two memory clock cycles and can also be launched
from either the rising or falling edge of the command and address clock signal.
1
In ALTMEMPHY-based designs, the command and address clock ac_clk_1x is always
half rate. However, because of the output enable assertion, CS#, CKE, and ODT
behave like full-rate signals even in a half-rate PHY.
In Arria II GX and Cyclone III devices, the command and address clock is either
shared with the write_clk_2x or the mem_clk_2x clock.
Data, Data Strobes, DM, and Optional ECC Signals
DDR SDRAM uses bidirectional single-ended data strobe (DQS); DDR3 SDRAM uses
bidirectional differential data strobes. The DQSn pins in DDR2 SDRAM devices are
optional but recommended for DDR2 SDRAM designs operating at more than 333
MHz. Differential DQS operation enables improved system timing due to reduced
crosstalk and less simultaneous switching noise on the strobe output drivers. The DQ
pins are also bidirectional. Regardless of interface width, DDR SDRAM always
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–6
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
operates in ×8 mode DQS groups. DQ pins in DDR2 and DDR3 SDRAM interfaces can
operate in either ×4 or ×8 mode DQS groups, depending on your chosen memory
device or DIMM, regardless of interface width. The ×4 and ×8 configurations use one
pair of bidirectional data strobe signals, DQS and DQSn, to capture input data.
However, two pairs of data strobes, UDQS and UDQS# (upper byte) and LDQS and
LDQS# (lower byte), are required by the ×16 configuration devices. A group of DQ
pins must remain associated with its respective DQS and DQSn pins.
The DQ signals are edge-aligned with the DQS signal during a read from the memory
and are center-aligned with the DQS signal during a write to the memory. The
memory controller shifts the DQ signals by –90 during a write operation to center
align the DQ and DQS signals. The PHY IP delays the DQS signal during a read, so
that the DQ and DQS signals are center aligned at the capture register. Altera devices
use a phase-locked loop (PLL) to center-align the DQS signal with respect to the DQ
signals during writes and Altera devices (except Cyclone III and Cyclone IV devices)
use dedicated DQS phase-shift circuitry to shift the incoming DQS signal during
reads. Figure 3–1 shows an example where the DQS signal is shifted by 90 for a read
from the DDR2 SDRAM.
Figure 3–1. Edge-aligned DQ and DQS Relationship During a DDR2 SDRAM Read in Burst-of-Four Mode
DQS at
FPGA Pin
Preamble
Postamble
DQ at
FPGA Pin
DQS at DQ
IOE registers
DQ at DQ
IOE registers
DQS phase shift
Figure 3–2 shows an example of the relationship between the data and data strobe
during a burst-of-four write.
Figure 3–2. DQ and DQS Relationship During a DDR2 SDRAM Write in Burst-of-Four Mode
DQS at
FPGA Pin
DQ at
FPGA Pin
The memory device's setup (tDS) and hold times (tDH) for the write DQ and DM pins
are relative to the edges of DQS write signals and not the CK or CK# clock. Setup and
hold requirements are not necessarily balanced inDDR2 and DDR3 SDRAM, unlike in
DDR SDRAM devices.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–7
The DQS signal is generated on the positive edge of the system clock to meet the tDQSS
requirement. DQ and DM signals use a clock shifted –90 from the system clock, so
that the DQS edges are centered on the DQ or DM signals when they arrive at the
DDR2 SDRAM. The DQS, DQ, and DM board trace lengths need to be tightly matched
(within 20 ps).
The SDRAM uses the DM pins during a write operation. Driving the DM pins low
shows that the write is valid. The memory masks the DQ signals if the DM pins are
driven high. To generate the DM signal, Altera recommends that you use the spare
DQ pin within the same DQS group as the respective data, to minimize skew.
The DM signal's timing requirements at the SDRAM input are identical to those for
DQ data. The DDR registers, clocked by the –90 shifted clock, create the DM signals.
Some SDRAM modules support error correction coding (ECC) to allow the controller
to detect and automatically correct error in data transmission. The 72-bit SDRAM
modules contain eight extra data pins in addition to 64 data pins. The eight extra ECC
pins should be connected to a single DQS or DQ group on the FPGA.
DIMM Options
Compared to the unbuffered DIMMs (UDIMM), both single-rank and double-rank
registered DIMMs (RDIMM) use only one pair of clocks and two chip selects
CS#[1:0]in DDR3. An RDIMM has extra parity signals for address, RAS#, CAS#, and
WE#.
Dual-rank DIMMs have the following extra signals for each side of the DIMM:
■
CS# (RDIMM always has two chip selects, DDR3 uses a minimum of 2 chip selects,
even on a single rank module)
■
CK (only UDIMM)
■
ODT signal
■
CKE signal
Table 3–2 compares the UDIMM and RDIMM pin options.
Table 3–2. UDIMM and RDIMM Pin Options
UDIMM Pins (Single
Rank)
Pins
Data
Data Mask
Data Strobe
Address
Clock
November 2011
(1)
UDIMM Pins
(Dual Rank)
RDIMM Pins (Single
Rank)
RDIMM Pins
(Dual Rank)
72 bit DQ[71:0] =
72 bit DQ[71:0] =
72 bit DQ[71:0] =
72 bit DQ[71:0]=
{CB[7:0], DQ[63:0]}
{CB[7:0], DQ[63:0]}
{CB[7:0], DQ[63:0]}
{CB[7:0], DQ[63:0]}
DM[8:0]
DM[8.0]
DM[8.0]
DM[8.0]
DQS[8:0] and
DQS#[8:0]
DQS[8:0] and
DQS#[8:0]
DQS[8:0] and
DQS#[8:0]
DQS[8:0] and
DQS#[8:0]
BA[2:0], A[15:0]–
BA[2:0], A[15:0]–
2 GB: A[13:0]
BA[2:0], A[15:0]– 2
GB: A[13:0]
BA[2:0], A[15:0]–
2 GB: A[13:0]
4 GB: A[14:0]
4 GB: A[14:0]
4 GB: A[14:0]
4 GB: A[14:0]
8 GB: A[15:0]
8 GB: A[15:0]
8 GB: A[15:0]
8 GB: A[15:0]
CK0/CK0#
CK0/CK0#, CK1/CK1#
CK0/CK0#
CK0/CK0#
Altera Corporation
2 GB: A[13:0]
External Memory Interface Handbook
Volume 2: Design Guidelines
3–8
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Table 3–2. UDIMM and RDIMM Pin Options
Pins
Command
UDIMM Pins (Single
Rank)
ODT, CS#, CKE, RAS#,
CAS#, WE#
Parity
UDIMM Pins
(Dual Rank)
ODT[1:0], CS#[1:0],
CKE[1:0], RAS#, CAS#,
WE#
—
Other Pins
SA[2:0], SDA, SCL,
EVENT#, RESET#
—
SA[2:0], SDA, SCL,
EVENT#, RESET#
RDIMM Pins (Single
Rank)
RDIMM Pins
(Dual Rank)
ODT, CS#[1:0], CKE,
RAS#, CAS#, WE#
ODT[1:0], CS#[1:0],
CKE[1:0], RAS#, CAS#,
WE#
PAR_IN, ERR_OUT
PAR_IN, ERR_OUT
SA[2:0], SDA, SCL,
EVENT#, RESET#
SA[2:0], SDA, SCL,
EVENT#, RESET#
Note to Table 3–2:
(1) DQS#[8:0] is optional in DDR2 SDRAM and is not supported in DDR SDRAM interfaces.
QDR II+ and QDR II SRAM
This section provides a description of the clock, command, address, and data signals
for QDR II and QDR II+ SRAM interfaces.
Clock Signals
QDR II+ and QDR II SRAM devices have three pairs of clocks:
■
Input clocks K and K#
■
Input clocks C and C#
■
Echo clocks CQ and CQ#
The positive input clock, K, is the logical complement of the negative input clock, K#.
Similarly, C and CQ are complements of C# and CQ#, respectively. With these
complementary clocks, the rising edges of each clock leg latch the DDR data.
The QDR II+ and QDR II SRAM devices use the K and K# clocks for write access and
the C and C# clocks for read accesses only when interfacing more than one QDR II+ or
QDR II SRAM device. Because the number of loads that the K and K# clocks drive
affects the switching times of these outputs when a controller drives a single QDR II+
or QDR II SRAM device, C and C# are unnecessary. This is because the propagation
delays from the controller to the QDR II+ or QDR II SRAM device and back are the
same. Therefore, to reduce the number of loads on the clock traces, QDR II+ and QDR
II SRAM devices have a single clock mode, and the K and K# clocks are used for both
reads and writes. In this mode, the C and C# clocks are tied to the supply voltage
(VDD).
CQ and CQ# are the source-synchronous output clocks from the QDR II or QDR
II+ SRAM device that accompanies the read data.
The Altera device outputs the K and K# clocks, data, address, and command lines to
the QDR II+ or QDR II SRAM device. For the controller to operate properly, the write
data (D), address (A), and control signal trace lengths (and therefore the propagation
times) should be equal to the K and K# clock trace lengths.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–9
You can generate C, C#, K, and K# clocks using any of the PLL registers via the DDR
registers. Because of strict skew requirements between K and K# signals, use adjacent
pins to generate the clock pair. The propagation delays for K and K# from the FPGA to
the QDR II+ or QDR II SRAM device are equal to the delays on the data and address
(D, A) signals. Therefore, the signal skew effect on the write and read request
operations is minimized by using identical DDR output circuits to generate clock and
data inputs to the memory.
Command Signals
QDR II+ and QDR II SRAM devices use the write port select (WPSn) signal to control
write operations and the read port select (RPSn) signal to control read operations. The
byte write select signal (BWSn) is a third control signal that indicates to the QDR II+ or
QDR II SRAM device which byte to write into the QDR II+ or QDR II SRAM device.
You can use any of the FPGA's user I/O pins to generate control signals, preferably on
the same side and the same bank. Assign the BWSn pin within the same DQS group as
the corresponding the write data.
Address Signals
QDR II+ and QDR II SRAM devices use one address bus (A) for both read and write
addresses. You can use any of the FPGA's user I/O pins to generate address signals,
preferably on the same side and the same banks.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–10
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Data and QVLD Signals
QDR II+ and QDR II SRAM devices use two unidirectional data buses: one for
writes (D) and one for reads (Q). The read data is edge-aligned with the CQ and CQ#
clocks while the write data is center-aligned with the K and K# clocks (see Figure 3–3
and Figure 3–4).
Figure 3–3. Edge-aligned CQ and Q Relationship During QDR II+ SRAM Read
CQ at
FPGA Pin
CQ# at
FPGA Pin
Q at
FPGA Pin
CQ at
Capture Register
CQ# at
Capture Register
Q at
Capture Register
DQS phase
shift
Figure 3–4. Centre-aligned K and D Relationship During QDR II+ SRAM Write
K at
FPGA Pin
K# at
FPGA Pin
D at
FPGA Pin
QDR II+ SRAM devices also have a QVLD pin that indicates valid read data. The
QVLD signal is edge-aligned with the echo clock and is asserted high for
approximately half a clock cycle before data is output from memory.
1
The Altera QDR II+ SRAM Controller with UniPHY IP does not use the QVLD signal.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–11
RLDRAM II
This section provides a description of the clock, command, address, and data signals
for RLDRAM II interfaces.
Clock Signals
RLDRAM II devices use CK and CK# signals to clock the command and address bus in
single data rate (SDR). There is one pair of CK and CK# pins per RLDRAM II device.
Instead of a strobe, RLDRAM II devices use two sets of free-running differential
clocks to accompany the data. The DK and DK# clocks are the differential input data
clocks used during writes while the QK or QK# clocks are the output data clocks used
during reads. Even though QK and QK# signals are not differential signals according to
the RLDRAM II data sheets, Micron treats these signals as such for their testing and
characterization. Each pair of DK and DK#, or QK and QK# clocks are associated with
either 9 or 18 data bits.
The exact clock-data relationships are as follows:
■
For ×36 data bus width configuration, there are 18 data bits associated with each
pair of write and read clocks. So, there are two pairs of DK and DK# pins and two
pairs of QK or QK# pins.
■
For ×18 data bus width configuration, there are 18 data bits per one pair of write
clocks and nine data bits per one pair of read clocks. So, there is one pair of DK and
DK# pins, but there are two pairs of QK and QK# pins.
■
For ×9 data bus width configuration, there are nine data bits associated with each
pair of write and read clocks. So, there is one pair of DK and DK# pins and one pair
of QK and QK# pins each.
There are tCKDK timing requirements for skew between CK and DK or CK# and DK#.
Because of the loads on these I/O pins, the maximum frequency you can achieve
depends on the number of RLDRAM II devices you are connecting to the Altera
device. Perform SPICE or IBIS simulations to analyze the loading effects of the
pin-pair on multiple RLDRAM II devices.
Commands and Addresses
The CK and CK# signals clock the commands and addresses into RLDRAM II devices.
These pins operate at single data rate using only one clock edge. RLDRAM II devices
have 18 to 21 address pins, depending on the data bus width configuration and burst
length. RLDRAM II supports both non-multiplexed and multiplexed addressing.
Multiplexed addressing allows you to save a few user I/O pins while
non-multiplexed addressing allows you to send the address signal within one clock
cycle instead of two clock cycles. CS#, REF#, and WE# pins are input commands to the
RLDRAM II device.
The commands and addresses must meet the memory address and command setup
(tAS, tCS) and hold (tAH, tCH) time requirements.
1
November 2011
UniPHY IP does not support multiplexed addressing.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–12
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Data, DM and QVLD Signals
The read data is edge-aligned with the QK or QK# clocks while the write data is
center-aligned with the DK and DK# clocks (see Figure 3–5 and Figure 3–6). The
memory controller shifts the DK or DK# signal to center align the DQ and DK or DK# signal
during a write and to shift the QK signal during a read, so that read data (DQ or Q
signals) and QK clock is center-aligned at the capture register. Altera devices use
dedicated DQS phase-shift circuitry to shift the incoming QK signal during reads and
use a PLL to center-align the DK and DK# signals with respect to the DQ signals during
writes.
Figure 3–5. Edge-aligned DQ and QK Relationship During RLDRAM II Read
QK at
FPGA Pin
DQ at
FPGA Pin
QK at DQ
LE Registers
DQ at DQ
LE Registers
DQS Phase Shift
Figure 3–6. Centre-aligned DQ and DK Relationship During RLDRAM II Write
DK at
FPGA Pin
DQ at
FPGA Pin
The RLDRAM II data mask (DM) pins are only used during a write. The memory
controller drives the DM signal low when the write is valid and drives it high to mask
the DQ signals. There is one DM pin per RLDRAM II device.
The DM timing requirements at the input to the RLDRAM II are identical to those for
DQ data. The DDR registers, clocked by the write clock, create the DM signals. This
reduces any skew between the DQ and DM signals.
The RLDRAM II device's setup time (tDS) and hold (tDH) time for the write DQ and DM
pins are relative to the edges of the DK or DK# clocks. The DK and DK# signals are
generated on the positive edge of system clock, so that the positive edge of CK or CK# is
aligned with the positive edge of DK or DK# respectively to meet the RLDRAM II
tCKDK requirement. The DQ and DM signals are clocked using a shifted clock so that
the edges of DK or DK# are center-aligned with respect to the DQ and DM signals when
they arrive at the RLDRAM II device.
The clocks, data, and DM board trace lengths should be tightly matched to minimize
the skew in the arrival time of these signals.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–13
RLDRAM II devices also have a QVLD pin indicating valid read data. The QVLD
signal is edge-aligned with QK or QK# and is high approximately half a clock cycle
before data is output from the memory.
1
The Altera RLDRAM II Controller with UniPHY IP does not use the QVLD signal.
Maximum Number of Interfaces
Table 3–3 through Table 3–7 list the available device resources for DDR, DDR2, DDR3
SDRAM, RLDRAM II, and QDR II and QDR II+ SRAM controller interfaces.
1
Unless otherwise noted, the calculation for the maximum number of interfaces is
based on independent interfaces where the address or command pins are not shared.
The maximum number of independent interfaces is limited to the number of PLLs
each FPGA device has.
f Timing closure depends on device resource and routing utilization. For more
information about timing closure, refer to the Area and Timing Optimization Techniques
chapter in the Quartus II Handbook.
1
You need to share DLLs if the total number of interfaces exceeds the number of DLLs
available in a specific FPGA device. You may also need to share PLL clock outputs
depending on your clock network usage, refer to “PLLs and Clock Networks” on
page 3–42.
1
For information about the number of DQ and DQS in other packages, refer to the DQ
and DQS tables in the relevant device handbook.
Table 3–3 describes the maximum number of ×8 DDR SDRAM components fit in the
smallest and biggest devices and pin packages assuming the device is blank.
Each interface of size n, where n is a multiple of 8, consists of:
November 2011
■
n DQ pins (including error correction coding (ECC))
■
n/8 DM pins
■
n/8 DQS pins
■
18 address pins
■
6 command pins (CAS, RAS, WE, CKE, reset, and CS)
■
1 CK, CK# pin pair for up to every three ×8 DDR SDRAM components
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–14
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Table 3–3. Maximum Number of DDR SDRAM Interfaces Supported per FPGA (Part 1 of 2)
Device
Arria II GX
Device Type
EP2AGX190
Package
Pin Count
1,152
Four ×8 interfaces or one ×72 interface on each
side (no DQ pins on left side)
358
■
On top side, one ×16 interface
■
On bottom side, one ×16 interface
■
On right side (no DQ pins on left side), one ×8
interface
EP2AGX260
EP2AGX45
EP2AGX65
Arria II GZ
EP2AGZ300
Maximum Number of Interfaces
F1,517
Four ×8 interfaces or one ×72 interface on each
side
EP2AGZ350
EP2AGZ225
EP2AGZ300
F780
■
On top side, three ×8 interfaces or one ×64
interface
■
On bottom side, three ×8 interfaces or one
×64 interface
■
No DQ pins on the left and right sides
■
Three ×16 interfaces on top and bottom sides
■
Two ×16 interfaces on right and left sides
■
Two ×8 interfaces on top and bottom sides
■
One ×8 interface on right and left sides
■
One ×48 interface or two ×8 interfaces on top
and bottom sides
■
Four ×8 interfaces on right and left sides
EP2AGZ350
Cyclone III
EP3C120
EP3C5
Cyclone IV E
Cyclone IV GX
EP4CE115
780
144
On top side, one ×8 interface with address pins
wrapped around the left or right side
EP4CGX150
896
■
One ×48 interface or four ×8 interfaces on top
and bottom sides
■
On right side, three ×8 interfaces
■
No DQ pins on the left side
■
One ×8 interface on top and bottom sides
■
On right side, one ×8 interface with address
pins wrapped around the top or bottom side
■
No DQ pins on the left side
■
Two ×72 interfaces on top and bottom sides
■
One ×72 interface on right and left sides
■
Two ×8 interfaces on top and bottom sides
■
Three ×8 interface on right and left sides
EP3SL340
EP3SE50
External Memory Interface Handbook
Volume 2: Design Guidelines
256
EP4CE10
EP4CGX22
Stratix III
780
324
1,760
484
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–15
Table 3–3. Maximum Number of DDR SDRAM Interfaces Supported per FPGA (Part 2 of 2)
Device
Package
Pin Count
Device Type
Stratix IV
EP4SGX290
Maximum Number of Interfaces
1,932
■
EP4SGX360
or
EP4SGX530
■
One ×72 interface on each side and two
additional ×72 wraparound interfaces, only if
sharing DLL and PLL resources
■
Three ×8 interfaces or one ×64 interface on
top and bottom sides
■
On left side, one ×48 interface or two ×8
interfaces
■
No DQ pins on the right side
■
Three ×72 interfaces on top and bottom sides
■
No DQ pins on left and right sides
EP4SE530
1,760
EP4SE820
EP4SGX70
780
EP4SGX110
EP4SGX180
EP4SGX230
Stratix V
One ×72 interface on each side
5SGXA5
1,932
5SGXA7
5SGXA3
780
■
5SGXA4
On top side, two ×8 interfaces
■
On bottom side, four ×8 interfaces or one ×72
interface
■
No DQ pins on left and right sides
Table 3–4 describes the maximum number of ×8 DDR2 SDRAM components that can
be fitted in the smallest and biggest devices and pin packages assuming the device is
blank.
Each interface of size n, where n is a multiple of 8, consists of:
■
n DQ pins (including ECC)
■
n/8 DM pins
■
n/8 DQS, DQSn pin pairs
■
18 address pins
■
7 command pins (CAS, RAS, WE, CKE, ODT, reset, and CS)
■
1 CK, CK# pin pair up to every three ×8 DDR2 components
Table 3–4. Maximum Number of DDR2 SDRAM Interfaces Supported per FPGA (Part 1 of 3)
Device
Device Type
Package Pin
Count
Arria II GX
EP2AGX190
1,152
Four ×8 interfaces or one ×72 interface on each
side (no DQ pins on left side)
358
■
One ×16 interface on top and bottom sides
■
On right side (no DQ pins on left side), one ×8
interface
EP2AGX260
EP2AGX45
EP2AGX65
November 2011
Altera Corporation
Maximum Number of Interfaces
External Memory Interface Handbook
Volume 2: Design Guidelines
3–16
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Table 3–4. Maximum Number of DDR2 SDRAM Interfaces Supported per FPGA (Part 2 of 3)
Device
Arria II GZ
Device Type
Package Pin
Count
EP2AGZ300
F1,517
Maximum Number of Interfaces
Four ×8 interfaces or one ×72 interface on each
side
EP2AGZ350
EP2AGZ225
EP2AGZ300
F780
■
Three ×8 interfaces or one ×64 interface on top
and bottom sides
■
No DQ pins on the left and right sides
■
One ×72 interface on top and bottom sides
■
No DQ pins on left and right sides
■
One ×64 interface or three ×8 interfaces on top
and bottom sides
■
One ×32 interface on the right side
■
No DQ pins on the left side
■
One ×64 interface or three ×8 interfaces on top
and bottom sides
■
No DQ pins on the left side
■
Three ×16 interfaces on top and bottom sides
■
Two ×16 interfaces on left and right sides
■
Two ×8 interfaces on top and bottom sides
■
One ×8 interface on right and left sides
■
One ×48 interface or two ×8 interfaces on top
and bottom sides
■
Four ×8 interfaces on right and left sides
EP2AGZ350
Arria V
5AGXB1
1,517
5AGXB3
5AGXB5
5AGXB7
5AGTD3
5AGTD7
5AGXA1
672
5AGXA3
5AGXA5
672
5AGXA7
Cyclone III
EP3C120
EP3C5
Cyclone IV E
Cyclone IV GX
EP4CE115
780
144
On top side, one ×8 interface with address pins
wrapped around the left or right side
EP4CGX150
896
■
One ×48 interface or four ×8 interfaces on top
and bottom sides
■
On right side, three ×8 interfaces
■
No DQ pins on the left side
■
One ×8 interface on top and bottom sides
■
On right side, one ×8 interface with address
pins wrapped around the top or bottom side
■
No DQ pins on the left side
■
Two ×72 interfaces on top and bottom sides
■
One ×72 interface on right and left sides
■
Two ×8 interfaces on top and bottom sides
■
Three ×8 interfaces on right and left sides
EP3SL340
EP3SE50
External Memory Interface Handbook
Volume 2: Design Guidelines
256
EP4CE10
EP4CGX22
Stratix III
780
324
1,760
484
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–17
Table 3–4. Maximum Number of DDR2 SDRAM Interfaces Supported per FPGA (Part 3 of 3)
Device
Stratix IV
Device Type
Package Pin
Count
EP4SGX290
1,932
Maximum Number of Interfaces
■
EP4SGX360
or
EP4SGX530
■
One ×72 interface on each side and two
additional ×72 wraparound interfaces only if
sharing DLL and PLL resources
■
Three ×8 interfaces or one ×64 interface on top
and bottom sides
■
On left side, one ×48 interface or two ×8
interfaces
■
No DQ pins on the right side
■
Three ×72 interfaces on top and bottom sides
■
No DQ pins on left and right sides
■
On top side, two ×8 interfaces
■
On bottom side, four ×8 interfaces or one ×72
interface
■
No DQ pins on left and right sides
EP4SE530
1,760
EP4SE820
EP4SGX70
780
EP4SGX110
EP4SGX180
EP4SGX230
Stratix V
One ×72 interface on each side
5SGXA5
1,932
5SGXA7
5SGXA3
780
5SGXA4
Table 3–5 describes the maximum number of ×8 DDR3 SDRAM components that can
be fitted in the smallest and biggest devices and pin packages assuming the device is
blank.
Each interface of size n, where n is a multiple of 8, consists of:
■
n DQ pins (including ECC)
■
n/8 DM pins
■
n/8 DQS, DQSn pin pairs
■
17 address pins
■
7 command pins (CAS, RAS, WE, CKE, ODT, reset, and CS)
■
1 CK, CK# pin pair
Table 3–5. Maximum Number of DDR3 SDRAM Interfaces Supported per FPGA (Part 1 of 2)
Device
Arria II GX
Device
Type
Package Pin
Count
EP2AGX190 1,152
EP2AGX260
EP2AGX45
EP2AGX65
November 2011
Altera Corporation
358
Maximum Number of Interfaces
Four ×8 interfaces or one ×72 interface on each
side (no DQ pins on left side)
■
One ×16 interface on top and bottom sides
■
On right side, one ×8 interface (no DQ pins on
left side)
External Memory Interface Handbook
Volume 2: Design Guidelines
3–18
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Table 3–5. Maximum Number of DDR3 SDRAM Interfaces Supported per FPGA (Part 2 of 2)
Device
Arria II GZ
Device
Type
Package Pin
Count
Maximum Number of Interfaces
EP2AGZ300 F1,517
EP2AGZ350
Four ×8 interfaces on each side
EP2AGZ225
Arria V
EP2AGZ300 F780
■
Three ×8 interfaces on top and bottom sides
EP2AGZ350
■
No DQ pins on the left and right sides
■
One ×72 interface on top and bottom sides
■
No DQ pins on left and right sides
■
One ×64 interface or three ×8 interfaces on top
and bottom sides
■
One ×32 interface on the right side
■
No DQ pins on the left side
■
One ×64 interface or three ×8 interfaces on top
and bottom sides
■
No DQ pins on the left side
5AGXB1
1,517
5AGXB3
5AGXB5
5AGXB7
5AGTD3
5AGTD7
5AGXA1
672
5AGXA3
5AGXA5
672
5AGXA7
Stratix III
EP3SL340
EP3SE50
Stratix IV
1,760
484
EP4SGX290 1,932
■
■
One ×72 interface on right and left sides
■
Two ×8 interfaces on top and bottom sides
■
Three ×8 interfaces on right and left sides
■
One ×72 interface on each side
EP4SGX360
or
EP4SGX530
■
One ×72 interface on each side and 2 additional
×72 wraparound interfaces only if sharing DLL
and PLL resources
■
Three ×8 interfaces or one ×64 interface on top
and bottom sides
■
On left side, one ×48 interface or two ×8
interfaces (no DQ pins on right side)
■
Two ×72 interfaces (800 MHz) on top and
bottom sides
■
No DQ pins on left and right sides
■
On top side, two ×8 interfaces
■
On bottom side, four ×8 interfaces
■
No DQ pins on left and right sides
EP4SE530
1,760
EP4SE820
EP4SGX70
780
EP4SGX110
EP4SGX180
EP4SGX230
Stratix V
5SGXA5
1,932
5SGXA7
5SGXA3
5SGXA4
External Memory Interface Handbook
Volume 2: Design Guidelines
Two ×72 interfaces on top and bottom sides
780
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–19
Table 3–6 on page 3–19 describes the maximum number of independent QDR II+ or
QDR II SRAM interfaces that can be fitted in the smallest and biggest devices and pin
packages assuming the device is blank.
One interface of ×36 consists of:
■
36 Q pins
■
36 D pins
■
1 K, K# pin pairs
■
1 CQ, CQ# pin pairs
■
19 address pins
■
4 BSWn pins
■
WPS, RPS
One interface of ×9 consists of:
■
9 Q pins
■
9 D pins
■
1 K, K# pin pairs
■
1 CQ, CQ# pin pairs
■
21 address pins
■
1 BWSn pin
■
WPS, RPS
Table 3–6. Maximum Number of QDR II and QDR II+ SRAM Interfaces Supported per FPGA (Part 1
of 2)
Device
Arria II GX
Device
Type
Package Pin
Count
EP2AGX190 1,152
EP2AGX260
EP2AGX45
358
EP2AGX65
Arria II GZ
EP2AGZ300 F1,517
Maximum Number of Interfaces
One ×36 interface and one ×9 interface one each
side
One ×9 interface on each side (no DQ pins on left
side)
■
Two ×36 interfaces and one ×9 interface on top
and bottom sides
EP2AGZ225
■
Four ×9 interfaces on right and left sides
EP2AGZ300 F780
■
Three ×9 interfaces on top and bottom sides
EP2AGZ350
■
No DQ pins on right and left sides
EP2AGZ350
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–20
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Table 3–6. Maximum Number of QDR II and QDR II+ SRAM Interfaces Supported per FPGA (Part 2
of 2)
Device
Arria V
Device
Type
5AGXB1
Package Pin
Count
1,517
Maximum Number of Interfaces
■
Two ×36 interfaces on top and bottom sides
■
No DQ pins on left and right sides
■
Two ×9 interfaces on top and bottom sides
■
One ×9 interface on the right side
■
No DQ pins on the left side
■
Two ×9 interfaces on top and bottom sides
■
No DQ pins on the left side
■
Two ×36 interfaces and one ×9 interface on top
and bottom sides
■
On left side, five ×9 interfaces on right and left
sides
■
One ×9 interface on top and bottom sides
■
Two ×9 interfaces on right and left sides
EP4SGX290 1,932
■
Two ×36 interfaces on top and bottom sides
EP4SGX360
■
One ×36 interface on right and left sides
5AGXB3
5AGXB5
5AGXB7
5AGTD3
5AGTD7
5AGXA1
672
5AGXA3
5AGXA5
672
5AGXA7
Stratix III
EP3SL340
EP3SE50
1,760
484
EP3SL50
EP3SL70
Stratix IV
EP4SGX530
EP4SE530
1,760
EP4SE820
EP4SGX70
780
EP4SGX110
Two ×9 interfaces on each side (no DQ pins on
right side)
EP4SGX180
EP4SGX230
Stratix V
5SGXA5
1,932
5SGXA7
5SGXA3
780
■
Two ×36 interfaces on top and bottom sides
■
No DQ pins on left and right sides
■
On top side, one ×36 interface or three ×9
interfaces
■
On bottom side, two ×9 interfaces
■
No DQ pins on left and right sides
5SGXA4
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–21
Table 3–7 on page 3–21 describes the maximum number of independent RLDRAM II
interfaces that can be fitted in the smallest and biggest devices and pin packages
assuming the device is blank.
One common I/O ×36 interface consists of:
■
36 DQ
■
1 DM pin
■
2 DK, DK# pin pairs
■
2 QK, QK# pin pairs
■
1 CK, CK# pin pair
■
24 address pins
■
1 CS# pin
■
1 REF# pin
■
1 WE# pin
■
1 QVLD pin
One common I/O ×9 interface consists of:
■
9 DQ
■
1 DM pins
■
1 DK, DK# pin pair
■
1 QK, QK# pin pair
■
1 CK, CK# pin pair
■
25 address pins
■
1 CS# pin
■
1 REF# pin
■
1 WE# pin
■
1 QVLD pin
Table 3–7. Maximum Number of RLDRAM II Interfaces Supported per FPGA (Part 1 of 2)
Device
Arria II GZ
Device
Type
Package Pin
Count
Maximum Number of RLDRAM II CIO Interfaces
EP2AGZ300 F1,517
EP2AGZ350
Two ×36 interfaces on each side
EP2AGZ225
EP2AGZ300 F780
■
Three ×9 interfaces or one ×36 interface on top
and bottom sides
■
No DQ pins on the left and right sides
EP2AGZ350
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–22
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
Table 3–7. Maximum Number of RLDRAM II Interfaces Supported per FPGA (Part 2 of 2)
Device
Arria V
Device
Type
5AGXB1
Package Pin
Count
1,517
5AGXB3
Maximum Number of RLDRAM II CIO Interfaces
■
Three ×36 interfaces on top and bottom sides
■
No DQ pins on left and right sides
■
One ×36 interface on top and bottom sides
■
One ×9 interface on the right side
■
No DQ pins on the left side
■
One ×36 interface on top and bottom sides
■
No DQ pins on the left side
■
Four ×36 components on top and bottom sides
■
Three ×36 interfaces on right and left sides
5AGXB5
5AGXB7
5AGTD3
5AGTD7
5AGXA1
672
5AGXA3
5AGXA5
672
5AGXA7
Stratix III
EP3SL340
EP3SE50
1,760
484
EP3SL50
One ×9 interface on right and left sides
EP3SL70
Stratix IV
EP4SGX290 1,932
■
Three ×36 interfaces on top and bottom sides
EP4SGX360
■
Two ×36 interfaces on right and left sides
1,760
■
Three ×36 interfaces on each side
780
One ×36 interface on each side (no DQ pins on
right side)
1,932
■
Four ×36 interfaces on top and bottom sides
■
No DQ pins on left and right sides
■
On top side, two ×9 interfaces or one ×18
interfaces
■
On bottom side, three ×9 interfaces or two ×36
interfaces
■
No DQ pins on left and right sides
EP4SGX530
EP4SE530
EP4SE820
EP4SGX70
EP4SGX110
EP4SGX180
EP4SGX230
Stratix V
5SGXA5
5SGXA7
5SGXA3
780
5SGXA4
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Interface Pins
3–23
OCT Support for Arria II GX, Arria II GZ, Arria V, Cyclone V, Stratix III,
Stratix IV, and Stratix V Devices
This section is not applicable to Cyclone III and Cyclone IV devices as OCT is not used
by the Altera IP.
f If you use OCT for your memory interfaces with Cyclone III and Cyclone IV devices,
refer to the Device I/O Features chapter in the Cyclone III or Cyclone IV Device Handbook.
If the memory interface uses any FPGA OCT calibrated series, parallel, or dynamic
termination for any I/O in your design, you need a calibration block for the OCT
circuitry. This calibration block is not required to be within the same bank or side of
the device as the memory interface pins. However, the block requires a pair of RUP
and RDN or RZQ pins that must be placed within an I/O bank that has the same VCCIO
voltage as the VCCIO voltage of the I/O pins that use the OCT calibration block.
The RZQ pin in Stratix V, Arria V, and Cyclone V devices is a dual functional pin that
can also be used as DQ and DQS pins when it is not used to support OCT. You can use
the DQS group in ×4 mode with non-differential DQS pins if the RZQ pin is part of a
×4 DQS group.
The RUP and RDN pins in Arria II GX, Arria II GZ, Stratix III, and Stratix IV devices are
dual functional pins that can also be used as DQ and DQS pins in when they are not
used to support OCT, giving the following impacts on your DQS groups:
1
■
If the RUP and RDN pins are part of a ×4 DQS group, you cannot use that DQS
group in ×4 mode.
■
If the RUP and RDN pins are part of a ×8 DQS group, you can only use this group in
×8 mode if either of the following conditions apply:
■
You are not using DM or BWSn pins.
■
You are not using a ×8 or ×9 QDR II and QDR II+ SRAM devices, as the RUP
and RDN pins may have dual purpose function as the CQn pins. In this case,
pick different pin locations for RUP and RDN pins, to avoid conflict with
memory interface pin placement. You have the choice of placing the RUP and
RDN pins in the same bank as the write data pin group or address and
command pin group.
■
You are not using complementary or differential DQS pins.
The QDR II and QDR II+ SRAM controller with UniPHY do not support ×8 QDR II
and QDR II+ SRAM devices in the Quartus II software.
A DQS/DQ ×8/×9 group in Arria II GZ, Stratix III, and Stratix IV devices comprises
12 pins. A typical ×8 memory interface consists of one DQS, one DM, and eight DQ
pins which add up to 10 pins. If you choose your pin assignment carefully, you can
use the two extra pins for RUP and RDN. However, if you are using differential DQS,
you do not have enough pins for RUP and RDN as you only have one pin leftover. In
this case, as you do not have to put the OCT calibration block with the DQS or DQ
pins, you can pick different locations for the RUP and RDN pins. As an example, you
can place it in the I/O bank that contains the address and command pins, as this I/O
bank has the same VCCIO voltage as the I/O bank containing the DQS and DQ pins.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–24
Chapter 3: Planning Pin and FPGA Resources
General Pin-out Guidelines
There is no restriction when using ×16/×18 or ×32/×36 DQS groups that include the
×4 groups when pin members are used as RUP and RDN pins, as there are enough extra
pins that can be used as DQS or DQ pins.
f You need to pick your DQS and DQ pins manually for the ×8, ×9, ×16 and ×18, or ×32
and ×36 groups, if they are using RUP and RDN pins within the group. The Quartus II
software may not place these pins optimally and may give you a no-fit.
General Pin-out Guidelines
Altera recommends that you place all the pins for one memory interface (attached to
one controller) on the same side of the device. For projects where I/O availability is a
challenge and therefore it is necessary spread the interface on two sides, for optimal
performance, place all the input pins on one side, and the output pins on an adjacent
side of the device along with their corresponding source-synchronous clock.
1
For a unidirectional data bus as in QDR II and QDR II+ SRAM interfaces, do not split
a read data pin group or a write data pin group onto two sides. It is also strongly
recommended not to split the address and command group onto two sides either,
especially when you are interfacing with QDR II and QDR II+ SRAM
burst-length-of-two devices, where the address signals are double data rate also.
Failure to adhere to these rules may result in timing failure.
In addition, there are some exceptions for the following interfaces:
■
×36 emulated QDR II and QDR II+ SRAM in Arria II, Stratix III, and Stratix IV
devices.
■
RLDRAM II CIO devices
■
QDR II/+ SDRAM burst-length-of-two devices.
c You need to compile the design in Quartus II to ensure that you are not violating
signal integrity and Quartus II placement rules, which is critical when you have
transceivers in the same design.
The following list gives some general guidelines on how to place pins optimally for
your memory interfaces:
1. For Arria II GZ, Arria V, Cyclone V, Stratix III, Stratix IV, and Stratix V designs, if
you are using OCT, the RUP and RDN, or RZQ pins need to be in any bank with the
same I/O voltage as your memory interface signals and often use two DQS and
DQ pins from a group. If you decide to place the RUP and RDN, or RZQ pins in a
bank where the DQS and DQ groups are used, place these pins first and then see
how many DQ pins you have left after, to find out if your data pins can fit in the
remaining pins. Refer to “OCT Support for Arria II GX, Arria II GZ, Arria V,
Cyclone V, Stratix III, Stratix IV, and Stratix V Devices” on page 3–23.
2. Use the PLL that is on the same side of the memory interface. If the interface is
spread out on two adjacent sides, you may use the PLL that is located on either
adjacent side. You must use the dedicated input clock pin to that particular PLL as
the reference clock for the PLL as the input of the memory interface PLL cannot
come from the FPGA clock network.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
General Pin-out Guidelines
3–25
3. The Altera IP uses the output of the memory interface PLL for the DLL input
reference clock. Therefore, ensure you pick a PLL that can directly feed a suitable
DLL.
1
Alternatively, you can use an external pin to feed into the DLL input
reference clock. The available pins are also listed in the External Memory
Interfaces chapter of the relevant device family handbook. You can also
activate an unused PLL clock outputs , set it at the desired DLL frequency,
and route it to a PLL dedicated output pin. Connect a trace on the PCB from
this output pin to the DLL reference clock pin, but be sure to include any
signal integrity requirements such as terminations.
4. Read data pins require the usage of DQS and DQ group pins to have access to the
DLL control signals.
1
In addition, QVLD pins in RLDRAM II and QDR II+ SRAM must use DQS
group pins, when the design uses the QVLD signal. None of the Altera IP
uses QVLD pins as part of read capture, so theoretically you do not need to
connect the QVLD pins if you are using the Altera solution. It is good to
connect it anyway in case the Altera solution gets updated to use QVLD
pins.
5. In differential clocking (DDR3/DDR2 SDRAM and RLDRAM II interfaces),
connect the positive leg of the read strobe or clock to a DQS pin, and the negative
leg of the read strobe or clock to a DQSn pin. For QDR II or QDR II+ SRAM
devices with 2.5 or 1.5 cycles of read latency, connect the CQ pin to a DQS pin, and
the CQn pin to a CQn pin (and not the DQSn pin). For QDR II or QDR II+ SRAM
devices with 2.0 cycles of read latency, connect the CQ pin to a CQn pin, and the
CQn pin to a DQS pin.
6. Write data (if unidirectional) and data mask pins (DM or BWSn) pins must use
DQS groups. While the DLL phase shift is not used, using DQS groups for write
data minimizes skew, and must use the SW and TCCS timing analysis
methodology.
7. Assign the write data strobe or write data clock (if unidirectional) in the
corresponding DQS/DQSn pin with the write data groups that place in DQ pins
(except in RLDRAM II CIO devices, refer to “Pin-out Rule Exceptions” on
page 3–26)
1
When interfacing with a DDR, or DDR2, or DDR3 SDRAM without
leveling, put the three CK and CK# pairs in a single ×4 DQS group to
minimize skew between clocks and maximize margin for the tDQSS, tDSS, and
tDSH specifications from the memory devices.
8. Assign any address pins to any user I/O pin. To minimize skew within the address
pin group, you should assign the address and command pins in the same bank or
side of the device.
9. Assign the command pins to any I/O pins and assign the pins in the same bank or
device side as the other memory interface pins, especially address and memory
clock pins. The memory device usually uses the same clock to register address and
command signals.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–26
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
1
In QDR II and QDR II+ SRAM interfaces where the memory clock also
registers the write data, assign the address and command pins in the same
I/O bank or same side as the write data pins, to minimize skew.
1
For more information about assigning memory clock pins for different
device families and memory standards, refer to “Pin Connection Guidelines
Tables” on page 3–33.
Pin-out Rule Exceptions
The following sub sections described exceptions to the rule described in the “General
Pin-out Guidelines” on page 3–24.
Exceptions for ×36 Emulated QDR II and QDR II+ SRAM Interfaces in
Arria II, Stratix III and Stratix IV Devices
A few packages in the Arria II, Stratix III, Stratix IV, and Stratix V device families do
not offer any ×32/×36 DQS groups where one read clock or strobe is associated with
32 or 36 read data pins. This limitation exists in the following I/O banks:
■
All I/O banks in U358- and F572-pin packages for all Arria II GX devices
■
All I/O banks in F484-pin packages for all Stratix III devices
■
All I/O banks in F780-pin packages for all Arria II GZ, Stratix III, and Stratix IV
devices; top and side I/O banks in F780-pin packages for all Stratix V devices
■
All I/O banks in F1152-pin packages for all Arria II GZ, Stratix III, and Stratix IV
devices, except EP4SGX290, EP4SGX360, EP4SGX530, EPAGZ300, and EPAGZ350
devices
■
Side I/O banks in F1517- and F1760-pin packages for all Stratix III devices
■
All I/O banks in F1517-pin for EP4SGX180, EP4SGX230, EP4S40G2, EP4S40G5,
EP4S100G2, EP4S100G5, and EPAGZ225 devices
■
Side I/O banks in F1517-, F1760-, and F1932-pin packages for all Arria II GZ and
Stratix IV devices
This limitation limits support for ×36 QDR II and QDR II+ SRAM devices. To support
these memory devices, this following section describes how you can emulate the
×32/×36 DQS groups for these devices.
c The maximum frequency supported in ×36 QDR II and QDR II+ SRAM interfaces
using ×36 emulation is lower than the maximum frequency when using a native ×36
DQS group.
1
The F484-pin package in Stratix III devices cannot support ×32/×36 DQS group
emulation, as it does not support ×16/×18 DQS groups.
To emulate a ×32/×36 DQS group, combine two ×16/×18 DQS groups together. For
×36 QDR II and QDR II+ SRAM interfaces, the 36-bit wide read data bus uses two
×16/×18 groups; the 36-bit wide write data uses another two ×16/×18 groups or four
×8/×9 groups. The CQ and CQn signals from the QDR II and QDR II+ SRAM device
traces are then split on the board to connect to two pairs of CQ/CQn pins in the
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
3–27
FPGA. You may then need to split the QVLD pins also (if you are connecting them).
These connections are the only connections on the board that you need to change for
this implementation. There is still only one pair of K and Kn# connections on the
board from the FPGA to the memory (see Figure 3–7). Use an external termination for
the CQ/CQn signals at the FPGA end. You can use the FPGA OCT features on the
other QDR II interface signals with ×36 emulation. In addition, there may be extra
assignments to be added with ×36 emulation.
1
Other QDR II and QDR II+ SRAM interface rules also apply for this implementation.
You may also combine four ×9 DQS groups (or two ×9 DQS groups and one ×18
group) on the same side of the device, if not the same I/O bank, to emulate a x36 write
data group, if you need to fit the QDR II interface in a particular side of the device that
does not have enough ×18 DQS groups available for write data pins. Altera does not
recommend using ×4 groups as the skew may be too large, as you need eight ×4
groups to emulate the ×36 write data bits.
You cannot combine four ×9 groups to create a ×36 read data group as the loading on
the CQ pin is too large and hence the signal is degraded too much.
When splitting the CQ and CQn signals, the two trace lengths that go to the FPGA
pins must be as short as possible to reduce reflection. These traces must also have the
same trace delay from the FPGA pin to the Y or T junction on the board. The total
trace delay from the memory device to each pin on the FPGA should match the Q
trace delay (I2).
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–28
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
1
You must match the trace delays. However, matching trace length is only an
approximation to matching actual delay.
Figure 3–7. Board Trace Connection for Emulated x36 QDR II and QDR II+ SRAM Interface
FPGA IOE
D, A
DDR
DDR
length = l1
K
length = l1
Kn
DDR
length = l1
DQ (18-bit)
Q
QDR II
SRAM
length = l2
DDR
Latch
DDR
ena
DQS length = l2
DQS Logic
Block
DQSn length = l2
DQ (18-bit)
CQ
CQn
Q
length = l2
DDR
Latch
DDR
ena
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
3–29
Timing Impact on x36 Emulation
With ×36 emulation, the CQ/CQn signals are split on the board, so these signals see
two loads (to the two FPGA pins)—the DQ signals still only have one load. The
difference in loading gives some slew rate degradation, and a later CQ/CQn arrival
time at the FPGA pin.
The slew rate degradation factor is taken into account during timing analysis when
you indicate in the UniPHY Preset Editor that you are using ×36 emulation mode.
However, you must determine the difference in CQ/CQn arrival time as it is highly
dependent on your board topology.
The slew rate degradation factor for ×36 emulation assumes that CQ/CQn has a
slower slew rate than a regular ×36 interface. The slew rate degradation is assumed
not to be more than 500 ps (from 10% to 90% VCCIO swing). You may also modify your
board termination resistor to improve the slew rate of the ×36-emulated CQ/CQn
signals. If your modified board does not have any slew rate degradation, you do not
need to enable the ×36 emulation timing in the UniPHY-based controller
MegaWizard™ interface.
f For more information about how to determine the CQ/CQn arrival time skew, refer to
“Determining the CQ/CQn Arrival Time Skew” on page 3–30.
Because of this effect, the maximum frequency supported using x36 emulation is
lower than the maximum frequency supported using a native x36 DQS group.
Rules to Combine Groups
For devices that do not have four ×16/×18 groups in a single side of the device to
form two ×36 groups for read and write data, you can form one ×36 group on one side
of the device, and another ×36 group on the other side of the device. All the read
groups have to be on the same edge (column I/O or row I/O) and all write groups
have to be on the same type of edge (column I/O or row I/O), so you can have an
interface with the read group in column I/O and the write group in row I/O. The only
restriction is that you cannot combine an ×18 group from column I/O with an ×18
group from row IO to form a x36-emulated group.
For vertical migration with the ×36 emulation implementation, check if migration is
possible and enable device migration in the Quartus II software.
1
I/O bank 1C in both Stratix III and Stratix IV devices has dual-function configuration
pins. Some of the DQS pins may not be available for memory interfaces if these are
used for device configuration purposes.
Each side of the device in these packages has four remaining ×8/×9 groups. You can
combine four of the remaining for the write side (only) if you want to keep the ×36
QDR II and QDR II+ SRAM interface on one side of the device, by changing the
Memory Interface Data Group default assignment, from the default 18 to 9.
1
November 2011
The ALTMEMPHY megafunction does not support ×36 mode emulation wraparound
interface, where the ×36 group consists of a ×18 group from the top/bottom I/O bank
and a ×18 group from the side I/O banks.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–30
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
f For more information about rules to combine groups for your target device, refer to
the External Memory Interfaces chapter in the respective device handbooks.
Determining the CQ/CQn Arrival Time Skew
Before compiling a design in Quartus II, you need to determine the CQ/CQn arrival
time skew based on your board simulation. You then need to apply this skew in the
report_timing.tcl file of your QDR II and QDR II+ SRAM interface in the Quartus II
software. Figure 3–8 shows an example of a board topology comparing an emulated
case where CQ is double-loaded and a non-emulated case where CQ only has a single
load.
Figure 3–8. Board Simulation Topology Example
Run the simulation and look at the signal at the FPGA pin. Figure 3–9 shows an
example of the simulation results from Figure 3–8. As expected, the double-loaded
emulated signal, in pink, arrives at the FPGA pin later than the single-loaded signal,
in red. You then need to calculate the difference of this arrival time at VREF level (0.75
V in this case). Record the skew and rerun the simulation in the other two cases (slowweak and fast-strong). To pick the largest and smallest skew to be included in Quartus
II timing analysis, follow these steps:
1. Open the <variation_name>_report_timing.tcl and search for
tmin_additional_dqs_variation.
2. Set the minimum skew value from your board simulation to
tmin_additional_dqs_variation.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
3–31
3. Set the maximum skew value from your board simulation to
tmax_additional_dqs_variation.
4. Save the .tcl file.
Figure 3–9. Board Simulation Results
Exceptions for RLDRAM II Interfaces
RLDRAM II CIO devices have one bidirectional bus for the data, but there are two
different sets of clocks: one for read and one for write. As the QK and QK# already
occupies the DQS and DQSn pins needed for read, placement of DK and DK# pins are
restricted due to the limited number of pins in the FPGA. This limitations causes the
exceptions to the previous rules, which are discussed in the following sections.
The address or command pins of RLDRAM II must be placed in a DQ-group because
these pins are driven by the PHY clock. Two master RLDRAM II interfaces must not
share a single I/O sub-bank because these interfaces require strict timing at 800 MHz
and above. Half-rate RLDRAM II interfaces use the PHY clock for both the DQ pins
and the address or command pins.
1
November 2011
Currently, full-rate interface do not use the PHY clock tree. However, in future
software releases, full-rate interfaces will have the same pin requirements as the
half-rate interfaces.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–32
Chapter 3: Planning Pin and FPGA Resources
Pin-out Rule Exceptions
1
DK and DK# signals need to use DQS- and DQSn-capable pins to ensure accurate
timing analysis, as the TCCS specifications are characterized using DQS and DQSn
pins. As you must use the DQS and DQSn pins for the DQS group to connect to QK
and QK# pins, pick a pair of DQ pins that are DQS and DQSn pins when configured
as a smaller DQS group size. For example, if the interfaces uses a ×16/×18 DQS group,
the DQS and DQSn pins connect to QK and QK# pins, pick differential DQ pin pairs
from that DQS group that are DQS and DQSn pins for ×8/×9 DQS groups or ×4 DQS
groups.
Interfacing with ×9 RLDRAM II CIO Devices
These devices have the following pins:
■
2 pins for QK and QK# signals
■
9 DQ pins (in a ×8/×9 DQS group)
■
2 pins for DK and DK# signals
■
1 DM pin
■
1 QVLD pins
■
15 pins total
In the FPGA, the ×8/×9 DQS group consists of 12 pins: 2 for the read clocks and 10 for
the data. In this case, move the QVLD (if you want to keep this connected even
though this is not used in the Altera memory interface solution) and the DK and DK#
pins to the adjacent DQS group. If that group is in use, move to any available user I/O
pins in the same I/O bank. The DK and DK# must use DQS- and DQSn-capable pins.
Interfacing with ×18 RLDRAM II CIO Devices
These devices have the following pins:
■
4 pins for QK/QK# signals
■
18 DQ pins (in ×8/×9 DQS group)
■
2 pins for DK/DK# signals
■
1 DM pin
■
1 QVLD pins
■
26 pins total
In the FPGA, you use two ×8/×9 DQS group totaling 24 pins: 4 for the read clocks and
18 for the read data. In this case, move the DK and DK# pins to DQS- and DQSncapable pins in the adjacent DQS group. Or if that group is in use, move to any DQSand DQSn-capable pins in the same I/O bank.
Each ×8/×9 group has one DQ pin left over that can either use QVLD or DM, so one
×8/×9 group has the DM pin associated with that group and one ×8/×9 group has the
QVLD pin associated with that group.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
3–33
Interfacing with RLDRAM II ×36 CIO Devices
These devices have the following pins:
■
4 pins for QK/QK# signals
■
36 DQ pins (in x16/x18 DQS group)
■
4 pins for DK/DK# signals
■
1 DM pins
■
1 QVLD pins
■
46 pins total
In the FPGA, you use two ×16/×18 DQS groups totaling 48 pins: 4 for the read clocks
and 36 for the read data. Configure each ×16/×18 DQS group to have:
■
Two QK/QK# pins occupying the DQS/DQSn pins
■
Pick two DQ pins in the ×16/×18 DQS groups that are DQS and DQSn pins in the
×4 or ×8/×9 DQS groups for the DK and DK# pins
■
18 DQ pins occupying the DQ pins
■
There are two DQ pins leftover that you can use for QVLD or DM pins. Put the
DM pin in the group associated with DK[1] and the QVLD pin in the group
associated with DK[0].
1
Check that DM is associated with DK[1] for your chosen memory
component.
Exceptions for QDR II and QDR II+ SRAM Burst-length-of-two Interfaces
If you are using the QDR II and QDR II+ SRAM burst-length-of-two devices, you may
want to place the address pins in a DQS group to minimize skew, because these pins
are now double data rate too. The address pins typically do not exceed 22 bits, so you
may use one ×18 DQS groups or two ×9 DQS groups on the same side of the device, if
not the same I/O bank. In Stratix III, Stratix IV, and Stratix V devices, one ×18 group
typically has 22 DQ bits and 2 pins for DQS/DQSn pins, while one ×9 group typically
has 10 DQ bits with 2 pins for DQS/DQSn pins. Using ×4 DQS groups should be a last
resort.
Pin Connection Guidelines Tables
Table 3–8 on page 3–34 list the FPGA pin utilization for DDR, DDR2, and DDR3
SDRAM without leveling interfaces.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
Interface
Pin
Description
Memory
Device Pin
Name
Memory
System
Clock
CK and
CK# (1), (2)
3–34
External Memory Interface Handbook
Volume 2: Design Guidelines
Table 3–8. FPGA Pin Utilization for DDR, DDR2, and DDR3 SDRAM without Leveling Interfaces (Part 1 of 2)
FPGA Pin Utilization
Arria II GX
Cyclone III and
Cyclone IV
Place any differential
I/O pin pair (DIFFIO)
in the same bank or
on the same side as
the data pins. You
If you are using differential DQS signaling in
can use either side of
ALTMEMPHY IP, TO IMPROVE TIMING PLACE THE
the device for
#+#+ PAIR ON THE DQ or DQS pins with
$)&&)/?28 or DIFFIN capability in the same wraparound
BANK OR ON the same side as the data pins. You interfaces. The first
CK/CK# pair cannot
CAN USE either side of the device for
be placed in the
WRAPAROUND interfaces.
same row or column
If there are other CK/CK# pairs, place them on pad group as any of
DIFFOUT in the same single DQ group of
the DQ pins
adequate width to minimize skew.
(Figure 3–10 and
Figure 3–11).
For example, DIMMs requiring three memory
clock pin-pairs must use a ×4 DQS group,
where the mem_clk[0] and mem_clk n[0]
use the DIFF_RX or DIFFIN pins in the group,
while mem_clk[2:1] and mem_clk n[2:1]
pins use DIFFOUT pins in that DQS group.
If you are using single-ended DQS signaling,
place any unused DQ or DQS pins with
DIFFOUT capability located in the same bank or
on the same side as the data pins.
Arria V, Cyclone V,
and Stratix V
If you are using single-ended DQS signaling,
place any DIFFOUT pins in the same bank or
on the same side as the data pins
If you are using
single-ended DQS
signaling, place any
unused DQ or DQS
pins with DIFFOUT
capability in the same
bank or on the same
side as the data pins.
If you are using differential DQS DQS
signaling in ALTMEMPHY IP, the first CK/CK#
pair must use any unused DIFFIO_RX pins in
the same bank or on the same side as the data
pins. You can use either side of the device for
wraparound interfaces.
If there are other CK/CK# pairs, place them on
DIFFOUT in the same single DQ group of
adequate width to minimize skew.
For example, DIMMs requiring three memory
clock pin-pairs must use a ×4 DQS group,
where mem_clk[0] and mem_clk_n[0] pins
use the DIFFIO_RX or DIFFIN pins in that
group, while,
If you are using
differential DQS
signaling, place any
unused DQ or DQS
pins with DIFFOUT
capability for the
mem_clk[n:0] and
mem_clk_n[n:0]
signals (where n>=0).
mem_clk[2:1] and mem_clk_n[2:1] pins
use DIFFOUT pins in that DQS group.
Do not place CK and
CK# pins in the same
group as any other
DQ or DQS pins.
If you are using differential DQS signaling in
UniPHY IP, place any DIFFOUT pins in the
same bank or on the same side as the data
pins. If there are multiple CK/CK# pairs, place
them on DIFFOUT in the same single DQ
group of adequate width.
If there are multiple
CK and CK# pin pairs,
place them on
DIFFOUT in the same
single DQ group of
adequate width.
For example, DIMMs requiring three memory
clock pin-pairs must use a ×4 DQS group.
Clock
Source
—
Dedicated PLL clock input pin with direct connection to the PLL (not using the global clock network).
For Arria II GX,Arria II GZ, Stratix III, Stratix IV and Stratix V Devices, also ensure that the PLL can supply the input reference clock to the DLL.
Otherwise, refer to alternative DLL input reference clocks (“General Pin-out Guidelines” on page 3–24).
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
November 2011 Altera Corporation
If you are using differential DQS signaling in
UniPHY IP, place on DIFFOUT in the same
single DQ group of adequate width to minimize
skew.
Arria II GZ, Stratix III, and
Stratix IV
Interface
Pin
Description
Memory
Device Pin
Name
FPGA Pin Utilization
Arria II GX
Cyclone III and
Cyclone IV
Arria II GZ, Stratix III, and
Stratix IV
Arria V, Cyclone V,
and Stratix V
Altera Corporation
Reset
—
Dedicated clock input pin to accommodate the high fan-out signal.
Data
DQ
Data mask
DM
DQ in the pin table, marked as Q in the Quartus II Pin Planner. Each DQ group has a common background color for all of the DQ and DM pins,
associated with DQS (and DQSn) pins.
Data strobe
DQS or
DQS and
DQSn
(DDR2 and
DDR2
SDRAM
only)
DQS (S in the Quartus II Pin Planner) for single-ended DQS signaling or DQS and DQSn (S and Sbar in the Quartus II Pin Planner) for differential
DQS signaling. DDR2 supports either single-ended or differential DQS signaling. However, Cyclone III and Cyclone IV devices do not support
differential DQS signaling. DDR3 SDRAM mandates differential DQS signaling.
Address and
command
A[], BA[],
CAS#,
CKE, CS#,
ODT,
RAS#,
WE#,
RESET#
Any user I/O pin. To minimize skew, you must place the address and command pins in the same bank or side of the device as the CK/CK# pins,
DQ, DQS, or DM pins. The reset# signal is only available in DDR3 SDRAM interfaces. Altera devices use the SSTL-15 I/O standard on the
RESET# signal to meet the voltage requirements of 1.5 V CMOS at the memory device. Altera recommends that you do not terminate the
RESET# signal to VTT.
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
November 2011
Table 3–8. FPGA Pin Utilization for DDR, DDR2, and DDR3 SDRAM without Leveling Interfaces (Part 2 of 2)
Notes to Table 3–8:
(1) The first CK/CK# pair refers to mem_clk[0] or mem_clk_n[0] in the IP core.
(2) The restriction on the placement for the first CK/CK# pair is required because this placement allows the mimic path that the IP VT tracking uses to go through differential I/O buffers to mimic the
differential DQS signals.
3–35
External Memory Interface Handbook
Volume 2: Design Guidelines
3–36
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
Additional Placement Rules for Cyclone III and Cyclone IV Devices
Assigning the mem_clk[0] pin on the same row or column pad group as the DQ pin
pins results in the failure to constrain the DDIO input nodes correctly and close
timing. Hence, the Read Capture and Write timing margins computed by TimeQuest
may not be valid due to the violation of assumptions made by the timing scripts.
Figure 3–10 shows an example of assigning mem_clk[0] and mem_clk_n[0] incorrectly.
As you can see, mem_clk[0] pin is assigned at the same column pad group as mem_dq
pin (in column X = 1). This assignment results in the Quartus II software showing the
following critical warning:
Register <name> fed by pin mem_clk[0] must be placed in adjacent LAB X:1
Y:0 instead of X:2 Y:0
Figure 3–10. Incorrect Placement of mem_clk[0] and mem_clk_n[0] in Cyclone III and Cyclone IV
Devices.
To eliminate this critical warning, assign the mem_clk[0] pin at different column or
row from the data pin (Figure 3–11).
Figure 3–11. Correct Placement of mem_clk[0] and mem_clk_n[0] in Cyclone III and Cyclone IV
Devices.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
3–37
Table 3–9 lists the FPGA pin utilization for DDR3 SDRAM with leveling interfaces.
Table 3–9. DDR3 SDRAM With Leveling Interface Pin Utilization Applicable for Stratix III, Stratix IV, and Stratix V
Devices (Part 1 of 2)
Interface Pin
Description
Data
Memory Device Pin
Name
DQ
Data Mask
DM
FPGA Pin Utilization
DQ in the pin table, marked as Q in the Quartus II Pin Planner. Each DQ
group has a common background color for all of the DQ and DM pins,
associated with DQS (and DQSn) pins. The ×4 DIMM has the following
mapping between DQS and DQ pins:
■
DQS[0] maps to DQ[3:0]
■
DQS[9] maps to DQ[7:4]
■
DQS[1] maps to DQ[11:8]
■
DQS[10] maps to DQ[15:12]
The DQS pin index in other DIMM configurations typically increases
sequentially with the DQ pin index (DQS[0]: DQ[3:0]; DQS[1]:
DQ[7:4]; DQS[2]: DQ[11:8]). In this DIMM configuration, the DQS
pins are indiced this way to ensure pin out is compatible with both ×4
and ×8 DIMMs.
Data Strobe
DQS and DQSn
DQS and DQSn (S and Sbar in the Quartus II Pin Planner)
A[], BA[], CAS#, CKE,
CS#, ODT, RAS#, WE#,
Any user I/O pin. To minimize skew, you should place address and
command pins in the same bank or side of the device as the following
pins: CK/CK# pins, DQ, DQS, or DM pins.
RESET#
ALTMEMPHY uses the SSTL-15 I/O standard and UniPHY uses the
1.5 V CMOS I/O standard on the RESET# signal. Both standards are
valid. However, Altera recommends that you use the 1.5V CMOS I/O
standard. If your board is already using the SSTL-15 I/O standard, you
do not terminate the RESET# signal to VTT.
Address and Command
For controllers with ALTMEMPHY IP, the first CK/CK# pin pairs
(namely mem_clk[0] or mem_clk_n[0] in the IP) must use any
unused DQ or DQS pins with DIFFIO_RX capability pins in the same
bank or on the same side as the data pins. You can use either side of
the device for wraparound interfaces. This placement is to allow the
mimic path used in the IP VT tracking to go through differential I/O
buffers to mimic the differential DQS signals. Any other CK/CK# pin
pairs (mem_clk[n:1] and mem_clk_n [n:1]) can use any unused
DQ or DQS pins in the same bank or on the same side as the data pins.
Memory system clock
CK and CK#
For controllers with UniPHY IP, you can assign the memory clock to
any unused DIFF_OUT pins in the same bank or on the same side as
the data pins. However, for Stratix V devices, place the memory clock
pins to any unused DQ or DQS pins. Do not place the memory clock
pins in the same DQ group as any other DQ or DQS pins.
If there are multiple CK/CK# pin pairs using Stratix V devices, you
must place them on DIFFOUT in the same single DQ groups of
adequate width. For example, DIMMs requiring three memory clock
pin-pairs must use a ×4 DQS group.
Placing the multiple CK/CK# pin pairs on DIFFOUT in the same single
DQ groups for Stratix III and Stratix IV devices improves timing.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–38
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
Table 3–9. DDR3 SDRAM With Leveling Interface Pin Utilization Applicable for Stratix III, Stratix IV, and Stratix V
Devices (Part 2 of 2)
Interface Pin
Description
Memory Device Pin
Name
FPGA Pin Utilization
Clock Source
—
Dedicated PLL clock input pin with direct (not using a global clock net)
connection to the PLL and optional DLL required by the interface.
Reset
—
Dedicated clock input pin to accommodate the high fan-out signal.
Table 3–10 lists the FPGA pin utilization for QDR II and QDR II+ SRAM interfaces.
Table 3–10. QDR II and QDR II+ SRAM Pin Utilization for Arria II, Arria V, Stratix III, Stratix IV, and Stratix V Devices
Memory Device Pin
Name
Interface Pin Description
Read Clock
CQ and CQn
FPGA Pin Utilization
For QDR II SRAM devices with 1.5 or 2.5 cycles of read
latency or QDR II+ SRAM devices with 2.5 cycles of read
latency, connect CQ to DQS pin (S in the Quartus II Pin
Planner), and CQn to CQn pin (Qbar in the Quartus II Pin
Planner).
For QDR II or QDR II+ SRAM devices with 2.0 cycles of read
latency, connect CQ to CQn pin (Qbar in the Quartus II Pin
Planner), and CQn to DQS pin (S in the Quartus II Pin
Planner).
Read Data
Q
Data Valid
QVLD
Memory and Write Data Clock
K and K#
Write Data
D
Byte Write Select
BWS#, NWS#
DQ pins (Q in the Quartus II Pin Planner). Ensure that you are
using the DQ pins associated with the chosen read clock pins
(DQS and CQn pins). QVLD pins are only available for QDR II+
SRAM devices and note that Altera IP does not use the QVLD
pin.
DQS and DQSn pins associated with the write data pins, S
and Sbar in the Quartus II Pin Planner.
DQ pins. Ensure that you are using the DQ pins associated
with the chosen memory and write data clock pins (DQS and
DQS pins).
Address and Control Signals
A, WPS#, RPS#
Any user I/O pin. To minimize skew, you should place address
and command pins in the same bank or side of the device as
the following pins: K and K# pins, DQ, DQS, BWS#, and
NWS# pins. If you are using burst-length-of-two devices,
place the address signals in a DQS group pin as these signals
are now double data rate.
Clock source
—
Dedicated PLL clock input pin with direct (not using a global
clock net) connection to the PLL and optional DLL required by
the interface.
Reset
—
Dedicated clock input pin to accommodate the high fan-out
signal
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
3–39
Table 3–11 lists the FPGA pin utilization for RLDRAM II CIO interfaces.
Table 3–11. RLDRAM II CIO Pin Utilization for Arria II GZ, Arria V, Stratix III, Stratix IV, and Stratix V Devices
Interface
Pin
Description
Memory Device
Pin Name
FPGA Pin Utilization
Read Clock
QK and QK#
DQS and DQSn pins (S and Sbar in the Quartus II Pin Planner)
Data
Q
Data Valid
QVLD
Data Mask
DM
DQ pins (Q in the Quartus II Pin Planner). Ensure that you are using the DQ pins
associated with the chosen read clock pins (DQS and DQSn pins). Altera IP does not
use the QVLD pin. You may leave this pin unconnected on your board. You may not be
able to fit these pins in a DQS group. For more information about how to place these
pins, refer to “Exceptions for RLDRAM II Interfaces” on page 3–31.
Write Data
Clock
DK and DK#
DQ pins in the same DQS group as the read data (Q) pins or in adjacent DQS group or
in the same bank as the address and command pins. For more information, refer to
“Exceptions for RLDRAM II Interfaces” on page 3–31. DK/DK# must use differential
output-capable pins.
For Nios-based configuration, the DK pins must be in a DQ group but the DK pins do
not have to be in the same group as the data or QK pins.
Any differential output-capable pins.
For Stratix V devices, place any unused DQ or DQS pins with DIFFOUT capability. Place
the memory clock pins either in the same bank as the DK or DK# pins to improve DK
versus CK timing, or in the same bank as the address and command pins to improve
address command timing. Do not place CK and CK# pins in the same DQ group as any
other DQ or DQS pins.
Memory
Clock
CK and CK#
Address
and Control
Signals
A, BA, CS#, REF#,
WE#
Any user I/O pins. To minimize skew, you should place address and command pins in
the same bank or side of the device as the following pins: CK/CK# pins, DQ, DQS, and
DM pins.
Clock
source
—
Dedicated PLL clock input pin with direct (not using a global clock net) connection to
the PLL and optional DLL required by the interface.
Reset
—
Dedicated clock input pin to accommodate the high fan-out signal
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–40
Chapter 3: Planning Pin and FPGA Resources
Pin Connection Guidelines Tables
Table 3–12 lists the FPGA pin utilization for RLDRAM II SIO interfaces.
Table 3–12. RLDRAM II SIO Pin Utilization Applicable for Arria II GZ, Arria V, Stratix III, Stratix IV, and Stratix V Devices
Interface Pin
Description
Memory Device Pin
Name
Read Clock
QK and QK#
Read Data
Q
Data valid
QVLD
Memory and
Write Data
Clock
DK and DK#
Write Data
D
Data Mask
DM
FPGA Pin Utilization
DQS and DQSn pins (S and Sbar in the Quartus II Pin Planner) in the same
DQS group as the respective read data (Q) pins.
DQ pins (Q in the Quartus II Pin Planner). Ensure that you are using the DQ
pins associated with the chosen read clock (DQS and DQSn) pins. Altera does
not use the QVLD pin. You may leave this pin unconnected on your board.
DQS and DQSn pins (S and Sbar in the Quartus II Pin Planner) in the same
DQS group as the respective write data (D) pins.
For Nios-based configuration, the DK pins must be in a DQ group but the DK
pins do not have to be in the same group as the data or QK pins.
DQ pins. Ensure that you are using the DQ pins associated with the chosen
write data clock (DQS and DQSn) pins.
Any differential output-capable pins.
Memory Clock CK and CK#
For Stratix V devices, place any unused DQ or DQS pins with DIFFOUT
capability. Place the memory clock pins either in the same bank as the DK or
DK# pins to improve DK versus CK timing, or in the same bank as the address
and command pins to improve address command timing. Do not place CK and
CK# pins in the same DQ group as any other DQ or DQS pins.
Address and
Control
Signals
A, BA, CS#, REF#, WE#
Any user I/O pin. To minimize skew, you should place address and command
pins in the same bank or side of the device as the following pins: CK/CK# pins,
DQ, DQS, or DM pins.
Clock source
—
Dedicated PLL clock input pin with direct (not using a global clock net)
connection to the PLL and optional DLL required by the interface.
Reset
—
Dedicated clock input pin to accommodate the high fan-out signal
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Additional Guidelines for Stratix V Devices
3–41
Additional Guidelines for Stratix V Devices
This section provides guidelines on how to improve timing for Stratix V devices and
the rules that you must follow to overcome timing failures.
Performing Manual Pin Placement
Table 3–10 lists a set of rules that you can follow to perform proper manual pin
placement and avoid timing failures.
The rules are categorized as follows:
■
Mandatory—This rule is mandotory and cannot be violated as they would result
in no-fit error.
■
Recommended—This rule is recommended and if violated the implementation is
legal but the timing is degraded.
■
Highly Recommended—This rule is not mandatory but is highly recommended
because diregarding this rule might result in timing
violations.
Table 3–13. Manual Pin Placement Rules
Rules
Frequency
Device
Reason
Mandotory
> 800 MHz
All
For optimum timing, clock and data
output paths must share as much
hardware as possible. For write data
pins (for example, DQ/DQS), the best
timing is achieved through the DQS
Groups.
Must not split interface between top and bottom sides
Any
All
Because PLLs and DLLs on the top
edge cannot access the bottom edge of
a device and vice-versa.
Must not place pins from seperate interfaces in the
same I/O sub-banks unless the interfaces share PLL
or DLL resources.
Any
All
All pins require access to the same
leveling block.
All
Because sharing the same PLL input
reference clock forces the same ff-PLL
to be used. Each ff-PLL can drive only
one PHY clock tree and interfaces not
sharing a PLL cannot share a PHY clock
tree.
<800 MHz
All
For optimum timing, clock and data
output paths should share as much
hardware as possible. For write data
pins (for example, DQ/DQS), the best
timing is achieved through the DQS
Groups.
Any
A7 (1)
Because of the extra delay to reach the
sub-banks in the corners.
Must place all CK, CK#, address, control, and
command pins of an interface in the same I/O
sub-bank.
Must not share the same PLL input reference clock
unless the interfaces share PLL or DLL resources.
Any
Recommended
Place all CK, CK#, address, control, and command
pins of an interface in the same I/O sub-bank.
Avoid using I/Os at the device corners (for example,
sub-bank “A”).
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–42
Chapter 3: Planning Pin and FPGA Resources
PLLs and Clock Networks
Table 3–13. Manual Pin Placement Rules
Rules
Avoid straddling an interface across the center PLL.
Frequency
Any
Device
Reason
All
Straddling PLL causes timing
degradation. This is because it
increases the length of the PHY clock
tree and generates higher jitter.
Using a non-center PLL results in
driving a sub-bank in the opposite
quadrant due to long PHY clock tree
delay.
Use the center PLL(f-PLL1) for a wide interface that
must straddle across center PLL.
>= 800 MHz
All
Place the DQS/DQS# pins such that all DQ groups of
the same interface are next to each other and do not
span across the center PLL.
Any
All
Place CK, CK#, address, control, and command pins
in the same quadrant as DQ groups for improved
timing in general.
Any
All
To ease core timing closure. If the pins
are too far apart then the core logic is
also placed apart which results in
difficult timing closure.
Highly Recommended
Place all CK, CK#, address, control, and command
pins of an interface in the same I/O sub-bank.
800 MHz
All
Use center PLL and ensure that the PLL input
reference clock pin is placed at a location that can
drive the center PLL.
>= 800 MHz
All
If center PLL is not accessible, place pins in the same
quadrant as the PLL.
>= 800 MHz
All
For optimum timing, clock and data
output paths should share as much
hardware as possible. For write data
pins (for example, DQ/DQS), the best
timing is achieved through the DQS
Groups.
Using a non-center PLL results in
driving a sub-bank in the opposite
quadrant due to long PHY clock tree
delay.
Note to Table 3–13:
(1) This rule is currently applicable to A7 devices only. This rule might be applied to other devices in the future if they show the same failure.
PLLs and Clock Networks
The exact number of clocks and PLLs required in your design depends greatly on the
memory interface frequency, and the IP that your design uses.
For example, you can build simple DDR slow-speed interfaces that typically require
only two clocks: system and write. You can then use the rising and falling edges of
these two clocks to derive four phases (0°, 90°, 180°, and 270°). However, as clock
speeds increase, the timing margin decreases and additional clocks are required, to
optimize setup and hold and meet timing. Typically, at higher clock speeds, you need
to have dedicated clocks for resynchronization, and address and command paths.
In addition, ALTMEMPHY-based interfaces, use a VT tracking clock to measure and
compensate for VT changes and their effects.
Altera memory interface IP uses one PLL, which generates the various clocks needed
in the memory interface data path and controller, and provides the required phase
shifts for the write clock and address and command clock. The PLL is instantiated
when you generate the Altera memory IPs.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
PLLs and Clock Networks
3–43
By default, the memory interface IP uses the PLL to generate the input reference clock
for the DLL, available in all device families except for the Cyclone III and Cyclone IV
devices. This method eliminates the need of an extra pin for the DLL input reference
clock.
The input reference clock to the DLL can come from certain input clock pins or clock
output from certain PLLs.
f For the actual pins and PLLs connected to the DLLs, refer to the External Memory
Interfaces chapter of the relevant device family handbook.
You must use the PLL located in the same device quadrant or side as the memory
interface and the corresponding dedicated clock input pin for that PLL, to ensure
optimal performance and accurate timing results from the Quartus II software. The
input clock to the PLL should not fan out to any logic other than the PHY, as you
cannot use a global clock resource for the path between the clock input pin to the PLL.
Table 3–14 and Table 3–15 list a comparison of the number of PLLs and dedicated
clock outputs available respectively in Arria II, Cyclone III, Cyclone IV, Stratix III,
Stratix IV, and Stratix V devices.
Table 3–14. Number of PLLs Available in Altera Device Families
Device Family
(1)
Enhanced PLLs Available
Arria II GX
4-6
Arria II GZ
3-8
Arria V
16-24
Cyclone III and Cyclone IV
2-4
Cyclone V
4-8
Stratix III
4-12
Stratix IV
3-12
Stratix V (fPLL)
22-28
Note to Table 3–14:
(1) For more details, refer to the Clock Networks and PLL chapter of the respective device family handbook.
Table 3–15. Number of Enhanced PLL Clock Outputs and Dedicated Clock Outputs Available in
Altera Device Families (1) (Part 1 of 2)
Device Family
Number of Enhanced PLL Clock
Outputs
Number Dedicated Clock Outputs
1 single-ended or 1 differential pair
Arria II GX
(2)
7 clock outputs each
3 single-ended or 3 differential pair total
(3)
November 2011
Arria V
18 clock outputs each
4 single-ended or 2 single-ended and 1
differential pair
Cyclone III and
Cyclone IV
5 clock outputs each
1 single-ended or 1 differential pair total
(not for memory interface use)
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–44
Chapter 3: Planning Pin and FPGA Resources
PLLs and Clock Networks
Table 3–15. Number of Enhanced PLL Clock Outputs and Dedicated Clock Outputs Available in
Altera Device Families (1) (Part 2 of 2)
Device Family
Stratix III
Number of Enhanced PLL Clock
Outputs
Left/right: 7 clock outputs
Top/bottom: 10 clock outputs
Arria II GZ and
Stratix IV
Left/right: 7 clock outputs
Stratix V
18 clock outputs each
Top/bottom: 10 clock outputs
Number Dedicated Clock Outputs
Left/right: 2 single-ended or 1
differential pair
Top/bottom: 6 single-ended or 4
single-ended and 1 differential pair
Left/right: 2 single-ended or 1
differential pair
Top/bottom: 6 single-ended or 4
single-ended and 1 differential pair
4 single-ended or 2 single-ended and 1
differential pair
Notes to Table 3–15:
(1) For more details, refer to the Clock Networks and PLL chapter of the respective device family handbook.
(2) PLL_5 and PLL_6 of Arria II GX devices do not have dedicated clock outputs.
(3) The same PLL clock outputs drives three single-ended or three differential I/O pairs, which are only supported in
PLL_1 and PLL_3 of the EP2AGX95, EP2AGX125, EP2AGX190, and EP2AGX260 devices.
Table 3–16 lists the number of clock networks available in the Altera device families.
Table 3–16. Number of Clock Networks Available in Altera Device Families
Device Family
(1)
Global Clock Network
Regional Clock Network
Arria II GX
16
48
Arria II GZ
16
64–88
Arria V
16
88
10-20
N/A
Cyclone V
16
N/A
Stratix III
16
64–88
Stratix IV
16
64–88
Stratix V
16
92
Cyclone III and
Cyclone IV
Note to Table 3–16:
(1) For more information on the number of available clock network resources per device quadrant to better understand
the number of clock networks available for your interface, refer to the Clock Networks and PLL chapter of the
respective device family handbook.
1
You must decide whether you need to share clock networks, PLL clock outputs, or
PLLs if you are implementing multiple memory interfaces.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
PLLs and Clock Networks
3–45
Table 3–17 through Table 3–19 list the number of PLL outputs and clock networks
required for the memory standards using Altera IP. Table 3–20 lists the names and
frequency of the clocks used.
Table 3–17. Clock Network Usage in ALTMEMPHY-based Memory Standards
DDR3 SDRAM
DDR2/DDR SDRAM
Half-Rate
Half-Rate
Full-Rate
Device
Number of
full-rate
clock
Arria II GX
4 global
Number of full- Number of half- Number of full- Number of halfrate clock
rate clock
rate clock
rate clock
2 global
Cyclone III and
Cyclone IV
—
1 global
Stratix III and
Stratix IV
Number of
half-rate
clock
2 regional
2 regional
4 global
2 global
5 global
1 global
4 global
1 global
5 global
1 regional
1 global
1 global
2 dual-regional
2 dual-regional
2 dual-regional
—
Table 3–18. Clock Network Usage in UniPHY-based Memory Interfaces—DDR2 and DDR3 SDRAM
Device
Stratix III
3 global
Arria II GZ and Stratix IV
3 global
DDR2 SDRAM
Half-Rate
Half-Rate
1 global
Stratix V
(1) (1)
DDR3 SDRAM
Number of full-rate
clock
Number of half-rate
clock
Number of full-rate
clock
Number of half-rate
clock
1 global
1 global
1 global
1 regional
2 global
1 regional
1 global
1 regional
1 global
1 regional
2 regional
1 regional
2 global
2 regional
2 dual-regional
1 regional
2 global
2 regional
Note to Table 3–18:
(1) There are two additional regional clocks, pll_avl_clk and pll_config_clk for DDR2 and DDR3 SDRAM with UniPHY memory interfaces.
(2) In multiple interface designs with other IP, the clock network might need to be modified to get a design to fit. For more information, refer to
Clock Networks and PLLs chapter in the respective device handbooks.
Table 3–19. Clock Network Usage in UniPHY-based Memory Interfaces—RLDRAM II, and QDR II and QDR II+ SRAM
RLDRAM II
Half-Rate
QDR II/QDR II+ SRAM
Full-Rate
Half-Rate
Full-Rate
Device
Arria II GX
Number of
full-rate
clock
Number of
half-rate
clock
Number of
full-rate
clock
—
—
Stratix III
2 regional
Arria II GZ and Stratix IV
2 regional
November 2011
Altera Corporation
Number of
full-rate
clock
Number of
half-rate
clock
Number of
full-rate
clock
—
2 global
2 global
4 global
1 global
1 global
1 global
1 regional
2 regional
1 regional
1 global
1 global
1 global
1 regional
2 regional
1 regional
2 regional
2 regional
1 global
2 regional
1 global
2 regional
External Memory Interface Handbook
Volume 2: Design Guidelines
3–46
Chapter 3: Planning Pin and FPGA Resources
PLLs and Clock Networks
1
For more information about the clocks used in UniPHY-based memory standards,
refer to the Functional Description—UniPHY chapter in volume 3 of the External
Memory Interface Handbook.
Table 3–20. Clocks Used in the ALTMEMPHY Megafunction
(1)
Clock Name
Usage Description
phy_clk_1x
Static system clock for the half-rate data path and controller.
mem_clk_2x
Static DQS output clock that generates DQS, CK/CK# signals, the input reference clock to the
DLL, and the system clock for the full-rate datapath and controller.
mem_clk_1x
This clock drives the aux_clk output or clocking DQS and as a reference clock for the memory
devices.
write_clk_2x
Static DQ output clock used to generate DQ signals at 90o earlier than DQS signals. Also may
generate the address and command signals.
mem_clk_ext_2x
This clock is only used if the memory clock generation uses dedicated output pins. Applicable
only in HardCopy® II or Stratix II prototyping for HardCopy II designs.
resync_clk_2x
Dynamic-phase clock used for resynchronization and postamble paths. Currently, this clock
cannot be shared by multiple interfaces.
measure_clk_2x/
measure_clk_1x
ac_clk_2x
(2)
Dynamic-phase clock used for VT tracking purposes. Currently, this clock cannot be shared by
multiple interfaces.
Dedicated static clock for address and command signals.
ac_clk_1x
scan_clk
Static clock to reconfigure the PLL
seq_clk
Static clock for the sequencer logic
Notes to Table 3–20:
(1) For more information about the clocks used in the ALTMEMPHY megafunction, refer to the Clock Networks and PLL chapter of the respective
device family handbook for more details.
(2) This clock should be of the same clock network clock as the resync_clk_2x clock.
In every ALTMEMPHY solution, the measure_clk and resync_clk_2x clocks
(Table 3–20) are calibrated and hence may not be shared or used for other modules in
your system. You may be able to share the other statically phase-shifted clocks with
other modules in your system provided that you do not change the clock network
used.
Changing the clock network that the ALTMEMPHY solution uses may affect the
output jitter, especially if the clock is used to generate the memory interface output
pins. Always check the clock network output jitter specification in the DC and
Switching Characteristics chapter of the device handbook, before changing the
ALTMEMPHY clock network, to ensure that it meets the memory standard jitter
specifications, which includes period jitter, cycle-to-cycle jitter and half duty cycle
jitter.
If you need to change the resync_clk_2x clock network, you have to change the
measure_clk_1x clock network also to ensure accurate VT tracking of the memory
interface.
f For more information about sharing clocks in multiple controllers, refer to the design
tutorials on the List of designs using Altera External Memory IP page of the Altera
Wiki website.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
PLLs and Clock Networks
3–47
In addition, you should not change the PLL clock numbers as the wizard-generated
Synopsis Design Constraints File (.sdc) assumes certain counter outputs from the PLL
(Table 3–21 through Table 3–22).
Table 3–21. PLL Usage for DDR, DDR2, and DDR3 SDRAM Without Leveling Interfaces
Clock
C0
C1
Cyclone III and Cyclone IV
Devices
Arria II GX Devices
Stratix III and Stratix IV Devices
■
phy_clk_1x in half-rate
designs
■
phy_clk_1x in half-rate
designs
■
phy_clk_1x in half-rate
designs
■
aux_half_rate_clk
■
aux_half_rate_clk
■
aux_half_rate_clk
■
PLL scan_clk
■
PLL scan_clk
■
phy_clk_1x in full-rate
designs
■
phy_clk_1x in full-rate
designs
■
mem_clk_2x
■
aux_full_rate_clk
■
aux_full_rate_clk
■
mem_clk_2x to generate DQS
and CK/CK# signals
■
mem_clk_2x to generate DQS
and CK/CK# signals
■
ac_clk_2x
■
ac_clk_2x
■
cs_n_clk_2x
■
cs_n_clk_2x
■
Unused
■
write_clk_2x (for DQ)
■
■
ac_clk_2x
phy_clk_1x in full-rate
designs
■
cs_n_clk_2x
■
aux_full_rate_clk
■
resync_clk_2x
■
write_clk_2x
measure_clk_2x
■
resync_clk_2x
C2
■
write_clk_2x (for DQ)
■
ac_clk_2x
■
cs_n_clk_2x
C4
■
resync_clk_2x
■
C5
■
measure_clk_2x
—
■
measure_clk_1x
C6
—
—
■
ac_clk_1x
C3
Table 3–22. PLL Usage for DDR3 SDRAM With Leveling Interfaces
Clock
■
phy_clk_1x in half-rate designs
■
aux_half_rate_clk
■
PLL scan_clk
C1
■
mem_clk_2x
C2
■
aux_full_rate_clk
C3
■
write_clk_2x
C4
■
resync_clk_2x
C5
■
measure_clk_1x
C6
■
ac_clk_1x
C0
November 2011
Stratix III and Stratix IV Devices
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–48
Chapter 3: Planning Pin and FPGA Resources
Using PLL Guidelines
Using PLL Guidelines
When using PLL for external memory interfaces, you must consider the following
guidelines:
■
For the clock source, use the clock input pin specifically dedicated to the PLL that
you want to use with your external memory interface. The input and output pins
are only fully compensated when you use the dedicated PLL clock input pin. If the
clock source for the PLL is not a dedicated clock input pin for the dedicated PLL,
you would need an additional clock network to connect the clock source to the
PLL block. Using additional clock network may increase clock jitter and degrade
the timing margin.
■
Pick a PLL and PLL input clock pin that are located on the same side of the device
as the memory interface pins.
■
Share the DLL and PLL static clocks for multiple memory interfaces provided the
controllers are on the same or adjacent side of the device and run at the same
memory clock frequency.
■
If you are using Cyclone III or Cyclone IV devices, you need not set the PLL mode
to No Compensation in the Quartus II software. The PLL for these devices in
Normal mode has low jitter. Changing the compensation mode may result in
inaccurate timing results.
■
If your design uses a dedicated PLL to only generate a DLL input reference clock
(not available for Cyclone III or Cyclone IV device), you must set the PLL mode to
No Compensation in the Quartus II software to minimize the jitter, or the software
forces this setting automatically. The PLL does not generate other output, so it
does not need to compensate for any clock path.
■
If your design cascades PLL, the source (upstream) PLL must have a
low-bandwidth setting, while the destination (downstream) PLL must have a
high-bandwidth setting to minimize jitter. Altera does not recommend using
cascaded PLLs for external memory interfaces because your design gets
accumulated jitters. The memory output clock may violate the memory device
jitter specification.
c
Use this feature at your own risk. For more information, refer to “PLL
Cascading” on page 3–49.
■
If you are using Arria II GX devices, for a single memory instance that spans two
right-side quadrants, use a middle-side PLL as the source for that interface.
■
If you are using Arria II GZ, Stratix III, Stratix IV, or Stratix V devices, for a single
memory instance that spans two top or bottom quadrants, use a middle top or
bottom PLL as the source for that interface. The ten dual regional clocks that the
single interface requires must not block the design using the adjacent PLL (if
available) for a second interface.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
PLL Cascading
3–49
PLL Cascading
Arria II GZ PLLs, Stratix III PLLs, Stratix IV PLLs, Stratix V fPLLs, and the two
middle PLLs in Arria II GX EP2AGX95, EP2AGX125, EP2AGX190, and EP2AGX260
devices can be cascaded using either the global or regional clock trees, or the cascade
path between two adjacent PLLs.
1
Use this feature at your own risk. You should use faster memory devices to maximize
timing margins.
Cyclone III and Cyclone IV devices do not support PLL cascading for external
memory interfaces.
The UniPHY IP supports PLL cascading using the cascade path without any
additional timing derating when the bandwidth and compensation rules are
followed. The timing constraints and analysis assume that there is no additional jitter
due to PLL cascading when the upstream PLL uses no compensation and low
bandwidth, and the downstream PLL uses no compensation and high bandwidth.
The UniPHY IP does not support PLL cascading using the global and regional clock
networks. You can implement PLL cascading at your own risk without any additional
guidance and specifications from Altera. The Quartus II software does issue a critical
warning suggesting use of the cascade path to minimize jitter, but does not explicitly
state that Altera does not support cascading using global and regional clock networks.
1
The Quartus II software does not issue a critical warning stating that Cyclone III and
Cyclone IV ALTMEMPHY designs do not support PLL cascading; it issues the Stratix
III warning message requiring use of cascade path.
Some Arria II GX devices (EP2AGX95, EP2AGX125, EP2AGX190, and EP2AGX260)
have direct cascade path for two middle right PLLs. Arria II GX PLLs have the same
bandwidth options as Stratix IV GX left and right PLLs.
DLL
The Altera memory interface IP uses one DLL (except in Cyclone III and Cyclone IV
devices, where this resource is not available). The DLL is located at the corner of the
device and can send the control signals to shift the DQS pins on its adjacent sides for
Stratix-series devices, or DQS pins in any I/O banks in Arria II GX devices.
For example, the top-left DLL can shift DQS pins on the top side and left side of the
device. The DLL generates the same phase shift resolution for both sides, but can
generate different phase offset to the two different sides, if needed. Each DQS pin can
be configured to use or ignore the phase offset generated by the DLL.
The DLL cannot generate two different phase offsets to the same side of the device.
However, you can use two different DLLs to for this functionality.
DLL reference clocks must come from either dedicated clock input pins located on
either side of the DLL or from specific PLL output clocks. Any clock running at the
memory frequency is valid for the DLLs.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
3–50
Chapter 3: Planning Pin and FPGA Resources
DLL
To minimize the number of clocks routed directly on the PCB, typically this reference
clock is sourced from the memory controllers PLL. In general, DLLs can use the PLLs
directly adjacent to them (corner PLLs when available) or the closest PLL located in
the two sides adjacent to its location.
1
By default, the DLL reference clock in Altera external memory IP is from a PLL
output.
When designing for 780-pin packages with EP3SE80, EP3SE110, EP3SL150, EP4SE230,
EP4SE360, EP4SGX180, and EP4SGX230 devices, the PLL to DLL reference clock
connection is limited. DLL2 is isolated from a direct PLL connection and can only
receive a reference clock externally from pins CLK[11:4]p in EP3SE80, EP3SE110,
EP3SL150, EP4SE230, and EP4SE360 devices. In EP4SGX180 and EP4SGX230 devices,
DLL2 and DLL3 are not directly connected to PLL. DLL2 and DLL3 receive a reference
clock externally from pins CLK[7:4]p and CLK[15:12]p respectively.
f For more DLL information, refer to the respective device handbooks.
The DLL reference clock should be the same frequency as the memory interface, but
the phase is not important.
The required DQS capture phase is optimally chosen based on operating frequency
and external memory interface type (DDR, DDR2, DDR3 SDRAM, and QDR II SRAM,
or RLDRAM II). As each DLL supports two possible phase offsets, two different
memory interface types operating at the same frequency can easily share a single
DLL. More may be possible, depending on the phase shift required.
f Altera memory IP always specifies a default optimal phase setting, to override this
setting, refer to the Implementing and Parameterizing Memory IP chapter.
When sharing DLLs, your memory interfaces must be of the same frequency. If the
required phase shift is different amongst the multiple memory interfaces, you can use
a different delay chain in the DQS logic block or use the DLL phase offset feature.
To simplify the interface to IP connections, multiple memory interfaces operating at
the same frequency usually share the same system and static clocks as each other
where possible. This sharing minimizes the number of dedicated clock nets required
and reduces the number of different clock domains found within the same design.
As each DLL can directly drive four banks, but each PLL only has complete C (output)
counter coverage of two banks (using dual regional networks), situations can occur
where a second PLL operating at the same frequency is required. As cascaded PLLs
increase jitter and reduce timing margin, you are advised to first ascertain if an
alternative second DLL and PLL combination is not available and more optimal.
Select a DLL that is available for the side of the device where the memory interface
resides. If you select a PLL or a PLL input clock reference pin that can also serve as the
DLL input reference clock, you do not need an extra input pin for the DLL input
reference clock.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 3: Planning Pin and FPGA Resources
Other FPGA Resources
3–51
Other FPGA Resources
The Altera memory interface IP uses FPGA fabric, including registers and the
Memory Block to implement the memory interface.
f For resource utilization examples to ensure that you can fit your other modules in the
device, refer to the “Resource Utilization” section in the Introduction to UniPHY IP and
the Introduction to ALTMEMPHY IP chapters of the External Memory Interface
Handbook.
In addition, one OCT calibration block is used if you are using the FPGA OCT feature
in the memory interface.The OCT calibration block uses two pins (RUP and RDN), or
single pin (RZQ) (“OCT Support for Arria II GX, Arria II GZ, Arria V, Cyclone V,
Stratix III, Stratix IV, and Stratix V Devices” on page 3–23). You can select any of the
available OCT calibration block as you do not need to place this block in the same
bank or device side of your memory interface. The only requirement is that the I/O
bank where you place the OCT calibration block uses the same VCCIO voltage as the
memory interface. You can share multiple memory interfaces with the same OCT
calibration block if the VCCIO voltage is the same.
Even though Cyclone III and Cyclone IV devices support OCT, this feature is not
turned on by default in the Altera IP solution.
Document Revision History
Table 3–23 lists the revision history for this document.
Table 3–23. Document Revision History
Date
Version
November 2011
June 2011
4.0
3.0
Changes
■
Moved and reorganized “Planning Pin and Resource” section to Volume 2:Design
Guidelines.
■
Added Additional Guidelines for Stratix V Devices section.
■
Added Arria V and Cyclone V information.
■
Moved Select a Device and Memory IP Planning chapters to Volume 1.
■
Added information about interface pins.
■
Added guidelines for using PLL.
■
Added a new section on controller efficiency.
■
Added Arria II GX and Stratix V information.
December 2010
2.1
July 2010
2.0
Updated information about UniPHY-based interfaces and Stratix V devices.
April 2010
1.0
Initial release.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4. DDR2 and DDR3 SDRAM Board Design
Guidelines
November 2011
EMI_DG_004-4.0
EMI_DG_004-4.0
This chapter provides guidelines on how to improve the signal integrity of your
system and layout guidelines to help you successfully implement a DDR2 or DDR3
SDRAM interface on your system.
DDR3 SDRAM is the third generation of the DDR SDRAM family, and offers
improved power, higher data bandwidth, and enhanced signal quality with multiple
on-die termination (ODT) selection and output driver impedance control while
maintaining partial backward compatibility with the existing DDR2 SDRAM
standard.
This chapter focuses on the following key factors that affect signal quality at the
receiver:
■
Leveling and dynamic ODT
■
Proper use of termination
■
Output driver drive strength setting
■
Loading at the receiver
■
Layout guidelines
As memory interface performance increases, board designers must pay closer
attention to the quality of the signal seen at the receiver because poorly transmitted
signals can dramatically reduce the overall data-valid margin at the receiver.
Figure 4–1 shows the differences between an ideal and real signal seen by the receiver.
VIH
Voltage
Voltage
Figure 4–1. Ideal and Real Signal at the Receiver
VIH
VIL
VIL
Time
Ideal
Time
Real
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
4–2
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Leveling and Dynamic ODT
In addition, this chapter compares various types of termination schemes, and their
effects on the signal quality on the receiver. It also discusses the proper drive strength
setting on the FPGA to optimize the signal integrity at the receiver, and the effects of
different loading types, such as components versus DIMM configuration, on signal
quality. The objective of this chapter is to understand the trade-offs between different
types of termination schemes, the effects of output drive strengths, and different
loading types, so you can swiftly navigate through the multiple combinations and
choose the best possible settings for your designs.
Leveling and Dynamic ODT
DDR3 SDRAM DIMMs, as specified by JEDEC, always use a fly-by topology for the
address, command, and clock signals. This standard DDR3 SDRAM topology requires
the use of Altera® DDR3 SDRAM Controller with UniPHY or ALTMEMPHY with
read and write leveling.
Altera recommends that for full DDR3 SDRAM compatibility when using discrete
DDR3 SDRAM components, you should mimic the JEDEC DDR3 UDIMM fly-by
topology on your custom printed circuit boards (PCB).
1
Arria® II, Arria V, and Cyclone® V devices do not support DDR3 SDRAM with read or
write leveling, so these devices do not support standard DDR3 SDRAM DIMMs or
DDR3 SDRAM components using the standard DDR3 SDRAM fly-by address,
command, and clock layout topology.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Leveling and Dynamic ODT
4–3
Read and Write Leveling
One major difference between DDR2 and DDR3 SDRAM is the use of leveling. To
improve signal integrity and support higher frequency operations, the JEDEC
committee defined a fly-by termination scheme used with clocks, and command and
address bus signals. Fly-by topology reduces simultaneous switching noise (SSN) by
deliberately causing flight-time skew between the data and strobes at every DRAM as
the clock, address, and command signals traverse the DIMM (Figure 4–2).
Figure 4–2. DDR3 DIMM Fly-By Topology Requiring Write Leveling
Command, Address, Clock in
“Flyby” topology in DDR3 DIMM
VTT
Data Skew
Data Skew Calibrated Out at Power Up with Write Leveling
The flight-time skew caused by the fly-by topology led the JEDEC committee to
introduce the write leveling feature on the DDR3 SDRAMs; thus requiring controllers
to compensate for this skew by adjusting the timing per byte lane.
During a write, DQS groups launch at separate times to coincide with a clock arriving
at components on the DIMM, and must meet the timing parameter between the
memory clock and DQS defined as tDQSS of ± 0.25 tCK.
During the read operation, the memory controller must compensate for the delays
introduced by the fly-by topology. The Stratix® III, Stratix IV, and Stratix V FPGAs
have alignment and synchronization registers built in the I/O element (IOE) to
properly capture the data.
In DDR2 SDRAM, there are only two drive strength settings, full or reduced, which
correspond to the output impedance of 18  and 40 , respectively. These output
drive strength settings are static settings and are not calibrated; as a result, the output
impedance varies as the voltage and temperature drifts.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–4
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Leveling and Dynamic ODT
The DDR3 SDRAM uses a programmable impedance output buffer. Currently, there
are two drive strength settings, 34 and 40 . The 40- drive strength setting is
currently a reserved specification defined by JEDEC, but available on the DDR3
SDRAM, as offered by some memory vendors. Refer to the datasheet of the respective
memory vendors for more information about the output impedance setting. You
select the drive strength settings by programming the memory mode register defined
by mode register 1 (MR1). To calibrate output driver impedance, an external precision
resistor, RZQ, connects the ZQ pin and VSSQ. The value of this resistor must be
240  ± 1%.
If you are using a DDR3 SDRAM DIMM, RZQ is soldered on the DIMM so you do not
need to layout your board to account for it. Output impedance is set during
initialization. To calibrate output driver impedance after power-up, the DDR3
SDRAM needs a calibration command that is part of the initialization and reset
procedure and is updated periodically when the controller issues a calibration
command.
In addition to calibrated output impedance, the DDR3 SDRAM also supports
calibrated parallel ODT through the same external precision resistor, RZQ, which is
possible by using a merged output driver structure in the DDR3 SDRAM, which also
helps to improve pin capacitance in the DQ and DQS pins. The ODT values supported
in DDR3 SDRAM are 20 , 30 , 40 , 60 , and 120 , assuming that RZQ is 240 .
In DDR3 SDRAM, there are two commands related to the calibration of the output
driver impedance and ODT. The controller often uses the first calibration command,
ZQ CALIBRATION LONG (ZQCL), at initial power-up or when the DDR3 SDRAM is
in a reset condition. This command calibrates the output driver impedance and ODT
to the initial temperature and voltage condition, and compensates for any process
variation due to manufacturing. If the controller issues the ZQCL command at
initialization or reset, it takes 512 memory clock cycles to complete; otherwise, it
requires 256 memory clock cycles to complete. The controller uses the second
calibration command, ZQ CALIBRATION SHORT (ZQCS) during regular operation
to track any variation in temperature or voltage. The ZQCS command takes
64 memory clock cycles to complete. Use the ZQCL command any time there is more
impedance error than can be corrected with a ZQCS command.
For more information about using ZQ Calibration in DDR3 SDRAM, refer to the
application note by Micron, TN-41-02 DDR3 ZQ Calibration.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Leveling and Dynamic ODT
4–5
Dynamic ODT
Dynamic ODT is a new feature in DDR3 SDRAM, and not available in DDR2 SDRAM.
Dynamic ODT can change the ODT setting without issuing a mode register set (MRS)
command. When you enable dynamic ODT, and there is no write operation, the DDR3
SDRAM terminates to a termination setting of RTT_NORM; when there is a write
operation, the DDR3 SDRAM terminates to a setting of RTT_WR. You can preset the
values of RTT_NORM and RTT_WR by programming the mode registers, MR1 and MR2.
Figure 4–3 shows the behavior of ODT when you enable dynamic ODT.
Figure 4–3. Dynamic ODT: Behavior with ODT Asserted Before and After the Write
(1)
Note to Figure 4–3:
(1) Source: TN-41-04 DDR3 Dynamic On-Die Termination, Micron.
In the two-DIMM DDR3 SDRAM configuration, dynamic ODT helps reduce the jitter
at the module being accessed, and minimizes reflections from any secondary
modules.
f For more information about using the dynamic ODT on DDR3 SDRAM, refer to the
application note by Micron, TN-41-04 DDR3 Dynamic On-Die Termination.
Dynamic OCT in Stratix III and Stratix IV Devices
Stratix III and Stratix IV devices support on-off dynamic series and parallel
termination for a bidirectional I/O in all I/O banks. Dynamic OCT is a new feature in
Stratix III and Stratix IV FPGA devices. You enable dynamic parallel termination only
when the bidirectional I/O acts as a receiver and disable it when the bidirectional I/O
acts as a driver. Similarly, you enable dynamic series termination only when the
bidirectional I/O acts as a driver and is disable it when the bidirectional I/O acts as a
receiver. The default setting for dynamic OCT is series termination, to save power
when the interface is idle—no active reads or writes.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–6
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Leveling and Dynamic ODT
1
Additionally, the dynamic control operation of the OCT is separate to the output
enable signal for the buffer. Hence, UniPHY IP can only enable parallel OCT during
read cycles, saving power when the interface is idle.
Figure 4–4. Dynamic OCT Between Stratix III and Stratix IV FPGA Devices
FPGA
DDR3 DIMM
DDR3 Component
50 Ω
34 W
Driver
Driver
100 W
R S = 15 Ω
50 Ω
3" Trace Length
VREF = 0.75 V
Receiver
VREF = 0.75 V
Receiver
100 W
FPGA
DDR3 DIMM
DDR3 Component
34 Ω
50 W
Driver
Driver
100 Ω
R S = 15 Ω
50 Ω
3" Trace Length
VREF = 0.75 V
Receiver
100 Ω
VREF = 0.75 V
Receiver
This feature is useful for terminating any high-performance bidirectional path
because signal integrity is optimized depending on the direction of the data. In
addition, dynamic OCT also eliminates the need for external termination resistors
when used with memory devices that support ODT (such as DDR3 SDRAM), thus
reducing cost and easing board layout.
However, dynamic OCT in Stratix III and Stratix IV FPGA devices is different from
dynamic ODT in DDR3 SDRAM mentioned in previous sections and these features
should not be assumed to be identical.
f For detailed information about the dynamic OCT feature in the Stratix III FPGA, refer
to the Stratix III Device I/O Features chapter in volume 1 of the Stratix III Device
Handbook.
f For detailed information about the dynamic OCT feature in the Stratix IV FPGA, refer
to the I/O Features in Stratix IV Devices chapter in volume 1 of the Stratix IV Device
Handbook.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–7
Dynamic OCT in Stratix V Devices
Stratix V devices also support dynamic OCT feature and provide more flexibility.
Stratix V OCT calibration uses one RZQ pin that exists in every OCT block. You can
use any one of the following as a reference resistor on the RZQ pin to implement
different OCT values:
■
240- reference resistor—to implement RS OCT of 34 , 40 , 48 , 60 , and 80 ;
and RT OCT resistance of 20 , 30 , 40 , and 120 
■
100  reference resistor—to implement RS OCT of 25  and 50 ; and RT OCT
resistance of 50 
f For detailed information about the dynamic OCT feature in the Stratix V FPGA, refer
to the I/O Features in Stratix V Devices chapter in volume 1 of the Stratix V Device
Handbook.
Board Termination for DDR2 SDRAM
DDR2 adheres to the JEDEC standard of governing Stub-Series Terminated Logic
(SSTL), JESD8-15a, which includes four different termination schemes.
Two commonly used termination schemes of SSTL are:
■
Single parallel terminated output load with or without series resistors (Class I, as
stated in JESD8-15a)
■
Double parallel terminated output load with or without series resistors (Class II,
as stated in JESD8-15a)
Depending on the type of signals you choose, you can use either termination scheme.
Also, depending on your design’s FPGA and SDRAM memory devices, you may
choose external or internal termination schemes.
With the ever-increasing requirements to reduce system cost and simplify printed
circuit board (PCB) layout design, you may choose not to have any parallel
termination on the transmission line, and use point-to-point connections between the
memory interface and the memory. In this case, you may take advantage of internal
termination schemes such as on-chip termination (OCT) on the FPGA side and on-die
termination (ODT) on the SDRAM side when it is offered on your chosen device.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–8
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
External Parallel Termination
If you use external termination, you must study the locations of the termination
resistors to determine which topology works best for your design. Figure 4–5 and
Figure 4–6 illustrate the two most commonly used termination topologies: fly-by
topology and non-fly-by topology, respectively.
Figure 4–5. Fly-By Placement of a Parallel Resistor
VTT
RT = 50 Ω
Board Trace
DDR2 SDRAM
DIMM
(Receiver)
FPGA Driver
Board Trace
With fly-by topology (Figure 4–5), you place the parallel termination resistor after the
receiver. This termination placement resolves the undesirable unterminated stub
found in the non-fly-by topology. However, using this topology can be costly and
complicate routing. The Stratix II Memory Board 2 uses the fly-by topology for the
parallel terminating resistors placement. The Stratix II Memory Board 2 is a memory
test board available only within Altera for the purpose of testing and validating
Altera’s memory interface.
Figure 4–6. Non-Fly-By Placement of a Parallel Resistor
VTT
RT = 50 Ω
FPGA Driver
External Memory Interface Handbook
Volume 2: Design Guidelines
DDR2 SDRAM
DIMM
(Receiver)
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–9
With non-fly-by topology (Figure 4–6), the parallel termination resistor is placed
between the driver and receiver (closest to the receiver). This termination placement is
easier for board layout, but results in a short stub, which causes an unterminated
transmission line between the terminating resistor and the receiver. The unterminated
transmission line results in ringing and reflection at the receiver.
If you do not use external termination, DDR2 offers ODT and Altera FPGAs have
varying levels of OCT support. You should explore using ODT and OCT to decrease
the board power consumption and reduce the required board real estate.
On-Chip Termination
OCT technology is offered on Arria II GX, Arria II GZ, Arria V, Cyclone III,
Cyclone IV, Cyclone V, Stratix III, Stratix IV, and Stratix V devices. Table 4–1
summarizes the extent of OCT support for each device. This table provides
information about SSTL-18 standards because SSTL-18 is the supported standard for
DDR2 memory interface by Altera FPGAs.
On-chip series (RS) termination is supported only on output and bidirectional buffers.
The value of RS with calibration is calibrated against a 25- resistor for class II and
50- resistor for class I connected to RUP and RDN pins and adjusted to ± 1% of 25  or
50 . On-chip parallel (RT) termination is supported only on inputs and bidirectional
buffers. The value of RT is calibrated against 100  connected to the RUP and RDN
pins. Calibration occurs at the end of device configuration. Dynamic OCT is
supported only on bidirectional I/O buffers.
Table 4–1. On-Chip Termination Schemes
FPGA Device
Termination
Scheme
On-Chip
Series
Termination
without
Calibration
On-Chip
Series
Termination
with
Calibration
On-Chip
Parallel
Termination
with
Calibration
Arria II GX
Arria II GZ
Arria V
Cyclone III
and
Cyclone IV
Cyclone V
Stratix III
and
Stratix IV
Stratix V (1)
Column and
Row I/O
Column and
Row I/O
Column and
Row I/O
Column and
Row I/O
Column
and Row
I/O
Column
and Row
I/O
Column I/O
Class I
50
50
50
50
50
50
50
Class II
25
25
25
25
25
25
25
Class I
50
50
50
50
50
50
50
Class II
25
25
25
25
25
25
25
Class I
and
Class II
—
50
50
—
50
50
50
SSTL-18
Note to Table 4–1:
(1) Row I/O is not available for external memory interfaces in Stratix V devices.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–10
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
The dynamic OCT scheme is only available in Stratix III, Stratix IV, and Stratix V
FPGAs. The dynamic OCT scheme enables series termination (RS) and parallel
termination (RT) to be dynamically turned on and off during the data transfer.
The series and parallel terminations are turned on or off depending on the read and
write cycle of the interface. During the write cycle, the RS is turned on and the RT is
turned off to match the line impedance. During the read cycle, the RS is turned off and
the RT is turned on as the Stratix III FPGA implements the far-end termination of the
bus (Figure 4–7).
Figure 4–7. Dynamic OCT for Memory Interfaces
Write Cycle
Read Cycle
VTT
VTT
Z0 = 50 Ω
Z0 = 50 Ω
22 Ω
OE
Stratix III (TX)
VTT
22 Ω
OE
DDR2 DIMM
Stratix III (RX)
DDR2 DIMM
Recommended Termination Schemes
Table 4–2 provides the recommended termination schemes for major DDR2 memory
interface signals. Signals include data (DQ), data strobe (DQS/DQSn), data mask
(DM), clocks (mem_clk/mem_clk_n), and address and command signals.
When interfacing with multiple DDR2 SDRAM components where the address,
command, and memory clock pins are connected to more than one load, follow these
steps:
1. Simulate the system to get the new slew-rate for these signals.
2. Use the derated tIS and tIH specifications from the DDR2 SDRAM datasheet based
on the simulation results.
3. If timing deration causes your interface to fail timing requirements, consider
signal duplication of these signals to lower their loading, and hence improve
timing.
1
Altera uses Class I and Class II termination in this table to refer to drive strength, and
not physical termination.
1
You must simulate your design for your system to ensure correct functionality.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Table 4–2. Termination Recommendations (Part 1 of 4)
Device Family
4–11
(1)
SSTL 18 IO Standard
(2), (3), (4), (5), (6)
FPGA-End
Discrete
Termination
Class I R50 CAL
50  Parallel to
VTT discrete
ODT75
(7)
HALF
(8)
DIFF Class R50 CAL
50  Parallel to
VTT discrete
ODT75
(7)
HALF
(8)
DQS SE (12)
Class I R50 CAL
50  Parallel to
VTT discrete
ODT75
(7)
HALF
(8)
DM
Class I R50 CAL
N/A
ODT75
(7)
Class I MAX
N/A
Signal Type
Memory-End
Termination 1
(Rank/DIMM)
Memory
I/O
Standard
Arria II GX
DQ
DQS DIFF (13)
DDR2 component
Address and
command
56  parallel to VTT
discrete
N/A
N/A
×1 = 100  differential
(10)
Clock
DIFF Class I R50 CAL
N/A
×2 = 200  differential
N/A
(11)
Class I R50 CAL
50  Parallel to
VTT discrete
ODT75
(7)
FULL
(9)
DIFF Class I R50 CAL
50  Parallel to
VTT discrete
ODT75
(7)
FULL
(9)
DQS SE (12)
Class I R50 CAL
50  Parallel to
VTT discrete
ODT75
(7)
FULL
(9)
DM
Class I R50 CAL
N/A
ODT75
(7)
Class I MAX
N/A
56  parallel to VTT
discrete
N/A
DIFF Class I R50 CAL
N/A
N/A = on DIMM
N/A
DQ
Class I R50/P50 DYN CAL
N/A
ODT75 (7)
HALF (8)
DQS DIFF (13)
DIFF Class I R50/P50 DYN
CAL
N/A
ODT75 (7)
HALF (8)
DQS SE (12)
Class I R50/P50 DYN CAL
N/A
ODT75 (7)
HALF (8)
Class I R50 CAL
N/A
ODT75 (7)
N/A
Class I MAX
N/A
56  parallel to VTT
discrete
N/A
DIFF Class I R50 NO CAL
N/A
DQ
DQS DIFF (13)
DDR2 DIMM
Address and
command
Clock
N/A
Arria V and Cyclone V
DDR2 component
DM
Address and
command
Clock
November 2011
Altera Corporation
×1 = 100  differential (10)
×2 = 200  differential (11)
N/A
External Memory Interface Handbook
Volume 2: Design Guidelines
4–12
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Table 4–2. Termination Recommendations (Part 2 of 4)
Device Family
DDR2 DIMM
(1)
SSTL 18 IO Standard
(2), (3), (4), (5), (6)
FPGA-End
Discrete
Termination
Memory-End
Termination 1
(Rank/DIMM)
Memory
I/O
Standard
DQ
Class I R50/P50 DYN CAL
N/A
ODT75 (7)
FULL (9)
DQS DIFF (13)
DIFF Class I R50/P50 DYN
CAL
N/A
ODT75 (7)
FULL (9)
DQS SE (12)
Class I R50/P50 DYN CAL
N/A
ODT75 (7)
FULL (9)
Class I R50 CAL
N/A
ODT75 (7)
N/A
Class I MAX
N/A
56  parallel to VTT
discrete
N/A
DIFF Class I R50 NO CAL
N/A
N/A = on DIMM
N/A
Signal Type
DM
Address and
command
Clock
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Table 4–2. Termination Recommendations (Part 3 of 4)
Device Family
4–13
(1)
SSTL 18 IO Standard
(2), (3), (4), (5), (6)
FPGA-End
Discrete
Termination
DQ/DQS
Class I 12 mA
50  Parallel to
VTT discrete
DM
Class I 12 mA
N/A
Class I MAX
N/A
Signal Type
Memory-End
Termination 1
(Rank/DIMM)
Memory
I/O
Standard
Cyclone III and Cyclone IV
DDR2 component
Address and
command
ODT75
(7)
56  parallel to VTT
discrete
HALF
(8)
N/A
N/A
×1 = 100  differential
(10)
Clock
Class I 12 mA
N/A
×2 = 200  differential
N/A
(11)
DDR2 DIMM
DQ/DQS
Class I 12 mA
50  Parallel to
VTT discrete
DM
Class I12 mA
N/A
Address and
command
Class I MAX
N/A
56  parallel to VTT
discrete
Class I 12 mA
N/A
N/A = on DIMM
Clock
ODT75
(7)
FULL
(9)
N/A
N/A
N/A
Arria II GZ, Stratix III, Stratix IV, and Stratix V
DDR2 component
DQ
Class I R50/P50 DYN CAL
N/A
ODT75
(7)
HALF
(8)
DQS DIFF (13)
DIFF Class I R50/P50 DYN
CAL
N/A
ODT75
(7)
HALF
(8)
DQS SE (12)
DIFF Class I R50/P50 DYN
CAL
N/A
ODT75
(7)
HALF
(8)
Class I R50 CAL
N/A
ODT75
(7)
Class I MAX
N/A
DM
Address and
command
56  Parallel to VTT
discrete
N/A
N/A
x1 = 100  differential
(10)
Clock
DIFF Class I R50 NO CAL
N/A
x2 = 200  differential
N/A
(11)
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–14
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Table 4–2. Termination Recommendations (Part 4 of 4)
SSTL 18 IO Standard
(2), (3), (4), (5), (6)
FPGA-End
Discrete
Termination
DQ
Class I R50/P50 DYN CAL
N/A
ODT75
(7)
FULL
(9)
DQS DIFF (13)
DIFF Class I R50/P50 DYN
CAL
N/A
ODT75
(7)
FULL
(9)
DQS SE (12)
Class I R50/P50 DYN CAL
N/A
ODT75
(7)
FULL
(9)
Class I R50 CAL
N/A
ODT75
(7)
Device Family
DDR2 DIMM
(1)
Signal Type
DM
Address and
command
Clock
Memory-End
Termination 1
(Rank/DIMM)
Class I MAX
N/A
56  Parallel to VTT
discrete
DIFF Class I R50 NO CAL
N/A
N/A = on DIMM
Memory
I/O
Standard
N/A
N/A
N/A
Notes to Table 4–2:
(1) N/A is not available.
(2) R is series resistor.
(3) P is parallel resistor.
(4) DYN is dynamic OCT.
(5) NO CAL is OCT without calibration.
(6) CAL is OCT with calibration.
(7) ODT75 vs. ODT50 on the memory has the effect of opening the eye more, with a limited increase in overshoot/undershoot.
(8) HALF is reduced drive strength.
(9) FULL is full drive strength.
(10) x1 is a single-device load.
(11) x2 is two-device load. For example, you can feed two out of nine devices on a single rank DIMM with a single clock pair.
(12) DQS SE is single-ended DQS.
(13) DQS DIFF is differential DQS
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–15
Dynamic On-Chip Termination
The termination schemes are described in JEDEC standard JESD8-15a for
SSTL 18 I/O. Dynamic OCT is available in Stratix III and Stratix IV. When the
Stratix III FPGA (driver) is writing to the DDR2 SDRAM DIMM (receiver), series OCT
is enabled dynamically to match the impedance of the transmission line. As a result,
reflections are significantly reduced. Similarly, when the FPGA is reading from the
DDR2 SDRAM DIMM, the parallel OCT is dynamically enabled.
f For information about setting the proper value for termination resistors, refer to the
Stratix III Device I/O Features chapter in the Stratix III Device Handbook and the I/O
Features in Stratix IV Devices chapter in the Stratix IV Device Handbook.
FPGA Writing to Memory
Figure 4–8 shows dynamic series OCT scheme when the FPGA is writing to the
memory. The benefit of using dynamic series OCT is that when driver is driving the
transmission line, it “sees” a matched transmission line with no external resistor
termination.
Figure 4–8. Dynamic Series OCT Scheme with ODT on the Memory
FPGA
DDR2 DIMM
DDR2 Component
50 Ω
Driver
100 Ω
50 Ω
100 Ω
Altera Corporation
150 Ω
3” Trace Length
Receiver
November 2011
RS = 22 Ω
Driver
Receiver
150 Ω
External Memory Interface Handbook
Volume 2: Design Guidelines
4–16
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Figure 4–9 and Figure 4–10 show the simulation and measurement results of a write
to the DDR2 SDRAM DIMM. The system uses Class I termination with a 50- series
OCT measured at the FPGA with a full drive strength and a 75  ODT at the DIMM.
Both simulation and bench measurements are in 200 pS/div and 200 mV/div.
Figure 4–9. HyperLynx Simulation FPGA Writing to Memory
Figure 4–10. Board Measurement, FPGA Writing to Memory
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–17
Table 4–3 lists the comparison between the simulation and the board measurement of
the signal seen at the DDR2 SDRAM DIMM.
Table 4–3. Signal Comparison When the FPGA is Writing to the Memory
Eye Width (ns)
(2)
(1)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Simulation
1.194
0.740
N/A
N/A
Board Measurement
1.08
0.7
N/A
N/A
Notes to Table 4–3:
(1) N/A is not applicable.
(2) The eye width is measured from VIH/VIL(ac) = VREF ±250 mV to VIH/VIL(dc) = VREF ±125 mV, where VIH and VI L
are determined per the JEDEC specification for SSTL-18.
The data in Table 4–3 and Figure 4–9 and Figure 4–10 suggest that when the FPGA is
writing to the memory, the bench measurements are closely matched with simulation
measurements. They indicate that using the series dynamic on-chip termination
scheme for your bidirectional I/Os maintains the integrity of the signal, while it
removes the need for external termination.
Depending on the I/O standard, you should consider the four parameters listed in
Table 4–3 when designing a memory interface. Although the simulation and board
measurement appear to be similar, there are some discrepancies when the key
parameters are measured. Although simulation does not fully model the duty cycle
distortion of the I/O, crosstalk, or board power plane degradation, it provides a good
indication on the performance of the board.
For memory interfaces, the eye width is important when determining if there is a
sufficient window to correctly capture the data. Regarding the eye height, even
though most memory interfaces use voltage-referenced I/O standards (in this case,
SSTL-18), as long as there is sufficient eye opening below and above VIL and VIH,
there should be enough margin to correctly capture the data. However, because effects
such as crosstalk are not taken into account, it is critical to design a system to achieve
the optimum eye height, because it impacts the overall margin of a system with a
memory interface.
f Refer to the memory vendors when determining the over- and undershoot. They
typically specify a maximum limit on the input voltage to prevent reliability issues.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–18
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
FPGA Reading from Memory
Figure 4–11 shows the dynamic parallel termination scheme when the FPGA is
reading from memory. When the DDR2 SDRAM DIMM is driving the transmission
line, the ringing and reflection is minimal because the FPGA-side termination 50-
pull-up resistor is matched with the transmission line. Figure 4–12 shows the
simulation and measurement results of a read from DDR2 SDRAM DIMM. The
system uses Class I termination with a 50- calibrated parallel OCT measured at the
FPGA end with a full drive strength and a 75- ODT at the memory. Both simulation
and bench measurements are in 200 pS/div and 200 mV/div.
Figure 4–11. Dynamic Parallel OCT Scheme with Memory-Side Series Resistor
FPGA
DDR2 DIMM Full Strength
DDR2 Component
Driver
100 Ω
50 Ω
Receiver
Driver
RS = 22 Ω
3” Trace Length
Receiver
100 Ω
Figure 4–12. Hyperlynx Simulation and Board Measurement, FPGA Reading from Memory
Table 4–4 lists the comparison between the simulation and the board measurement of
the signal seen at the FPGA end.
Table 4–4. Signal Comparison When the FPGA is Reading from the Memory (1),
Eye Width (ns)
(3)
(2)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Simulation
1.206
0.740
N/A
N/A
Board Measurement
1.140
0.680
N/A
N/A
Notes to Table 4–4:
(1) The drive strength on the memory DIMM is set to Full.
(2) N/A is not applicable.
(3) The eye width is measured from VIH/VIL(ac) = VREF ±250 mV to VIH/VIL(dc) = VREF ±125 mV, in which VIH and VIL
are determined per the JEDEC specification for SSTL-18.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–19
The data in Table 4–4 and Figure 4–13 suggest that bench measurements are closely
matched with simulation measurements when the FPGA is reading from the memory.
They indicate that using the parallel dynamic on-chip termination scheme in
bidirectional I/Os maintains the integrity of the signal, while it removes the need for
external termination.
On-Chip Termination (Non-Dynamic)
When you use the 50- OCT feature in a Class I termination scheme using ODT with
a memory-side series resistor, the output driver is tuned to 50 , which matches the
characteristic impedance of the transmission line. Figure 4–13 shows the Class I
termination scheme using ODT when the 50- OCT on the FPGA is turned on.
Figure 4–13. Class I Termination Using ODT with 50- OCT
FPGA
DDR2 DIMM
DDR2 Component
50 Ω
Driver
RS = 22 Ω
50 Ω
3” Trace Length
Receiver
VREF
VREF = 0.9 V
Driver
Receiver
The resulting signal quality has a similar eye opening to the 8 mA drive strength
setting (refer to “Drive Strength” on page 4–53) without any over- or undershoot.
Figure 4–14 shows the simulation and measurement of the signal at the memory side
(DDR2 SDRAM DIMM) with the drive strength setting of 50- OCT in the FPGA.
Figure 4–14. HyperLynx Simulation and Measurement, FPGA Writing to Memory
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–20
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Table 4–5 lists data for the signal at the DDR2 SDRAM DIMM of a Class I scheme
termination using ODT with a memory-side series resistor. The FPGA is writing to the
memory with 50- OCT.
Table 4–5. Simulation and Board Measurement Results for 50-OCT and8-mA Drive Strength
Settings (1)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
50- OCT Drive Strength Setting
Simulation
1.68
0.82
N/A
N/A
Board Measurement
1.30
0.70
N/A
N/A
Note to Table 4–5:
(1) N/A is not applicable.
When you use the 50- OCT setting on the FPGA, the signal quality for the Class I
termination using ODT with a memory-side series resistor is further improved with
lower over- and undershoot.
In addition to the 50- OCT setting, Stratix II devices have a 25- OCT setting that
you can use to improve the signal quality in a Class II terminated transmission line.
Figure 4–15 shows the Class II termination scheme using ODT when the 25- OCT on
the FPGA is turned on.
Figure 4–15. Class II Termination Using ODT with 25- OCT
VTT = 0.9 V
FPGA
DDR2 DIMM
RT = 56 Ω
25 Ω
DDR2 Component
Driver
50 Ω
3” Trace Length
Receiver
VREF
RS = 22 Ω
VREF = 0.9 V
Driver
Receiver
Figure 4–16 shows the simulation and measurement of the signal at the DDR2
SDRAM DIMM (receiver) with a drive strength setting of 25- OCT in the FPGA.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–21
Figure 4–16. HyperLynx Simulation and Measurement, FPGA Writing to Memory
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–22
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Table 4–6 lists the data for the signal at the DDR2 SDRAM DIMM of a Class II
termination with a memory-side series resistor. The FPGA is writing to the memory
with 25- OCT.
Table 4–6. Simulation and Board Measurement Results for 25-OCT and16-mA Drive Strength
Settings (1)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
25- OCT Drive Strength Setting
Simulation
1.70
0.81
N/A
N/A
Board Measurement
1.47
0.51
N/A
N/A
Note to Table 4–6:
(1) N/A is not applicable.
This type of termination scheme is only used for bidirectional signals, such as data
(DQ), data strobe (DQS), data mask (DM), and memory clocks (CK) found in DRAMs.
Class II External Parallel Termination
The double parallel (Class II) termination scheme is described in JEDEC standards
JESD8-6 for HSTL I/O, JESD8-9b for SSTL-2 I/O, and JESD8-15a for SSTL-18 I/O.
When the FPGA (driver) is writing to the DDR2 SDRAM DIMM (receiver), the
transmission line is terminated at the DDR2 SDRAM DIMM. Similarly, when the
FPGA is reading from the DDR2 SDRAM DIMM, the DDR2 SDRAM DIMM is now
the driver and the transmission line is terminated at the FPGA (receiver). This type of
termination scheme is typically used for bidirectional signals, such as data (DQ) and
data strobe (DQS) signal found in DRAMs.
FPGA Writing to Memory
Figure 4–17 shows the Class II termination scheme when the FPGA is writing to the
memory. The benefit of using Class II termination is that when either driver is driving
the transmission line, it sees a matched transmission line because of the termination
resistor at the receiver-end, thereby reducing ringing and reflection.
Figure 4–17. Class-II Termination Scheme with Memory-Side Series Resistor
VTT = 0.9 V
VTT = 0.9 V
FPGA
DDR2 DIMM
RT = 50 Ω
RT = 50 Ω
DDR2 Component
16 mA
Driver
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
RS = 22 Ω
50 Ω
VREF
3” Trace Length
VREF = 0.9 V
Driver
Receiver
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–23
Figure 4–18 and Figure 4–19 show the simulation and measurement result of a write
to the DDR2 SDRAM DIMM. The system uses Class II termination with a sourceseries resistor measured at the DIMM with a drive strength setting of 16 mA.
Figure 4–18. HyperLynx Simulation, FPGA Writing to Memory
The simulation shows a clean signal with a good eye opening, but there is slight overand undershoot of the 1.8-V signal specified by DDR2 SDRAM. The over- and
undershoot can be attributed to either overdriving the transmission line using a
higher than required drive strength setting on the driver or the over-termination on
the receiver side by using an external resistor value that is higher than the
characteristic impedance of the transmission line. As long as the over- and undershoot
do not exceed the absolute maximum rating specification listed in the memory
vendor’s DDR2 SDRAM data sheet, it does not result in any reliability issues. The
simulation results are then correlated with actual board level measurements.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–24
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Figure 4–19 shows the measurement obtained from the Stratix II Memory Board 2.
The FPGA is using a 16 mA drive strength to drive the DDR2 SDRAM DIMM on a
Class II termination transmission line.
Figure 4–19. Board Measurement, FPGA Writing to Memory
Table 4–7 lists the comparison between the simulation and the board measurement of
the signal seen at the DDR2 SDRAM DIMM.
Table 4–7. Signal Comparison When the FPGA is Writing to the Memory (1)
Eye Width (ns)
(2)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Simulation
1.65
1.28
0.16
0.14
Board Measurement
1.35
0.83
0.16
0.18
Notes to Table 4–7:
(1) The drive strength on the FPGA is set to 16 mA.
(2) The eye width is measured from VREF ± 125 mV where VIH and VIL are determined per the JEDEC specification for
SSTL-18.
A closer inspection of the simulation shows an ideal duty cycle of 50%–50%, while the
board measurement shows that the duty cycle is non-ideal, around 53%–47%,
resulting in the difference between the simulation and measured eye width. In
addition, the board measurement is conducted on a 72-bit memory interface, but the
simulation is performed on a single I/O.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–25
FPGA Reading from Memory
Figure 4–20 shows the Class II termination scheme when the FPGA is reading from
memory. When the DDR2 SDRAM DIMM is driving the transmission line, the ringing
and reflection is minimal because of the matched FPGA-side termination pull-up
resistor with the transmission line.
Figure 4–20. Class II Termination Scheme with Memory-Side Series Resistor
VTT = 0.9 V
VTT = 0.9 V
FPGA
DDR2 DIMM Full Strength
RT = 56 Ω
Driver
Receiver
November 2011
Altera Corporation
RT = 56 Ω
Driver
50 Ω
VREF = 0.9 V
3” Trace Length
VREF
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
4–26
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Figure 4–21 and Figure 4–22 show the simulation and measurement, respectively, of
the signal at the FPGA side with the full drive strength setting on the DDR2 SDRAM
DIMM. The simulation uses a Class II termination scheme with a source-series
resistor transmission line. The FPGA is reading from the memory with a full drive
strength setting on the DIMM.
Figure 4–21. HyperLynx Simulation, FPGA Reading from Memory
Figure 4–22. Board Measurement, FPGA Reading from Memory
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–27
Table 4–8 lists the comparison between the simulation and board measurements of the
signal seen by the FPGA when the FPGA is reading from memory (driver).
Table 4–8. Signal Comparison, FPGA is Reading from Memory (1),
(2)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Simulation
1.73
0.76
N/A
N/A
Board Measurement
1.28
0.43
N/A
N/A
Notes to Table 4–8:
(1) The drive strength on the DDR2 SDRAM DIMM is set to full strength.
(2) N/A is not applicable.
Both simulation and measurement show a clean signal and a good eye opening
without any over- and undershoot. However, the eye height when the FPGA is
reading from the memory is smaller compared to the eye height when the FPGA is
writing to the memory. The reduction in eye height is attributed to the voltage drop
on the series resistor present on the DIMM. With the drive strength setting on the
memory already set to full, you cannot increase the memory drive strength to
improve the eye height. One option is to remove the series resistor on the DIMM
when the FPGA is reading from memory (refer to the section “Component Versus
DIMM” on page 4–55). Another option is to remove the external parallel resistor near
the memory so that the memory driver sees less loading. For a DIMM configuration,
the latter option is a better choice because the series resistors are part of the DIMM
and you can easily turn on the ODT feature to use as the termination resistor when the
FPGA is writing to the memory and turn off when the FPGA is reading from memory.
The results for the Class II termination scheme demonstrate that the scheme is ideal
for bidirectional signals such as data strobe and data for DDR2 SDRAM memory.
Terminations at the receiver eliminate reflections back to the driver and suppress any
ringing at the receiver.
Class I External Parallel Termination
The single parallel (Class I) termination scheme refers to when the termination is
located near the receiver side. Typically, this scheme is used for terminating
unidirectional signals (such as clocks, address, and command signals) for DDR2
SDRAM.
However, because of board constraints, this form of termination scheme is sometimes
used in bidirectional signals, such as data (DQ) and data strobe (DQS) signals. For
bidirectional signals, you can place the termination on either the memory or the FPGA
side. This section focuses only on the Class I termination scheme with memory-side
termination. The memory-side termination ensures impedance matching when the
signal reaches the receiver of the memory. However, when the FPGA is reading from
the memory, there is no termination on the FPGA side, resulting in impedance
mismatch. This section describes the signal quality of this termination scheme.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–28
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
FPGA Writing to Memory
When the FPGA is writing to the memory (Figure 4–23), the transmission line is
parallel-terminated at the memory side, resulting in minimal reflection on the receiver
side because of the matched impedance seen by the transmission line. The benefit of
this termination scheme is that only one external resistor is required. Alternatively,
you can implement this termination scheme using an ODT resistor instead of an
external resistor.
Refer to the section “Class I Termination Using ODT” on page 4–31 for more
information about how an ODT resistor compares to an external termination resistor.
Figure 4–23. Class I Termination Scheme with Memory-Side Series Resistor
VTT = 0.9 V
FPGA
DDR2 DIMM
RT = 56 Ω
Driver
Receiver
RS = 22 Ω
50 Ω
VREF
3” Trace Length
DDR2 Component
VREF = 0.9 V
Driver
Receiver
Figure 4–24 shows the simulation and measurement of the signal at the memory
(DDR2 SDRAM DIMM) of Class I termination with a memory-side resistor. The FPGA
writes to the memory with a 16 mA drive strength setting.
Figure 4–24. HyperLynx Simulation and Board Measurement, FPGA Writing to Memory
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–29
Table 4–9 lists the comparison of the signal at the DDR2 SDRAM DIMM of a Class I
and Class II termination scheme using external resistors with memory-side series
resistors. The FPGA (driver) writes to the memory (receiver).
Table 4–9. Signal Comparison When the FPGA is Writing to Memory (1)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Class I Termination Scheme With External Parallel Resistor
Simulation
1.69
1.51
0.34
0.29
Board Measurement
1.25
1.08
0.41
0.34
Class II Termination Scheme With External Parallel Resistor
Simulation
1.65
1.28
0.16
0.14
Board Measurement
1.35
0.83
0.16
0.18
Note to Table 4–9:
(1) The drive strength on the FPGA is set to 16 mA.
Table 4–9 lists the overall signal quality of a Class I termination scheme is comparable
to the signal quality of a Class II termination scheme, except that the eye height of the
Class I termination scheme is approximately 30% larger. The increase in eye height is
due to the reduced loading “seen” by the driver, because the Class I termination
scheme does not have an FPGA-side parallel termination resistor. However, increased
eye height comes with a price: a 50% increase in the over- and undershoot of the
signal using Class I versus Class II termination scheme. You can decrease the FPGA
drive strength to compensate for the decreased loading seen by the driver to decrease
the over- and undershoot.
For more information about how drive strength affects the signal quality, refer to
“Drive Strength” on page 4–53.
FPGA Reading from Memory
As described in the section “FPGA Writing to Memory” on page 4–28, in Class I
termination, the termination is located near the receiver. However, if you use this
termination scheme to terminate a bidirectional signal, the receiver can also be the
driver. For example, in DDR2 SDRAM, the data signals are both receiver and driver.
Figure 4–25 shows a Class I termination scheme with a memory-side resistor. The
FPGA reads from the memory.
Figure 4–25. Class I Termination Scheme with Memory-Side Series Resistor
VTT = 0.9 V
FPGA
DDR2 DIMM Full Strength
RT = 56 Ω
Driver
Receiver
November 2011
Altera Corporation
DDR2 Component
RS = 22 Ω
50 Ω
VREF = 0.9 V
3” Trace Length
VREF
Driver
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
4–30
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
When the FPGA reads from the memory (Figure 4–25), the transmission line is not
terminated at the FPGA, resulting in an impedance mismatch, which then results in
over- and undershoot. Figure 4–26 shows the simulation and measurement of the
signal at the FPGA side (receiver) of a Class I termination. The FPGA reads from the
memory with a full drive strength setting on the DDR2 SDRAM DIMM.
Figure 4–26. HyperLynx Simulation and Board Measurement, FPGA Reading from Memory
Table 4–10 lists the comparison of the signal “seen” at the FPGA of a Class I and
Class II termination scheme using an external resistor with a memory-side series
resistor. The FPGA (receiver) reads from the memory (driver).
Table 4–10. Signal Comparison When the FPGA is Reading From Memory (1),
Eye Width (ns)
Eye Height (V)
(2)
Overshoot (V)
Undershoot (V)
Class I Termination Scheme with External Parallel Resistor
Simulation
1.73
0.74
0.20
0.18
Board Measurement
1.24
0.58
0.09
0.14
Class II Termination Scheme with External Parallel Resistor
Simulation
1.73
0.76
N/A
N/A
Board Measurement
1.28
0.43
N/A
N/A
Notes to Table 4–10:
(1) The drive strength on the DDR2 SDRAM DIMM is set to full strength.
(2) N/A is not applicable.
When the FPGA reads from the memory using the Class I scheme, the signal quality is
comparable to that of the Class II scheme, in terms of the eye height and width.
Table 4–10 shows the lack of termination at the receiver (FPGA) results in impedance
mismatch, causing reflection and ringing that is not visible in the Class II termination
scheme. As such, Altera recommends using the Class I termination scheme for
unidirectional signals (such as command and address signals), between the FPGA and
the memory.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–31
Class I Termination Using ODT
Presently, ODT is becoming a common feature in memory, including SDRAMs,
graphics DRAMs, and SRAMs. ODT helps reduce board termination cost and
simplify board routing. This section describes the ODT feature of DDR2 SDRAM and
the signal quality when the ODT feature is used.
FPGA Writing to Memory
DDR2 SDRAM has built-in ODT that eliminates the need for external termination
resistors. To use the ODT feature of the memory, you must configure the memory to
turn on the ODT feature during memory initialization. For DDR2 SDRAM, set the
ODT feature by programming the extended mode register. In addition to
programming the extended mode register during initialization of the DDR2 SDRAM,
an ODT input pin on the DDR2 SDRAM must be driven high to activate the ODT.
f For additional information about setting the ODT feature and the timing
requirements for driving the ODT pin in DDR2 SDRAM, refer to the respective
memory data sheet
The ODT feature in DDR2 SDRAM is controlled dynamically—it is turned on while
the FPGA is writing to the memory and turned off while the FPGA is reading from the
memory. The ODT feature in DDR2 SDRAM has three settings: 50, 75, and 150.
If there are no external parallel termination resistors and the ODT feature is turned on,
the termination scheme resembles the Class I termination described in “Class I
External Parallel Termination” on page 4–27.
Figure 4–27 shows the termination scheme when the ODT on the DDR2 SDRAM is
turned on.
Figure 4–27. Class I Termination Scheme Using ODT
FPGA
DDR2 DIMM
DDR2 Component
16 mA
Driver
Receiver
November 2011
Altera Corporation
50 Ω
VREF
3” Trace Length
RS = 22 Ω
VREF = 0.9 V
Driver
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
4–32
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Figure 4–28 shows the simulation and measurement of the signal visible at the
memory (receiver) using 50  ODT with a memory-side series resistor transmission
line. The FPGA writes to the memory with a 16 mA drive strength setting.
Figure 4–28. Simulation and Board Measurement, FPGA Writing to Memory
Table 4–11 lists the comparisons of the signal seen the DDR2 SDRAM DIMM of a
Class I termination scheme using an external resistor and a Class I termination
scheme using ODT with a memory-side series resistor. The FPGA (driver) writes to
the memory (receiver).
Table 4–11. Signal Comparison When the FPGA is Writing to Memory
Eye Width (ns)
(1), (2)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Class I Termination Scheme with ODT
Simulation
1.63
0.84
N/A
0.12
Board Measurement
1.51
0.76
0.05
0.15
Class I Termination Scheme with External Parallel Resistor
Simulation
1.69
1.51
0.34
0.29
Board Measurement
1.25
1.08
0.41
0.34
Notes to Table 4–11:
(1) The drive strength on the FPGA is set to 16 mA.
(2) N/A is not applicable.
When the ODT feature is enabled in the DDR2 SDRAM, the eye width is improved.
There is some degradation to the eye height, but it is not significant. When ODT is
enabled, the most significant improvement in signal quality is the reduction of the
over- and undershoot, which helps mitigate any potential reliability issues on the
memory devices.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–33
Using memory ODT also eliminates the need for external resistors, which reduces
board cost and simplifies board routing, allowing you to shrink your boards.
Therefore, Altera recommends using the ODT feature on the DDR2 SDRAM memory.
FPGA Reading from Memory
Altera’s Arria GX, Arria II GX, Cyclone series, and Stratix II series of devices are not
equipped with parallel ODT. When the DDR2 SDRAM ODT feature is turned off
when the FPGA is reading from the memory, the termination scheme resembles the
no-parallel termination scheme illustrated by Figure 4–31 on page 4–35.
No-Parallel Termination
The no-parallel termination scheme is described in the JEDEC standards JESD8-6 for
HSTL I/O, JESD8-9b for SSTL-2 I/O, and JESD8-15a for SSTL-18 I/O. Designers who
attempt series-only termination schemes such as this often do so to eliminate the need
for a VTT power supply.
This is typically not recommended for any signals between an FPGA and DDR2
interface; however, information about this topic is included here as a reference point
to clarify the challenges that may occur if you attempt to avoid parallel termination
entirely.
FPGA Writing to Memory
Figure 4–29 shows a no-parallel termination transmission line of the FPGA driving
the memory. When the FPGA is driving the transmission line, the signals at the
memory-side (DDR2 SDRAM DIMM) may suffer from signal degradation (for
example, degradation in rise and fall time). This is due to impedance mismatch,
because there is no parallel termination at the memory-side. Also, because of factors
such as trace length and drive strength, the degradation seen at the receiver-end
might be sufficient to result in a system failure. To understand the effects of each
termination scheme on a system, perform system-level simulations before and after
the board is designed.
Figure 4–29. No-Parallel Termination Scheme
FPGA
DDR2 DIMM
DDR2 Component
Driver
Receiver
November 2011
Altera Corporation
50 Ω
VREF
3” Trace Length
RS = 22 Ω
VREF = 0.9 V
Driver
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
4–34
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Figure 4–30 shows a HyperLynx simulation and measurement of the FPGA writing to
the memory at 533 MHz with a no-parallel termination scheme using a 16 mA drive
strength option. The measurement point is on the DDR2 SDRAM DIMM.
Figure 4–30. HyperLynx Simulation and Board Measurement, FPGA Writing to Memory
The simulated and measured signal shows that there is sufficient eye opening but also
significant over- and undershoot of the 1.8-V signal specified by the DDR2 SDRAM.
From the simulation and measurement, the overshoot is approximately 1 V higher
than 1.8 V, and undershoot is approximately 0.8 V below ground. This over- and
undershoot might result in a reliability issue, because it has exceeded the absolute
maximum rating specification listed in the memory vendors’ DDR2 SDRAM data
sheet.
Table 4–12 lists the comparison of the signal visible at the DDR2 SDRAM DIMM of a
no-parallel and a Class II termination scheme when the FPGA writes to the DDR2
SDRAM DIMM.
Table 4–12. Signal Comparison When the FPGA is Writing to Memory
Eye Width (ns)
(1)
Eye Height (V)
Overshoot (V)
Undershoot (V)
No-Parallel Termination Scheme
Simulation
1.66
1.10
0.90
0.80
Board Measurement
1.25
0.60
1.10
1.08
Class II Termination Scheme With External Parallel Resistor
Simulation
1.65
1.28
0.16
0.14
Board Measurement
1.35
0.83
0.16
0.18
Note to Table 4–12:
(1) The drive strength on the FPGA is set to Class II 16 mA.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–35
Although the appearance of the signal in a no-parallel termination scheme is not
clean, when you take the key parameters into consideration, the eye width and height
is comparable to that of a Class II termination scheme. The major disadvantage of
using a no-parallel termination scheme is the over- and undershoot. There is no
termination on the receiver, so there is an impedance mismatch when the signal
arrives at the receiver, resulting in ringing and reflection. In addition, the 16-mA drive
strength setting on the FPGA also results in overdriving the transmission line, causing
the over- and undershoot. By reducing the drive strength setting, the over- and
undershoot decreases and improves the signal quality “seen” by the receiver.
For more information about how drive strength affects the signal quality, refer to
“Drive Strength” on page 4–53.
FPGA Reading from Memory
In a no-parallel termination scheme (Figure 4–31), when the memory is driving the
transmission line, the resistor, RS acts as a source termination resistor. The DDR2
SDRAM driver has two drive strength settings:
■
Full strength, in which the output impedance is approximately 18
■
Reduced strength, in which the output impedance is approximately 40
When the DDR2 SDRAM DIMM drives the transmission line, the combination of the
22- source-series resistor and the driver impedance should match that of the
characteristic impedance of the transmission line. As such, there is less over- and
undershoot of the signal visible at the receiver (FPGA).
Figure 4–31. No-Parallel Termination Scheme, FPGA Reading from Memory
FPGA
DDR2 DIMM Full Strength
DDR2 Component
Driver
Receiver
November 2011
Altera Corporation
50 Ω
VREF = 0.9 V
3” Trace Length
RS = 22 Ω
VREFF
Driver
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
4–36
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
Figure 4–32 shows the simulation and measurement of the signal visible at the FPGA
(receiver) when the memory is driving the no-parallel termination transmission line
with a memory-side series resistor.
Figure 4–32. HyperLynx Simulation and Board Measurement, FPGA Reading from Memory
Table 4–13 lists the comparison of the signal seen on the FPGA with a no-parallel and
a Class II termination scheme when the FPGA is reading from memory.
Table 4–13. Signal Comparison, FPGA Reading From Memory (1),
Eye Width (ns)
Eye Height (V)
(2)
Overshoot (V)
Undershoot (V)
No-Parallel Termination Scheme
Simulation
1.82
1.57
0.51
0.51
Board Measurement
1.62
1.29
0.28
0.37
Class II Termination Scheme with External Parallel Resistor
Simulation
1.73
0.76
N/A
N/A
Board Measurement
1.28
0.43
N/A
N/A
Notes to Table 4–13:
(1) The drive strength on the DDR2 SDRAM DIMM is set to full strength.
(2) N/A is not applicable.
As in the section “FPGA Writing to Memory” on page 4–33, the eye width and height
of the signal in a no-parallel termination scheme is comparable to a Class II
termination scheme, but the disadvantage is the over- and undershoot. There is overand undershoot because of the lack of termination on the transmission line, but the
magnitude of the over- and undershoot is not as severe when compared to that
described in “FPGA Writing to Memory” on page 4–33. This is attributed to the
presence of the series resistor at the source (memory side), which dampens any
reflection coming back to the driver and further reduces the effect of the reflection on
the FPGA side.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR2 SDRAM
4–37
When the memory-side series resistor is removed (Figure 4–33), the memory driver
impedance no longer matches the transmission line and there is no series resistor at
the driver to dampen the reflection coming back from the unterminated FPGA side.
Figure 4–33. No-Parallel Termination Scheme, FPGA REading from Memory
FPGA
DDR2 Component Full Strength
Driver
Receiver
Driver
50 Ω
VREF = 0.9 V
3” Trace Length
VREF
Receiver
Figure 4–34 shows the simulation and measurement of the signal at the FPGA side in
a no-parallel termination scheme with the full drive strength setting on the memory.
Figure 4–34. HyperLynx Simulation and Measurement, FPGA Reading from Memory
Table 4–14 lists the difference between no-parallel termination with and without
memory-side series resistor when the memory (driver) writes to the FPGA (receiver).
Table 4–14. No-Parallel Termination with and without Memory-Side Series Resistor (1)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Simulation
1.81
0.85
1.11
0.77
Board Measurement
1.51
0.92
0.96
0.99
Simulation
1.82
1.57
0.51
0.51
Board Measurement
1.62
1.29
0.28
0.37
Without Series Resistor
With Series Resistor
Note to Table 4–14:
(1) The drive strength on the memory is set to full drive strength.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–38
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
Table 4–14 highlights the effect of the series resistor on the memory side with the
dramatic increase in over- and undershoot and the decrease in the eye height. This
result is similar to that described in “FPGA Writing to Memory” on page 4–33. In that
simulation, there is a series resistor but it is located at the receiver side (memory-side),
so it does not have the desired effect of reducing the drive strength of the driver and
suppressing the reflection coming back from the unterminated receiver-end. As such,
in a system without receiver-side termination, the series resistor on the driver helps
reduce the drive strength of the driver and dampen the reflection coming back from
the unterminated receiver-end.
Board Termination for DDR3 SDRAM
The following sections describe the correct way to terminate a DDR3 SDRAM
interface together with Stratix III, Stratix IV, and Stratix V FPGA devices.
DDR3 DIMMs have terminations on all unidirectional signals, such as memory clocks,
and addresses and commands; thus eliminating the need for them on the FPGA PCB.
In addition, using the ODT feature on the DDR3 SDRAM and the dynamic OCT
feature of Stratix III, Stratix IV, and Stratix V FPGA devices completely eliminates any
external termination resistors; thus simplifying the layout for the DDR3 SDRAM
interface when compared to that of the DDR2 SDRAM interface.
This section describes the termination for the following DDR3 SDRAM components:
1
■
Single-Rank DDR3 SDRAM Unbuffered DIMM
■
Multi-Rank DDR3 SDRAM Unbuffered DIMM
■
DDR3 SDRAM Registered DIMM
■
DDR3 SDRAM Components With Leveling
If you are using a DDR3 SDRAM without leveling interface, refer to the “Board
Termination for DDR2 SDRAM” on page 4–7.
Single-Rank DDR3 SDRAM Unbuffered DIMM
The most common implementation of the DDR3 SDRAM interface is the unbuffered
DIMM (UDIMM). You can find DDR3 SDRAM UDIMMs in many applications,
especially in PC applications.
Table 4–15 lists the recommended termination and drive strength setting for UDIMM
and Stratix III, Stratix IV, and Stratix V FPGA devices.
1
These settings are just recommendations for you to get started. Simulate with real
board and try different settings to get the best SI.
Table 4–15. Drive Strength and ODT Setting Recommendations for Single-Rank UDIMM
Signal Type
SSTL 15 I/O Standard (1)
FPGA End
On-Board Termination (2)
Memory End
Termination for Write
Memory Driver
Strength for Read
DQ
Class I R50C/G50C (3)
—
60  ODT (4)
40  (4)
DQS
Differential Class I
R50C/G50C (3)
—
60 ODT (4)
40  (4)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–39
Table 4–15. Drive Strength and ODT Setting Recommendations for Single-Rank UDIMM
SSTL 15 I/O Standard (1)
Signal Type
FPGA End
On-Board Termination (2)
Memory End
Termination for Write
Memory Driver
Strength for Read
DM
Class I R50C (3)
—
60 ODT (4)
40  (4)
Address and
Command
Class I with maximum
drive strength
—
39 on-board termination to VTT (5)
On-board (5):
CK/CK#
Differential Class I R50C
—
2.2 pf compensation cap before the first
component; 36 termination to VTT for each
arm (72 differential); add 0.1 uF just before
VTT
For more information, refer to Figure 4–38 on
page 4–42.
Notes to Table 4–16:
(1) UniPHY IP automatically implements these settings.
(2) Altera recommends that you use dynamic on-chip termination (OCT) for Stratix III and Stratix IV device families.
(3) R50C is series with calibration for write, G50C is parallel 50 with calibration for read.
(4) You can specify these settings in the parameter editor.
(5) For DIMM, these settings are already implemented on the DIMM card; for component topology, Altera recommends that you mimic termination
scheme on the DIMM card on your board.
You can implement a DDR3 SDRAM UDIMM interface in several permutations, such
as single DIMM or multiple DIMMs, using either single-ranked or dual-ranked
UDIMMs. In addition to the UDIMM’s form factor, these termination
recommendations are also valid for small-outline (SO) DIMMs and MicroDIMMs.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–40
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
DQS, DQ, and DM for DDR3 SDRAM UDIMM
On a single-ranked DIMM, DQS, and DQ signals are point-to-point signals.
Figure 4–35 shows the net structure for differential DQS and DQ signals. There is an
external 15- stub resistor, RS, on each of the DQS and DQ signals soldered on the
DIMM, which helps improve signal quality by dampening reflections from unused
slots in a multi-DIMM configuration.
Figure 4–35. DQ and DQS Net Structure for 64-Bit DDR3 SDRAM UDIMM
(1)
(2)
(2)
Note to Figure 4–35:
(1) Source: PC3-6400/PC3-8500/PC3-10600/PC3-12800 DDR3 SDRAM Unbuffered DIMM Design Specification, July 2007, JEDEC Solid State
Technology Association. For clarity of the signal connections in the illustration, the same SDRAM is drawn as two separate SDRAMs.
As mentioned in “Dynamic ODT” on page 4–5, DDR3 SDRAM supports calibrated
ODT with different ODT value settings. If you do not enable dynamic ODT, there are
three possible ODT settings available for RTT_NORM: 40 , 60 , and 120 . If you
enable dynamic ODT, the number of possible ODT settings available for RTT_NORM
increases from three to five with the addition of 20  and 30 . Trace impedance on
the DIMM and the recommended ODT setting is 60 .
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–41
Figure 4–36 shows the simulated write-eye diagram at the DQ0 of a DDR3 SDRAM
DIMM using the 60- ODT setting, driven by a Stratix III or Stratix IV FPGA using a
calibrated series 50- OCT setting.
Figure 4–36. Simulated Write-Eye Diagram of a DDR3 SDRAM DIMM Using a 60- ODT Setting
Figure 4–37 shows the measured write eye diagram using Altera’s Stratix III or Stratix
IV memory board.
Figure 4–37. Measured Write-Eye Diagram of a DDR3 SDRAM DIMM Using the 60- ODT Setting
The measured eye diagram correlates well with the simulation. The faint line in the
middle of the eye diagram is the effect of the refresh operation during a regular
operation. Because these simulations and measurements are based on a narrow set of
constraints, you must perform your own board-level simulation to ensure that the
chosen ODT setting is right for your setup.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–42
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
Memory Clocks for DDR3 SDRAM UDIMM
For the DDR3 SDRAM UDIMM, you do not need to place any termination on your
board because the memory clocks are already terminated on the DIMM. Figure 4–38
shows the net structure for the memory clocks and the location of the termination
resistors, RTT. The value of RTT is 36  which results in an equivalent differential
termination value of 72 . The DDR3 SDRAM DIMM also has a compensation
capacitor, CCOMP of 2.2 pF, placed between the differential memory clocks to improve
signal quality. The recommended center-tap-terminated (CTT) value is 0.1 uF just
before VTT.
Figure 4–38. Clock Net Structure for a 64-Bit DDR3 SDRAM UDIMM
(1)
(2)
Note to Figure 4–38:
(1) Source: PC3-6400/PC3-8500/PC3-10600/PC3-12800 DDR3 SDRAM Unbuffered DIMM Design Specification, July 2007, JEDEC Solid State
Technology Association.
(2) The recommeded CTT value is 0.1 uF just before VTT..
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–43
From Figure 4–38, you can see that the DDR3 SDRAM clocks are routed in a fly-by
topology, as mentioned in “Read and Write Leveling” on page 4–3, resulting in the
need for write-and-read leveling. Figure 4–39 shows the HyperLynx simulation of the
differential clock seen at the die of the first and last DDR3 SDRAM component on the
UDIMM using the 50- OCT setting on the output driver of the Stratix III or Stratix IV
FPGA.
Figure 4–39. Differential Memory Clock of a DDR3 SDRAM DIMM at the First and Last Component on the DIMM
Figure 4–39 shows that the memory clock seen at the first DDR3 SDRAM component
(the yellow signal) leads the memory clock seen at the last DDR3 SDRAM component
(the green signal) by 1.3 ns, which is about 0.69 tCK for a 533 MHz operation.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–44
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
Commands and Addresses for DDR3 SDRAM UDIMM
Similar to memory clock signals, you do not need to place any termination on your
board because the command and address signals are also terminated on the DIMM.
Figure 4–40 shows the net structure for the command and address signals, and the
location of the termination resistor, RTT, which has an RTT value of 39 .
Figure 4–40. Command and Address Net Structure for a 64-Bit DDR3 SDRAM Unbuffered DIMM (1)
Note to Figure 4–40:
(1) Source: PC3-6400/PC3-8500/PC3-10600/PC3-12800 DDR3 SDRAM Unbuffered DIMM Design Specification, July 2007, JEDEC Solid State
Technology Association
In Figure 4–40, observe that the DDR3 SDRAM command and address signals are
routed in a fly-by topology, as mentioned in “Read and Write Leveling” on page 4–3,
resulting in the need for write-and-read leveling.
Figure 4–41 shows the HyperLynx simulation of the command and address signal
seen at the die of the first and last DDR3 SDRAM component on the UDIMM, using
an OCT setting on the output driver of the Stratix III or Stratix IV FPGA.
Figure 4–41. Command and Address Eye Diagram of a DDR3 SDRAM DIMM at the First and Last DDR3 SDRAM Component
at 533 MHz (1)
Note to Figure 4–41:
(1) The command and address simulation is performed using a bit period of 1.875 ns.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–45
Figure 4–41 shows that the command and address signal seen at the first DDR3
SDRAM component (the green signal) leads the command and address signals seen at
the last DDR3 SDRAM component (the red signal) by 1.2 ns, which is 0.64 tCK for a
533-MHz operation.
Stratix III, Stratix IV, and Stratix V FPGAs
The following sections review the termination on the single-ranked single DDR3
SDRAM DIMM interface side and investigate the use of different termination features
available in Stratix III, Stratix IV, and Stratix V FPGA devices to achieve optimum
signal integrity for your DDR3 SDRAM interface.
DQS, DQ, and DM for Stratix III, Stratix IV, and Stratix V FPGA
As mentioned in “Dynamic OCT in Stratix III and Stratix IV Devices” on page 4–5,
Stratix III, Stratix IV, and Stratix V FPGAs support the dynamic OCT feature, which
switches from series termination to parallel termination depending on the mode of
the I/O buffer. Because DQS and DQ are bidirectional signals, DQS and DQ can be
both transmitters and receivers. “DQS, DQ, and DM for DDR3 SDRAM UDIMM” on
page 4–40 describes the signal quality of DQ, DQS, and DM when the Stratix III,
Stratix IV, or Stratix V FPGA device is the transmitter with the I/O buffer set to a 50-
series termination.
This section details the condition when the Stratix III, Stratix IV, or Stratix V device is
the receiver, the Stratix III, Stratix IV, and Stratix V I/O buffer is set to a 50- parallel
termination, and the memory is the transmitter. DM is a unidirectional signal, so the
DDR3 SDRAM component is always the receiver.
For receiver termination recommendations and transmitter output drive strength
settings, refer to “DQS, DQ, and DM for DDR3 SDRAM UDIMM” on page 4–40.
Figure 4–42 illustrates the DDR3 SDRAM interface when the Stratix III, Stratix IV, or
Stratix V FPGA device is reading from the DDR3 SDRAM using a 50- parallel OCT
termination on the Stratix III, Stratix IV, or Stratix V FPGA device, and the DDR3
SDRAM driver output impedance is set to 34 .
Figure 4–42. DDR3 SDRAM Component Driving the Stratix III, Stratix IV, and Stratix V FPGA Device with Parallel 50-
OCT Turned On
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–46
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
Figure 4–43 shows the simulation of a read from the DDR3 SDRAM DIMM with a
50- parallel OCT setting on the Stratix III or Stratix IV FPGA device.
Figure 4–43. Read-Eye Diagram of a DDR3 SDRAM DIMM at the Stratix III and Stratix IV FPGA Using a Parallel 50-
OCT Setting
Use of the Stratix III, Stratix IV, or Stratix V parallel 50- OCT feature matches
receiver impedance with the transmission line characteristic impedance. This
eliminates any reflection that causes ringing, and results in a clean eye diagram at the
Stratix III, Stratix IV, or Stratix V FPGA.
Memory Clocks for Stratix III, Stratix IV, and Stratix V FPGA
Memory clocks are unidirectional signals. Refer to “Memory Clocks for DDR3
SDRAM UDIMM” on page 4–42 for receiver termination recommendations and
transmitter output drive strength settings.
Commands and Addresses for Stratix III and Stratix IV FPGA
Commands and addresses are unidirectional signals. Refer to “Commands and
Addresses for DDR3 SDRAM UDIMM” on page 4–44 for receiver termination
recommendations and transmitter output drive strength settings.
Multi-Rank DDR3 SDRAM Unbuffered DIMM
You can implement a DDR3 SDRAM UDIMM interface in several permutations, such
as single DIMM or multiple DIMMs, using either single-ranked or dual-ranked
UDIMMs. In addition to the UDIMM’s form factor, these termination
recommendations are also valid for small-outline (SO) DIMMs and MicroDIMMs.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–47
Table 4–16 lists the different permutations of a two-slot DDR3 SDRAM interface and
the recommended ODT settings on both the memory and controller when writing to
memory.
Table 4–16. DDR3 SDRAM ODT Matrix for Writes
Slot 1
DR
Slot 2
DR
Write To
(1)
and
(2)
Controller
OCT (3)
Slot 1
Rank 1
Slot 1
Series 50 
120
Slot 2
Series 50 
ODT off
Slot 1
Series 50 
120 
Slot 2
Series 50 
40 
Slot 2
Rank 2
 (4)
(4)
Rank 1
Rank 2
ODT off
ODT off
40  (4)
40  (4)
120  (4)
ODT off
 (4)
Unpopulated
40
Unpopulated
120 
120 
ODT off
Unpopulated
Unpopulated
Unpopulated
Unpopulated
120 
ODT off
Series 50 
120 
Unpopulated
Unpopulated
Unpopulated
Series 50 
Unpopulated
Unpopulated
120 
Unpopulated
SR
SR
DR
Empty
Slot 1
Series 50 
Empty
DR
Slot 2
Series 50 
SR
Empty
Slot 1
Empty
SR
Slot 2
(4)
Unpopulated
(4)
Unpopulated
Notes to Table 4–16:
(1) SR: single-ranked DIMM; DR: dual-ranked DIMM.
(2) These recommendations are taken from the DDR3 ODT and Dynamic ODT session of the JEDEC DDR3 2007 Conference, Oct 3-4, San Jose, CA.
(3) The controller in this case is the FPGA.
(4) Dynamic ODT is required. For example, the ODT of Slot 2 is set to the lower ODT value of 40  when the memory controller is writing to Slot 1,
resulting in termination and thus minimizing any reflection from Slot 2. Without dynamic ODT, Slot 2 will not be terminated.
Table 4–17 lists the different permutations of a two-slot DDR3 SDRAM interface and
the recommended ODT settings on both the memory and controller when reading
from memory.
Table 4–17. DDR3 SDRAM ODT Matrix for Reads
Slot 1
Slot 2
Read From
(1)
and
(2)
Controller
OCT (3)
Slot 1
Rank 1
Slot 2
Rank 2
Rank 1
Rank 2
Slot 1
Parallel 50 
ODT off
ODT off
ODT off
40 
Slot 2
Parallel 50 
ODT off
40 
ODT off
ODT off
Slot 1
Parallel 50 
ODT off
Unpopulated
40 
Unpopulated
Slot 2
Parallel 50 
40 
Unpopulated
ODT off
Unpopulated
Empty
Slot 1
Parallel 50 
ODT off
ODT off
Unpopulated
Unpopulated
Empty
DR
Slot 2
Parallel 50 
Unpopulated
Unpopulated
ODT off
ODT off
SR
Empty
Slot 1
Parallel 50 
ODT off
Unpopulated
Unpopulated
Unpopulated
Empty
SR
Slot 2
Parallel 50 
Unpopulated
Unpopulated
ODT off
Unpopulated
DR
DR
SR
SR
DR
Notes to Table 4–17:
(1) SR: single-ranked DIMM; DR: dual-ranked DIMM.
(2) These recommendations are taken from the DDR3 ODT and Dynamic ODT session of the JEDEC DDR3 2007 Conference, Oct 3-4, San Jose, CA.
(3) The controller in this case is the FPGA. JEDEC typically recommends 60 , but this value assumes that the typical motherboard trace impedance
is 60 and that the controller supports this termination. Altera recommends using a 50- parallel OCT when reading from the memory.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–48
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
DDR3 SDRAM Registered DIMM
The difference between a registered DIMM (RDIMM) and a UDIMM is that the clock,
address, and command pins of the RDIMM are registered or buffered on the DIMM
before they are distributed to the memory devices. For a controller, each clock,
address, or command signal has only one load, which is the register or buffer. In a
UDIMM, each controller pin must drive a fly-by wire with multiple loads.
You do not need to terminate the clock, address, and command signals on your board
because these signals are terminated at the register. However, because of the register,
these signals become point-to-point signals and have improved signal integrity
making the drive strength requirements of the FPGA driver pins more relaxed.
Similar to the signals in a UDIMM, the DQS, DQ, and DM signals on a RDIMM are
not registered. To terminate these signals, refer to “DQS, DQ, and DM for DDR3
SDRAM UDIMM” on page 4–40.
DDR3 SDRAM Components With Leveling
This section discusses terminations used to achieve optimum performance for
designing the DDR3 SDRAM interface using discrete DDR3 SDRAM components.
In addition to using DDR3 SDRAM DIMM to implement your DDR3 SDRAM
interface, you can also use DDR3 SDRAM components. However, for applications
that have limited board real estate, using DDR3 SDRAM components reduces the
need for a DIMM connector and places components closer, resulting in denser
layouts.
DDR3 SDRAM Components With or Without Leveling
The DDR3 SDRAM UDIMM is laid out to the JEDEC specification. The JEDEC
specification is available from either the JEDEC Organization website
(www.JEDEC.org) or from the memory vendors. However, when you are designing
the DDR3 SDRAM interface using discrete SDRAM components, you may desire a
layout scheme that is different than the DIMM specification. You have the following
two options:
■
Mimic the standard DDR3 SDRAM DIMM, using a fly-by topology for the
memory clocks, address, and command signals. This options needs read and write
leveling, so you must use the UniPHY IP with leveling.
f For more information about this fly-by configuration, continue reading this
chapter.
■
Mimic a standard DDR2 SDRAM DIMM, using a balanced (symmetrical) tree-type
topology for the memory clocks, address, and command signals. Using this
topology results in unwanted stubs on the command, address, and clock, which
degrades signal integrity and limits the performance of the DDR3 SDRAM
interface.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–49
DQS, DQ, and DM for DDR3 SDRAM Components
When you are laying out the DDR3 SDRAM interface using Stratix III, Stratix IV, or
Stratix V devices, Altera recommends that you not include the 15- stub series
resistor that is on every DQS, DQ, and DM signal; unless your simulation shows that
the absence of this resistor causes extra reflection. Although adding the 15- stub
series resistor may help to maintain constant impedance in some cases, it also slightly
reduces signal swing at the receiver. It is unlikely that by removing this resistor the
waveform shows a noticeable reflection, but it is your responsibility to prove by
simulating your board trace. Therefore, Altera recommends the DQS, DQ, and DM
topology shown in Figure 4–44 when the Stratix III, Stratix IV, or Stratix V FPGA is
writing to the DDR3 SDRAM.
Figure 4–44. Stratix III, Stratix IV, and Stratix V FPGA Writing to a DDR3 SDRAM Components
When you are using DDR3 SDRAM components, there are no DIMM connectors. This
minimizes any impedance discontinuity, resulting in better signal integrity.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–50
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
Memory Clocks for DDR3 SDRAM Components
When you use DDR3 SDRAM components, you must account for the compensation
capacitor and differential termination resistor between the differential memory clocks
of the DIMM. Figure 4–45 shows the HyperLynx simulation of the differential clock
seen at the die of the first and last DDR3 SDRAM component using a fly-by topology
on a board, without the 2.2 pF compensation capacitor using the 50- OCT setting on
the output driver of the Stratix III, Stratix IV, or Stratix V FPGA.
Figure 4–45. Differential Memory Clock of a DDR3 SDRAM Component without the Compensation Capacitor at the First
and Last Component Using a Fly-by Topology on a Board
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
4–51
Without the compensation capacitor, the memory clocks (the yellow signal) at the first
component have significant ringing, whereas, with the compensation capacitor the
ringing is dampened. Similarly, the differential termination resistor needs to be
included in the design. Depending on your board stackup and layout requirements,
you choose your differential termination resistor value. Figure 4–46 shows the
HyperLynx simulation of the differential clock seen at the die of the first and last
DDR3 SDRAM component using a fly-by topology on a board, and terminated with
100  instead of the 72  used in the DIMM.
Figure 4–46. Differential Memory Clock of a DDR3 SDRAM DIMM Terminated with 100  at the First and Last Component
Using a Fly-by Topology on a Board
Terminating with 100  instead of 72  results in a slight reduction in peak-to-peak
amplitude. To simplify your design, use the terminations outlined in the JEDEC
specification for DDR3 SDRAM UDIMM as your guide and perform simulation to
ensure that the DDR3 SDRAM UDIMM terminations provide you with optimum
signal quality.
In addition to choosing the value of the differential termination, you must consider
the trace length of the memory clocks. Altera’s DDR3 UniPHY IP currently supports a
flight-time skew of no more than 0.69 tCK in between the first and last memory
component. If you use Altera’s DDR3 UniPHY IP to create your DDR3 SDRAM
interface, ensure that the flight-time skew of your memory clocks is not more than
0.69 tCK. UniPHY IP also requires that the total skew combination of the clock fly-by
skew and DQS skew is less than 1 clock cycle.
Refer to “Layout Guidelines for DDR3 SDRAM Interface” on page 4–64 for more
information about layout guidelines for DDR3 SDRAM components.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–52
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Board Termination for DDR3 SDRAM
Command and Address Signals for DDR3 SDRAM
As with memory clock signals, you must account for the termination resistor on the
command and address signals when you use DDR3 SDRAM components. Choose
your termination resistor value depending on your board stackup and layout
requirements. Figure 4–47 shows the HyperLynx simulation of the command and
address seen at the die of the first and last DDR3 SDRAM component using a fly-by
topology on a board terminated with 60  instead of the 39  used in the DIMM.
Figure 4–47. Command and Address Eye Diagram of a DDR3 SDRAM Component Using Fly-by Topology on a Board at the
First and Last DDR3 SDRAM Component at 533 MHz, Terminated with 60 
Terminating with 60  instead of 39  results in eye closure in the signal at the first
component (the green signal), while there is no effect on the signal at the last
component (the red signal). To simplify your design with discrete DDR3 SDRAM
components, use the terminations outlined in the JEDEC specification for DDR3
SDRAM UDIMM as your guide, and perform simulation to ensure that the DDR3
SDRAM UDIMM terminations provide you with the optimum signal quality.
As with memory clocks, you must consider the trace length of the command and
address signals so that they match the flight-time skew of the memory clocks.
Stratix III, Stratix IV, and Stratix V FPGAs
Stratix III, Stratix IV, or Stratix V FPGA termination settings for DIMM also applies to
DDR3 SDRAM component interfaces.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Drive Strength
4–53
Table 4–18 compares the effects of the series stub resistor on the eye diagram at the
Stratix III or Stratix IV FPGA (receiver) when the Stratix III or Stratix IV FPGA is
reading from the memory.
Table 4–18. Read-Eye Diagram with and without RS Using 50- Parallel OCT
ODT
Eye Height (V)
Eye Width (ps)
Overshoot (V)
Undershoot (V)
With RS
0.70
685
—
—
Without RS
0.73
724
—
—
Without the 15- stub series resistor to dampen the signal, the signal at the receiver of
the Stratix III or Stratix IV FPGA driven by the DDR3 SDRAM component is larger
than the signal at the receiver of the Stratix III or Stratix IV FPGA driven by DDR3
SDRAM DIMM (Figure 4–42), and similar to the write-eye diagram in “DQS, DQ, and
DM for DDR3 SDRAM Components” on page 4–49.
Drive Strength
Altera’s FPGA products offer numerous drive strength settings, allowing you to
optimize your board designs to achieve the best signal quality. This section focuses on
the most commonly used drive strength settings of 8 mA and 16 mA, as
recommended by JEDEC for Class I and Class II termination schemes.
1
November 2011
You are not restricted to using only these drive strength settings for your board
designs. You should perform simulations using I/O models available from Altera and
memory vendors to ensure that you use the proper drive strength setting to achieve
optimum signal integrity.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–54
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Drive Strength
How Strong is Strong Enough?
Figure 4–19 on page 4–24 shows a signal probed at the DDR2 SDRAM DIMM
(receiver) of a far-end series-terminated transmission line when the FPGA writes to
the DDR2 SDRAM DIMM using a drive strength setting of 16 mA. The resulting
signal quality on the receiver shows excessive over- and undershoot. To reduce the
over- and undershoot, you can reduce the drive strength setting on the FPGA from
16 mA to 8 mA. Figure 4–48 shows the simulation and measurement of the FPGA
with a drive strength setting of 8 mA driving a no-parallel termination transmission
line.
Figure 4–48. HyperLynx Simulation and Measurement, FPGA Writing to Memory
Table 4–19 compares the signals at the DDR2 SDRAM DIMM with no-parallel
termination and memory-side series resistors when the FPGA is writing to the
memory with 8-mA and 16-mA drive strength settings.
Table 4–19. Simulation and Board Measurement Results for 8 mA and 16 mA Drive Strength
Settings
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
8-mA Drive Strength Setting
Simulation
1.48
1.71
0.24
0.35
Board Measurement
1.10
1.24
0.24
0.50
Simulation
1.66
1.10
0.90
0.80
Board Measurements
1.25
0.60
1.10
1.08
16-mA Drive Strength Setting
With a lower strength drive setting, the overall signal quality is improved. The eye
width is reduced, but the eye height is significantly larger with a lower drive strength
and the over- and undershoot is reduced dramatically.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
System Loading
4–55
To improve the signal quality further, you should use 50- on-chip series termination
in place of an 8mA drive strength and 25- on-chip series termination in place of a
16 mA drive strength. Refer to “On-Chip Termination (Non-Dynamic)” on page 4–19
for simulation and board measurements.
The drive strength setting is highly dependent on the termination scheme, so it is
critical that you perform pre- and post-layout board-level simulations to determine
the proper drive strength settings.
System Loading
You can use memory in a variety of forms, such as individual components or multiple
DIMMs, resulting in different loading seen by the FPGA. This section describes the
effect on signal quality when interfacing memories in component, dual rank, and dual
DIMMs format.
Component Versus DIMM
When using discrete DDR2 SDRAM components, the additional loading from the
DDR2 SDRAM DIMM connector is eliminated and the memory-side series resistor on
the DDR2 SDRAM DIMM is no longer there. You must decide if the memory-side
series resistor near the DDR2 SDRAM is required.
FPGA Writing to Memory
Figure 4–49 shows the Class II termination scheme without the memory-side series
resistor when the FPGA is writing to the memory in the component format.
Figure 4–49. Class II Termination Scheme without Memory-Side Series Resistor
VTT = 0.9 V
VTT = 0.9 V
FPGA
RT = 56 Ω
RT = 56 Ω
DDR2 Component
16 mA
Driver
Receiver
November 2011
Altera Corporation
Driver
50 Ω
VREF
3” Trace Length
VREF = 0.9 V
Receiver
External Memory Interface Handbook
Volume 2: Design Guidelines
4–56
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
System Loading
Figure 4–50 shows the simulation and measurement results of the signal seen at a
DDR2 SDRAM component of a Class II termination scheme without the DIMM
connector and the memory-side series resistor. The FPGA is writing to the memory
with a 16-mA drive strength setting.
Figure 4–50. HyperLynx Simulation and Measurement of the Signal, FPGA Writing to Memory
Table 4–20 compares the signal for a single rank DDR2 SDRAM DIMM and a single
DDR2 SDRAM component in a Class II termination scheme when the FPGA is writing
to the memory.
Table 4–20. Simulation and Board Measurement Results for Single Rank DDR2 SDRAM DIMM and Single DDR2 SDRAM
Component (1), (2)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Rising Edge
Rate (V/ns)
Falling Edge
Rate (V/ns)
Single DDR2 SDRAM Component
Simulation
1.79
1.15
0.39
0.33
3.90
3.43
Measurement
1.43
0.96
0.10
0.13
1.43
1.43
Single Rank DDR2 SDRAM DIMM
Simulation
1.65
0.86
N/A
N/A
1.71
1.95
Measurement
1.36
0.41
N/A
N/A
1.56
1.56
Notes to Table 4–20:
(1) The drive strength on the FPGA is set to Class II 16 mA.
(2) N/A is not applicable.
The overall signal quality is comparable between the single rank DDR2 SDRAM
DIMM and the single DDR2 SDRAM component, but the elimination of the DIMM
connector and memory-side series resistor results in a more than 50% improvement in
the eye height.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
System Loading
4–57
FPGA Reading from Memory
Figure 4–51 shows the Class II termination scheme without the memory-side series
resistor when the FPGA is reading from memory. Without the memory-side series
resistor, the memory driver has less loading to drive the Class II termination.
Compare this result to the result of the DDR2 SDRAM DIMM described in “FPGA
Reading from Memory” on page 4–35 where the memory-side series resistor is on the
DIMM.
Figure 4–51. Class II Termination Scheme without Memory-Side Series Resistor
VTT = 0.9 V
VTT = 0.9 V
FPGA
DDR2 DIMM Full Strength
RT = 56 Ω
Driver
Receiver
RT = 56 Ω
Driver
50 Ω
VREF = 0.9 V
3” Trace Length
VREF
Receiver
Figure 4–52 shows the simulation and measurement results of the signal seen at the
FPGA. The FPGA reads from memory without the source-series resistor near the
DDR2 SDRAM component on a Class II-terminated transmission line. The FPGA
reads from memory with a full drive strength setting.
Figure 4–52. HyperLynx Simulation and Measurement, FPGA Reading from the DDR2 SDRAM Component
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–58
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
System Loading
Table 4–21 compares the signal at a single rank DDR2 SDRAM DIMM and a single
DDR2 SDRAM component of a Class II termination scheme. The FPGA is reading
from memory with a full drive strength setting.
Table 4–21. Simulation and Board Measurement Results of Single Rank DDR2 SDRAM DIMM and DDR2 SDRAM
Component (1)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Rising Edge
Rate (V/ns)
Falling Edge
Rate (V/ns)
Single DDR2 SDRAM Component
Simulation
1.79
1.06
N/A
N/A
2.48
3.03
Measurement
1.36
0.63
0.13
0.00
1.79
1.14
Single Rank DDR2 SDRAM DIMM
Simulation
1.73
0.76
N/A
N/A
1.71
1.95
Measurement
1.28
0.43
N/A
N/A
0.93
0.86
Note to Table 4–21:
(1) N/A is not applicable.
The effect of eliminating the DIMM connector and memory-side series resistor is
evident in the improvement in the eye height.
Single- Versus Dual-Rank DIMM
DDR2 SDRAM DIMMs are available in either single- or dual-rank DIMM. Single-rank
DIMMs are DIMMs with DDR2 SDRAM memory components on one side of the
DIMM. Higher-density DIMMs are available as dual-rank, which has DDR2 SDRAM
memory components on both sides of the DIMM. With the dual-rank DIMM
configuration, the loading is twice that of a single-rank DIMM. Depending on the
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
System Loading
4–59
board design, you must adjust the drive strength setting on the memory controller to
account for this increase in loading. Figure 4–53 shows the simulation result of the
signal seen at a dual rank DDR2 SDRAM DIMM. The simulation uses Class II
termination with a memory-side series resistor transmission line. The FPGA uses a 16mA drive strength setting.
Figure 4–53. HyperLynx Simulation with a 16-mA Drive Strength Setting on the FPGA
Table 4–22 compares the signals at a single- and dual-rank DDR2 SDRAM DIMM of a
Class II and far-end source-series termination when the FPGA is writing to the
memory with a 16-mA drive strength setting.
Table 4–22. Simulation Results of Single- and Dual-Rank DDR2 SDRAM DIMM (1)
Eye Width (ns)
Eye Height (V)
Overshoot (V)
Undershoot (V)
Rising Edge
Rate (V/ns)
Falling Edge
Rate (V/ns)
1.27
0.12
0.12
0.99
0.94
1.27
0.10
0.10
1.71
1.95
Dual Rank DDR2 SDRAM DIMM
Simulation
1.34
Single Rank DDR2 SDRAM DIMM
Simulation
1.65
Note to Table 4–22:
(1) The drive strength on the FPGA is set to Class II 16 mA.
In a dual-rank DDR2 SDRAM DIMM, the additional loading leads to a slower edge
rate, which affects the eye width. The slower edge rate leads to the degradation of the
setup and hold time required by the memory as well, which must be taken into
consideration during the analysis of the timing for the interface. The overall signal
quality remains comparable, but eye width is reduced in the dual-rank DIMM. This
reduction in eye width leads to a smaller data capture window that must be taken into
account when performing timing analysis for the memory interface.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–60
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
Single DIMM Versus Multiple DIMMs
Some applications, such as packet buffering, require deeper memory, making a single
DIMM interface insufficient. If you use a multiple DIMM configuration to increase
memory depth, the memory controller is required to interface with multiple data
strobes and the data lines instead of the point-to-point interface in a single DIMM
configuration. This results in heavier loading on the interface, which can potentially
impact the overall performance of the memory interface.
f For detailed information about a multiple DIMM DDR2 SDRAM memory interface,
refer to the Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines chapter.
Design Layout Guidelines
This section discusses general layout guidelines for designing your DDR2 and DDR3
SDRAM interfaces. These layout guidelines help you plan your board layout, but are
not meant as strict rules that must be adhered to. Altera recommends that you
perform your own board-level simulations to ensure that the layout you choose for
your board allows you to achieve your desired performance.
These layout guidelines are for both ALTMEMPHY- and UniPHY-based IP designs,
unless specified otherwise.
f For more information about how the memory manufacturers route these address and
control signals on their DIMMs, refer to the Cadence PCB browser from the Cadence
website, at www.cadence.com. The various JEDEC example DIMM layouts are
available from the JEDEC website, at www.jedec.org.
1
The following layout guidelines include several +/- length based rules. These length
based guidelines are for first order timing approximations if you cannot simulate the
actual delay characteristic of the interface. They do not include any margin for
crosstalk.
Altera recommends that you get accurate time base skew numbers for your design
when you simulate the specific implementation.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
4–61
Layout Guidelines for DDR2 SDRAM Interface
Table 4–23 lists DDR2 SDRAM layout guidelines.
1
These layout guidelines also apply to DDR3 SDRAM without leveling interfaces.
Table 4–23. DDR2 SDRAM Layout Guidelines (Part 1 of 3) (1)
Parameter
Guidelines
If you consider a normal DDR2 unbuffered, unregistered DIMM, essentially you are planning to
perform the DIMM routing directly on your PCB. Therefore, each address and control pin
routes from the FPGA (single pin) to all memory devices must be on the same side of the
FPGA.
DIMMs
Impedance
Decoupling Parameter
■
All signal planes must be 50-60-, single-ended, ±10%
■
All signal planes must be 100, differential ±10%
■
All unused via pads must be removed, because they cause unwanted capacitance
■
Use 0.1 F in 0402 size to minimize inductance
■
Make VTT voltage decoupling close to pull-up resistors
■
Connect decoupling caps between VTT and ground
■
Use a 0.1F cap for every other VTT pin and 0.01F cap for every VDD and VDDQ pin
■
Route GND, 1.8 V as planes
■
Route VCCIO for memories in a single split plane with at least a 20-mil (0.020 inches, or
0.508 mm) gap of separation
■
Route VTT as islands or 250-mil (6.35-mm) power traces
■
Route oscillators and PLL power as islands or 100-mil (2.54-mm) power traces
Power
All specified delay matching requirements include PCB trace delays, different layer propagation
velocity variance, and crosstalk. To minimize PCB layer propogation variance, Altera
recommend that signals from the same net group always be routed on the same layer.
General Routing
November 2011
■
Use 45° angles (not 90° corners)
■
Avoid T-Junctions for critical nets or clocks
■
Avoid T-junctions greater than 250 mils (6.35 mm)
■
Disallow signals across split planes
■
Restrict routing other signals close to system reset signals
■
Avoid routing memory signals closer than 0.025 inch (0.635 mm) to PCI or system clocks
■
All data, address, and command signals must have matched length traces ± 50 ps (±0.250
inches or 6.35 mm)
■
All signals within a given Byte Lane Group should be matched length with maximum
deviation of ±10 ps or approximately ±0.050 inches (1.27 mm) and routed in the same layer.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–62
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
Table 4–23. DDR2 SDRAM Layout Guidelines (Part 2 of 3) (1)
Parameter
Clock Routing
Guidelines
■
Route clocks on inner layers with outer-layer run lengths held to under 500 mils (12.7 mm)
■
These signals should maintain a10-mil (0.254 mm) spacing from other nets
■
Clocks should maintain a length-matching between clock pairs of ±5 ps or approximately
±25 mils (0.635 mm)
■
Differential clocks should maintain a length-matching between P and N signals of ±2 ps or
approximately ±10 mils (0.254 mm), routed in parallel
■
Space between different pairs should be at least three times the space between the
differential pairs and must be routed differentially (5-mil trace, 10-15 mil space on centers),
and equal to the signals in the Address/Command Group or up to 100 mils (2.54 mm)
longer than the signals in the Address/Command Group.
■
Unbuffered address and command lines are more susceptible to cross-talk and are
generally noisier than buffered address or command lines. Therefore, un-buffered address
and command signals should be routed on a different layer than data signals (DQ) and data
mask signals (DM) and with greater spacing.
■
Do not route differential clock (CK) and clock enable (CKE) signals close to address signals.
■
Keep the distance from the pin on the DDR2 DIMM or component to the termination resistor
pack (VTT) to less than 500 mils for DQS[x] Data Groups.
■
Keep the distance from the pin on the DDR2 DIMM or component to the termination resistor
pack (VTT) to less than 1000 mils for the ADR_CMD_CTL Address Group.
■
Parallelism rules for the DQS[x] Data Groups are as follows:
Address and Command
Routing
External Memory Routing
Rules
■
■
4 mils for parallel runs < 0.1 inch (approximately 1× spacing relative to plane distance)
■
5 mils for parallel runs < 0.5 inch (approximately 1× spacing relative to plane distance)
■
10 mils for parallel runs between 0.5 and 1.0 inches (approximately 2× spacing relative
to plane distance)
■
15 mils for parallel runs between 1.0 and 6.0 inch (approximately 3× spacing relative to
plane distance)
Parallelism rules for the ADR_CMD_CTL group and CLOCKS group are as follows:
■
4 mils for parallel runs < 0.1 inch (approximately 1× spacing relative to plane distance)
■
10 mils for parallel runs < 0.5 inch (approximately 2× spacing relative to plane distance)
■
15 mils for parallel runs between 0.5 and 1.0 inches (approximately 3× spacing relative
to plane distance)
■
20 mils for parallel runs between 1.0 and 6.0 inches (approximately 4× spacing relative
to plane distance)
■
All signals are to maintain a 20-mil separation from other, non-related nets.
■
All signals must have a total length of < 6 inches.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
4–63
Table 4–23. DDR2 SDRAM Layout Guidelines (Part 3 of 3) (1)
Parameter
Guidelines
■
When pull-up resistors are used, fly-by termination configuration is recommended. Fly-by
helps reduce stub reflection issues.
■
Pull-ups should be within 0.5 to no more than 1 inch.
■
Pull up is typically 56 .
■
If using resistor networks:
Termination Rules
■
■
Do not share R-pack series resistors between address/command and data lines (DQ, DQS,
and DM) to eliminate crosstalk within pack.
■
Series and pull up tolerances are 1–2%.
■
Series resistors are typically 10 to 20.
■
Address and control series resistor typically at the FPGA end of the link.
■
DM, DQS, DQ series resistor typically at the memory end of the link (or just before the
first DIMM).
If termination resistor packs are used:
■
The distance to your memory device should be less than 750 mils.
■
The distance from your Altera’s FPGA device should be less than 1250 mils.
■
To perform timing analyses on board and I/O buffers, use third party simulation tool to
simulate all timing information such as skew, ISI, crosstalk, and type the simulation result
into the UniPHY board setting panel.
■
Do not use advanced I/O timing model (AIOT) or board trace model unless you do not have
access to any third party tool. AIOT provides reasonable accuracy but tools like HyperLynx
provides better result. In operations with higher frequency, it is crucial to properly simulate
all signal integrity related uncertainties.
■
The Quartus II software does timing check to find how fast the controller issues a write
command after a read command, which limits the maximum length of the DQ/DQS trace.
Turn on the bus turnaround timing option and make sure the margin is positive before board
fabrication. Functional failure happens if the margin is more than 0.
Quartus II Software
Settings for Board Layout
Note to Table 4–23:
(1) For point-to-point and DIMM interface designs, refer to the Micron website, www.micron.com.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–64
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
Layout Guidelines for DDR3 SDRAM Interface
Table 4–24 lists DDR3 SDRAM layout guidelines.
These layout guidelines are specifically for DDR3 UDIMMs and interfaces with
discrete components using fly-by networks clocked at 1,066 MHz.
1
You have to account for FPGA package delay when determining trace-length
matching. The requirements for trace-length matching in Table 4–24 are only for
interfaces with frequencies greater than and not including 533 MHz.
f To get the package net length report for Altera devices, refer to Net Length Reports in
Board Design Report page, or refer to the Package Delay column in the .pin file
generated by the Quartus II software.
Table 4–24. DDR3 SDRAM UDIMM Layout Guidelines (Part 1 of 4) (1)
Parameter
DIMMs
Impedance
Decoupling Parameter
Guidelines
If you consider a normal DDR3 unbuffered, unregistered DIMM, essentially you are planning to
perform the DIMM routing directly on your PCB. Therefore, each address and control pin
routes from the FPGA (single pin) to all memory devices must be on the same side of the
FPGA.
■
All signal planes must be 50 , single-ended, ±10%.
■
All signal planes must be 100 , differential ±10%.
■
All unused via pads must be removed, because they cause unwanted capacitance.
■
Use 0.1 F in 0402 size to minimize inductance.
■
Make VTT voltage decoupling close to the DDR3 SDRAM components and pull-up resistors.
■
Connect decoupling caps between VTT and VDD using a 0.1F cap for every other VTT pin.
■
Use a 0.1F cap and 0.01F cap for every VDDQ pin.
■
Route GND,1.5 V and 0.75 V as planes.
■
Route VCCIO for memories in a single split plane with at least a 20-mil (0.020 inches, or
0.508 mm) gap of separation.
■
Route VTT as islands or 250-mil (6.35-mm) power traces.
■
Route oscillators and PLL power as islands or 100-mil (2.54-mm) power traces.
■
Maximum trace length for all signals from FPGA to the first DIMM slot is 4.5 inches.
■
Maximum trace length for all signals from DIMM slot to DIMM slot is 0.425 inches.
■
When interface with multiple DDR3 SDRAM components, maximum trace length for
address, command, control, and clock from FPGA to the first component must not be more
than 7 inches.
■
Maximum trace length for DQ, DQS, DQS#, and DM from FPGA to the first component is 5
inches.
■
Even though there are no hard requirements for minimum trace length, you need to
simulate the trace to ensure the signal integrity.
Power
Maximum Trace Length (2)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
4–65
Table 4–24. DDR3 SDRAM UDIMM Layout Guidelines (Part 2 of 4) (1)
Parameter
Guidelines
All specified delay matching requirements include PCB trace delays, different layer propagation
velocity variance, and crosstalk. To minimize PCB layer propogation variance, Altera
recommend that you route signals from the same net group on the same layer.
General Routing
Clock Routing
■
Use 45° angles (not 90° corners).
■
Disallow critical signals across split planes.
■
Route over appropriate VCC and GND planes.
■
Keep signal routing layers close to GND and power planes.
■
Avoid routing memory signals closer than 0.025 inch (0.635 mm) to memory clocks.
■
Route clocks on inner layers with outer-layer run lengths held to under 500 mils (12.7 mm).
The maximum length of the first SDRAM to the last SDRAM must not be more than 5 inches
(127 mm) or 0.69 tCK at 1.066 GHz
■
These signals should maintain the following spacings:
10-mil (0.254 mm) spacing for parallel runs less than 0.5 inches or 2x trace-to-plane
distance.
■
15-mil spacing for parallel runs between 0.5 and 1 inches or 3× trace-to-plane distance.
■
20-mil spacing for parallel runs between 1 and 6 inches or 4× trace-to-plane distance.
■
Clocks should maintain a length-matching between clock pairs of ±5 ps or approximately
±25 mils (0.635 mm).
■
Differential clocks should maintain a length-matching between positive (p) and negative (n)
signals of ±2 ps or approximately ±10 mils (0.254 mm), routed in parallel.
■
Space between different pairs should be at least two times the trace width of the differential
pair to minimize loss and maximize interconnect density.
■
Route differential clocks differentially (5-mil trace, 10-15 mil space on centers) and equal to
length of signals in the Address/Command Group.
■
To avoid mismatched transmission line to via, Altera recommends that you use Ground
Signal Signal Ground (GSSG) topology for your clock pattern—GND|CLKP|CKLN|GND.
■
Route address and command signals in a daisy chain topology from the first SDRAM to the
last SDRAM. The maximum length of the first SDRAM to the last SDRAM must not be more
than 5 inches (127 mm) or 0.69 tCK at 1.066 GHz. For different DIMM configurations, check
the appropriate JEDEC specifications.
■
UDIMMs are more susceptible to cross-talk and are generally noisier than buffered DIMMs.
Therefore, route address and command signals of UDIMMs on a different layer than data
signals (DQ) and data mask signals (DM) and with greater spacing. Make sure that each net
maintains the same consecutive order.
■
Do not route differential clock (CK) and clock enable (CKE) signals close to address signals.
■
Route all addresses and commands to match the clock signals to within ±25 ps or
approximately ± 125 mil (± 3.175 mm) to each discrete memory component. Refer to
Figure 4–54.
Address and Command
Routing
November 2011
■
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–66
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
Table 4–24. DDR3 SDRAM UDIMM Layout Guidelines (Part 3 of 4) (1)
Parameter
Guidelines
■
Match in length all DQ, DQS, and DM signals within a given byte-lane group with a
maximum deviation of ±10 ps or approximately ± 50 mils (± 1.27 mm).
■
Ensure to route all DQ, DQS, and DM signals within a given byte-lane group on the same
layer to avoid layer to layer transmission velocity differences, which otherwise increase the
skew within the group.
■
For ALTMEMPHY-based interfaces, keep the maximum byte-lane group-to-byte group
matched length deviation to ± 150 ps or ± 0.8 inches (± 20 mm).
■
Parallelism rules for address and command and clock signals are as follows:
■
External Memory Routing
Rules
■
■
4 mils for parallel runs <0.1 inch (approximately 1× spacing relative to plane distance)
■
10 mils for parallel runs <0.5 inch (approximately 2× spacing relative to plane distance)
■
15 mils for parallel runs between 0.5 and 1.0 inches (approximately 3× spacing relative
to plane distance)
■
20 mils for parallel runs between 1.0 and 6.0 inches (approximately 4× spacing relative
to plane distance)
Parallelism rules for all other signals are as follows:
■
5 mils for parallel runs < 0.5 inch (approximately 1× spacing relative to plane distance)
■
10 mils for parallel runs between 0.5 and 1.0 inches (approximately 2× spacing relative
to plane distance)
■
15 mils for parallel runs between 1.0 and 6.0 inch (approximately 3× spacing relative to
plane distance)
Do not use DDR3 deskew to correct for more than 20 ps of DQ group skew. The skew
algorithm only removes the following possible uncertainties:
■
Minimum and maximum die IOE skew or delay mismatch
■
Minimum and maximum device package skew or mismatch
■
Board delay mismatch of 20 ps
■
Memory component DQ skew mismatch
■
Increasing any of these four parameters runs the risk of the deskew algorithm limiting,
failing to correct for the total observed system skew. If the algorithm cannot compensate
without limiting the correction, timing analysis shows reduced margins.
■
All the trace length matching requirements are from the FPGA package ball to DDR3
package ball, which means you have to take into account trace mismatching on different
DIMM raw cards.
■
For UniPHY-based interfaces, the timing between the DQS and clock signals on each device
calibrates dynamically when you enable leveling to meet tDQSS. To make sure the skew is not
too large for the leveling circuit’s capability, refer to Figure 4–55 and follow these rules:
External Memory Interface Handbook
Volume 2: Design Guidelines
■
Propagation delay of clock signal must not be shorter than propagation delay of DSQ
signal at every device:
(CKi – CK) – DQSi > 0; 0 < i < number of components – 1
■
Total skew of CLK and DQS signal between groups is less than one clock cycle:
(CKi – CK + DQSi) max – (CKi – CK + DQSi) min < 1 × tCK
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
4–67
Table 4–24. DDR3 SDRAM UDIMM Layout Guidelines (Part 4 of 4) (1)
Parameter
Guidelines
■
■
Termination Rules
■
When using DIMMs, you have no concerns about terminations on memory clocks,
addresses, and commands.
If you are using components, use an external parallel termination of 40  to VTT at the end
of the fly-by daisy chain topology on the addresses and commands.
For memory clocks, use an external parallel termination of 75  differential at the end of the
fly-by daisy chain topology on the memory clocks. Fly-by daisy chain topology helps reduce
stub reflection issues.
■
Keep the length of the traces to the termination to within 0.5 inch (14 mm).
■
Use resistors with tolerances of 1 to 2%.
■
To perform timing analyses on board and I/O buffers, use third party simulation tool to
simulate all timing information such as skew, ISI, crosstalk, and type the simulation result
into the UniPHY board setting panel.
■
Do not use advanced I/O timing model (AIOT) or board trace model unless you do not have
access to any third party tool. AIOT provides reasonable accuracy but tools like HyperLynx
provides better result. In 1,066-MHz operation, it is crucial to properly simulate all signal
integrity related uncertainties.
■
The Quartus II software does timing check to find how fast the controller issues a write
command after a read command, which limits the maximum length of the DQ/DQS trace.
Turn on the bus turnaround timing option and make sure the margin is positive before board
fabrication. Functional failure happens if the margin is more than 0.
Quartus II Software
Settings for Board Layout
Notes to Table 4–23:
(1) For point-to-point and DIMM interface designs, refer to the Micron website, www.micron.com.
(2) For better efficiency, the UniPHY IP requires faster turnarounds from read commands to write.
Figure 4–54 shows the DDR3 SDRAM component routing guidelines for address and
command signals.
Figure 4–54. DDR3 SDRAM Component Address and Command Routing Guidelines
Propagation delay < 0.69 tCK
VTT
FPGA
clock
address and
command
x
x1
y
x2
DDR3 SDRAM
Component
DDR3 SDRAM
Component
y1
VTT
x3
DDR3 SDRAM
Component
y2
DDR3 SDRAM
Component
y3
Maximum 6 inches
x = y ± 125 mil
x + x1 = y + y1 ± 125 mil
x + x1 + x2 = y + y1 + y2 ± 125 mil
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–68
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
Figure 4–55 shows the delay requirements to align DQS and clock signals.
Figure 4–55. Delaying DQS Signal to Align DQS and Clock
VTT
DDR3
Component
CK0
DQ Group 0
DDR3
Component
DDR3
Component
CKi
CK1
DQ Group 1
DQSi
DQ Group i
CK
FPGA
(CKi — CK) = Clock signal propagation delay to device i
DQSi = DQ/DQS signals propagation delay to group i
Layout Guidelines for DDR3 SDRAM Wide Interface (>72 bits)
This section discusses the different ways to lay out a wider DDR3 SDRAM interface to
the FPGA. Choose the topology based on board trace simulation and the timing
budget of your system.
The UniPHY IP supports up to a 144-bit wide DDR3 interface. You can either use
discrete components or DIMMs to implement a wide interface (any interface wider
than 72 bits). Altera recommends using leveling when you implement a wide
interface with DDR3 components.
When you lay out for a wider interface, all rules and constraints discussed in the
previous sections still apply. The DQS, DQ, and DM signals are point-to-point, and all
the same rules discussed in “Design Layout Guidelines” on page 4–60 apply.
The main challenge for the design of the fly-by network topology for the clock,
command, and address signals is to avoid signal integrity issues, and to make sure
you route the DQS, DQ, and DM signals with the chosen topology.
Fly-By Network Design for Clock, Command, and Address Signals
As described in “DDR3 SDRAM Components With Leveling” on page 4–48, the
UniPHY IP requires the flight-time skew between the first DDR3 SDRAM component
and the last DDR3 SDRAM component to be less than 0.69 tCK for memory clocks.
This constraint limits the number of components you can have for each fly-by
network.
If you design with discrete components, you can choose to use one or more fly-by
networks for the clock, command, and address signals.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
4–69
Figure 4–56 shows an example of a single fly-by network topology.
Figure 4–56. Single Fly-By Network Topology
FPGA
VTT
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
Less than 0.69 tCK
Every DDR3 SDRAM component connected to the signal is a small load that causes
discontinuity and degrades the signal. When using a single fly-by network topology,
to minimize signal distortion, follow these guidelines:
November 2011
■
Use ×16 device instead ×4 or ×8 to minimize the number of devices connected to
the trace.
■
Keep the stubs as short as possible.
■
Even with added loads from additional components, keep the total trace length
short; keep the distance between the FPGA and the first DDR3 SDRAM
component less than 5 inches.
■
Simulate clock signals to ensure a decent waveform.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
4–70
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Design Layout Guidelines
Figure 4–57 shows an example of a double fly-by network topology. This topology is
not rigid but you can use it as an alternative option. The advantage of using this
topology is that you can have more DDR3 SDRAM components in a system without
violating the 0.69 tCK rule. However, as the signals branch out, the components still
create discontinuity.
Figure 4–57. Double Fly-By Network Topology
VTT
FPGA
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
Less than 0.69 tCK
VTT
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
DDR3
SDRAM
Less than 0.69 tCK
You need to carry out some simulations to find the location of the split, and the best
impedance for the traces before and after the split.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 4: DDR2 and DDR3 SDRAM Board Design Guidelines
Document Revision History
4–71
Figure 4–58 shows a way to minimize the discontinuity effect. In this example, keep
TL2 and TL3 matches in length. Keep TL1 longer than TL2 and TL3, so that it is easier
to route all the signals during layout.
Figure 4–58. Minimizing Discontinuity Effect
TL2, ZQ = 50Ω
Splitting Point
TL1, ZQ = 25Ω
TL3, ZQ = 50Ω
You can also consider using a DIMM on each branch to replace the components.
Because the trade impedance on the DIMM card is 40  to 60 , perform a board trace
simulation to control the reflection to within the level your system can tolerate.
By using the new features of the DDR3 SDRAM controller with UniPHY and the
Stratix III, Stratix IV, or Stratix V devices, you simplify your design process. Using the
fly-by daisy chain topology increases the complexity of the datapath and controller
design to achieve leveling, but also greatly improves performance and eases board
layout for DDR3 SDRAM.
You can also use the DDR3 SDRAM components without leveling in a design if it may
result in a more optimal solution, or use with devices that support the required
electrical interface standard, but do not support the required read and write leveling
functionality.
Document Revision History
Table 4–25 lists the revision history for this document.
Table 4–25. Document Revision History
Date
Version
November 2011
4.0
June 2011
3.0
Changes
Added Arria V and Cyclone V information.
■
Merged DDR2 and DDR3 chapters to DDR2 and DDR3 SDRAM Interface Termination and
Layout Guidelines and updated with leveling information.
■
Added Stratix V information.
December 2010
2.1
Added DDR3 SDRAM Interface Termination, Drive Strength, Loading, and Board Layout
Guidelines chapter with Stratix V information.
July 2010
2.0
Updated Arria II GX information.
April 2010
1.0
Initial release.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5. Dual-DIMM DDR2 and DDR3 SDRAM
Board Design Guidelines
November 2011
EMI_DG_005-2.0
EMI_DG_005-2.0
This chapter describes guidelines for implementing dual unbuffered DIMM
(UDIMM) DDR2 and DDR3 SDRAM interfaces. This chapter discusses the impact on
signal integrity of the data signal with the following conditions in a dual-DIMM
configuration:
■
Populating just one slot versus populating both slots
■
Populating slot 1 versus slot 2 when only one DIMM is used
■
On-die termination (ODT) setting of 75  versus an ODT setting of 150 
f For detailed information about a single-DIMM DDR2 SDRAM interface, refer to the
DDR2 and DDR3 SDRAM Board Design Guidelines chapter.
DDR2 SDRAM
This section describes guidelines for implementing a dual slot unbuffered DDR2
SDRAM interface, operating at up to 400-MHz and 800-Mbps data rates. Figure 5–1
shows a typical DQS, DQ, and DM signal topology for a dual-DIMM interface
configuration using the ODT feature of the DDR2 SDRAM components.
Figure 5–1. Dual-DIMM DDR2 SDRAM Interface Configuration
(1)
V TT
Board Trace
R T = 54Ω
DDR2 SDRAM
DIMMs
(Receiver)
Slot 1
FPGA
(Driver)
Board Trace
Slot 2
Board Trace
Note to Figure 5–1:
(1) The parallel termination resistor RT = 54  to VTT at the FPGA end of the line is optional for devices that support dynamic on-chip termination
(OCT).
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
5–2
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
The simulations in this section use a Stratix® II device-based board. Because of
limitations of this FPGA device family, simulations are limited to 266 MHz and
533 Mbps so that comparison to actual hardware results can be directly made.
Stratix II High Speed Board
To properly study the dual-DIMM DDR2 SDRAM interface, the simulation and
measurement setup evaluated in the following analysis features a Stratix II FPGA
interfacing with two 267-MHz DDR2 SDRAM UDIMMs. This DDR2 SDRAM
interface is built on the Stratix II High-Speed High-Density Board (Figure 5–2).
f For more information about the Stratix II High-Speed High-Density Board, contact
your Altera® representative.
Figure 5–2. Stratix II High-Speed Board with Dual-DIMM DDR2 SDRAM Interface
The Stratix II High-Speed Board uses a Stratix II 2S90F1508 device. For DQS, DQ, and
DM signals, the board is designed without external parallel termination resistors near
the DDR2 SDRAM DIMMs, to take advantage of the ODT feature of the DDR2
SDRAM components. Stratix II FPGA devices are not equipped with dynamic OCT, so
external parallel termination resistors are used at the FPGA end of the line.
Stratix III and Stratix IV devices, which support dynamic OCT, do not require FPGA
end parallel termination. Hence this discrete parallel termination is optional.
The DDR2 SDRAM DIMM contains a 22- external series termination resistor for
each data strobe and data line, so all the measurements and simulations need to
account for the effect of these series termination resistors.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–3
To correlate the bench measurements done on the Stratix II High Speed High Density
Board, the simulations are performed using HyperLynx LineSim Software with IBIS
models from Altera and memory vendors. Figure 5–3 is an example of the simulation
setup in HyperLynx used for the simulation.
Figure 5–3. HyperLynx Setup for Simulating the Stratix II High Speed High Density with Dual-DIMM DDR2 SDRAM
Interface
Overview of ODT Control
When there is only a single-DIMM on the board, the ODT control is relatively
straightforward. During write to the memory, the ODT feature of the memory is
turned on; during read from the memory, the ODT feature of the memory is turned
off. However, when there are multiple DIMMs on the board, the ODT control becomes
more complicated.
With a dual-DIMM interface on the system, the controller has different options for
turning the memory ODT on or off during read or write. Table 5–1 lists the DDR2
SDRAM ODT control during write to the memory; Table 5–2 during read from the
memory. These DDR2 SDRAM ODT controls are recommended by Samsung
Electronics. The JEDEC DDR2 specification was updated to include optional support
for RTT(nominal) = 50 .
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–4
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
f For more information about the DDR2 SDRAM ODT controls recommended by
Samsung, refer to the Samsung DDR2 Application Note: ODT (On Die Termination)
Control.
Table 5–1. DDR2 SDRAM ODT Control—Writes
Slot 1
(2)
Slot 2
(2)
(1)
Module in Slot 1
Write
To
Module in Slot 2
FPGA
Rank 1
Rank 2
Rank 3
Rank 4
Slot 1
Series 50 
Infinite
Infinite
75 or 50 
Infinite
Slot 2
Series 50 
75 or 50 
Infinite
Infinite
infinite
Slot 1
Series 50 
Infinite
Unpopulated
75 or 50 
Unpopulated
Slot 2
Series 50 
75 or 50 
Unpopulated
Infinite
Unpopulated
Empty
Slot 1
Series 50 
150 
Infinite
Unpopulated
Unpopulated
Empty
DR
Slot 2
Series 50 
Unpopulated
Unpopulated
150 
Infinite
SR
Empty
Slot 1
Series 50 
150 
Unpopulated
Unpopulated
Unpopulated
Empty
SR
Slot 2
Series 50 
Unpopulated
Unpopulated
150 
Unpopulated
DR
DR
SR
SR
DR
Notes to Table 5–1:
(1) For DDR2 at 400 MHz and 533 Mbps = 75 ; for DDR2 at 667 MHz and 800 Mbps = 50 .
(2) SR = single ranked; DR = dual ranked.
Table 5–2. DDR2 SDRAM ODT Control—Reads
Slot 1
(2)
Slot 2
DR
DR
SR
SR
(2)
(1)
Module in Slot 1
Read
From
Module in Slot 2
FPGA
Rank 1
Rank 2
Rank 3
Rank 4
Slot 1
Parallel 50 
Infinite
Infinite
75 or 50 
Infinite
Slot 2
Parallel 50 
75 or 50 
Infinite
Infinite
Infinite
Slot 1
Parallel 50 
Infinite
Unpopulated
75 or 50 
Unpopulated
Slot 2
Parallel 50 
75 or 50 
Unpopulated
Infinite
Unpopulated
DR
Empty
Slot 1
Parallel 50 
Infinite
Infinite
Unpopulated
Unpopulated
Empty
DR
Slot 2
Parallel 50 
Unpopulated
Unpopulated
Infinite
Infinite
SR
Empty
Slot 1
Parallel 50 
Infinite
Unpopulated
Unpopulated
Unpopulated
Empty
SR
Slot 2
Parallel 50 
Unpopulated
Unpopulated
Infinite
Unpopulated
Notes to Table 5–1:
(1) For DDR2 at 400 MHz and 533 Mbps = 75 ; for DDR2 at 667 MHz and 800 Mbps = 50 .
(2) SR = single ranked; DR = dual ranked.
A 54- external parallel termination resistor is placed on all the data strobes and data
lines near the Stratix II device on the Stratix II High Speed High Density Board.
Although the characteristic impedance of the transmission is designed for 50 , to
account for any process variation, it is advisable to underterminate the termination
seen at the receiver. This is why the termination resistors at the FPGA side use 54-
resistors.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–5
DIMM Configuration
While populating both memory slots is common in a dual-DIMM memory system,
there are some instances when only one slot is populated. For example, some systems
are designed to have a certain amount of memory initially and as applications get
more complex, the system can be easily upgraded to accommodate more memory by
populating the second memory slot without re-designing the system. The following
section discusses a dual-DIMM system where the dual-DIMM system only has one
slot populated at one time and a dual-DIMM system where both slots are populated.
ODT controls recommended by the memory vendors listed in Table 5–1 as well as
other possible ODT settings will be evaluated for usefulness in an FPGA system.
Dual-DIMM Memory Interface with Slot 1 Populated
This section focuses on a dual-DIMM memory interface where slot 1 is populated and
slot 2 is unpopulated. This section examines the impact on the signal quality due to an
unpopulated DIMM slot and compares it to a single-DIMM memory interface.
FPGA Writing to Memory
In the DDR2 SDRAM, the ODT feature has two settings: 150  and 75 . In Table 5–1,
the recommended ODT setting for a dual DIMM configuration with one slot occupied
is 150 .
1
On DDR2 SDRAM devices running at 333 MHz/667 Mbps and above, the ODT
feature supports an additional setting of 50 .
f Refer to the respective memory decathlete for additional information about the ODT
settings in DDR2 SDRAM devices.
Write to Memory Using an ODT Setting of 150
Figure 5–4 shows a double parallel termination scheme (Class II) using ODT on the
memory with a memory-side series resistor when the FPGA is writing to the memory
using a 25- OCT drive strength setting on the FPGA.
Figure 5–4. Double Parallel Termination Scheme (Class II) Using ODT on DDR2 SDRAM DIMM with Memory-Side Series
Resistor
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
Driver
RT= 54Ω
RS = 22Ω
Receiver
Altera Corporation
300Ω/
150Ω
Receiver
50Ω
VREF
November 2011
DDR2 Component
Driver
3" Trace Length
VREF = 0.9V
300Ω/
150Ω
External Memory Interface Handbook
Volume 2: Design Guidelines
5–6
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
Figure 5–5 shows a HyperLynx simulation and board measurement of a signal at the
memory of a double parallel termination using ODT 150  with a memory-side series
resistor transmission line when the FPGA is writing to the memory with a 25- OCT
drive strength setting.
Figure 5–5. HyperLynx Simulation and Board Measurement of the Signal at the Memory in Slot 1 with Slot 2 Unpopulated
Table 5–3 summarizes the comparison between the simulation and board
measurements of the signal at the memory of a single-DIMM and a dual-DIMM
memory interface with slot 1 populated using a double parallel termination using an
ODT setting of 150  with a memory-side series resistor with a 25- OCT strength
setting on the FPGA.
Table 5–3. Comparison of Signal at the Memory of a Single-DIMM and a Dual-DIMM Interface with Slot 1 Populated
Type
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
(1)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM memory interface with slot 1 populated
Simulation
1.68
0.97
0.06
NA
2.08
1.96
Measurements
1.30
0.63
0.22
0.20
1.74
1.82
Simulation
1.62
0.94
0.10
0.05
2.46
2.46
Measurements
1.34
0.77
0.04
0.13
1.56
1.39
Single-DIMM
Note to Table 5–3:
(1) The simulation and board measurements of the single-DIMM DDR2 SDRAM interface are based on the Stratix II Memory Board 2. For more
information about the single-DIMM DDR2 SDRAM interface, refer to the DDR2 and DDR3 SDRAM Board Design Guidelines chapter.
Table 5–3 indicates that there is not much difference between a single-DIMM memory
interface or a dual-DIMM memory interface with slot 1 populated. The over and
undershooting observed in both the simulations and board measurements can be
attributed to the use of the ODT setting of 150  on the memory resulting in overtermination at the receiver. In addition, there is no significant effect of the extra DIMM
connector due to the unpopulated slot.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–7
When the ODT setting is set to 75 , there is no difference in the eye width and height
compared to the ODT setting of 150 . However, there is no overshoot and
undershoot when the ODT setting is set to 75 , which is attributed to proper
termination resulting in matched impedance seen by the DDR2 SDRAM devices.
1
For information about results obtained from using an ODT setting of 75  refer to
page 5–24.
Reading from Memory
During read from the memory, the ODT feature is turned off. Thus, there is no
difference between using an ODT setting of 150  and 75 . As such, the termination
scheme becomes a single parallel termination scheme (Class I) where there is an
external resistor on the FPGA side and a series resistor on the memory side as shown
in Figure 5–6.
Figure 5–6. Single Parallel Termination Scheme (Class I) Using External Resistor and Memory-Side Series Resistor
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
DDR2 Component
Driver
Driver
RT= 54Ω
RS = 22Ω
Receiver
300Ω/
150Ω
Receiver
50Ω
VREF = 0.9V
3" Trace Length
VREF
300Ω/
150Ω
Figure 5–7 shows the simulation and board measurement of the signal at the FPGA of
a single parallel termination using an external parallel resistor on the FPGA side with
a memory-side series resistor with full drive strength setting on the memory.
Figure 5–7. HyperLynx Simulation and Board Measurement of the Signal at the FPGA When Reading From Slot 1 With
Slot 2 Unpopulated
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–8
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
Table 5–4 summarizes the comparison between the simulation and board
measurements of the signal seen at the FPGA of a single-DIMM and a dual-DIMM
memory interface with a slot 1 populated memory interface using a single parallel
termination using an external parallel resistor at the FPGA with a memory-side series
resistor with full strength setting on the memory.
Table 5–4. Comparison of Signal at the FPGA of a Dual-DIMM Memory Interface with Slot 1 Populated
Type
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
(1)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM memory interface with slot 1 populated
Simulation
1.76
0.80
NA
NA
2.29
2.29
Measurements
1.08
0.59
NA
NA
1.14
1.59
Simulation
1.80
0.95
NA
NA
2.67
2.46
Measurements
1.03
0.58
NA
NA
1.10
1.30
Single-DIMM1
Note to Table 5–4:
(1) The simulation and board measurements of the single-DIMM DDR2 SDRAM interface are based on the Stratix II Memory Board 2. For more
information about the single-DIMM DDR2 SDRAM interface, refer to the DDR2 and DDR3 SDRAM Board Design Guidelines chapter.
Table 5–4 demonstrates that there is not much difference between a single-DIMM
memory interface or a dual-DIMM memory interface with only slot 1 populated.
There is no significant effect of the extra DIMM connector due to the unpopulated
slot.
Dual-DIMM with Slot 2 Populated
This section focuses on a dual-DIMM memory interface where slot 2 is populated and
slot 1 is unpopulated. Specifically, this section discusses the impact of location of the
DIMM on the signal quality.
FPGA Writing to Memory
The previous section focused on the dual-DIMM memory interface where slot 1 is
populated resulting in the memory being located closer to the FPGA. When slot 2 is
populated, the memory is located further away from the FPGA, resulting in
additional trace length that potentially affects the signal quality seen by the memory.
The next section explores if there are any differences between populating slot 1 and
slot 2 of the dual-DIMM memory interface.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–9
Write to Memory Using an ODT Setting of 150
Figure 5–8 shows the double parallel termination scheme (Class II) using ODT on the
memory with the memory-side series resistor when the FPGA is writing to the
memory using a 25- OCT drive strength setting on the FPGA.
Figure 5–8. Double Parallel Termination Scheme (Class II) Using ODT on DDR2 SDRAM DIMM with Memory-side Series
Resistor
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
DDR2 Component
Driver
Driver
RT= 54Ω
RS = 22Ω
Receiver
300Ω/
150Ω
Receiver
50Ω
VREF
3" Trace Length
VREF = 0.9V
300Ω/
150Ω
Figure 5–9 shows the simulation and board measurement of the signal at the memory
of a double parallel termination using an ODT setting of 150  with a memory-side
series resistor transmission line when the FPGA is writing to the memory with a 25-
OCT drive strength setting.
Figure 5–9. HyperLynx Simulation and Board Measurement of the Signal at the Memory in Slot 2 With Slot 1 Unpopulated
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–10
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
Table 5–5 summarizes the comparison between the simulation and board
measurements of the signal seen at the DDR2 SDRAM DIMM of a dual-DIMM
memory interface with either only slot 1 populated or only slot 2 populated using a
double parallel termination using an ODT setting of 150  with a memory-side series
resistor with a 25- OCT strength setting on the FPGA.
Table 5–5. Comparison of Signal at the Memory of a Dual-DIMM Interface with Either Only Slot 1 Populated or Only Slot
2 Populated
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM Memory Interface with Slot 2 Populated
Simulation
1.69
0.94
0.07
0.02
1.96
2.08
Measurements
1.28
0.68
0.24
0.20
1.60
1.60
Dual-DIMM Memory Interface with Slot 1 Populated
Simulation
1.68
0.97
0.06
NA
2.08
2.08
Measurements
1.30
0.63
0.22
0.20
1.74
1.82
Table 5–5 shows that there is not much difference between populating slot 1 or slot 2
in a dual-DIMM memory interface. The over and undershooting observed in both the
simulations and board measurements can be attributed to the use of the ODT setting
of 150  on the memory, resulting in under-termination at the receiver.
When the ODT setting is set to 75 , there is no difference in the eye width and height
compared to the ODT setting of 150 . However, there is no overshoot and
undershoot when the ODT setting is set to 75 , which is attributed to proper
termination resulting in matched impedance seen by the DDR2 SDRAM devices.
f For detailed results for the ODT setting of 75 , refer to page 5–25.
Reading from Memory
During read from memory, the ODT feature is turned off, thus there is no difference
between using an ODT setting of 150  and 75 . As such, the termination scheme
becomes a single parallel termination scheme (Class I) where there is an external
resistor on the FPGA side and a series resistor on the memory side, as shown in
Figure 5–10.
Figure 5–10. Single Parallel Termination Scheme (Class I) Using External Resistor and Memory-Side Series Resistor
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
DDR2 Component
Driver
Driver
RT= 54Ω
RS = 22Ω
Receiver
Receiver
50Ω
VREF = 0.9V
External Memory Interface Handbook
Volume 2: Design Guidelines
300Ω/
150Ω
3" Trace Length
VREF
300Ω/
150Ω
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–11
Figure 5–11 shows the simulation and board measurement of the signal at the FPGA
of a single parallel termination using an external parallel resistor on the FPGA side
with a memory-side series resistor with full drive strength setting on the memory.
Figure 5–11. HyperLynx Simulation and Board Measurement of the Signal at the FPGA When Reading From Slot 2 With
Slot 1 Unpopulated
Table 5–6 summarizes the comparison between the simulation and board
measurements of the signal seen at the FPGA of a dual-DIMM memory interface with
either slot 1 or slot 2 populated using a single parallel termination using an external
parallel resistor at the FPGA with a memory-side series resistor with full strength
setting on the memory.
Table 5–6. Comparison of the Signal at the FPGA of a Dual-DIMM Memory Interface with Either Slot 1 or Slot 2
Populated
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Simulation
1.80
0.80
NA
NA
3.09
2.57
Measurements
1.17
0.66
NA
NA
1.25
1.54
Simulation
1.80
0.95
NA
NA
2.67
2.46
Measurements
1.08
0.59
NA
NA
1.14
1.59
Type
Slot 2 Populated
Slot 1 Populated
From Table 5–6, you can see the signal seen at the FPGA is similar whether the
memory DIMM is located at either slot 1 or slot 2.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–12
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
Dual-DIMM Memory Interface with Both Slot 1 and Slot 2 Populated
This section focuses on a dual-DIMM memory interface where both slot 1 and slot 2
are populated. As such, you can write to either the memory in slot 1 or the memory in
slot 2.
FPGA Writing to Memory
In Table 5–1, the recommended ODT setting for a dual DIMM configuration with both
slots occupied is 75 . Since there is an option for an ODT setting of 150 , this section
explores the usage of the 150  setting and compares the results to that of the
recommended 75 .
Write to Memory in Slot 1 Using an ODT Setting of 75-
Figure 5–12 shows the double parallel termination scheme (Class II) using ODT on the
memory with the memory-side series resistor when the FPGA is writing to the
memory using a 25- OCT drive strength setting on the FPGA. In this scenario, the
FPGA is writing to the memory in slot 1 and the ODT feature of the memory at slot 2
is turned on.
Figure 5–12. Double Parallel Termination Scheme (Class II) Using ODT on DDR2 SDRAM DIMM with a Memory-Side Series
Resistor
Slot 1
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
DDR2 Component
Driver
Driver
RT= 54Ω
RS = 22Ω
Receiver
300Ω/
150Ω
Receiver
50Ω
3" Trace Length
VREF
300Ω/
150Ω
Slot 2
50Ω
VREF
DDR2 DIMM
DDR2 Component
Driver
RS = 22Ω
VREF = 0.9V
External Memory Interface Handbook
Volume 2: Design Guidelines
300Ω/
150Ω
Receiver
300Ω/
150Ω
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–13
Figure 5–13 shows a HyperLynx simulation and board measurement of the signal at
the memory in slot 1 of a double parallel termination using an ODT setting of 75 
with a memory-side series resistor transmission line when the FPGA is writing to the
memory with a 25- OCT drive strength setting.
Figure 5–13. HyperLynx Simulation and Board Measurements of the Signal at the Memory in Slot 1 with Both Slots
Populated
Table 5–7 summarizes the comparison of the signal at the memory of a dual-DIMM
memory interface with one slot and with both slots populated using a double parallel
termination using an ODT setting of 75  with a memory-side series resistor with a
25- OCT strength setting on the FPGA.
Table 5–7. Comparison of the Signal at the Memory of a Dual-DIMM Interface With One Slot and With Both Slots
Populated
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM Interface with Both Slots Populated Writing to Slot 1
Simulation
1.60
1.18
0.02
NA
1.71
1.71
Measurements
0.97
0.77
0.05
0.04
1.25
1.25
Dual-DIMM Interface with Slot 1 Populated
Simulation
1.68
0.97
0.06
NA
2.08
2.08
Measurements
1.30
0.63
0.22
0.20
1.74
1.82
Table 5–7 shows that there is not much difference in the eye height between
populating one slot or both slots. However, the additional loading due to the
additional memory DIMM results in a slower edge rate, which results in smaller eye
width and degrades the setup and hold time of the memory. This reduces the
available data valid window.
When the ODT setting is set to 150 , there is no difference in the eye width and
height compared to the ODT setting of 75 . However, there is some overshoot and
undershoot when the ODT setting is set to 150 , which is attributed to under
termination resulting in mismatched impedance seen by the DDR2 SDRAM devices.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–14
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
1
For more information about the results obtained from using an ODT setting of 150 ,
refer to page 5–26.
Write to Memory in Slot 2 Using an ODT Setting of 75-
In this scenario, the FPGA is writing to the memory in slot 2 and the ODT feature of
the memory at slot 1 is turned on. Figure 5–14 shows the HyperLynx simulation and
board measurement of the signal at the memory in slot 1 of a double parallel
termination using an ODT setting of 75  with a memory-side series resistor
transmission line when the FPGA is writing to the memory with a 25- OCT drive
strength setting.
Figure 5–14. HyperLynx Simulation and Board Measurements of the Signal at the Memory in Slot 2 With Both Slots
Populated
Table 5–8 summarizes the comparison of the signal at the memory of a dual-DIMM
memory interface with slot 1 populated using a double parallel termination using an
ODT setting of 75  with a memory-side series resistor with a 25- OCT strength
setting on the FPGA.
Table 5–8. Comparison of the Signal at the Memory of a Dual-DIMM Interface With Both Slots Populated
Type
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rise Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM Interface with Both Slots Populated Writing to Slot 2
Simulation
1.60
1.16
0.10
0.08
1.68
1.60
Measurements
1.10
0.85
0.16
0.19
1.11
1.25
Dual-DIMM Interface with Both Slots Populated Writing to Slot 1
Simulation
1.60
1.18
0.02
NA
1.71
1.71
Measurements
1.30
0.77
0.05
0.04
1.25
1.25
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–15
From Table 5–8, you can see that both simulations and board measurements
demonstrate that the eye width is larger when writing to slot 1, which is due to better
edge rate seen when writing to slot 1. The improvement on the eye when writing to
slot 1 can be attributed to the location of the termination. When you are writing to slot
1, the ODT feature of slot 2 is turned on, resulting in a fly-by topology. When you are
writing to slot 2, the ODT feature of slot 1 is turned on resulting in a non fly-by
topology.
When the ODT setting is set to 150 , there is no difference in the eye width and
height compared to the ODT setting of 75 . However, there is some overshoot and
undershoot when the ODT setting is set to 150 , which is attributed to under
termination resulting in mismatched impedance seen by the DDR2 SDRAM devices.
For more information about the results obtained from using an ODT setting of 150 ,
refer to “Write to Memory in Slot 2 Using an ODT Setting of 150  With Both Slots
Populated” on page 5–27.
Reading From Memory
In Table 5–2, the recommended ODT setting for a dual-DIMM configuration with both
slots occupied is to turn on the ODT feature using a setting of 75  on the slot that is
not read from. As there is an option for an ODT setting of 150 , this section explores
the usage of the 150  setting and compares the results to that of the recommended
75 ..
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–16
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
Read From Memory in Slot 1 Using an ODT Setting of 75- on Slot 2
Figure 5–15 shows the double parallel termination scheme (Class II) using ODT on the
memory with the memory-side series resistor when the FPGA is reading from the
memory using a full drive strength setting on the memory. In this scenario, the FPGA
is reading from the memory in slot 1 and the ODT feature of the memory at slot 2 is
turned on.
Figure 5–15. Double Parallel Termination Scheme (Class II) Using External Resistor and Memory-Side Series Resistor
and ODT Feature Turned On
Slot 1
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
DDR2 Component
Driver
Driver
RT= 54Ω
RS = 22Ω
Receiver
300Ω/
150Ω
Receiver
50Ω
3" Trace Length
VREF
300Ω/
150Ω
Slot 2
50Ω
VREF
DDR2 DIMM
DDR2 Component
Driver
RS = 22Ω
VREF
External Memory Interface Handbook
Volume 2: Design Guidelines
300Ω/
150Ω
Receiver
300Ω/
150Ω
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–17
Figure 5–16 shows the simulation and board measurement of the signal at the FPGA
when the FPGA is reading from the memory in slot 1 using a full drive strength
setting on the memory.
Figure 5–16. HyperLynx Simulation and Board Measurement of the Signal at the FPGA When Reading From Slot 1 With
Both Slots Populated (1)
Note to Figure 5–16:
(1) The vertical scale used for the simulation and measurement is set to 200 mV per division.
Table 5–9 summarizes the comparison between the simulation and board
measurements of the signal seen at the FPGA of a dual-DIMM memory interface with
both slots populated and a dual-DIMM memory interface with a slot 1 populated
memory interface.
Table 5–9. Comparison of the Signal at the FPGA of a Dual-DIMM Interface Reading From Slot 1 With One Slot and With
Both Slots Populated
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM with One Slot Populated with an ODT Setting of 75- on Slot 2
Simulation
1.74
0.87
NA
NA
1.91
1.88
Measurements
0.86
0.58
NA
NA
1.11
1.09
Dual-DIMM with One Slot Populated in Slot 1 without ODT Setting
Simulation
1.76
0.80
NA
NA
2.29
2.29
Measurements
1.08
0.59
NA
NA
1.14
1.59
Table 5–9 shows that when both slots are populated, the additional loading due to the
additional memory DIMM results in a slower edge rate, which results in a
degradation in the eye width.
For more information about the results obtained from using an ODT setting of 150 ,
refer to “Read from Memory in Slot 1 Using an ODT Setting of 150  on Slot 2 with
Both Slots Populated” on page 5–28.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–18
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
Read From Memory in Slot 2 Using an ODT Setting of 75- on Slot 1
In this scenario, the FPGA is reading from the memory in slot 2 and the ODT feature
of the memory at slot 1 is turned on.
Figure 5–17. Double Parallel Termination Scheme (Class II) Using External Resistor and a Memory-Side Series Resistor
and ODT Feature Turned On
Slot 1
FPGA
DDR2 DIMM
VTT = 0.9V
25Ω
DDR2 Component
Driver
Driver
RT= 54Ω
RS = 22Ω
Receiver
150Ω/
300Ω
Receiver
50Ω
3" Trace Length
VREF
150Ω/
300Ω
Slot 2
50Ω
VREF = 0.9V
DDR2 DIMM
DDR2 Component
Driver
RS = 22Ω
VREF
External Memory Interface Handbook
Volume 2: Design Guidelines
150Ω/
300Ω
Receiver
150Ω/
300Ω
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–19
Figure 5–18 shows the HyperLynx simulation and board measurement of the signal at
the FPGA of a double parallel termination using an external parallel resistor on the
FPGA side with a memory-side series resistor and an ODT setting of 75  with a full
drive strength setting on the memory.
Figure 5–18. HyperLynx Simulation and Board Measurements of the Signal at the FPGA When Reading From Slot 2 With
Both Slots Populated (1)
Note to Figure 5–18:
(1) The vertical scale used for the simulation and measurement is set to 200 mV per division.
Table 5–10 summarizes the comparison between the simulation and board
measurements of the signal seen at the FPGA of a dual-DIMM memory interface with
both slots populated and a dual-DIMM memory interface with a slot 1 populated
memory interface.
Table 5–10. Comparison of the Signal at the FPGA of a Dual-DIMM Interface Reading From Slot 2 With One Slot and With
Both Slots Populated
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
Dual-DIMM with Both Slots Populated with an ODT Setting of 75- Setting on Slot 1
Simulation
1.70
0.81
NA
NA
1.72
1.99
Measurements
0.87
0.59
NA
NA
1.09
1.14
Dual-DIMM with One Slot Populated in Slot 2 without an ODT Setting
Simulation
1.80
0.80
NA
NA
3.09
2.57
Measurements
1.17
0.66
NA
NA
1.25
1.54
Table 5–10 shows that when only one slot is populated in a dual-DIMM memory
interface, the eye width is larger as compared to a dual-DIMM memory interface with
both slots populated. This can be attributed to the loading from the DIMM located in
slot 1.
When the ODT setting is set to 150 , there is no difference in the signal quality
compared to the ODT setting of 75 .
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–20
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
For more information about the results obtained from using an ODT setting of 150  ,
refer to “Read From Memory in Slot 2 Using an ODT Setting of 150  on Slot 1 With
Both Slots Populated” on page 5–29.
Dual-DIMM DDR2 Clock, Address, and Command Termination and Topology
The address and command signals on a DDR2 SDRAM interface are unidirectional
signals that the FPGA memory controller drives to the DIMM slots. These signals are
always Class-I terminated at the memory end of the line (Figure 5–19). Always place
DDR2 SDRAM address and command Class-I termination after the last DIMM. The
interface can have one or two DIMMs, but never more than two DIMMs total.
Figure 5–19. Multi DIMM DDR2 Address and Command Termination Topology
V TT
DDR2 SDRAM
DIMMs
(Receiver)
Slot 1
FPGA
(Driver)
Board Trace A
Board Trace C
R P = 47 
Slot 2
Board Trace B
In Figure 5–19, observe the following points:
■
Board trace A = 1.9 to 4.5 inches (48 to 115 mm)
■
Board trace B = 0.425 inches (10.795 mm)
■
Board trace C = 0.2 to 0.55 inches (5 to 13 mm)
■
Total of board trace A + B + C = 2.5 to 5 inches (63 to 127 mm)
■
RP = 36 to 56 
■
Length match all address and command signals to +250 mils (+5 mm) or +/– 50 ps
of memory clock length at the DIMM.
You may place a compensation capacitor directly before the first DIMM slot 1 to
improve signal quality on the address and command signal group. If you fit a
capacitor, Altera recommends a value of 24 pF.
f For more information, refer to Micron TN47-01.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR2 SDRAM
5–21
Address and Command Signals
The address and command group of signals: bank address, address, RAS#, CAS#, and
WE#, operate a different toggle rate depending on whether a you implement a fullrate or half-rate memory controller.
In full-rate designs, the address and command group of signals are 1T signals, which
means that the signals can change every memory clock cycle. Address and command
signals are also single data rate (SDR). Hence in a full-rate PHY design, the address
and command signals operate at a maximum frequency of 0.5 × the data rate. For
example in a 266-MHz full rate design, the maximum address and command
frequency is 133 MHz.
In half-rate designs the address and command group of signals are 2T signals, which
means that the signals change only every two memory clock cycles. As the signals are
also SDR, in a half-rate PHY design, the address and command signals operate at a
maximum frequency of 0.25 × the data rate. For example, in a 400-MHz half-rate
design, the maximum address and command frequency is 100 MHz.
Control Group Signals
The control group of signals: chip select CS#, clock enable CKE, and ODT are always
1T regardless of whether you implement a full-rate or half-rate design. As the signals
are also SDR, the control group signals operate at a maximum frequency of 0.5 × the
data rate. For example, in a 400-MHz design, the maximum control group frequency
is 200 MHz.
Clock Group Signals
Depending on the specific form factor, DDR2 SDRAM DIMMs have two or three
differential clock pairs, to ensure that the loading on the clock signals is not excessive.
The clock signals are always terminated on the DIMMs and hence no termination is
required on your PCB. Additionally, each DIMM slot is required to have its own
dedicated set of clock signals. Hence clock signals are always point-to-point from the
FPGA PHY to each individual DIMM slot. Individual memory clock signals should
never be shared between two DIMM slots.
A typical two slot DDR2 DIMM design therefore has six differential memory clock
pairs—three to the first DIMM and three to the second DIMM. All six memory clock
pairs must be delay matched to each other to ±25 mils (±0.635 mm) and ±10 mils
(±0.254 mm) for each CLK to CLK# signal.
You may place a compensation capacitor between each clock pair directly before the
DIMM connector, to improve the clock slew rates. As FPGA devices have fully
programmable drive strength and slew rate options, this capacitor is usually not
required for FPGA design. However, Altera advise that you simulate your specific
implementation to ascertain if this capacitor is required or not. If fitted the best value
is typically 5 pF.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–22
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR3 SDRAM
DDR3 SDRAM
This section details the system implementation of a dual slot unbuffered DDR3
SDRAM interface, operating at up to 400 MHz and 800 Mbps data rates. Figure 5–20
shows a typical DQS, DQ, and DM, and address and command signal topology for a
dual-DIMM interface configuration, using the ODT feature of the DDR3 SDRAM
components combined with the dynamic OCT features available in Stratix III and
Stratix IV devices.
Figure 5–20. Multi DIMM DDR3 DQS, DQ, and DM, and Address and Command Termination Topology
DDR3 SDRAM
DIMMs
Slot 1
FPGA
(Driver)
Board Trace A
Slot 2
Board Trace B
In Figure 5–20, observe the following points:
■
Board trace A = 1.9 to 4.5 inches (48 to 115 mm)
■
Board trace B = 0.425 inches (10.795 mm)
■
This topology to both DIMMs is accurate for DQS, DQ, and DM, and address and
command signals
■
This topology is not correct for CLK and CLK# and control group signals (CS#,
CKE, and ODT), which are always point-to-point single rank only.
Comparison of DDR3 and DDR2 DQ and DQS ODT Features and Topology
DDR3 and DDR2 SDRAM systems are quite similar. The physical topology of the data
group of signals may be considered nearly identical. The FPGA end (driver) I/O
standard changes from SSTL18 for DDR2 to SSTL15 for DDR3, but all other OCT
settings are identical. DDR3 offers enhanced ODT options for termination and drivestrength settings at the memory end of the line.
f For more information, refer to the DDR3 SDRAM ODT matrix for writes and the
DDR3 SDRAM ODT matrix for reads tables in the DDR2 and DDR3 SDRAM Board
Design Guidelines chapter.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
DDR3 SDRAM
5–23
Dual-DIMM DDR3 Clock, Address, and Command Termination and Topology
One significant difference between DDR3 and DDR2 DIMM based interfaces is the
address, command and clock signals. DDR3 uses a daisy chained based architecture
when using JEDEC standard modules. The address, command, and clock signals are
routed on each module in a daisy chain and feature a fly-by termination on the
module. Impedance matching is required to make the dual-DIMM topology work
effectively—40 to 50  traces should be targeted on the main board.
Address and Command Signals
Two UDIMMs result in twice the effective load on the address and command signals,
which reduces the slew rate and makes it more difficult to meet setup and hold timing
(tIS and tIH). However, address and command signals operate at half the interface rate
and are SDR. Hence a 400-Mbps data rate equates to an address and command
fundamental frequency of 100 MHz.
Control Group Signals
The control group signals (chip Select CS#, clock enable CKE, and ODT) are only ever
single rank. A dual-rank capable DDR3 DIMM slot has two copies of each signal, and
a dual-DIMM slot interface has four copies of each signal. Hence the signal quality of
these signals is identical to a single rank case. The control group of signals, are always
1T regardless of whether you implement a full-rate or half-rate design. As the signals
are also SDR, the control group signals operate at a maximum frequency of 0.5 × the
data rate. For example, in a 400 MHz design, the maximum control group frequency is
200 MHz.
Clock Group Signals
Like the control group signals, the clock signals in DDR3 SDRAM are only ever single
rank loaded. A dual-rank capable DDR3 DIMM slot has two copies of the signal, and
a dual-slot interface has four copies of the mem_clk and mem_clk_n signals.
f For more information about a DDR3 two-DIMM system design, refer to Micron TN41-08: DDR3 Design Guide for Two-DIMM Systems.
1
November 2011
The Altera DDR3 ALTMEMPHY megafunction does not support the 1T address and
command topology referred to in this Micron Technical Note—only 2T
implementations are supported.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–24
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Write to Memory in Slot 1 Using an ODT Setting of 75  With One Slot Populated
Write to Memory in Slot 1 Using an ODT Setting of 75  With One Slot
Populated
Figure 5–21 shows the simulation and board measurement of the signal at the
memory when the FPGA is writing to the memory with an ODT setting of 75  and
using a 25- OCT drive strength setting on the FPGA.
Figure 5–21. HyperLynx Simulation and Board Measurement of the Signal at the Memory in Slot 1 With Slot 2
Unpopulated
Table 5–11 summarizes the comparison between the simulation and board
measurements of the signal seen at the DDR2 SDRAM of a dual-DIMM with slot 1
populated by a memory interface using a different ODT setting.
Table 5–11. Comparison of the Signal at the Memory of a Dual-DIMM Interface With Only Slot 1 Populated and a
Different ODT Setting
Type
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
ODT Setting of 75 
Simulation
1.68
0.91
NA
NA
1.88
1.88
Measurements
1.28
0.57
NA
NA
1.54
1.38
Simulation
1.68
0.97
0.06
NA
2.67
2.13
Measurements
1.30
0.63
0.22
0.20
1.74
1.82
ODT Setting of 150 
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Write to Memory in Slot 2 Using an ODT Setting of 75  With One Slot Populated
5–25
Write to Memory in Slot 2 Using an ODT Setting of 75  With One Slot
Populated
Figure 5–22 shows the simulation and measurements result of the signal seen at the
memory when the FPGA is writing to the memory with an ODT setting of 75  and
using a 25- OCT drive strength setting on the FPGA.
Figure 5–22. HyperLynx Simulation and Board Measurement of the Signal at the Memory in Slot 2 with Slot 1
Unpopulated
Table 5–12 summarizes the comparison of the signal at the memory of a dual-DIMM
memory interface with either slot 1 or slot 2 populated using a double parallel
termination using an ODT setting of 75  with a memory-side series resistor with a
25- OCT strength setting on the FPGA.
Table 5–12. Comparison of Signal at the Memory of a Dual-DIMM Interface With Only Slot 2 Populated and a Different
ODT Setting
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
ODT Setting of 75 
Simulation
1.68
0.89
NA
NA
1.82
1.93
Measurements
1.29
0.59
NA
NA
1.60
1.29
Simulation
1.69
0.94
0.07
0.02
1.88
2.29
Measurements
1.28
0.68
0.24
0.20
1.60
1.60
ODT Setting of 150 
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–26
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Write to Memory in Slot 1 Using an ODT Setting of 150  With Both Slots Populated
Write to Memory in Slot 1 Using an ODT Setting of 150  With Both
Slots Populated
Figure 5–23 shows the HyperLynx simulation and board measurement of the signal at
the memory in slot 1 of a double parallel termination using an ODT setting of 150 
on Slot 2 with a memory-side series resistor transmission line when the FPGA is
writing to the memory with a 25- OCT drive strength setting.
Figure 5–23. HyperLynx Simulation and Board Measurement of the Signal at the Memory in Slot 1 With Both Slots
Populated
Table 5–13 summarizes the comparison between the simulation and board
measurements of the signal seen at the memory in slot 1 of a dual-DIMM memory
interface with both slots populated using a double parallel termination using a
different ODT setting on Slot 2 with a memory-side series resistor with a 25- OCT
strength setting on the FPGA.
Table 5–13. Comparison of Signal at the Memory of a Dual-DIMM Interface with Both Slots Populated and a Different
ODT Setting on Slot 2
Type
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
ODT Setting of 150 
Simulation
1.60
1.18
0.02
NA
1.71
1.71
Measurements
0.89
0.78
0.13
0.17
1.19
1.32
Simulation
1.60
1.18
0.02
NA
1.71
1.71
Measurements
0.97
0.77
0.05
0.04
1.25
1.25
ODT Setting of 75 
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Write to Memory in Slot 2 Using an ODT Setting of 150  With Both Slots Populated
5–27
Write to Memory in Slot 2 Using an ODT Setting of 150  With Both
Slots Populated
Figure 5–24 shows the HyperLynx simulation and board measurement of the signal at
the memory in slot 2 of a double parallel termination using an ODT setting of 150 
on slot 1 with a memory-side series resistor transmission line when the FPGA is
writing to the memory with a 25- OCT drive strength setting.
Figure 5–24. HyperLynx Simulation and Board Measurements of the Signal at the Memory in Slot 2 with Both Slots
Populated
Table 5–14 summarizes the comparison between the simulation and board
measurements of the signal seen at the memory of a dual-DIMM memory interface
with both slots populated using a double parallel termination using a different ODT
setting on Slot 1 with a memory-side series resistor with a 25- OCT strength setting
on the FPGA.
Table 5–14. Comparison of the Signal at the Memory of a Dual-DIMM Interface With Both Slots Populated and a
Different ODT Setting on Slot 1
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
ODT Setting of 150 
Simulation
1.45
1.11
0.19
0.17
1.43
2.21
Measurements
0.71
0.81
0.12
0.20
0.93
1.00
Simulation
1.60
1.16
0.10
0.08
1.68
1.60
Measurements
1.10
0.85
0.16
0.19
1.11
1.25
ODT Setting of 75 
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–28
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Read from Memory in Slot 1 Using an ODT Setting of 150  on Slot 2 with Both Slots Populated
Read from Memory in Slot 1 Using an ODT Setting of 150  on Slot 2
with Both Slots Populated
Figure 5–25 shows the HyperLynx simulation and board measurement of the signal at
the FPGA of a double parallel termination using an external parallel resistor on the
FPGA side with a memory-side series resistor and an ODT setting of 150  with a full
drive strength setting on the memory.
Figure 5–25. HyperLynx Simulation and Board Measurement of the Signal at the FPGA When Reading From Slot 1 With
Both Slots Populated (1)
Note to Figure 5–25:
(1) The vertical scale used for the simulation and measurement is set to 200 mV per division.
Table 5–15 summarizes the comparison between the simulation and board
measurements of the signal seen at the FPGA of a dual-DIMM memory interface with
both slots populated using a different ODT setting on Slot 2.
Table 5–15. Comparison of Signal at the FPGA of a Dual-DIMM Interface With Both Slots Populated and a Different ODT
Setting on Slot 2
Type
Eye Width
(ns)
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rise Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
ODT Setting of 150 
Simulation
1.68
0.77
NA
NA
1.88
1.88
Measurements
0.76
0.55
NA
NA
1.11
1.14
Simulation
1.74
0.87
NA
NA
1.91
1.88
Measurements
0.86
0.59
NA
NA
1.11
1.09
ODT Setting of 75 
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Read From Memory in Slot 2 Using an ODT Setting of 150  on Slot 1 With Both Slots Populated
5–29
Read From Memory in Slot 2 Using an ODT Setting of 150  on Slot 1
With Both Slots Populated
Figure 5–26 shows the HyperLynx simulation board measurement of the signal seen
at the FPGA of a double parallel termination using an external parallel resistor on the
FPGA side with memory-side series resistor and an ODT setting of 150  with a full
drive strength setting on the memory.
Figure 5–26. HyperLynx Simulation Board Measurement of the Signal at the FPGA When Reading From Slot 2 With Both
Slots Populated (1)
Note to Figure 5–26:
(1) The vertical scale used for the simulation and measurement is set to 200 mV per division.
Table 5–16 summarizes the comparison between the simulation and board
measurements of the signal seen at the FPGA of a dual-DIMM memory interface with
both slots populated using a different ODT setting on Slot 1.
Table 5–16. Comparison of Signal at the FPGA of a Dual-DIMM Interface With Both Slots Populated and a Different ODT
Setting on Slot 1
Eye Width
(ns)
Type
Eye Height
(V)
Overshoot
(V)
Undershoot
(V)
Rising Edge Rate
(V/ns)
Falling Edge Rate
(V/ns)
ODT Setting of 150 
Simulation
1.70
0.74
NA
NA
1.91
1.64
Measurements
0.74
0.64
NA
NA
1.14
1.14
Simulation
1.70
0.81
NA
NA
1.72
1.99
Measurements
0.87
0.59
NA
NA
1.09
1.14
ODT Setting of 75 
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
5–30
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
FPGA OCT Features
FPGA OCT Features
Many FPGA devices offer OCT. Depending on the chosen device family, series
(output), parallel (input) or dynamic (bidirectional) OCT may be supported.
f For more information specific to your device family, refer to the respective I/O
features chapter in the relevant device handbook.
Use series OCT in place of the near-end series terminator typically used in both
Class I or Class II termination schemes that both DDR2 and DDR3 type interfaces use.
Use parallel OCT in place of the far-end parallel termination typically used in Class I
termination schemes on unidirectional input only interfaces. For example, QDR-II
type interfaces, when the FPGA is at the far end.
Use dynamic OCT in place of both the series and parallel termination at the FPGA end
of the line. Typically use dynamic OCT for DQ and DQS signals in both DDR2 and
DDR3 type interfaces. As the parallel termination is dynamically disabled during
writes, the FPGA driver only ever drives into a Class I transmission line. When
combined with dynamic ODT at the memory, a truly dynamic Class I termination
scheme exists where both reads and writes are always fully Class I terminated in each
direction. Hence, you can use a fully dynamic bidirectional Class I termination
scheme instead of a static discretely terminated Class II topology, which saves power,
printed circuit board (PCB) real estate, and component cost.
Arria V, Cyclone V, Stratix III, Stratix IV, and Stratix V Devices
Arria® V, Cyclone® V, Stratix III, Stratix IV, and Stratix V devices feature full dynamic
OCT termination capability, Altera advise that you use this feature combined with the
SDRAM ODT to simplify PCB layout and save power.
Arria II GX Devices
Arria II GX devices do not support dynamic OCT. Altera recommends that you use
series OCT with SDRAM ODT. Use parallel discrete termination at the FPGA end of
the line when necessary,
f For more information, refer to the DDR2 and DDR3 SDRAM Board Design Guidelines
chapter.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 5: Dual-DIMM DDR2 and DDR3 SDRAM Board Design Guidelines
Document Revision History
5–31
Document Revision History
Table 5–17 lists the revision history for this document.
Table 5–17. Document Revision History
Date
Version
November 2011
4.0
Changes
Added Arria V and Cyclone V information.
June 2011
3.0
Added Stratix V information.
December 2010
2.1
Maintenance update.
July 2010
2.0
Updated Arria II GX information.
April 2010
1.0
Initial release.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6. RLDRAM II Board Design Guidelines
November 2011
EMI_DG_006-3.0
EMI_DG_006-3.0
This chapter provides guidelines for you to improve your system's signal integrity
and layout guidelines to successfully implement an RLDRAM II interface in your
system.
The RLDRAM II Controller with UniPHY intellectual property (IP) enables you to
implement Common I/O (CIO) RLDRAM II interfaces with Arria® V, Stratix® III,
Stratix IV, and Stratix V devices. You can implement Separate I/O (SIO) RLDRAM II
interfaces with the ALTDQ_DQS or ALTDQ_DQS2 megafunctions.
This chapter focuses on the following key factors that affect signal integrity:
■
I/O standards
■
RLDRAM II configurations
■
Signal terminations
■
Printed circuit board (PCB) layout guidelines
I/O Standards
RLDRAM II interface signals use one of the following JEDEC I/O signalling
standards:
■
HSTL-15—provides the advantages of lower power and lower emissions.
■
HSTL-18—provides increased noise immunity with slightly greater output voltage
swings.
f To select the most appropriate standard for your interface, refer to the Device Datasheet
for Arria II Devices chapter in the Arria II Device Handbook, the Device Datasheet for
Arria V Devices chapter in the Arria V Device Handbook, the Stratix III Device Datasheet:
DC and Switching Characteristics chapter in the Stratix III Device Handbook, the DC and
Switching Characteristics for Stratix IV Devices chapter in the Stratix IV Device Handbook,
or the DC and Switching Characteristics for Stratix V Devices chapter in the Stratix V
Device Handbook.
The RLDRAM II Controller with UniPHY IP defaults to HSTL 1.8 V Class I outputs
and HSTL 1.8 V inputs.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
6–2
Chapter 6: RLDRAM II Board Design Guidelines
RLDRAM II Configurations
RLDRAM II Configurations
The RLDRAM II Controller with UniPHY IP supports interfaces for CIO RLDRAM II
with a single device, and two devices in a width expansion configuration up to
maximum width of 72 bits. This chapter focuses on the layout and guidelines for CIO
RLDRAM II interfaces. However, the termination and layout principles for SIO
RLDRAM II interfaces are similar to CIO RLDRAM II, except that SIO RLDRAM II
interfaces have unidirectional data buses.
Figure 6–1 shows the main signal connections between the FPGA and a single CIO
RLDRAM II component.
Figure 6–1. Configuration with a Single CIO RLDRAM II Component
ZQ
RLDRAM II Device
DK/DK
QK/QK
DQ
(1)
(3)
DM
(3)
CK/K
(1)
A/BA
(5)
WE
REF
RQ
CS
(5)
VTT
(4)
VTT or VDD
(6)
FGPA
DK/DK
QK/QK
DQ
(2)
(2)
DM
CK/CK
ADDRESS/BANK ADDRESS
WE
REF
CS
Notes to Figure 6–1:
(1) Use external differential termination on DK/DK# and CK/CK#.
(2) Use FPGA parallel on-chip termination (OCT) for terminating QK/QK# and DQ on reads.
(3) Use RLDRAM II component on-die termination (ODT) for terminating DQ and DM on writes.
(4) Use external discrete termination with fly-by placement to avoid stubs.
(5) Use external discrete termination for this signal, as shown for REF.
(6) Use external discrete termination, as shown for REF, but you may require a pull-up resistor to VDD as an alternative option. Refer to the RLDRAM II
device data sheet for more information about RLDRAM II power-up sequencing.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
6–3
Figure 6–2 shows the main signal connections between the FPGA and two CIO
RLDRAM II components in a width expansion configuration.
Figure 6–2. Configuration with Two CIO RLDRAM II Components in a Width Expansion Configuration
RLDRAM II Device 1
DK/DK QK/QK
D
(1)
DM
(3)
CK/CK
(3)
ZQ
A/BA/REF/WE
RLDRAM II Device 2
RQ
CS
DK/DK QK/QK
(4)
(1)
VTT
(5)
D
DM
(3)
CK/CK
(3)
ZQ
A/BA/REF/WE
RQ
CS
(4)
VTT or VDD
(6)
FPGA
Device 1 DK/DK
Device 2 DK/DK
Device 1 QK/QK
Device 2 QK/QK
Device 1 DQ
Device 2 DQ
(2)
(2)
(2)
(2)
Device 1 DM
Device 2 DM
CK/CK
A/BA/REF/WE
CS
Notes to Figure 6–2:
(1) Use external differential termination on DK/DK#.
(2) Use FPGA parallel OCT for terminating QK/QK# and DQ on reads.
(3) Use RLDRAM II component ODT for terminating DQ and DM on writes.
(4) Use external dual 200 differential termination.
(5) Use external discrete termination at the trace split of the balanced T or Y topology.
(6) Use external discrete termination at the trace split of the balanced T or Y topology, but you may require a pull-up resistor to VDD as an alternative
option. Refer to the RLDRAM II device data sheet for more information about RLDRAM II power-up sequencing.
Signal Terminations
Stratix III, Stratix IV, and Stratix V devices offer OCT technology.
Table 6–1 lists the extent of OCT support for each device.
Table 6–1. On-Chip Termination Schemes (Part 1 of 2)
FPGA Device
Termination Scheme
November 2011
HSTL-15 and
HSTL-18
Arria II GZ, Stratix III,
and Stratix IV
Arria V and Stratix V
Row/Column I/O
Row/Column I/O
On-Chip Series Termination
without Calibration
Class I
50
50
On-Chip Series Termination
with Calibration
Class I
50
50 (1)
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6–4
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
Table 6–1. On-Chip Termination Schemes (Part 2 of 2)
FPGA Device
Termination Scheme
On-Chip Parallel Termination
with Calibration
HSTL-15 and
HSTL-18
Arria II GZ, Stratix III,
and Stratix IV
Arria V and Stratix V
Row/Column I/O
Row/Column I/O
50
50 (1)
Class I
Note to Table 6–1:
(1) Although 50 . is the recommended option, Stratix V devices offer a wider range of calibrated termination
impedances.
On-chip series (RS) termination supports output buffers, and bidirectional buffers
only when they are driving output signals. On-chip parallel (RT) termination supports
input buffers, and bidirectional buffers only when they are input signals. RLDRAM II
CIO interfaces have bidirectional data paths. The UniPHY IP uses dynamic OCT on
the datapath, which switches between series OCT for memory writes and parallel
OCT for memory reads.
For Arria II GZ, Stratix III, and Stratix IV devices, the HSTL Class I I/O calibrated
terminations are calibrated against 50  1% resistors connected to the RUP and RDN
pins in an I/O bank with the same VCCIO as the RLDRAM II interface. For Arria V and
Stratix V devices, the HSTL Class I I/O calibrated terminations are calibrated against
100  1% resistors connected to the RZQ pins in an I/O bank with the same VCCIO as
the RLDRAM II interface.
The calibration occurs at the end of the device configuration.
RLDRAM II memory components have a ZQ pin which connects through a resistor
RQ to ground. Typically the RLDRAM II output signal impedance is 0.2 × RQ. Refer
to the RLDRAM II device data sheet for more information.
f For information about OCT, refer to the I/O Features in Arria II Devices chapter in the
Arria II Device Handbook, I/O Features in Arria V Devices chapter in the Arria V Device
Handbook, Stratix III Device I/O Features chapter in the Stratix III Device Handbook, the
I/O Features in Stratix IV Devices chapter in the Stratix IV Device Handbook, or the I/O
Features in Stratix V Devices chapter in the Stratix V Device Handbook.
The following section shows HyperLynx simulation eye diagrams to demonstrate
signal termination options. Altera strongly recommends signal terminations to
optimize signal integrity and timing margins, and to minimize unwanted emissions,
reflections, and crosstalk.
All of the eye diagrams shown in this section are for a 50  trace with a propagation
delay of 600 ps which is approximately a 3.3-inch trace on a standard FR4 PCB. The
signal I/O standard is HSTL-18.
The eye diagrams shown in this section show the best case achievable and do not take
into account PCB vias, crosstalk and other degrading effects such as variations in the
PCB structure due to manufacturing tolerances.
1
Simulate your design to ensure correct functionality.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
6–5
Outputs from the FPGA to the RLDRAM II Component
The following output signals are from the FPGA to the RLDRAM II component:
■
write data (DQ on the bidirectional data signals for CIO RLDRAM II)
■
data mask (DM)
■
address, bank address
■
command (CS, WE, and REF)
■
clocks (CK/CK# and DK/DK#)
For point-to-point single-ended signals requiring external termination, Altera
recommends that you place a fly-by termination by terminating at the end of the
transmission line after the receiver to avoid unterminated stubs. The guideline is to
place the fly-by termination within 100 ps propagation delay of the receiver.
Although not recommended, you can place the termination before the receiver, which
leaves an unterminated stub. The stub delay is critical because the stub between the
termination and the receiver is effectively unterminated, causing additional ringing
and reflections. Stub delays should be less than 50 ps.
Altera recommends that the differential clocks, CK, CK# and DK, DK#, use a differential
termination at the end of the trace of the RLDRAM II component. Alternatively, you
can terminate each clock output with a parallel termination to VTT.
The HyperLynx simulation eye diagrams show simulation cases of write data,
address, and chip-select signals with termination options. All eye diagrams are shown
at the connection to the receiver device die.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6–6
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
Figure 6–3 shows the double data rate write data using a Stratix IV Class I HSTL-18
with calibrated 50  OCT output driver and the nominal RLDRAM II ODT of 150 .
Figure 6–3. Write Data Simulation at 400 MHz with RLDRAM II ODT
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
6–7
Figure 6–4 shows an address signal at a frequency of 200 MHz using Stratix IV Class I
HSTL-18 with a calibrated 50  OCT driver and a 100 ps fly-by 50 parallel
termination to VTT.
Figure 6–4. Address Simulation Using Stratix IV Class I HSTL-18 50 Calibration Driver and Fly-by 50 Parallel
Termination
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6–8
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
Figure 6–5 shows an address signal at a frequency of 200 MHz using Stratix IV Class I
HSTL-18 12 mA driver and a 50 ps stub 50 parallel termination to VTT.
Figure 6–5. Address Simulation Using Stratix IV Class I HSTL-18 50 Calibration Driver and Stub 50 Parallel
Termination to VTT
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
6–9
Figure 6–6 shows the chip-select signal at a frequency of 200 MHz using Stratix IV
Class I HSTL-18 with a calibrated 50  driver and a 10 K pull-up resistor to VDD. The
RLDRAM II power sequencing may require the chip selects to have a pull-up resistor.
Refer to the RLDRAM II data sheet for further details.
Figure 6–6. Chip-Select Simulation Using Stratix IV Class I HSTL-18 50 Calibration Driver and 10 K Pull-up Resistor to
VDD
For the RLDRAM II width expansion configuration for address and command, use
the same principles recommended for “QDR II SRAM Board Design Guidelines” on
page 7–1.
For external parallel termination recommended for a balanced T topology, refer to
Figure 7–3 on page 7–4, and for HyperLynx simulation diagrams of the width
expansion topology for address and command signals, refer to Figure 7–8 through
Figure 7–11 on page 7–13.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6–10
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
Input to the FPGA from the RLDRAM II Component
The RLDRAM II component drives the following input signals into the FPGA:
■
read data (DQ on the bidirectional data signals for CIO RLDRAM II)
■
read clocks (QK/QK#)
Altera recommends that you use the FPGA parallel OCT to terminate the data on
reads and read clocks.
The eye diagrams are shown at the FPGA die pin, and the RLDRAM II output driver
is Class I HSTL-18 using its ZQ calibration of 50 . The RLDRAM II read data is
double data rate.
Figure 6–7 shows the ideal case of a fly-by terminated signal using 50  calibrated
parallel OCT for a Stratix IV device.
Figure 6–7. Read Data Simulation at 400 MHz with 50  Parallel OCT Termination
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 6: RLDRAM II Board Design Guidelines
Signal Terminations
6–11
Termination Schemes
Table 6–2 lists the recommended termination schemes for major CIO RLDRAM II
memory interface signals, which include data (DQ), data mask (DM), clocks (CK, CK#, DK,
DK#, QK, and QK#), address, bank address, and command (WE#, REF#, and CS#).
Table 6–2. Termination Recommendations for Arria II GZ, Arria V, Stratix III, Stratix IV, and Stratix V Devices
HSTL 15/18 Standard (1), (2), (3), (4)
Memory End Termination
DK/DK# Clocks
Class I R50 NO CAL
100 Differential
QK/QK# Clocks
Class I P50 CAL
ZQ50
Data (Write)
Class I R50 CAL
ODT
Data (Read)
Class I P50 CAL
ZQ50
Data Mask
Class I R50 CAL
ODT
Signal Type
×1 = 100 Differential (9)
CK/CK# Clocks
Class I R50 NO CAL
Address/Bank Address (5), (6)
Class I Max Current
50 Parallel to VTT
Command (WE#, REF#) (5), (6)
Class I Max Current
50 Parallel to VTT
Command (CS#) (5), (6), (7)
Class I Max Current
QVLD
(8)
×2 = 200 Differential (10)
50 Parallel to VTT
or Pull-up to VDD
Class I P50 CAL
ZQ50
Notes to Table 6–2:
(1) R is effective series output impedance.
(2) P is effective parallel input impedance.
(3) CAL is OCT with calibration.
(4) NO CAL is OCT without calibration.
(5) For width expansion configuration, the address and control signals are routed to 2 devices. Recommended termination is 50  parallel to VTT
at the trace split of a balanced T or Y routing topology. Use a clamshell placement of the two RLDRAM II components to achieve minimal stub
delays and optimum signal integrity. Clamshell placement is when two devices overlay each other by being placed on opposite sides of the PCB.
(6) The UniPHY default IP setting for this output is Max Current. A Class I 50  output with calibration output is typically optimal in single load
topologies.
(7) Altera recommends that you use a 50 parallel termination to VTT if your design meets the power sequencing requirements of the RLDRAM II
component. Refer to the RLDRAM II data sheet for further information.
(8) QVLD is not used in the RLDRAM II Controller with UniPHY implementations.
(9) ×1 is a single-device load.
(10) ×2 is a double-device load. An alternative option is to use a 100  differential termination at the trace split.
1
November 2011
Altera recommends that you simulate your specific design for your system to ensure
good signal integrity.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6–12
Chapter 6: RLDRAM II Board Design Guidelines
PCB Layout Guidelines
PCB Layout Guidelines
Table 6–3 lists the RLDRAM II general routing layout guidelines.
1
The following layout guidelines include several +/- length based rules. These
length-based guidelines are for first order timing approximations if you cannot
simulate the actual delay characteristics of your PCB implementation. They do not
include any margin for crosstalk.
1
Altera recommends that you get accurate time base skew numbers when you simulate
your specific implementation.
Table 6–3. RLDRAM II Layout Guidelines (Part 1 of 2)
Parameter
Impedance
Decoupling Parameter
Guidelines
■
All signal planes must be 50 , single-ended, ±10%.
■
All signal planes must be 100 , differential ±10%.
■
Remove all unused via pads, because they cause unwanted capacitance.
■
Use 0.1 F in 0402 size to minimize inductance.
■
Make VTT voltage decoupling close to pull-up resistors.
■
Connect decoupling caps between VTT and ground.
■
Use a 0.1 F cap for every other VTT pin.
■
Verify your capacitive decoupling using the Altera Power Distribution Network (PDN)
Design tool.
■
Route GND, 1.5 V/1.8 V as planes.
■
Route VCCIO for memories in a single split plane with at least a 20-mil (0.020 inches or
0.508 mm) gap of separation.
■
Route VTT as islands or 250-mil (6.35-mm) power traces.
■
Route oscillators and PLL power as islands or 100-mil (2.54-mm) power traces.
■
All specified delay matching requirements include PCB trace delays, different layer
propagation, velocity variance, and crosstalk. To minimize PCB layer propagation variance,
Altera recommends that signals from the same net group always be routed on the same
layer. If you must route signals of the same net group on different layers with the same
impedance characteristic, simulate your worst case PCB trace tolerances to ascertain actual
propagation delay differences. Typical layer to layer trace delay variations are of 15 ps/inch
order.
■
Use 45° angles (not 90° corners).
■
Avoid T-Junctions for critical nets or clocks.
■
Avoid T-junctions greater than 150 ps (approximately 500 mils, 12.7 mm).
■
Disallow signals across split planes.
■
Restrict routing other signals close to system reset signals.
■
Avoid routing memory signals closer than 0.025 inch (0.635 mm) to PCI or system clocks.
■
Match all signals within a given DQ group with a maximum skew of ±10 ps or
approximately ±50 mils (0.254 mm) and route on the same layer.
Power
General Routing
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 6: RLDRAM II Board Design Guidelines
PCB Layout Guidelines
6–13
Table 6–3. RLDRAM II Layout Guidelines (Part 2 of 2)
Parameter
Guidelines
■
Route clocks on inner layers with outer-layer run lengths held to under 150 ps
(approximately 500 mils, 12.7 mm).
■
These signals should maintain a 10-mil (0.254 mm) spacing from other nets.
■
Clocks should maintain a length-matching between clock pairs of ±5 ps or approximately
±25 mils (0.635 mm).
■
Differential clocks should maintain a length-matching between P and N signals of ±2 ps or
approximately ±10 mils (0.254 mm).
■
Space between different clock pairs should be at least three times the space between the
traces of a differential pair.
■
To minimize crosstalk, route address, bank address, and command signals on a different
layer than the data and data mask signals.
■
Do not route the differential clock signals close to the address signals.
■
Keep the distance from the pin on the RLDRAM II component to the stub termination
resistor (VTT) to less than 50 ps (approximately 250 mils, 6.35 mm) for the
address/command signal group.
■
Keep the distance from the pin on the RLDRAM II component to the fly-by termination
resistor (VTT) to less than 100 ps (approximately 500 mils, 12.7 mm) for the
address/command signal group.
■
Apply the following parallelism rules for the RLDRAM II data/address/command groups:
Clock Routing
Address and Command
Routing
External Memory Routing
Rules
Maximum Trace Length
■
■
4 mils for parallel runs < 0.1 inch (approximately 1× spacing relative to plane distance).
■
5 mils for parallel runs < 0.5 inch (approximately 1× spacing relative to plane distance).
■
10 mils for parallel runs between 0.5 and 1.0 inches (approximately 2× spacing relative
to plane distance).
■
15 mils for parallel runs between 1.0 and 3.3 inch (approximately 3× spacing relative to
plane distance).
Keep the maximum trace length of all signals from the FPGA to the RLDRAM II components
to 600 ps (approximately 3,300 mils, 83.3 mm).
Using the layout guidelines in Table 6–3, Altera recommends the following layout
approach:
1. If the RLDRAM II interface has multiple DQ groups (×18 or ×36 RLDRAM II
component or width expansion configuration), match all the DK/DK# and QK,QK#
clocks as tightly as possible to optimize the timing margins in your design.
2. Route the DK/DK# write clock and QK/QK# read clock associated with a DQ group on
the same PCB layer. Match these clock pairs to within ±5 ps.
3. Set the DK/DK# or QK,QK# clock as the target trace propagation delay for the
associated data and data mask signals.
4. Route the data and data mask signals for the DQ group ideally on the same layer
as the associated QK/QK# and DK/DK# clocks to within ±10 ps skew of the target
clock.
5. Route the CK/CK# clocks and set as the target trace propagation delays for the
address/command signal group. Match the CK/CK# clock to within ±50 ps of all the
DK/DK# clocks.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
6–14
Chapter 6: RLDRAM II Board Design Guidelines
Document Revision History
6. Route the address/control signal group (address, bank address, CS, WE, and REF)
ideally on the same layer as the CK/CK# clocks, to within ±20 ps skew of the CK/CK#
traces.
This layout approach provides a good starting point for a design requirement of the
highest clock frequency supported for the RLDRAM II interface.
1
Altera recommends that you create your project in the Quartus® II software with a
fully implemented RLDRAM II Controller with UniPHY interface, and observe the
interface timing margins to determine the actual margins for your design.
Although the recommendations in this chapter are based on simulations, you can
apply the same general principles when determining the best termination scheme,
drive strength setting, and loading style to any board designs. Even armed with this
knowledge, it is still critical that you perform simulations, either using IBIS or
HSPICE models, to determine the quality of signal integrity on your designs.
Document Revision History
Table 6–4 lists the revision history for this document.
Table 6–4. Document Revision History
Date
Version
Changes
November 2011
3.0
Added Arria V information.
June 2011
2.0
Added Stratix V information.
December 2010
1.0
Initial release.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
7. QDR II SRAM Board Design Guidelines
November 2011
EMI_DG_007-4.0
EMI_DG_007-4.0
This chapter provides guidelines for you to improve your system's signal integrity
and layout guidelines to help successfully implement a QDR II or QDR II+ SRAM
interface in your system.
The QDR II and QDR II+ SRAM Controller with UniPHY intellectual property (IP)
enables you to implement QDR II and QDR II+ interfaces with Arria® II GX, Arria V,
Stratix® III, Stratix IV, and Stratix V devices.
1
In this chapter, QDR II SRAM refers to both QDR II and QDR II+ SRAM unless stated
otherwise.
This chapter focuses on the following key factors that affect signal integrity:
■
I/O standards
■
QDR II SRAM configurations
■
Signal terminations
■
Printed circuit board (PCB) layout guidelines
I/O Standards
QDR II SRAM interface signals use one of the following JEDEC I/O signalling
standards:
■
HSTL-15—provides the advantages of lower power and lower emissions.
■
HSTL-18—provides increased noise immunity with slightly greater output voltage
swings.
f To select the most appropriate standard for your interface, refer to the Arria II GX
Devices Data Sheet: Electrical Characteristics chapter in the Arria II Device Handbook,
Stratix III Device Datasheet: DC and Switching Characteristics chapter in the Stratix III
Device Handbook, or the Stratix IV Device Datasheet DC and Switching Characteristics
chapter in the Stratix IV Device Handbook.
Altera® QDR II SRAM Controller with UniPHY IP defaults to HSTL 1.5 V Class I
outputs and HSTL 1.5 V inputs.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
7–2
Chapter 7: QDR II SRAM Board Design Guidelines
QDR II SRAM Configurations
QDR II SRAM Configurations
The QDR II SRAM Controller with UniPHY IP supports interfaces with a single
device, and two devices in a width expansion configuration up to maximum width of
72 bits.
Figure 7–1 shows the main signal connections between the FPGA and a single QDR II
SRAM component.
Figure 7–1. Configuration With A Single QDR II SRAM Component
QDR II Device
ZQ
RQ
DOFF
Q
CQ/CQ
D
BWS
(3)
VTT
(3)
K/K
A
(3)
WPS
(3)
(3)
RPS
VTT
(4)
VTT
DOFFn
DATA IN
CQ/CQn
(1)
(2)
DATA OUT
BWSn
K/Kn
ADDRESS
WPSn
RPSn
Notes to Figure 7–1:
(1) Use external discrete termination only for data inputs targeting Arria II GX devices that do not support parallel OCT. For Stratix III and Stratix IV
devices, use parallel OCT.
(2) Use external discrete termination only for CQ/CQ# targeting Arria II GX devices, or for any device using ×36 emulated mode.
(3) Use external discrete termination for this signal, as shown for RPS.
(4) Use external discrete termination with fly-by placement to avoid stubs.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
QDR II SRAM Configurations
7–3
Figure 7–2 shows the main signal connections between the FPGA and two QDR II
SRAM components in a width expansion configuration.
Figure 7–2. Configuration With Two QDR II SRAM Components In A Width Expansion Configuration
QDR II SRAM Device 1
ZQ
DOFF
Q
CQ/CQn
D
BWS
K/K
A
WPS
(4)
(3)
(1)
VTT
(3)
RPS
Q
CQ/CQn
D
BWS
K/K
A
(4)
VTT
(3)
(2)
WPS
(4)
VTT
VTT VTT
(2)
ZQ
RQ
DOFF
VTT
VTT
QDR II SRAM Device 2
RQ
(3)
VTT
(3)
RPS
(4)
VTT
(3)
VTT
(5)
DOFFn
DATA IN
CQ/CQn0
CQ/CQn1
DATA OUT
BWSn
K0/K0n
K1/K1n
ADDRESS
WPSn
RPSn
Notes to Figure 7–2:
(1) Use external discrete termination only for data inputs targeting Arria II GX devices that do not support parallel OCT. For Stratix III and Stratix IV
devices, use parallel OCT.
(2) Use external discrete termination only for CQ/CQ# targeting Arria II GX devices, or for any device using ×36 emulated mode.
(3) Use external discrete termination for data outputs, BWSn, and K/K# clocks with fly-by placement to avoid stubs.
(4) Use external discrete termination for this signal, as shown for RPS.
(5) Use external discrete termination at the trace split of the balanced T or Y topology.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–4
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Figure 7–3 shows the detailed balanced topology recommended for the address and
command signals in the width expansion configuration.
Figure 7–3. External Parallel Termination for Balanced Topology
TL2
QDRII Memory
VTT
TL1
FPGA
(1)
TL2
QDRII Memory
Note to Figure 7–3:
(1) To minimize the reflections and parallel impedance discontinuity seen by the signal, place the trace split close to the
QDR II SRAM memory components. Keep TL2 short so that the QDR II SRAM components appear as a lumped load.
Signal Terminations
Arria II GX, Stratix III and Stratix IV devices offer on-chip termination (OCT)
technology.
Table 7–1 summarizes the extent of OCT support for each device.
Table 7–1. On-Chip Termination Schemes
(1)
FPGA Device
Termination Scheme
HSTL-15
and
HSTL-18
Arria II GX
Arria II GZ,
Stratix III,
and Stratix IV
Arria V and
Stratix V
Column
I/O
Row I/O
Column
I/O
Row I/O
Column
I/O
Row I/O
On-Chip Series Termination without
Calibration
Class I
50
50
50
50
—
—
On-Chip Series Termination with
Calibration
Class I
50
50
50
50
—
—
On-Chip Parallel Termination with
Calibration
Class I
—
—
50
50
50
50
Note to Table 7–1:
(1) This table provides information about HSTL-15 and HSTL-18 standards because these are the supported I/O standards for QDR II SRAM
memory interfaces by Altera FPGAs.
On-chip series (RS) termination is supported only on output and bidirectional buffers,
while on-chip parallel (RT) termination is supported only on input and bidirectional
buffers. Because QDR II SRAM interfaces have unidirectional data paths, dynamic
OCT is not required.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–5
For Arria II GX, Stratix III and Stratix IV devices, the HSTL Class I I/O calibrated
terminations are calibrated against 50  1% resistors connected to the RUP and RDN
pins in an I/O bank with the same VCCIO as the QDRII SRAM interface. The
calibration occurs at the end of the device configuration.
QDR II SRAM controllers have a ZQ pin which is connected via a resistor RQ to
ground. Typically the QDR II SRAM output signal impedance is 0.2 × RQ. Refer to the
QDR II SRAM device data sheet for more information.
f For information about OCT, refer to the I/O Features in Arria II GX Devices chapter in
the Arria II GX Device Handbook, I/O Features in Arria V Devices chapter in the Arria V
Device Handbook, Stratix III Device I/O Features chapter in the Stratix III Device
Handbook, I/O Features in Stratix IV Devices chapter in the Stratix IV Device Handbook,
and the I/O Features in Stratix V Devices chapter in the Stratix V Device Handbook
The following section shows HyperLynx simulation eye diagrams to demonstrate
signal termination options. Altera strongly recommends signal terminations to
optimize signal integrity and timing margins, and to minimize unwanted emissions,
reflections, and crosstalk.
All of the eye diagrams shown in this section are for a 50  trace with a propagation
delay of 720 ps which is approximately a 4-inch trace on a standard FR4 PCB. The
signal I/O standard is HSTL-15.
For point-to-point signals, Altera recommends that you place a fly-by termination by
terminating at the end of the transmission line after the receiver to avoid
unterminated stubs. The guideline is to place the fly-by termination within 100 ps
propagation delay of the receiver.
Although not recommended, you can place the termination before the receiver, which
leaves an unterminated stub. The stub delay is critical because the stub between the
termination and the receiver is effectively unterminated, causing additional ringing
and reflections. Stub delays should be less than 50 ps.
The eye diagrams shown in this section show the best case achievable and do not take
into account PCB vias, crosstalk and other degrading effects such as variations in the
PCB structure due to manufacturing tolerances.
1
Simulate your design to ensure correct functionality.
Output from the FPGA to the QDR II SRAM Component
The following output signals are from the FPGA to the QDR II SRAM component:
November 2011
■
write data
■
byte write select (BWSn)
■
address
■
control (WPSn and RPSn)
■
clocks, K/K#
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–6
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Altera recommends that you terminate the write clocks, K and K#, with a single-ended
fly-by 50 parallel termination to VTT. However, simulations show that you can
consider a differential termination if the clock pair is well matched and routed
differentially.
The HyperLynx simulation eye diagrams show simulation cases of write data and
address signals with termination options. The QDR II SRAM write data is double data
rate. The QDR II SRAM address is either double data rate (burst length of 2) or single
data rate (burst length of 4).
Simulations show that lowering the drive strength does not make a significant
difference to the eye diagrams. All eye diagrams are shown at the QDR II SRAM
device receiver pin.
Figure 7–4 shows the fly-by terminated signal using Stratix IV Class I HSTL-15 with
calibrated 50  OCT output driver.
Figure 7–4. Write Data Simulation at 400 MHz with Fly-By 50  Parallel Termination to VTT
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–7
Figure 7–5 shows an unterminated signal using Stratix IV Class I HSTL-15 with a
calibrated 50  OCT output driver. This unterminated solution is not recommended.
Figure 7–5. Write Data Simulation at 400 MHz with No Far-End Termination
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–8
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Figure 7–6 shows an unterminated signal at a lower frequency of 250 MHz using
Arria II GX Class I HSTL-15 with calibrated 50  OCT output driver. This
unterminated solution may be passable for some systems, but is shown so that you
can compare against the superior quality of the terminated signal in Figure 7–4.
Figure 7–6. Write Data Simulation at 250 MHz with No Far-End Termination
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–9
Figure 7–7 shows an unterminated signal at a frequency of 175 MHz with a
point-to-point connection. QDR II SRAM interfaces using Stratix IV devices have a
maximum supported frequency of 350 MHz. For QDR II SRAM with burst length of
four interfaces, the address signals are effectively single date rate at 175 MHz. This
unterminated solution is not recommended but can be considered. The FPGA output
driver is Class I HSTL-15 with a calibrated 50  OCT.
Figure 7–7. Address Simulation for QDR II SRAM Burst Length of 4 at 175 MHz with No Far-End Termination
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–10
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Figure 7–8 shows a typical topology, which are used for two components in width
expansion mode. Altera recommends that you match the stubs TL20 and TL22, but
you can allow small differences allowed to achieve acceptable signal integrity.
Figure 7–8. Address for QDR II SRAM Burst Length of 2 in Width Expansion Mode Topology
Vt1
0.75 V
V
TL20
U25.1
50.0 ohms
105.0 ps
CY7C1263v18_16A
R9
U24.1
Stratix IV Device
TL19
TL21
50.0 ohms
720.0 ps
50 ohms
100 ps
TL22
U26.1
50.0 ohms
95.0 ps
CY7C1263v18_16A
50 ohms
The eye diagrams in Figure 7–9 and Figure 7–10 use the topology shown in
Figure 7–8. The eye diagram in Figure 7–11 uses the topology shown in Figure 7–8
without the VTT termination, R9 and TL21.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–11
Figure 7–9 shows an address signal at a frequency of 400 MHz with parallel 50 
termination to VTT for QDR II SRAM burst length of 2 width expansion using Stratix
IV Class I HSTL-15 12 mA driver and fly-by 50  parallel termination to VTT.
Figure 7–9. Address Simulation Using Stratix IV Class I HSTL-15 12 mA Driver and Fly-by 50  Parallel Termination to VTT
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–12
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Figure 7–10 shows an address signal at a frequency of 400 MHz with parallel 50 
termination to VTT for QDR II SRAM burst length of 2 width expansion using Stratix
IV Class I HSTL-15 with 50  calibration driver and fly-by 50  parallel termination to
VTT. The waveform eye is significantly improved compared to the maximum (12mA)
drive strength case.
Figure 7–10. Address Simulation Using Stratix IV Class I HSTL-15 50  Calibration Driver and Fly-by 50  Parallel
Termination to VTT
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–13
Figure 7–11 shows an unterminated address signal at a frequency of 400 MHz for
QDR II SRAM burst length of 2 width expansion using Stratix IV Class I HSTL-15
with 50  calibration driver. This unterminated address has small eye and is not
recommended.
Figure 7–11. Address Simulation Using Stratix IV Class I HSTL-15 50  Calibration Driver and No Termination
Input to the FPGA from the QDR II SRAM Component
The QDR II SRAM component drives the following input signals into the FPGA:
■
read data
■
echo clocks, CQ/CQ#
For point-to-point signals, Altera recommends that you use the FPGA parallel OCT
wherever possible. For devices that do not support parallel OCT (Arria II GX), and for
×36 emulated configuration CQ/CQ# termination, Altera recommends that you use a
fly-by 50  parallel termination to VTT. Although not recommended, you can use
parallel termination with a short stub of less that 50 ps propagation delay as an
alternative option. The input echo clocks, CQ and CQ# must not use a differential
termination.
The eye diagrams are shown at the FPGA receiver pin, and the QDR II SRAM output
driver is Class I HSTL-15 using its ZQ calibration of 50 . The QDR II SRAM read data
is double data rate.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–14
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Figure 7–12 shows the ideal case of a fly-by terminated signal using 50  calibrated
parallel OCT with Stratix IV device.
Figure 7–12. Read Data Simulation at 400 MHz with 50  Parallel OCT Termination
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–15
Figure 7–13 shows an external discrete component fly-by terminated signal at a lower
frequency of 250 MHz using an Arria II GX device.
Figure 7–13. Read Data Simulation at 250 MHz with Fly-By Parallel 50  Termination
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–16
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
Figure 7–14 shows an unterminated signal at a lower frequency of 250 MHz using an
Arria II GX device. This unterminated solution is not recommended but is shown so
that you can compare against the superior quality of the terminated signal in
Figure 7–13.
Figure 7–14. Read Data Simulation at 250 MHz with No Far-End Termination
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
Signal Terminations
7–17
Termination Schemes
Table 7–2 and Table 7–3 list the recommended termination schemes for major QDR II
SRAM memory interface signals, which include write data (D), byte write select (BWS),
read data (Q), clocks (K, K#, CQ, and CQ#), address and command (WPS and RPS).
Table 7–2. Termination Recommendations for Arria II GX Devices
HSTL 15/18 Standard
(1), (2)
FPGA End Discrete
Termination
Memory End
Termination
K/K# Clocks
Class I R50 CAL
—
50 Parallel to VTT
Write Data
Class I R50 CAL
—
50 Parallel to VTT
BWS
Class I R50 CAL
—
50 Parallel to VTT
Class I Max Current
—
50 Parallel to VTT
Class I Max Current
—
50 Parallel to VTT
Class I
50 Parallel to VTT
ZQ50
Class I
50  Parallel to VTT
ZQ50
Class I
50  Parallel to VTT
ZQ50
—
—
ZQ50
Signal Type
Address
(3), (4)
WPS, RPS
(3), (4)
CQ/CQ#
CQ/CQ#
×36 emulated
(5)
Read Data (Q)
QVLD
(6)
Notes to Table 7–2:
(1) R is effective series output impedance.
(2) CAL is calibrated OCT.
(3) For width expansion configuration, the address and control signals are routed to 2 devices. Recommended
termination is 50  parallel to VTT at the trace split of a balanced T or Y routing topology. For 400 MHz burst length
2 configurations where the address signals are double data rate, it is recommended to use a clamshell placement
of the two QDR II SRAM components to achieve minimal stub delays and optimum signal integrity. Clamshell
placement is when two devices overlay each other by being placed on opposite sides of the PCB.
(4) The UniPHY default IP setting for this output is Max Current. A Class I 50  output with calibration output is
typically optimal in single load topologies.
(5) For ×36 emulated mode, the recommended termination for the CQ/CQ# signals is a 50  parallel termination to VTT
at the trace split, refer to Figure 7–15. Altera recommends that you use this termination when ×36 DQ/DQS groups
are not supported in the FPGA.
(6) QVLD is not used in the QDR II or QDR II+ SRAM with UniPHY implementations.
Table 7–3. Termination Recommendations for Arria V, Stratix III, Stratix IV, and Stratix V
Devices (Part 1 of 2)
HSTL 15/18 Standard
(1), (2), (3)
FPGA End Discrete
Termination
Memory End
Termination
K/K# Clocks
Class I R50 CAL
—
50 Parallel to VTT
Write Data
Class I R50 CAL
—
50 Parallel to VTT
BWS
Class I R50 CAL
—
50 Parallel to VTT
Class I Max Current
—
50 Parallel to VTT
Class I Max Current
—
50 Parallel to VTT
Class I P50 CAL
—
ZQ50
—
50  Parallel to VTT
ZQ50
Class I P50 CAL
—
ZQ50
Signal Type
Address
(4), (5)
WPS, RPS
(4), (5)
CQ/CQ#
CQ/CQ# ×36 emulated
Read Data (Q)
November 2011
Altera Corporation
(6)
External Memory Interface Handbook
Volume 2: Design Guidelines
7–18
Chapter 7: QDR II SRAM Board Design Guidelines
PCB Layout Guidelines
Table 7–3. Termination Recommendations for Arria V, Stratix III, Stratix IV, and Stratix V
Devices (Part 2 of 2)
Signal Type
QVLD
(7)
HSTL 15/18 Standard
(1), (2), (3)
FPGA End Discrete
Termination
Memory End
Termination
Class I P50 CAL
—
ZQ50
Notes to Table 7–3:
(1) R is effective series output impedance.
(2) P is effective parallel input impedance.
(3) CAL is calibrated OCT.
(4) For width expansion configuration, the address and control signals are routed to 2 devices. Recommended
termination is 50  parallel to VTT at the trace split of a balanced T or Y routing topology. For 400 MHz burst length
2 configurations where the address signals are double data rate, it is recommended to use a "clam shell" placement
of the two QDR II SRAM components to achieve minimal stub delays and optimum signal integrity. "Clam shell"
placement is when two devices overlay each other by being placed on opposite sides of the PCB.
(5) The UniPHY default IP setting for this output is Max Current. A Class 1 50  output with calibration output is
typically optimal in single load topologies.
(6) For ×36 emulated mode, the recommended termination for the CQ/CQ# signals is a 50  parallel termination to
VTT at the trace split, refer to Figure 7–15. Altera recommends that you use this termination when ×36 DQ/DQS
groups are not supported in the FPGA.
(7) QVLD is not used in the QDR II or QDR II+ SRAM Controller with UniPHY implementations.
1
Altera recommends that you simulate your specific design for your system to ensure
good signal integrity.
For a ×36 QDR II SRAM interface that uses an emulated mode of two ×18 DQS groups
in the FPGA, there are two CQ/CQ# connections at the FPGA and a single CQ/CQ#
output from the QDR II SRAM device. Altera recommends that you use a balanced T
topology with the trace split close to the FPGA and a parallel termination at the split,
as shown in Figure 7–15.
Figure 7–15. Emulated ×36 Mode CQ/CQn Termination Topology
FPGA
TL2
QDRII Memory
CQ
VTT
(1)
TL1
(1)
TL1
CQ
TL2
CQ
TL2
CQn
VTT
CQ
TL2
CQn
Note to Figure 7–15:
(1) To minimize the reflections and parallel impedance discontinuity seen by the signal, place the trace split close to the
FPGA device. Keep TL2 short so that the FPGA inputs appear as a lumped load.
f For more information about ×36 emulated modes, refer to the “Exceptions for ×36
Emulated QDR II and QDR II+ SRAM Interfaces in Arria II GX, Stratix III, and
Stratix IV Devices” section in the Planning Pin and Resource chapter.
PCB Layout Guidelines
Table 7–4 summarizes QDR II and QDR II SRAM general routing layout guidelines.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
PCB Layout Guidelines
7–19
1
The following layout guidelines include several +/- length based rules. These length
based guidelines are for first order timing approximations if you cannot simulate the
actual delay characteristics of your PCB implementation. They do not include any
margin for crosstalk.
1
Altera recommends that you get accurate time base skew numbers when you simulate
your specific implementation.
Table 7–4. QDR II and QDR II+ SRAM Layout Guidelines (Part 1 of 2)
Parameter
Impedance
Decoupling Parameter
Guidelines
■
All signal planes must be 50 , single-ended, ±10%.
■
All signal planes must be 100 , differential ±10%.
■
Remove all unused via pads, because they cause unwanted capacitance.
■
Use 0.1 F in 0402 size to minimize inductance.
■
Make VTT voltage decoupling close to pull-up resistors.
■
Connect decoupling caps between VTT and ground.
■
Use a 0.1 F cap for every other VTT pin.
■
Verify your capacitive decoupling using the Altera Power Distribution Network (PDN)
Design tool.
■
Route GND, 1.5 V/1.8 V as planes.
■
Route VCCIO for memories in a single split plane with at least a 20-mil (0.020 inches or
0.508 mm) gap of separation.
■
Route VTT as islands or 250-mil (6.35-mm) power traces.
■
Route all oscillators and PLL power as islands or 100-mil (2.54-mm) power traces.
■
All specified delay matching requirements include PCB trace delays, different layer
propagation, velocity variance, and crosstalk. To minimize PCB layer propagation variance,
Altera recommends that signals from the same net group always be routed on the same
layer. If signals of the same net group must be routed on different layers with the same
impedance characteristic, you must simulate your worst case PCB trace tolerances to
ascertain actual propagation delay differences. Typical later to later trace delay variations
are of 15 ps/inch order.
■
Use 45° angles (not 90° corners).
■
Avoid T-Junctions for critical nets or clocks.
■
Avoid T-junctions greater than 150 ps (approximately 500 mils, 12.7 mm).
■
Disallow signals across split planes.
■
Restrict routing other signals close to system reset signals.
■
Avoid routing memory signals closer than 0.025 inch (0.635 mm) to PCI or system clocks.
Power
General Routing
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–20
Chapter 7: QDR II SRAM Board Design Guidelines
PCB Layout Guidelines
Table 7–4. QDR II and QDR II+ SRAM Layout Guidelines (Part 2 of 2)
Parameter
Guidelines
■
Route clocks on inner layers with outer-layer run lengths held to under 150 ps
(approximately 500 mils, 12.7 mm).
■
These signals should maintain a 10-mil (0.254 mm) spacing from other nets.
■
Clocks should maintain a length-matching between clock pairs of ±5 ps or approximately
±25 mils (0.635 mm).
■
Complementary clocks should maintain a length-matching between P and N signals of
±2 ps or approximately ±10 mils (0.254 mm).
■
Keep the distance from the pin on the QDR II SRAM component to stub termination resistor
(VTT) to less than 50 ps (approximately 250 mils, 6.35 mm) for the K, K# clocks.
■
Keep the distance from the pin on the QDR II SRAM component to fly-by termination
resistor (VTT) to less than 100 ps (approximately 500 mils, 12.7 mm) for the K, K# clocks.
■
Keep the distance from the pin on the FPGA component to stub termination resistor (VTT) to
less than 50 ps (approximately 250 mils, 6.35 mm) for the echo clocks, CQ, CQ#, if they
require an external discrete termination.
■
Keep the distance from the pin on the FPGA component to fly-by termination resistor (VTT)
to less than 100 ps (approximately 500 mils, 12.7 mm) for the echo clocks, CQ, CQ#, if they
require an external discrete termination.
■
Keep the distance from the pin on the QDR II SRAM component to stub termination resistor
(VTT) to less than 50 ps (approximately 250 mils, 6.35 mm) for the write data, byte write
select and address/command signal groups.
■
Keep the distance from the pin on the QDR II SRAM component to fly-by termination
resistor (VTT) to less than 100 ps (approximately 500 mils, 12.7 mm) for the write data,
byte write select and address/command signal groups.
■
Keep the distance from the pin on the FPGA (Arria II GX) to stub termination resistor (VTT)
to less than 50 ps (approximately 250 mils, 6.35 mm) for the read data signal group.
■
Keep the distance from the pin on the FPGA (Arria II GX) to fly-by termination resistor (VTT)
to less than 100 ps (approximately 500 mils, 12.7 mm) for the read data signal group.
■
Parallelism rules for the QDR II SRAM data/address/command groups are as follows:
Clock Routing
External Memory Routing
Rules
Maximum Trace Length
■
■
4 mils for parallel runs < 0.1 inch (approximately 1× spacing relative to plane distance).
■
5 mils for parallel runs < 0.5 inch (approximately 1× spacing relative to plane distance).
■
10 mils for parallel runs between 0.5 and 1.0 inches (approximately 2× spacing relative
to plane distance).
■
15 mils for parallel runs between 1.0 and 6.0 inch (approximately 3× spacing relative to
plane distance).
Keep the maximum trace length of all signals from the FPGA to the QDR II SRAM
components to 6 inches.
Using the layout guidelines in Table 7–4, Altera recommends the following layout
approach:
1. Route the K/K# clocks and set the clocks as the target trace propagation delays for
the output signal group.
2. Route the write data output signal group (write data, byte write select),
ideally on the same layer as the K/K# clocks, to within ±10 ps skew of the K/K#
traces.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 7: QDR II SRAM Board Design Guidelines
PCB Layout Guidelines
7–21
3. Route the address/control output signal group (address, RPS, WPS), ideally on the
same layer as the K/K# clocks, to within ±20 ps skew of the K/K# traces.
4. Route the CQ/CQ# clocks and set the clocks as the target trace propagation delays
for the input signal group.
5. Route the read data output signal group (read data), ideally on the same layer as
the CQ/CQ# clocks, to within ±10 ps skew of the CQ/CQ# traces.
6. The output and input groups do not need to have the same propagation delays,
but they must have all the signals matched closely within the respective groups.
Table 7–5 and Table 7–6 list the typical margins for QDR II and QDR II+ SRAM
interfaces, with the assumption that there is zero skew between the signal groups.
Table 7–5. Typical Worst Case Margins for QDR II SRAM Interfaces of Burst Length 2
Device
Arria II GX
Arria II GX
×36 emulated
Stratix IV
Stratix IV
×36 emulated
Speed Grade
Frequency (MHz)
Typical Margin
Address/Command (ps)
Typical Margin
Write Data (ps)
Typical Margin
Read Data (ps)
I5
250
± 240
± 80
± 170
I5
200
± 480
± 340
± 460
—
350
—
—
—
C2
300
± 320
± 170
± 340
Typical Margin
Write Data (ps)
Typical Margin
Read Data (ps)
Table 7–6. Typical Worst Case Margins for QDR II+ SRAM Interfaces of Burst Length 4
Device
Arria II GX
Arria II GX
×36 emulated
Stratix IV
Stratix IV
×36 emulated
Typical Margin
Address/Command (ps)
Speed Grade
Frequency (MHz)
I5
250
± 810
± 150
± 130
I5
200
± 1260
± 410
± 420
C2
400
± 550
± 10
± 80
C2
300
± 860
± 180
± 300
(1)
Note to Table 7–6:
(1) The QDR II+ SRAM burst length of 4 designs have greater margins on the address signals because they are single data rate.
Other devices and speed grades typically show higher margins than the ones in
Table 7–5 and Table 7–6.
1
Altera recommends that you create your project with a fully implemented QDR II or
QDR II+ SRAM Controller with UniPHY interface, and observe the interface timing
margins to determine the actual margins for your design.
Although the recommendations in this chapter are based on simulations, you can
apply the same general principles when determining the best termination scheme,
drive strength setting, and loading style to any board designs. Even armed with this
knowledge, it is still critical that you perform simulations, either using IBIS or
HSPICE models, to determine the quality of signal integrity on your designs.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
7–22
Chapter 7: QDR II SRAM Board Design Guidelines
Document Revision History
Document Revision History
Table 7–7 lists the revision history for this document.
Table 7–7. Document Revision History
Date
November 2011
Version
4.0
Changes
Added Arria V information.
June 2011
3.0
Added Stratix V information.
December 2010
2.0
Maintenance update.
July 2010
1.0
Initial release.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
8. Implementing and Parameterizing
Memory IP
November 2011
EMI_DG_008-4.0
EMI_DG_008-4.0
This chapter describes the general overview of the Altera® IP core design flow to help
you quickly get started with any Altera IP core. The Altera IP Library is installed as
part of the Quartus® II installation process.You can select and parameterize any Altera
IP core from the library. Altera provides an integrated parameter editor that allows
you to customize IP cores to support a wide variety of applications. The parameter
editor guides you through the setting of parameter values and selection of optional
ports. The following section describes the general design flow and use of Altera IP
cores.
Installation and Licensing
The Altera IP Library is distributed with the Quartus II software and downloadable
from the Altera website (www.altera.com).
Figure 8–1 shows the directory structure after you install the memory controller with
the memory IP, where <path> is the installation directory. The default installation
directory on Windows is c:\altera\<version>; on Linux it is /opt/altera<version>.
Figure 8–1. Directory Structure
<path>
Installation directory.
ip
Contains the Altera IP Library and third-party IP cores.
altera
Contains the Altera IP Library.
common
Contains shared components.
ddr_high_perf
Contains the DDR SDRAM Controller with ALTMEMPHY IP files.
ddr2_high_perf
Contains the DDR2 SDRAM Controller with ALTMEMPHY IP files.
ddr3_high_perf
Contains the DDR3 SDRAM Controller with ALTMEMPHY IP files.
alt_mem_if
Contains the DDR2 or DDR3 SDRAM Controller with UniPHY IP files.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
8–2
Chapter 8: Implementing and Parameterizing Memory IP
Installation and Licensing
You can evaluate an IP core in simulation and in hardware until you are satisfied with
its functionality and performance. Some IP cores require that you purchase a license
for the IP core when you want to take your design to production. After you purchase
a license for an Altera IP core, you can request a license file from the Altera Licensing
page of the Altera website and install the license on your computer. For additional
information, refer to Altera Software Installation and Licensing.
Free Evaluation
Altera's OpenCore Plus evaluation feature is only applicable to the DDR, DDR2 and
DDR3 SDRAM HPC. With the OpenCore Plus evaluation feature, you can perform the
following actions:
■
Simulate the behavior of a megafunction (Altera MegaCore® function or AMPPSM
megafunction) within your system.
■
Verify the functionality of your design, as well as evaluate its size and speed
quickly and easily.
■
Generate time-limited device programming files for designs that include
MegaCore functions.
■
Program a device and verify your design in hardware.
You need to purchase a license for the megafunction only when you are completely
satisfied with its functionality and performance, and want to take your design to
production.
OpenCore Plus Time-Out Behavior
OpenCore Plus hardware evaluation can support the following two modes of
operation:
■
Untethered—the design runs for a limited time
■
Tethered—requires a connection between your board and the host computer. If
tethered mode is supported by all megafunctions in a design, the device can
operate for a longer time or indefinitely
All megafunctions in a device time out simultaneously when the most restrictive
evaluation time is reached. If there is more than one megafunction in a design, a
specific megafunction's time-out behavior may be masked by the time-out behavior of
the other megafunctions.
1
For MegaCore functions, the untethered time-out is 1 hour; the tethered time-out
value is indefinite.
Your design stops working after the hardware evaluation time expires and the
local_ready output goes low.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Design Flow
8–3
Design Flow
You can implement the memory controllers with ALTMEMPHY IP or UniPHY IP
using any of the following flows:
■
MegaWizard™ Plug-In Manager flow
■
SOPC Builder flow
■
Qsys Flow
Figure 8–2 shows the stages for creating a system in the Quartus II software using the
available flows.
Figure 8–2. Design Flows (1)
Select Design Flow
Qsys or
SOPC Builder
Flow
MegaWizard
Flow
Specify Parameters
Specify Parameters
Complete
SOPC Builder System
Optional
Perform
Functional Simulation
Does
Simulation Give
Expected Results?
Yes
Add Constraints
and Compile Design
IP Complete
Debug Design
Note to Figure 8–2:
(1) Altera IP cores may or may not support the Qsys and SOPC Builder design flows.
The MegaWizard Plug-In Manager flow offers the following advantages:
November 2011
■
Allows you to parameterize an IP core variant and instantiate into an existing
design
■
For some IP cores, this flow generates a complete example design and testbench
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–4
Chapter 8: Implementing and Parameterizing Memory IP
MegaWizard Plug-In Manager Flow
The SOPC Builder flow offers the following advantages:
■
Generates simulation environment
■
Allows you to integrate Altera-provided custom components
■
Uses Avalon® memory-mapped (Avalon-MM) interfaces
The Qsys flow offers the following additional advantages over SOPC Builder:
■
Provides visualization of hierarchical designs
■
Allows greater performance through interconnect elements and pipelining
■
Provides closer integration with the Quartus II software
MegaWizard Plug-In Manager Flow
The MegaWizard Plug-In Manager flow allows you to customize the memory
controller with ALTMEMPHY or UniPHY IP, and manually integrate the function into
your design.
Specifying Parameters
To specify parameters using the MegaWizard Plug-In Manager flow, perform the
following steps:
1. Create a Quartus II project using the New Project Wizard available from the File
menu.
2. In the Quartus II software, launch the MegaWizard Plug-in Manager from the
Tools menu, and follow the prompts in the MegaWizard Plug-In Manager
interface to create or edit a custom IP core variation.
3. Select a memory controller with the memory IP in the Installed Plug-Ins list in the
External Memory folder.
4. Specify the parameters on all pages in the Parameter Settings tab.
f For detailed explanation of the parameters, refer to “Parameterizing
Memory Controllers with ALTMEMPHY IP” on page 8–38 and
“Parameterizing Memory Controllers with UniPHY IP” on page 8–57.
1
Some IP cores provide preset parameters for specific applications. If you
wish to use preset parameters, click the arrow to expand the Presets list,
select the desired preset, and then click Apply. To modify preset settings, in
a text editor modify the <installation directory>/ip/altera/alt_mem_if/
alt_mem_if_interfaces/alt_mem_if_<memory_protocol>_emif/
alt_mem_if_<memory_protocol>_mem_model.qprs.
5. If the IP core provides a simulation model, specify appropriate options in the
wizard to generate a simulation model.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
MegaWizard Plug-In Manager Flow
1
8–5
Altera IP supports a variety of simulation models, including
simulation-specific IP functional simulation models and encrypted RTL
models, and plain text RTL models. These are all cycle-accurate models. The
models allow for fast functional simulation of your IP core instance using
industry-standard VHDL or Verilog HDL simulators. For some cores, only
the plain text RTL model is generated, and you can simulate that model.
f For more information about functional simulation models for Altera IP
cores, refer to Simulating Altera Designs chapter in volume 3 of the Quartus II
Handbook.
c
Use the simulation models only for simulation and not for synthesis or any
other purposes. Using these models for synthesis creates a nonfunctional
design.
6. This step applies to memory controllers with ALTMEMPHY IP. If the parameter
editor includes EDA and Summary tabs, follow these steps:
a. Some third-party synthesis tools can use a netlist that contains the structure of
an IP core but no detailed logic to optimize timing and performance of the
design containing it. To use this feature if your synthesis tool and IP core
support it, turn on Generate netlist.
1
When targeting a VHDL simulation model, the MegaWizard Plug-In
Manager still generates the <variation_name>_alt_mem_phy.v for the
Quartus II synthesis. Do not use this file for simulation. Use the
<variation_name>.vho for simulation instead.
The ALTMEMPHY megafunction only supports functional simulation. You
cannot perform timing or gate-level simulation when using the
ALTMEMPHY megafunction.
b. On the Summary tab, if available, select the files you want to generate. A gray
checkmark indicates a file that is automatically generated. All other files are
optional.
1
If file selection is supported for your IP core, after you generate the core, a
generation report (<variation name>.html) appears in your project directory.
This file contains information about the generated files.
7. Click the Finish button, the parameter editor generates the top-level HDL code for
your IP core, and a simulation directory which includes files for simulation.
1
The Finish button may be unavailable until all parameterization errors
listed in the messages window are corrected.
8. Click Yes if you are prompted to add the .qip to the current Quartus II project. You
can also turn on Automatically add Quartus II IP Files to all projects.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–6
Chapter 8: Implementing and Parameterizing Memory IP
MegaWizard Plug-In Manager Flow
9. This step applies to memory controllers with ALTMEMPHY IP. If you are using
the UniPHY IP, for the high-performance controller (HPC or HPC II), set the
<variation name>_example_top.v or .vhd to be the project top-level design file.
a. On the File menu, click Open.
b. Browse to <variation name>_example_top and click Open.
c. On the Project menu, click Set as Top-Level Entity.
You can now integrate your custom IP core instance in your design, simulate, and
compile. While integrating your IP core instance into your design, you must make
appropriate pin assignments. You can create a virtual pin to avoid making specific pin
assignments for top-level signals while you are simulating and not ready to map the
design to hardware.
For some IP cores, the generation process also creates complete example designs. An
example design for hardware testing is located in the
<variation_name>_example_design/example_project/ directory. An example design
for RTL simulation is located in the <variation_name>_example_design/simulation/
directory.
1
For information about the Quartus II software, including virtual pins and the
MegaWizard Plug-In Manager, refer to Quartus II Help.
Constraining the Design
After you have generated the memory IP MegaCore function, you may need to set
timing constraints and perform timing analysis using the Quartus II TimeQuest
Timing Analyzer. When you generate the MegaCore function, the MegaWizard
Plug-In Manager also generates a Synopsis Design Constraint File (.sdc),
<variation_name>.sdc, and a pin assignment script,
<variation_name>_pin_assignments.tcl. Both the .sdc and the <variation
name>_pin_assignments.tcl scripts support multiple instances. These scripts iterate
through all instances of the core and apply the same constraints to all of them. You can
derive the timing constraints from the external device data sheet and tolerances from
the board layout.
For more information about timing constraints and analysis, refer to the Analyzing
Timing of Memory IP chapter.
Add Pins and DQ Group Assignments
The <variation_name>_pin_assignments.tcl script, sets up the I/O standards and the
input/output termination for the memory IP. This script also helps to relate the DQ pin
groups together for the Quartus II Fitter to place them correctly.
The pin assignment script does not create a PLL reference clock for the design. You
must create a clock for the design and provide pin assignments for the signals of both
the example driver and testbench that the MegaCore variation generates.
Run the <variation_name>_pin_assignments.tcl script to add the input and output
termination, I/O standards, and DQ group assignments to the example design. To run
the pin assignment script, follow these steps:
1. On the Processing menu, point to Start, and click Start Analysis and Synthesis.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
MegaWizard Plug-In Manager Flow
8–7
2. On the Tools menu click Tcl Scripts.
3. Specify the pin_assignments.tcl and click Run.
1
If the PLL input reference clock pin does not have the same I/O standard as the
memory interface I/Os, a no-fit might occur because incompatible I/O standards
cannot be placed in the same I/O bank.
1
If you are upgrading your memory IP from an earlier Quartus II version, follow these
steps:
■
For UniPHY IP, rerun the pin_assignments.tcl script in the later Quartus II
revision.
■
For ALTMEMPHY IP, delete all the memory non-location I/O assignments and
rerun the pin_assignments.tcl script.
Compiling the Design
After constraining your design, compile your design in the Quartus II software to
generate timing reports to verify whether timing has been met.
To compile the design, on the Processing menu, click Start Compilation.
After you have compiled the top-level file, you can perform RTL simulation or
program your targeted Altera device to verify the top-level file in hardware.
f For more information about simulating the memory IP, refer to the Simulating Memory
IP chapter.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–8
Chapter 8: Implementing and Parameterizing Memory IP
SOPC Builder Flow
SOPC Builder Flow
You can use SOPC Builder to build a system that includes your customized IP core.
You can easily add other components and quickly create an SOPC Builder system.
SOPC Builder automatically generates HDL files that include all of the specified
components and interconnections. SOPC Builder defines default connections, which
you can modify. The HDL files are ready to be compiled by the Quartus II software to
produce output files for programming an Altera device.
Figure 8–3 shows a block diagram of an example SOPC Builder system.
Figure 8–3. SOPC Builder System
SOPC Builder System
Altera IP Core
Instance
System Interconnect Fabric
Peripheral 1
Peripheral 2
Peripheral 3
f For more information about system interconnect fabric, refer to the System Interconnect
Fabric for Memory-Mapped Interfaces and System Interconnect Fabric for Streaming
Interfaces chapters in the SOPC Builder User Guide and to the Avalon Interface
Specifications.
f For more information about SOPC Builder and the Quartus II software, refer to the
SOPC Builder Features and Building Systems with SOPC Builder sections in the SOPC
Builder User Guide and to Quartus II Help.
Specifying Parameters
To specify IP core parameters in the SOPC Builder flow, follow these steps:
1. Create a new Quartus II project using the New Project Wizard available from the
File menu.
2. On the Tools menu, click SOPC Builder.
3. For a new system, specify the system name and language.
4. On the System Contents tab, double-click the name of your IP core to add it to
your system. The relevant parameter editor appears.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys System Integration Tool Design Flow
8–9
5. Specify the required parameters in the parameter editor. For detailed explanations
of these parameters, refer to “Parameterizing Memory Controllers with
ALTMEMPHY IP” on page 8–38 and “Parameterizing Memory Controllers with
UniPHY IP” on page 8–57.
1
Some IP cores provide preset parameters for specific applications. If you
wish to use preset parameters, click the arrow to expand the Presets list,
select the desired preset, and then click Apply. To modify preset settings, in
a text editor modify the <installation directory>/ip/altera/alt_mem_if/
alt_mem_if_interfaces/alt_mem_if_<memory_protocol>_emif/
alt_mem_if_<memory_protocol>_mem_model.qprs.
1
You must also turn on Generate SOPC Builder compatible resets on the
Controller Settings tab when parameterizing those cores.
6. Click Finish to complete the IP core instance and add it to the system.
1
The Finish button may be unavailable until all parameterization errors
listed in the messages window are corrected.
Completing the SOPC Builder System
To complete the SOPC Builder system, follow these steps:
1. Add and parameterize any additional components. Some IP cores include a
complete SOPC Builder system design example.
2. Use the Connection panel on the System Contents tab to connect the components.
3. By default, clock names are not displayed. To display clock names in the Module
Name column and the clocks in the Clock column in the System Contents tab,
click Filters to display the Filters dialog box. In the Filter list, click All.
4. Click Generate to generate the system. SOPC Builder generates the system and
produces the <system name>.qip that contains the assignments and information
required to process the IP core or system in the Quartus II Compiler.
5. In the Quartus II software, click Add/Remove Files in Project and add the .qip to
the project.
6. Compile your design in the Quartus II software.
Qsys System Integration Tool Design Flow
You can use the Qsys system integration tool to build a system that includes your
customized IP core. You easily can add other components and quickly create a Qsys
system. Qsys automatically generates HDL files that include all of the specified
components and interconnections. In Qsys, you specify the connections you want.
The HDL files are ready to be compiled by the Quartus II software to produce output
files for programming an Altera device. Qsys generates Verilog HDL simulation
models for the IP cores that comprise your system.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–10
Chapter 8: Implementing and Parameterizing Memory IP
Qsys System Integration Tool Design Flow
Figure 8–4 shows a high level block diagram of an example Qsys system.
Figure 8–4. Example Qsys System
Qsys System
PCIe to Ethernet Bridge
DDR3
SDRAM
DDR3
SDRAM
Controller
PHY
Cntl
PCI Express
Subsystem
Mem
Mstr
PCIe
CSR
Mem
Slave
Embedded Cntl
Mem
Mstr
CSR
Ethernet
Subsystem
Ethernet
f For more information about the Qsys system interconnect, refer to the Qsys
Interconnect chapter in volume 1 of the Quartus II Handbook and to the Avalon Interface
Specifications.
f For more information about the Qsys tool and the Quartus II software, refer to the
System Design with Qsys section in volume 1 of the Quartus II Handbook and to Quartus
II Help.
Specify Parameters
To specify parameters for your IP core using the Qsys flow, follow these steps:
1. Create a new Quartus II project using the New Project Wizard available from the
File menu.
2. On the Tools menu, click Qsys.
3. In the Component Library window, double-click the name of your IP core to add
it to your system. The relevant parameter editor appears.
f Specify the required parameters in all tabs in the Qsys tool. For detailed
explanations of these parameters, refer to “Parameterizing Memory
Controllers with ALTMEMPHY IP” on page 8–38 and “Parameterizing
Memory Controllers with UniPHY IP” on page 8–57.
1
External Memory Interface Handbook
Volume 2: Design Guidelines
If your design includes external memory interface IP cores, you must turn
on Generate power-of-2 bus widths for SOPC Builder on the Controller
Settings tab when parameterizing those cores.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys System Integration Tool Design Flow
1
8–11
Some IP cores provide preset parameters for specific applications. If you
wish to use preset parameters, click the arrow to expand the Presets list,
select the desired preset, and then click Apply. To modify preset settings, in
a text editor modify the <installation directory>/ip/altera/alt_mem_if/
alt_mem_if/alt_mem_if_interfaces<memory_protocol>_emif/
alt_mem_if_<memory_protocol>_mem_model.qprs.
4. Click Finish to complete the IP core instance and add it to the system.
1
The Finish button may be unavailable until all parameterization errors
listed in the messages window are corrected.
Complete the Qsys System
To complete the Qsys system, follow these steps:
1. Add and parameterize any additional components.
2. Connect the components using the Connection panel on the System Contents tab.
3. In the Export column, enter the name of any connections that should be a top-level
Qsys system port.
4. If you intend to simulate your Qsys system, on the Generation tab, select either
Create testbench Qsys system to Standard, BFMs for standard Avalon interfaces
to create a testbench with bus functional models (BFMs) attached to all exported
interfaces or Simple, BFMs for clocks and resets to create a testbench with BFMs
driving only clocks and reset interfaces.
5. To generate a simulation model for the testbench Qsys system at the same time, set
Create testbench simulation model to Verilog or VHDL. Set this option to None
to view or modify the generated testbench system before generating its simulation
model.
6. If your system is not part of a Quartus II project and you want to generate
synthesis register transfer language (RTL) or high-level hardware description
language (HDL) files, turn on Create HDL design files for synthesis.
7. Click Generate to generate the system. Qsys generates the system and produces
the <system name>.qip that contains the assignments and information required to
process the IP core or system in the Quartus II Compiler.
8. In the Quartus II software, click Add/Remove Files in Project and add the .qip to
the project.
9. Compile your project in the Quartus II software.
1
November 2011
To ensure that the memory and oct interfaces are exported to the top-level RTL file, be
careful not to accidentally rename or delete either of these interfaces in the Export
column of the System Contents tab.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–12
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Qsys and SOPC Builder Interfaces
Table 8–1 and Table 8–2 list the DDR2 and DDR3 SDRAM with UniPHY signals
available for each interface in Qsys and SOPC Builder and provide a description and
guidance on how to connect those interfaces.
Table 8–1. DDR2 SDRAM Controller with UniPHY Interfaces (Part 1 of 5)
Signals in Interface
Interface Type
Description/How to Connect
pll_ref_clk interface
pll_ref_clk
Clock input
PLL reference clock input.
Reset input
Asychronous global reset for PLL and all logic
in PHY.
Reset input
Asychronous reset input. Resets the PHY, but
not the PLL that the PHY uses.
Reset output (PLL master/no sharing)
When the interface is in PLL master or no
sharing modes, this interface is an
asychronous reset output of the AFI interface.
The controller asserts this interface when the
PLL loses lock or the PHY is reset.
Reset input (PLL slave)
When the interface is in PLL slave mode, this
interface is a reset input that you must
connect to the afi_reset output of an
identically configured memory interface in
PLL master mode.
global_reset interface
global_reset_n
soft_reset interface
soft_reset_n
afi_reset interface
afi_reset_n
afi_reset_in interface
afi_reset_n
afi_clk interface
This AFI interface clock can be a full-rate or
half-rate memory clock frequency based on
the memory interface parameterization.
afi_clk
Clock output (PLL master/no sharing)
When the interface is in PLL master or no
sharing modes, this interface is a clock
output.
afi_clk_in interface
This AFI interface clock can be a full-rate or
half-rate memory clock frequency based on
the memory interface parameterization.
afi_clk
External Memory Interface Handbook
Volume 2: Design Guidelines
Clock input (PLL slave)
When the interface is in PLL slave mode, you
must connect this afi_clk input to the
afi_clk output of an identically configured
memory interface in PLL master mode.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–13
Table 8–1. DDR2 SDRAM Controller with UniPHY Interfaces (Part 2 of 5)
Signals in Interface
Interface Type
Description/How to Connect
afi_half_clk interface
The AFI half clock that is half the frequency of
afi_clk.
afi_half_clk
Clock output (PLL master/no sharing)
When the interface is in PLL master or no
sharing modes, this interface is a clock
output.
afi_half_clk_in interface
The AFI half clock that is half the frequency of
afi_clk.
afi_half_clk
Clock input (PLL slave)
When the interface is in PLL slave mode, this
is a clock input that you must connect to the
afi_half_clk output of an identically
configured memory interface in PLL master
mode.
memory interface
mem_a
mem_ba
mem_ck
mem_ck_n
mem_cke
mem_cs_n
mem_dm
mem_ras_n
mem_cas_n
Conduit
Interface signals between the PHY and the
memory device.
mem_we_n
mem_dq
mem_dqs
mem_dqs_n
mem_odt
mem_ac_parity
mem_err_out_n
mem_parity_error_n
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–14
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–1. DDR2 SDRAM Controller with UniPHY Interfaces (Part 3 of 5)
Signals in Interface
Interface Type
Description/How to Connect
avl interface
avl_ready
avl_burst_begin
avl_addr
avl_rdata_valid
avl_rdata
avl_wdata
Avalon-MM Slave
Avalon-MM interface signals between the
memory interface and user logic.
Conduit
Memory interface status signals.
Conduit
OCT reference resistor pins for rup/rdn or
rzqin.
Conduit
This powerdown interface for the controller is
enabled only when you turn on Enable Auto
Powerdown.
Conduit
Interface signals for PLL sharing, to connect
PLL masters to PLL slaves. This interface is
enabled only when you set PLL sharing mode
to master or slave.
Conduit
DLL sharing interface for connecting DLL
masters to DLL slaves. This interface is
enabled only when you set DLL sharing mode
to master or slave.
avl_be
avl_read_req
avl_write_req
avl_size
status interface
local_init_done
local_cal_success
local_cal_fail
oct interface
rup (Stratix® III/IV, Arria® II GZ)
rdn (Stratix III/IV, Arria II GZ)
rzq (Stratix V)
local_powerdown interface
local_powerdn_ack
pll_sharing interface
pll_mem_clk
pll_write_clk
pll_addr_cmd_clk
pll_locked
pll_avl_clk
pll_config_clk
pll_hr_clk
pll_p2c_read_clk
pll_c2p_write_clk
pll_dr_clk
dll_sharing interface
dll_delayctrl
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–15
Table 8–1. DDR2 SDRAM Controller with UniPHY Interfaces (Part 4 of 5)
Signals in Interface
Interface Type
Description/How to Connect
oct_sharing interface
seriesterminationcontrol
parallelterminationcontrol
Conduit
OCT sharing interface for connecting OCT
masters to OCT slaves. This interface is
enabled only when you set OCT sharing mode
to master or slave.
hcx_dll_reconfig interface
dll_offset_ctrl_addnsub
dll_offset_ctrl_offset
dll_offset_ctrl_addnsub (1)
dll_offset_ctrl_offset (1)
Conduit
This DLL reconfiguration interface is enabled
only when you turn on HardCopy
Compatibility Mode.
You can connect this interface to user-created
custom logic to enable DLL reconfiguration.
dll_offset_ctrl_offsetctrlout (1)
dll_offset_ctrl_b_offsetctrlout (1)
hcx_pll_reconfig interface
configupdate
phasecounterselect
phasestep
phaseupdown
scanclk
Conduit
scanclkena
This PLL reconfiguration interface is enabled
only when you turn on HardCopy
Compatibility Mode.
You can connect this interface to user-created
custom logic to enable PLL reconfiguration.
scandata
phasedone
scandataout
scandone
hcx_rom_reconfig interface
hc_rom_config_clock
This ROM loader interface is enabled only
when you turn on HardCopy Compatibility
Mode.
hc_rom_conig_datain
hc_rom_config_rom_data_ready
hc_rom_config_init
Conduit
You can connect this interface to user-created
custom logic to control the loading of the
sequencer ROM.
hc_rom_config_init_busy
hc_rom_config_rom_rden
hc_rom_config_rom_address
autoprecharge_req interface
local_autopch_req
Conduit
Precharge interface for connection to a
custom control block. This interface is
enabled only when you turn on
Auto-precharge Control.
Conduit
User refresh interface for connection to a
custom control block. This interface is
enabled only when you turn on User AutoRefresh Control.
user_refresh interface
local_refresh_req
local_refresh_chip
local_refresh_ack
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–16
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–1. DDR2 SDRAM Controller with UniPHY Interfaces (Part 5 of 5)
Signals in Interface
Interface Type
Description/How to Connect
self_refresh interface
local_self_rfsh_req
local_self_rfsh_chip
Conduit
Self refresh interface for connection to a
custom control block. This interface is
enabled only when you turn on Self-refresh
Control.
Conduit
ECC interrupt signal for connection to a
custom control block. This interface is
enabled only when you turn on Error
Detection and Correction Logic.
Avalon-MM Slave
Configuration and status register signals for
the memory interface, for connection to an
Avalon_MM master. This interface is enabled
only when you turn on Configuration and
Status Register.
local_self_rfsh_ack
ecc_interrupt interface
ecc_interrupt
csr interface
csr_write_req
csr_read_req
csr_waitrequest
csr_addr
csr_be
csr_wdata
csr_rdata
csr_rdata_valid
Note to Table 8–1:
(1) Signals available only in DLL master mode.
Table 8–2. DDR3 SDRAM Controller with UniPHY Interfaces (Part 1 of 6)
Signals in Interface
Interface Type
Description/How to Connect
pll_ref_clk interface
pll_ref_clk
Clock input
PLL reference clock input.
Reset input
Asychronous global reset for PLL
and all logic in PHY.
Reset input
Asychronous reset input. Resets the
PHY, but not the PLL that the PHY
uses.
Reset output (PLL master/no sharing)
When the interface is in PLL master
or no sharing modes, this interface
is an asychronous reset output of
the AFI interface. This interface is
asserted when the PLL loses lock or
the PHY is reset.
global_reset interface
global_reset_n
soft_reset interface
soft_reset_n
afi_reset interface
afi_reset_n
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–17
Table 8–2. DDR3 SDRAM Controller with UniPHY Interfaces (Part 2 of 6)
Signals in Interface
Interface Type
Description/How to Connect
afi_reset_in interface
afi_reset_n
Reset input (PLL slave)
When the interface is in PLL slave
mode, this interface is a reset input
that you must connect to the
afi_reset output of an identically
configured memory interface in PLL
master mode.
afi_clk interface
afi_clk
Clock output (PLL master/no sharing)
This AFI interface clock can be
full-rate or half-rate memory clock
frequency based on the memory
interface parameterization.
When the interface is in PLL master
or no sharing modes, this interface
is a clock output.
afi_clk_in interface
This AFI interface clock can be
full-rate or half-rate memory clock
frequency based on the memory
interface parameterization.
afi_clk
Clock input (PLL slave)
When the interface is in PLL slave
mode, this is a clock input that you
must connect to the afi_clk
output of an identically configured
memory interface in PLL master
mode.
afi_half_clk interface
The AFI half clock that is half the
frequency of afi_clk.
afi_half_clk
Clock output (PLL master/no sharing)
When the interface is in PLL master
or no sharing modes, this interface
is a clock output.
afi_half_clk_in interface
The AFI half clock that is half the
frequency of the afi_clk.
afi_half_clk
November 2011
Clock input (PLL slave)
Altera Corporation
When the interface is in PLL slave
mode, you must connect this
afi_half_clk input to the
afi_half_clk output of an
identically configured memory
interface in PLL master mode.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–18
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–2. DDR3 SDRAM Controller with UniPHY Interfaces (Part 3 of 6)
Signals in Interface
Interface Type
Description/How to Connect
memory interface
mem_a
mem_ba
mem_ck
mem_ck_n
mem_cke
mem_cs_n
mem_dm
mem_ras_n
mem_cas_n
mem_we_n
Conduit
Interface signals between the PHY
and the memory device.
Avalon-MM Slave
Avalon-MM interface signals
between the memory interface and
user logic.
Conduit
Memory interface status signals.
Conduit
OCT reference resistor pins for
rup/rdn or rzqin.
mem_dq
mem_dqs
mem_dqs_n
mem_odt
mem_reset_n
mem_ac_parity
mem_err_out_n
mem_parity_error_n
avl interface
avl_ready
avl_burst_begin
avl_addr
avl_rdata_valid
avl_rdata
avl_wdata
avl_be
avl_read_req
avl_write_req
avl_size
status interface
local_init_done
local_cal_success
local_cal_fail
oct interface
rup (Stratix III/IV, Arria II GZ)
rdn (Stratix III/IV, Arria II GZ)
rzq (Stratix V)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–19
Table 8–2. DDR3 SDRAM Controller with UniPHY Interfaces (Part 4 of 6)
Signals in Interface
Interface Type
Description/How to Connect
local_powerdown interface
local_powerdn_ack
Conduit
This powerdown interface for the
controller is enabled only when you
turn on Enable Auto Power Down.
Conduit
Interface signals for PLL sharing, to
connect PLL masters to PLL slaves.
This interface is enabled only when
you set PLL sharing mode to
master or slave.
Conduit
DLL sharing interface for connecting
DLL masters to DLL slaves. This
interface is enabled only when you
set DLL sharing mode to master or
slave.
Conduit
OCT sharing interface for
connecting OCT masters to OCT
slaves. This interface is inabled only
when you set OCT sharing mode to
master or slave.
pll_sharing interface
pll_mem_clk
pll_write_clk
pll_addr_cmd_clk
pll_locked
pll_avl_clk
pll_config_clk
pll_hr_clk
pll_p2c_read_clk
pll_c2p_write_clk
pll_dr_clk
dll_sharing interface
dll_delayctrl
oct_sharing interface
seriesterminationcontrol
parallelterminationcontrol
hcx_dll_reconfig interface
dll_offset_ctrl_addnsub
This DLL reconfiguration interface is
enabled only when you turn on
HardCopy Compatibility Mode.
dll_offset_ctrl_offset
dll_offset_ctrl_addnsub (1)
dll_offset_ctrl_offset (1)
dll_offset_ctrl_offsetctrlout (1)
dll_offset_ctrl_b_offsetctrlout (1)
November 2011
Altera Corporation
Conduit
You can connect this interface to
user-created custom logic to enable
DLL reconfiguration.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–20
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–2. DDR3 SDRAM Controller with UniPHY Interfaces (Part 5 of 6)
Signals in Interface
Interface Type
Description/How to Connect
hcx_pll_reconfig interface
configupdate
phasecounterselect
phasestep
This PLL reconfiguration interface is
enabled only when you turn on
HardCopy Compatibility Mode.
phaseupdown
scanclk
scanclkena
Conduit
You can connect this interface to
user-created custom logic to enable
PLL reconfiguration.
scandata
phasedone
scandataout
scandone
hcx_rom_reconfig interface
hc_rom_config_clock
This ROM loader interface is
enabled only when you turn on
HardCopy Compatibility Mode.
hc_rom_conig_datain
hc_rom_config_rom_data_ready
hc_rom_config_init
Conduit
You can connect this interface to
user-created custom logic to control
loading of the sequencer ROM.
hc_rom_config_init_busy
hc_rom_config_rom_rden
hc_rom_config_rom_address
autoprecharge_req interface
local_autopch_req
Conduit
Precharge interface for connection
to a custom control block. This
interface is enabled only when you
turn on Auto-precharge Control.
Conduit
User refresh interface for
connection to a custom control
block. This interface is enabled only
when you turn on User
Auto-Refresh Control.
Conduit
Self refresh interface for connection
to a custom control block. This
interface is enabled only when you
turn on Self-refresh Control.
Conduit
ECC interrupt signal for connection
to a custom control block. This
interface is enabled only when you
turn on Error Detection and
Correction Logic.
user_refresh interface
local_refresh_req
local_refresh_chip
local_refresh_ack
self_refresh interface
local_self_rfsh_req
local_self_rfsh_chip
local_self_rfsh_ack
ecc_interrupt interface
ecc_interrupt
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–21
Table 8–2. DDR3 SDRAM Controller with UniPHY Interfaces (Part 6 of 6)
Signals in Interface
Interface Type
Description/How to Connect
csr interface
csr_write_req
csr_read_req
csr_waitrequest
csr_addr
Avalon-MM Slave
csr_be
csr_wdata
Configuration and status register
signals for the memory interface, for
connection to an Avalon_MM
master. This interface is enabled
only when you turn on
Configuration and Status Register.
csr_rdata
csr_rdata_valid
Note to Table 8–2
(1) Signals available only in DLL master mode.
Table 8–3 lists the QDR II and QDR II+ SRAM signals available for each interface in
Qsys and SOPC Builder and provides a description and guidance on how to connect
those interfaces.
Table 8–3. QDR II and QDR II+ SRAM Controller with UniPHY Interfaces (Part 1 of 5) (Part 1 of 5)
Signals in Interface
Interface Type
Description/How to Connect
pll_ref_clk interface
pll_ref_clk
Clock input
PLL reference clock input.
Reset input
Asychronous global reset for PLL
and all logic in PHY.
Reset input
Asychronous reset input. Resets the
PHY, but not the PLL that the PHY
uses.
Reset output (PLL master/no sharing)
When the interface is in PLL master
or no sharing modes, this interface
is an asychronous reset output of
the AFI interface. This interface is
asserted when the PLL loses lock or
the PHY is reset.
Reset input (PLL slave)
When the interface is in PLL slave
mode, this interface is a reset input
that you must connect to the
afi_reset output of an identically
configured memory interface in PLL
master mode.
global_reset interface
global_reset_n
soft_reset interface
soft_reset_n
afi_reset interface
afi_reset_n
afi_reset_in interface
afi_reset_n
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–22
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–3. QDR II and QDR II+ SRAM Controller with UniPHY Interfaces (Part 2 of 5) (Part 2 of 5)
Signals in Interface
Interface Type
Description/How to Connect
afi_clk interface
afi_clk
Clock output (PLL master/no sharing)
This AFI interface clock can be
full-rate or half-rate memory clock
frequency based on the memory
interface parameterization.
When the interface is in PLL master
or no sharing modes, this interface
is a clock output.
afi_clk_in interface
This AFI interface clock can be
full-rate or half-rate memory clock
frequency based on the memory
interface parameterization.
afi_clk
Clock input (PLL slave)
When the interface is in PLL slave
mode, this is a clock input that you
must connect to the afi_clk
output of an identically configured
memory interface in PLL master
mode.
afi_half_clk interface
The AFI half clock that is half the
frequency of afi_clk.
afi_half_clk
Clock output (PLL master/no sharing)
When the interface is in PLL master
or no sharing modes, this interface
is a clock output.
afi_half_clk_in interface
The AFI half clock that is half the
frequency of afi_clk.
afi_half_clk
External Memory Interface Handbook
Volume 2: Design Guidelines
Clock input (PLL slave)
When the interface is in PLL slave
mode, you must connect this
afi_half_clk input to the
afi_half_clk output of an
identically configured memory
interface in PLL master mode.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–23
Table 8–3. QDR II and QDR II+ SRAM Controller with UniPHY Interfaces (Part 3 of 5) (Part 3 of 5)
Signals in Interface
Interface Type
Description/How to Connect
memory interface
mem_a
mem_cqn
mem_bws_n
mem_cq
mem_d
mem_k
Conduit
Interface signals between the PHY
and the memory device.
Avalon-MM Slave
Avalon-MM interface between
memory interface and user logic for
read requests.
Avalon-MM Slave
Avalon-MM interface between
memory interface and user logic for
write requests.
Conduit
Memory interface status signals.
Conduit
OCT reference resistor pins for
rup/rdn or rzqin.
mem_k_n
mem_q
mem_wps_n
mem_rps_n
mem_doff_n
avl_r interface
avl_r_read_req
avl_r_ready
avl_r_addr
avl_r_size
avl_r_rdata_valid
avl_r_rdata
avl_w interface
avl_w_write_req
avl_w_ready
avl_w_addr
avl_w_size
avl_w_wdata
avl_w_be
status interface
local_init_done
local_cal_success
local_cal_fail
oct interface
rup (Stratix III/IV, Arria II GZ, Arria II GX)
rdn (Stratix III/IV, Arria II GZ, Arria II GX)
rzq (Stratix V)
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–24
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–3. QDR II and QDR II+ SRAM Controller with UniPHY Interfaces (Part 4 of 5) (Part 4 of 5)
Signals in Interface
Interface Type
Description/How to Connect
pll_sharing interface
pll_mem_clk
pll_write_clk
pll_addr_cmd_clk
pll_locked
pll_avl_clk
pll_config_clk
Conduit
Interface signals for PLL sharing, to
connect PLL masters to PLL slaves.
This interface is enabled only when
you set PLL sharing mode to
master or slave.
Conduit
DLL sharing interface for connecting
DLL masters to DLL slaves. This
interface is enabled only when you
set DLL sharing mode to master or
slave.
Conduit
OCT sharing interface for
connecting OCT masters to OCT
slaves. This interface is inabled only
when you set OCT sharing mode to
master or slave.
pll_hr_clk
pll_p2c_read_clk
pll_c2p_write_clk
pll_dr_clk
dll_sharing interface
dll_delayctrl
oct_sharing interface
seriesterminationcontrol
(Stratix III/IV/V, Arria II GZ)
parallelterminationcontrol
(Stratix III/IV/V, Arria II GZ)
terminationcontrol (Arria II GX)
hcx_dll_reconfig
dll_offset_ctrl_addnsub
This DLL reconfiguration interface is
enabled only when you turn on
HardCopy Compatibility Mode.
dll_offset_ctrl_offset
dll_offset_ctrl_addnsub (1)
dll_offset_ctrl_offset (1)
Conduit
You can connect this interface to
user-created custom logic to enable
DLL reconfiguration.
dll_offset_ctrl_offsetctrlout (1)
dll_offset_ctrl_b_offsetctrlout (1)
hcx_pll_reconfig
configupdate
phasecounterselect
phasestep
This PLL reconfiguration interface is
enabled only when you turn on
HardCopy Compatibility Mode.
phaseupdown
scanclk
scanclkena
scandata
phasedone
Conduit
You can connect this interface to
user-created custom logic to enable
PLL reconfiguration.
scandataout
scandone
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–25
Table 8–3. QDR II and QDR II+ SRAM Controller with UniPHY Interfaces (Part 5 of 5) (Part 5 of 5)
Signals in Interface
Interface Type
Description/How to Connect
hcx_rom_reconfig
hc_rom_config_clock
This ROM loader interface is
enabled only when you turn on
HardCopy Compatibility Mode.
hc_rom_config_datain
hc_rom_config_rom_data_ready
hc_rom_config_init
Conduit
You can connect this interface to
user-created custom logic to control
loading of the sequencer ROM.
hc_rom_config_init_busy
hc_rom_config_rom_rden
hc_rom_config_rom_address
Note to Table 8–3
(1) Signals available only in DLL master mode.
Table 8–4 lists the RLDRAM II signals available for each interface in Qsys and SOPC
Builder and provides a description and guidance on how to connect those interfaces.
Table 8–4. RLDRAM II Controller with UniPHY Interfaces (Part 1 of 5)
Interface Name
Interface Type
Description
pll_ref_clk interface
pll_ref_clk
Clock input.
PLL reference clock input.
Reset input
Asynchronous global reset for PLL
and all logic in PHY.
Reset input
Asynchronous reset input. Resets
the PHY, but not the PLL that the
PHY uses.
Reset output (PLL master/no sharing)
When the interface is in PLL master
or no sharing modes, this interface
is an asynchronous reset output of
the AFI interface. This interface is
asserted when the PLL loses lock or
the PHY is reset.
Reset input (PLL slave)
When the interface is in PLL slave
mode, this interface is a reset input
that you must connect to the
afi_reset output of an identically
configured memory interface in PLL
master mode.
global_reset interface
global_reset_n
soft_reset interface
soft_reset_n
afi_reset interface
afi_reset_n
afi_reset_in interface
afi_reset_n
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–26
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–4. RLDRAM II Controller with UniPHY Interfaces (Part 2 of 5)
Interface Name
Interface Type
Description
afi_clk interface
afi_clk
Clock output (PLL master/no sharing)
This AFI interface clock can be
full-rate or half-rate memory clock
frequency based on the memory
interface parameterization.
When the interface is in PLL master
or no sharing modes, this interface
is a clock output.
afi_clk_in interface
This AFI interface clock can be
full-rate or half-rate memory clcok
frequency based on the memory
interface parameterization.
afi_clk
Clock input (PLL slave)
When the interface is in PLL slave
mode, you must connect this
afi_clk input to the afi_clk
output of an identically configured
memory interface in PLL master
mode.
afi_half_clk interface
The AFI half clock that is half the
frequency of the afi_clk.
afi_half_clk
Clock output (PLL master/no sharing)
When the interface is in PLL master
or no sharing modes, this interface
is a clock output.
afi_half_clk_in interface
The AFI half clock that is half the
frequency of the afi_clk.
afi_half_clk
External Memory Interface Handbook
Volume 2: Design Guidelines
Clock input (PLL slave)
When the interface is in PLL slave
mode, you must connect this
afi_half_clk input to the
afi_half_clk output of an
identically configured memory
interface in PLL master mode.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
8–27
Table 8–4. RLDRAM II Controller with UniPHY Interfaces (Part 3 of 5)
Interface Name
Interface Type
Description
memory interface
mem_a
mem_ba
mem_ck
mem_ck_n
mem_cs_n
mem_dk
mem_dk_n
Conduit
Interface signals between the PHY
and the memory device.
Avalom-MM Slave
Avalon-MM interface between
memory interface and user logic.
Conduit
Memory interface status signals.
Conduit
OCT reference resistor pins for
rup/rdn or rzqin.
mem_dm
mem_dq
mem_qk
mem_qk_n
mem_ref_n
mem_we_n
avl interface
avl_size
avl_wdata
avl_rdata_valid
avl_rdata
avl_ready
avl_write_req
avl_read_req
avl_addr
status interface
local_init_done
local_cal_success
local_cal_fail
oct interface
rup (Stratix III/IV, Arria II GZ)
rdn (Stratix III/IV, Arria II GZ)
rzq (Stratix V)
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–28
Chapter 8: Implementing and Parameterizing Memory IP
Qsys and SOPC Builder Interfaces
Table 8–4. RLDRAM II Controller with UniPHY Interfaces (Part 4 of 5)
Interface Name
Interface Type
Description
pll_sharing interface
pll_mem_clk
pll_write_clk
pll_addr_cmd_clk
pll_locked
pll_avl_clk
pll_config_clk
Conduit
Interface signals for PLL sharing, to
connect PLL masters to PLL slaves.
This interface is enabled only when
you set PLL sharing mode to
master or slave.
Conduit
DLL sharing interface for connecting
DLL masters to DLL slaves. This
interface is enabled only when you
set DLL sharing mode to master or
slave.
Conduit
OCT sharing interface for
connecting OCT masters to OCT
slaves. This interface is enabled only
when you set OCT sharing mode to
master or slave.
pll_hr_clk
pll_p2c_read_clk
pll_c2p_write_clk
pll_dr_clk
dll_sharing interface
dll_delayctrl
oct_sharing interface
seriesterminationcontrol
parallelterminationcontrol
hcx_dll_reconfig interface
dll_offset_ctrl_addnsub
This DLL reconfiguration interface is
enabled only when you turn on
HardCopy Compatibility Mode.
dll_offset_ctrl_offset
dll_offset_ctrl_addnsub (1)
dll_offset_ctrl_offset (1)
Conduit
You can connect this interface to
user-created custom logic to enable
DLL reconfiguration.
dll_offset_ctrl_offsetctrlout (1)
dll_offset_ctrl_b_offsetctrlout (1)
hcx_pll_reconfig interface
configupdate
phasecounterselect
phasestep
This PLL reconfiguration interface is
enabled only when you turn on
HardCopy Compatibility Mode.
phaseupdown
scanclk
scanclkena
scandata
phasedone
Conduit
You can connect this interface to
user-created custom logic to enable
PLL reconfiguration.
scandataout
scandone
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
8–29
Table 8–4. RLDRAM II Controller with UniPHY Interfaces (Part 5 of 5)
Interface Name
Interface Type
Description
hcx_rom_reconfig interface
hc_rom_config_clock
This ROM loader interface is
enabled only when you turn on
HardCopy Compatibility Mode.
hc_rom_config_datain
hc_rom_config_rom_data_ready
hc_rom_config_init
Conduit
You can connect this interface to
user-created custom logic to control
loading of the sequencer ROM.
hc_rom_config_init_busy
hc_rom_config_rom_rden
hc_rom_config_rom_adress
parity_error_interrupt interface
parity_error
Conduit
Parity error interrupt conduit for
connection to custom control block.
This interface is enabled only if you
turn on Enable Error Detection
Parity.
Conduit
User refresh interface for
connection to custom control block.
This interface is enabled only if you
turn on Enable User Refresh.
Conduit
Reserved interface required for
certain pin configurations when you
select the Nios® II-based sequencer.
user_refresh interface
ref_req
ref_ba
ref_ack
reserved interface
reserved
Note to Table 8–4
(1) Signals available only in DLL master mode.
Generated Files
When you complete the IP generation flow, there are generated files created in your
project directory. The directory structure created varies somewhat, depending on the
tool used to parameterize and generate the IP.
1
November 2011
The PLL parameters are statically defined in the <variation_name>_parameters.tcl at
generation time. To ensure timing constraints and timing reports are correct, when
you edit the PLL parameters, apply those changes to the PLL parameters in this file.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–30
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
The following sections list the generated files for the ALTMEMPHY and UniPHY IP.
Generated Files for Memory Controllers with the ALTMEMPHY IP
Table 8–5 lists the ALTMEMPHY generated directory and key files using the
MegaWizard Plug-In Manager.
Table 8–5. ALTMEMPHY Generated Files (Part 1 of 2)
File Name
Description
alt_mem_phy_defines.v
Contains constants used in the interface. This file is
always in Verilog HDL regardless of the language you
chose in the MegaWizard Plug-In Manager.
<variation_name>.ppf
Pin planner file for your ALTMEMPHY variation.
<variation_name>.qip
Quartus II IP file for your ALTMEMPHY variation,
containing the files associated with this megafunction.
<variation_name>.v/.vhd
Top-level file of your ALTMEMPHY variation, generated
based on the language you chose in the MegaWizard
Plug-In Manager.
<variation_name>.vho
Contains functional simulation model for VHDL only.
<variation_name>_alt_mem_phy_seq_wrapper.vo/.vho
A wrapper file, for simulation only, that calls the
sequencer file, created based on the language you
chose in the MegaWizard Plug-In Manager.
<variation_name>.html
Lists the top-level files created and ports used in the
megafunction.
<variation_name>_alt_mem_phy_seq_wrapper.v/.vhd
A wrapper file, for compilation only, that calls the
sequencer file, created based on the language you
chose in the MegaWizard Plug-In Manager.
<variation_name>_alt_mem_phy_seq.vhd
Contains the sequencer used during calibration. This
file is always in VHDL language regardless of the
language you chose in the MegaWizard Plug-In
Manager.
<variation_name>_alt_mem_phy.v
Contains all modules of the ALTMEMPHY variation
except for the sequencer. This file is always in Verilog
HDL language regardless of the language you chose in
the MegaWizard Plug-In Manager. The
<variation_name>_alt_mem_phy_seq.vhd includes
the DDR3 SDRAM sequencer.
<variation name>_alt_mem_phy_pll_<device>.ppf
This XML file describes the MegaCore pin attributes to
the Quartus II Pin Planner.
<variation_name>_alt_mem_phy_pll.v/.vhd
The PLL megafunction file for your ALTMEMPHY
variation, generated based on the language you chose
in the MegaWizard Plug-In Manager.
<variation_name>_alt_mem_phy_delay.vhd
Includes a delay module for simulation. This file is only
generated if you choose VHDL as the language of your
MegaWizard Plug-In Manager output files.
<variation_name>_alt_mem_phy_dq_dqs.vhd or .v
Generated file that contains DQ/DQS I/O atoms
interconnects and instance. Only generated when
targeting Arria II GX devices.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
8–31
Table 8–5. ALTMEMPHY Generated Files (Part 2 of 2)
File Name
Description
<variation_name>_alt_mem_phy_dq_dqs_clearbox.txt
Specification file that generates the
<variation_name>_alt_mem_phy_dq_dqs file using
the clearbox flow. Only generated when targeting
Arria II GX devices.
<variation_name>_alt_mem_phy_pll.qip
Quartus II IP file for the PLL that your ALTMEMPHY
variation uses that contains the files associated with
this megafunction.
<variation_name>_alt_mem_phy_pll_bb.v/.cmp
Black box file for the PLL used in your ALTMEMPHY
variation. Typically unused.
<variation_name>_alt_mem_phy_reconfig.qip
Quartus II IP file for the PLL reconfiguration block.
Only generated when targeting Arria GX, HardCopy® II,
Stratix II, and Stratix II GX devices.
<variation_name>_alt_mem_phy_reconfig.v/.vhd
PLL reconfiguration block module. Only generated
when targeting Arria GX, HardCopy II, Stratix II, and
Stratix II GX devices.
<variation_name>_alt_mem_phy_reconfig_bb.v/cmp
Black box file for the PLL reconfiguration block. Only
generated when targeting Arria GX, HardCopy II,
Stratix II, and Stratix II GX devices.
<variation_name>_bb.v/.cmp
Black box file for your ALTMEMPHY variation,
depending whether you are using Verilog HDL or VHDL
language.
<variation_name>_ddr_pins.tcl
Contains procedures used in the
<variation_name>_ddr_timing.sdc and
<variation_name>_report_timing.tcl files.
<variation_name>_pin_assignments.tcl
Contains I/O standard, drive strength, output enable
grouping, DQ/DQS grouping, and termination
assignments for your ALTMEMPHY variation. If your
top-level design pin names do not match the default
pin names or a prefixed version, edit the assignments
in this file.
<variation_name>_ddr_timing.sdc
Contains timing constraints for your ALTMEMPHY
variation.
<variation_name>_report_timing.tcl
Script that reports timing for your ALTMEMPHY
variation during compilation.
Table 8–6 lists the modules that are instantiated in the
<variation_name>_alt_mem_phy.v/.vhd file. A particular ALTMEMPHY variation
may or may not use any of the modules, depending on the memory standard that you
specify.
Table 8–6. Modules in <variation_name>_alt_mem_phy.v File (Part 1 of 2)
Module Name
Usage
Description
<variation_name>_alt_mem_phy_ad
dr_cmd
All ALTMEMPHY variations
Generates the address and command structures.
<variation_name>_alt_mem_phy_cl
k_reset
All ALTMEMPHY variations
Instantiates PLL, DLL, and reset logic.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–32
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
Table 8–6. Modules in <variation_name>_alt_mem_phy.v File (Part 2 of 2)
Module Name
Usage
Description
<variation_name>_alt_mem_phy_dp
_io
All ALTMEMPHY variations
Generates the DQ, DQS, DM, and QVLD I/O pins.
<variation_name>_alt_mem_phy_mi
mic
DDR2/DDR SDRAM
ALTMEMPHY variation
Creates the VT tracking mechanism for DDR and
DDR2 SDRAM PHY IPs.
<variation_name>_alt_mem_phy_oc
t_delay
DDR2/DDR SDRAM
ALTMEMPHY variation when
dynamic OCT is enabled.
Generates the proper delay and duration for the
OCT signals.
<variation_name>_alt_mem_phy_po
stamble
DDR2/DDR SDRAM
ALTMEMPHY variations
Generates the postamble enable and disable
scheme for DDR and DDR2 SDRAM PHY IPs.
<variation_name>_alt_mem_phy_re
ad_dp
All ALTMEMPHY variations
(unused for Stratix III or
Stratix IV devices)
Takes read data from the I/O through a read path
FIFO buffer, to transition from the
resyncronization clock to the PHY clock.
<variation_name>_alt_mem_phy_re
ad_dp_group
DDR2/DDR SDRAM
ALTMEMPHY variations
(Stratix III and Stratix IV
devices only)
A per DQS group version of
<variation_name>_alt_mem_phy_read_dp.
<variation_name>_alt_mem_phy_rd
ata_valid
DDR2/DDR SDRAM
ALTMEMPHY variations
Generates read data valid signal to sequencer and
controller.
<variation_name>_alt_mem_phy_se
q_wrapper
All ALTMEMPHY variations
Generates sequencer for DDR and DDR2 SDRAM.
<variation_name>_alt_mem_phy_wr
ite_dp
All ALTMEMPHY variations
Generates the demultiplexing of data from
half-rate to full-rate DDR data.
<variation_name>_alt_mem_phy_wr
ite_dp_fr
DDR2/DDR SDRAM
ALTMEMPHY variations
A full-rate version of
<variation_name>_alt_mem_phy_
write_dp.
Table 8–7 lists the additional files generated by the HPC II that may be in your project
directory.
Table 8–7. Controller-Generated Files (Part 1 of 2)
Filename
Description
alt_mem_ddrx_addr_cmd.v
Decodes internal protocol-related signals into memory address and
command signals.
alt_mem_ddrx_addr_cmd_wrap.v
A wrapper that instantiates the alt_mem_ddrx_addr_cmd.v file.
alt_mem_ddrx_ddr2_odt_gen.v
Generates the on-die termination (ODT) control signal for DDR2
memory interfaces.
alt_mem_ddrx_ddr3_odt_gen.v
Generates the ODT control signal for DDR3 memory interfaces.
alt_mem_ddrx_odt_gen.v
Wrapper that instantiates alt_mem_ddrx_ddr2_odt_gen.v and
alt_mem_ddrx_ddr3_odt_gen.v. This file also controls the ODT
addressing scheme.
alt_mem_ddrx_rdwr_data_tmg.v
Decodes internal data burst related signals to memory data signals.
alt_mem_ddrx_arbiter.v
Contains logic that determines which command to execute based on
certain schemes.
alt_mem_ddrx_burst_gen.v
Converts internal DRAM-aware commands to AFI signals.
alt_mem_ddrx_cmd_gen.v
Converts user requests to DRAM-aware commands.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
8–33
Table 8–7. Controller-Generated Files (Part 2 of 2)
Filename
Description
alt_mem_ddrx_csr.v
Contains configuration registers.
alt_mem_ddrx_buffer.v
Contains buffer for local data.
alt_mem_ddrx_buffer_manager.v
Manages the allocation of buffers.
alt_mem_ddrx_burst_tracking.v
Tracks data received per local burst command.
alt_mem_ddrx_dataid_manager.v
Manages the IDs associated with data stored in buffer.
alt_mem_ddrx_fifo.v
Contains the FIFO buffer to store local data to create a link; is also used
in rdata_path to store the read address and error address.
alt_mem_ddrx_list.v
Tracks the DRAM commands associated with the data stored
internally.
alt_mem_ddrx_rdata_path.v
Contains read data path logic.
alt_mem_ddrx_wdata_path.v
Contains write data path logic.
alt_mem_ddrx_define.v
Defines common parameters used in the RTL files.
alt_mem_ddrx_ecc_decoder.v
Instantiates appropriate width of the ECC decoder logic.
alt_mem_ddrx_ecc_decoder_32_syn.v
Contains synthesizable 32-bit version of the ECC decoder.
alt_mem_ddrx_ecc_decoder_64_syn.v
Contains synthesizable 64-bit version of the ECC decoder.
alt_mem_ddrx_ecc_encoder.v
Instantiates appropriate width of the ECC encoder logic.
alt_mem_ddrx_ecc_encoder_32_syn.v
Contains synthesizable 32-bit version of the ECC decoder.
alt_mem_ddrx_ecc_encoder_64_syn.v
Contains synthesizable 64-bit version of the ECC decoder.
alt_mem_ddrx_ecc_encoder_decoder_wrapper.v Wrapper that instantiates all ECC logic.
alt_mem_ddrx_input_if.v
Contains local input interface logic.
alt_mem_ddrx_mm_st_converter.v
Contains supporting logic for Avalon-MM interface.
alt_mem_ddrx_rank_timer.v
Contains a timer associated with rank timing.
alt_mem_ddrx_sideband.v
Contains supporting logic for user-controlled refresh and precharge
signals.
alt_mem_ddrx_tbp.v
Contains command queue and associated logic for reordering
features.
alt_mem_ddrx_timing_param.v
Contains timer logic associated with non-rank timing.
alt_mem_ddrx_controller_st_top.v
Wrapper that instantiates all submodules and configuration registers.
alt_mem_ddrx_controller_top.v
Wrapper that contains memory controller with Avalon-MM interface.
alt_mem_ddrx_controller.v
Wrapper that instantiates all submodules.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–34
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
Generated Files for Memory Controllers with the UniPHY IP
Table 8–8 lists the generated directory structure and key files created with the
MegaWizard Plug-In Manager, SOPC Builder, and Qsys.
Table 8–8. Generated Directory Structure and Key Files (Part 1 of 4)
Directory
File Name
Description
MegaWizard Plug-In Manager
<working_dir>/
Synthesis
Files
<working_dir>/
<variation_name>.qip
<variation_name>.v or
<variation_name>.vhd
Top-level wrapper synthesis
files.
.v is IEEE Encrypted Verilog.
.vhd is generated VHDL.
<working_dir>/<variation_name>/
<variation_name>_0002.v
UniPHY top-level wrapper.
<working_dir>/<variation_name>/
*.v, *.sv, *.tcl, *.sdc, *.ppf
RTL and constraints files for
synthesis.
<working_dir>/<variation_name>/
<variation_name>_p0_pin_assign Pin constraints script to be run
ments.tcl
after synthesis.
<working_dir>/<variation_name>_sim
<variation_name>.v
/
Simulation
Files
Quartus II IP file which refers to
all generated files in the
synthesis fileset. Include this file
in your Quartus II project.
Top-level wrapper simulation
files for both Verilog and VHDL.
RTL and constraints files for
simulation.
<working_dir>/<variation_name>_sim *.v, *.sv, *.vhd, *.vho,*hex,
/<subcomponent_module>/
*.mif
.v and .sv files are IEEE
Encrypted Verilog.
.vhd and .vho are generated
VHDL.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
8–35
Table 8–8. Generated Directory Structure and Key Files (Part 2 of 4)
Directory
File Name
Description
MegaWizard Plug-In Manager—Example Design Fileset
Synthesis
Files
<variation_name>_example_design/e
<variation_name>_example.qip
xample_project/
Quartus II IP file that refers to all
generated files in the
synthesizable project.
<variation_name>_example_design/e
<variation_name>_example.qpf
xample_project/
Quartus II project for synthesis
flow.
<variation_name>_example_design/e
<variation_name>_example.qsf
xample_project/
Quartus II project for synthesis
flow.
<variation_name>_example_design/e
xample_project/
<variation_name>_example.v
<variation_name>_example/
Top-level wrapper.
<variation_name>_example_design/e
xample_project/
*.v, *.sv, *.tcl, *.sdc, *.ppf
<variation_name>_example/
submodules/
RTL and constraints files.
Pin constraints script to be run
<variation_name>_example_design/e
after synthesis.
xample_project/
<variation_name>_example_if0_p
_if0 and _p0 are instance
<variation_name>_example/
0_pin_assignments.tcl
names. For more information,
submodules/
refer to Table 8–9 on page 8–37.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–36
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
Table 8–8. Generated Directory Structure and Key Files (Part 3 of 4)
Directory
Simulation
Files
File Name
Description
<variation_name>_example_design/s generate_sim_verilog_example_
imulation/
design.tcl
Run this file to generate the
Verilog simulation example
design.
<variation_name>_example_design/s generate_sim_vhdl_example_de
imulation/
sign.tcl
Run this file to generate the
VHDL simulation example
design.
<variation_name>_example_design/s
README.txt
imulation/
A text file with instructions
about how to generate and run
the simulation example design.
<variation_name>_example_design/s
run.do
imulation/verilog/mentor
ModelSim script to simulate the
generated Verilog example
design.
<variation_name>_example_design/s
run.do
imulation/vhdl/mentor
ModelSim script to simulate the
generated VHDL example
design.
<variation_name>_example_design/s
<variation_name>_example_
imulation/verilog/
sim.v
<variation_name>_sim/
Top-level wrapper (Testbench)
for Verilog.
<variation_name>_example_design/s
<variation_name>_example_
imulation/vhdl/
sim.vhd
<variation_name>_sim/
Top-level wrapper (Testbench)
for VHDL.
<variation_name>_example_design/s
imulation/
*.v, *.sv, *.hex, *.mif
<variation_name>_sim/verilog/
submodules/
RTL and ROM data for Verilog.
<variation_name>_example_design/s
imulation/
*.vhd, *.vho, *.hex, *.mif
<variation_name>_sim/vhdl/
submodules/
RTL and ROM data for VHDL.
SOPC Builder
<working_dir>/
<system_name>.qip
Quartus II IP file that refers to all
the generated files in the SOPC
Builder project.
<working_dir>/
<system_name>.v
System top-level RTL.
<working_dir>/
<module_name>.v
Module wrapper RTL.
*.v, *.sv, *.tcl, *.sdc, *.ppf
Subdirectory of TL and
constraints for each system
module.
<working_dir>/<system_name>/synthesis/
<system_name>.qip
Quartus II IP file that refers to all
the generated files in the
synthesis fileset.
<working_dir>/<system_name>/synthesis/
<system_name>.v
System top-level RTL for
synthesis.
<working_dir>/<module_name>/
Qsys
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Generated Files
8–37
Table 8–8. Generated Directory Structure and Key Files (Part 4 of 4)
Directory
File Name
<system_name>.v or
<variation_name>.vhd
<working_dir>/<system_name>/simulation/
Description
System top-level RTL for
simulation.
.v file is IEEE Encrypted Verilog.
.vhd file is generated VHDL.
<working_dir>/<system_name>/synthesis/
submodules/
*.v, *.sv, *.tcl, *.sdc, *.ppf
RTL and constraints files for
synthesis.
<working_dir>/<system_name>/simulation/
submodules/
*.v, *.sv, *.hex, *.mif
RTL and ROM data for
simulation.
Table 8–9 lists the prefixes or instance names of submodule files within the memory
interface IP. These instances are concactenated to form unique synthesis and
simulation filenames.
Table 8–9. Prefixes of Submodule Files
Prefixes
November 2011
Altera Corporation
Description
_c0
Specifies the controller.
_d0
Specifies the driver or traffic generator.
_dll0
Specifies the DLL.
_e0
Specifies the example design.
_if0
Specifies the memory Interface.
_m0
Specifies the AFI mux.
_oct0
Specifies the OCT.
_p0
Specifies the PHY.
_pll0
Specifies the PLL.
_s0
Specifies the sequencer.
_t0
Specifies the traffic generator status checker.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–38
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Parameterizing Memory Controllers with ALTMEMPHY IP
This section describes the parameters you can set for the DDR, DDR2, and DDR3
SDRAM memory controllers with the ALTMEMPHY IP.
The Parameter Settings page in the ALTMEMPHY parameter editor allows you to
parameterize the following settings:
■
Memory Settings
■
PHY Settings
■
Board Settings
The text window at the bottom of the MegaWizard Plug-In Manager displays
information about the memory interface, warnings, and errors if you are trying to
create something that is not supported. The Finish button is disabled until you correct
all the errors indicated in this window.
The following sections describe the four tabs of the Parameter Settings page in more
detail.
Memory Settings
Use this tab to apply the memory parameters from your memory manufacturer’s data
sheet.
Table 8–10 describes the General Settings available on the Memory Settings page of
the ALTMEMPHY parameter editor.
Table 8–10. General Settings (Part 1 of 2)
Parameter Name
Description
Device family
Targets device family (for example, Arria II GX). The device family selected here must match the
device family selected on page 2a of the parameter editor. For more information about selecting a
device family, refer to the “Device Family Selection” section in the Selecting your FPGA Device
chapter of the External Memory Interface Handbook.
Speed grade
Selects a particular speed grade of the device (for example, 2, 3, or 4 for the Arria II GX device
family).
PLL reference clock
frequency
Determines the clock frequency of the external input clock to the PLL. Ensure that you use three
decimal points if the frequency is not a round number (for example, 166.667 MHz or 100 MHz) to
avoid a functional simulation or a PLL locking problem.
Memory clock
frequency
Determines the memory interface clock frequency. If you are operating a memory device below its
maximum achievable frequency, ensure that you enter the actual frequency of operation rather than
the maximum frequency achievable by the memory device. Also, ensure that you use three decimal
points if the frequency is not a round number (for example, 333.333 MHz or 400 MHz) to avoid a
functional simulation or a PLL locking issue.
Controller data rate
Selects the data rate for the memory controller. Sets the frequency of the controller to equal to
either the memory interface frequency (full-rate) or half of the memory interface frequency
(half-rate). The full-rate option is not available for DDR3 SDRAM devices.
Enable half rate
bridge
This option is only available for HPC II full-rate controller.
Turn on to keep the controller in the memory full clock domain while allowing the local side to run
at half the memory clock speed, so that latency can be reduced.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–39
Table 8–10. General Settings (Part 2 of 2)
Parameter Name
Description
Local interface clock
frequency
Value that depends on the memory clock frequency and controller data rate.
Local interface width
Value that depends on the memory clock frequency and controller data rate.
1
When targeting a HardCopy device migration with performance improvement, the
ALTMEMPHY IP should target the mid speed grade to ensure that the PLL and the
PHY sequencer settings match. The compilation of the design can be executed in the
faster speed grade.
Show in ‘Memory Preset’ List
Table 8–11 describes the options available to filter the Memory Presets that are
displayed. This set of options is where you indicate whether you are creating a
datapath for DDR3 SDRAM.
Table 8–11. Show in ‘Memory Presets’ List
Parameter Name
Description
Memory type
You can filter the type of memory to display, for example, DDR3 SDRAM.
Memory vendor
You can filter the memory types by vendor. JEDEC is also one of the options, allowing you to
choose the JEDEC specifications. If your chosen vendor is not listed, you can choose JEDEC for the
DDR3 SDRAM interfaces. Then, pick a device that has similar specifications to your chosen device
and check the values of each parameter. Make sure you change the each parameter value to match
your device specifications.
Memory format
You can filter the type of memory by format, for example, discrete devices or DIMM packages.
Maximum frequency
You can filter the type of memory by the maximum operating frequency.
Memory Presets
Pick a device in the Memory Presets list that is closest or the same as the actual
memory device that you are using. Then, click the Modify Parameters button to
parameterize the following settings in the Preset Editor dialog box:
1
November 2011
■
Memory attributes—These are the settings that determine your system's number
of DQ, DQ strobe (DQS), address, and memory clock pins.
■
Memory initialization options—These settings are stored in the memory mode
registers as part of the initialization process.
■
Memory timing parameters—These are the parameters that create and
time-constrain the PHY.
Even though the device you are using is listed in Memory Presets, ensure that the
settings in the Preset Editor dialog box are accurate, as some parameters may have
been updated in the memory device datasheets.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–40
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
You can change the parameters with a white background to reflect your system. You
can also change the parameters with a gray background so the device parameters
match the device you are using. These parameters in gray background are
characteristics of the chosen memory device and changing them creates a new custom
memory preset. If you click Save As (at the bottom left of the page) and save the new
settings in the <quartus_install_dir>\quartus\common\ip\altera\altmemphy\lib\
directory, you can use this new memory preset in other Quartus II projects created in
the same version of the software.
When you click Save, the new memory preset appears at the bottom of the Memory
Presets list in the Memory Settings tab.
1
If you save the new settings in a directory other than the default directory, click Load
Preset in the Memory Settings tab to load the settings into the Memory Presets list.
The Advanced option shows the percentage of memory specification that is calibrated
by the FPGA. The percentage values are estimated by Altera based on the process
variation.
Preset Editor Settings for DDR and DDR2 SDRAM
Table 8–12 through Table 8–14 describe the DDR2 SDRAM parameters available for
memory attributes, initialization options, and timing parameters. DDR SDRAM has
the same parameters, but their value ranges are different than DDR2 SDRAM.
Table 8–12. DDR2 SDRAM Attributes Settings (Part 1 of 2)
Parameter Name
Output clock pairs from
FPGA
Total Memory chip selects
Memory interface DQ width
External Memory Interface Handbook
Volume 2: Design Guidelines
Range
(1)
1–6
1, 2, 4, or 8
? 4–288
Units
Description
pairs
Defines the number of differential clock pairs driven from
the FPGA to the memory. More clock pairs reduce the
loading of each output when interfacing with multiple
devices. Memory clock pins use the signal splitter feature
in Arria II GX, Stratix III, and Stratix IV devices for
differential signaling.
bits
Sets the number of chip selects in your memory
interface. The number of chip selects defines the depth of
your memory. You are limited to the range shown as the
local side binary encodes the chip select address. You
can set this value to the next higher number if the range
does not meet your specifications. However, the highest
address space of the ALTMEMPHY megafunction is not
mapped to any of the actual memory address. The
ALTMEMPHY megafunction works with multiple chip
selects and calibrates against all chip select, mem_cs_n
signals.
bits
Defines the total number of DQ pins on the memory
interface. If you are interfacing with multiple devices,
multiply the number of devices with the number of DQ
pins per device. Even though the GUI allows you to
choose 288-bit DQ width, the interface data width is
limited by the number of pins on the device. For best
performance, have the whole interface on one side of the
device.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–41
Table 8–12. DDR2 SDRAM Attributes Settings (Part 2 of 2)
Parameter Name
Range
(1)
Units
Description
Memory vendor
JEDEC, Micron,
Qimonda,
Samsung, Hynix,
Elpida, Nanya,
other
—
Lists the name of the memory vendor for all supported
memory standards.
Memory format
Discrete Device,
Unbuffered
DIMM,
Registered
DIMM
—
Specifies whether you are interfacing with devices or
modules. SODIMM is supported under unbuffered or
registered DIMMs.
Maximum memory
frequency
See the memory
device datasheet
MHz
Sets the maximum frequency supported by the memory.
Column address width
9–11
bits
Defines the number of column address bits for your
interface.
Row address width
13–16
bits
Defines the number of row address bits for your
interface.
Bank address width
2 or 3
bits
Defines the number of bank address bits for your
interface.
Chip selects per DIMM
1 or 2
bits
Defines the number of chip selects on each DIMM in your
interface.
DQ bits per DQS bit
4 or 8
bits
Defines the number of data (DQ) bits for each data strobe
(DQS) pin.
Precharge address bit
8 or 10
bits
Selects the bit of the address bus to use as the precharge
address bit.
Yes or No
—
Specifies whether you are using DM pins for write
operation. Altera devices do not support DM pins in ×4
mode.
Drive DM pins from FPGA
Maximum memory
frequency for CAS latency
3.0
Maximum memory
frequency for CAS latency
4.0
Maximum memory
frequency for CAS latency
5.0
80–533
MHz
Specifies the frequency limits from the memory data
sheet per given CAS latency. The ALTMEMPHY
parameter editor generates a warning if the operating
frequency with your chosen CAS latency exceeds this
number.
Maximum memory
frequency for CAS latency
6.0
Note to Table 8–12:
(1) The range values depend on the actual memory device used.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–42
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–13. DDR2 SDRAM Initialization Options
Parameter Name
Range
Units
Description
Sets the number of words read or written per transaction.
Memory burst length
Memory burst ordering
Memory burst length of four equates to local burst length
of one in half-rate designs and to local burst length of two
in full-rate designs.
4 or 8
beats
Sequential or
Interleaved
—
Controls the order in which data is transferred between
memory and the FPGA during a read transaction. For
more information, refer to the memory device datasheet.
Enable the DLL in the
memory devices
Yes or No
—
Enables the DLL in the memory device when set to Yes.
You must always enable the DLL in the memory device as
Altera does not guarantee any ALTMEMPHY operation
when the DLL is turned off. All timings from the memory
devices are invalid when the DLL is turned off.
Memory drive strength
setting
Normal or
Reduced
—
Controls the drive strength of the memory device’s output
buffers. Reduced drive strength is not supported on all
memory devices. The default option is normal.
Disabled, 50, 75,
150
W
Sets the memory ODT value. Not available in DDR
SDRAM interfaces.
3, 4, 5, 6
cycles
Memory ODT setting
Memory CAS latency setting
Table 8–14. DDR2 SDRAM Timing Parameter Settings
Parameter Name
tINIT
Range
0.001–
1000
(1)
(Part 1 of 3)
Units
µs
Sets the delay in clock cycles from the read command to
the first output data from the memory.
Description
Minimum memory initialization time. After reset, the
controller does not issue any commands to the memory
during this period.
Minimum load mode register command period. The
controller waits for this period of time after issuing a load
mode register command before issuing any other
commands.
tMRD is specified in ns in the DDR2 SDRAM
high-performance controller and in terms of tCK cycles in
Micron's device datasheet. Convert tMRD to ns by
multiplying the number of cycles specified in the
datasheet times tCK, where tCK is the memory operation
frequency and not the memory device's tCK.
tMRD
2–39
ns
tRAS
8–200
ns
Minimum active to precharge time. The controller waits
for this period of time after issuing an active command
before issuing a precharge command to the same bank.
tRCD
4–65
ns
Minimum active to read-write time. The controller does
not issue read or write commands to a bank during this
period of time after issuing an active command.
tRP
4–65
ns
Minimum precharge command period. The controller
does not access the bank for this period of time after
issuing a precharge command.
tREFI
1–65534
µs
Maximum interval between refresh commands. The
controller performs regular refresh at this interval unless
user-controlled refresh is turned on.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–14. DDR2 SDRAM Timing Parameter Settings
Parameter Name
Range
8–43
(1)
(Part 2 of 3)
Units
Description
tRFC
14–1651
ns
Minimum autorefresh command period. The length of
time the controller waits before doing anything else after
issuing an auto-refresh command.
tWR
4–65
ns
Minimum write recovery time. The controller waits for
this period of time after the end of a write transaction
before issuing a precharge command.
tWTR
1–3
tCK
Minimum write-to-read command delay. The controller
waits for this period of time after the end of a write
command before issuing a subsequent read command to
the same bank. This timing parameter is specified in clock
cycles and the value is rounded off to the next integer.
tAC
300–750
ps
DQ output access time from CK/CK# signals.
tDQSCK
100–750
ps
DQS output access time from CK/CK# signals.
tDQSQ
100–500
ps
The maximum DQS to DQ skew; DQS to last DQ valid, per
group, per access.
tDQSS
0–0.3
tCK
Positive DQS latching edge to associated clock edge.
ps
DQ and DM input setup time relative to DQS, which has a
derated value depending on the slew rate of the DQS (for
both DDR and DDR2 SDRAM interfaces) and whether
DQS is single-ended or differential (for DDR2 SDRAM
interfaces). Ensure that you are using the correct number
and that the value entered is referenced to VREF(dc), not
VIH(ac) min or VIL(ac) max. Refer to “Derating Memory
Setup and Hold Timing” on page 8–50 for more
information about how to derate this specification.
tDS
10–600
tDH
10–600
ps
DQ and DM input hold time relative to DQS, which has a
derated value depending on the slew rate of the DQS (for
both DDR and DDR2 SDRAM interfaces) and whether
DQS is single-ended or differential (for DDR2 SDRAM
interfaces). Ensure that you are using the correct number
and that the value entered is referenced to VREF(dc), not
VIH(dc) min or VIL(dc) max. Refer to “Derating Memory
Setup and Hold Timing” on page 8–50 for more
information about how to derate this specification.
tDSH
0.1–0.5
tCK
DQS falling edge hold time from CK.
tDSS
0.1–0.5
tCK
DQS falling edge to CK setup.
ps
Address and control input hold time, which has a derated
value depending on the slew rate of the CK and CK#
clocks and the address and command signals. Ensure
that you are using the correct number and that the value
entered is referenced to VREF(dc), not VIH(dc) min or
VIL(dc) max. Refer to “Derating Memory Setup and Hold
Timing” on page 8–50 for more information about how to
derate this specification.
100–1000
tIH
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–44
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–14. DDR2 SDRAM Timing Parameter Settings
Parameter Name
Range
(1)
(Part 3 of 3)
Units
Description
tIS
100–1000
ps
Address and control input setup time, which has a
derated value depending on the slew rate of the CK and
CK# clocks and the address and command signals.
Ensure that you are using the correct number and that the
value entered is referenced to VREF(dc), not VIH(ac) min or
VIL(ac) max. Refer to “Derating Memory Setup and Hold
Timing” on page 8–50 for more information about how to
derate this specification.
tQHS
100–700
ps
The maximum data hold skew factor.
tRRD
2.06–64
ns
The activate command to activate time, per device, RAS
to RAS delay timing parameter.
tFAW
7.69–256
ns
The four-activate window time, per device.
tRTP
2.06–64
ns
Read to precharge time.
Note to Table 8–14:
(1) Refer to the memory device data sheet for the parameter range. Some of the parameters are listed in a clock cycle (tCK) unit. If the MegaWizard
Plug-In Manager requires you to enter the value in a time unit (ps or ns), convert the number by multiplying it with the clock period of your
interface (and not the maximum clock period listed in the memory data sheet).
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–45
Preset Editor Settings for DDR3 SDRAM
Table 8–15 through Table 8–17 describe the DDR3 SDRAM parameters available for
memory attributes, initialization options, and timing parameters.
Table 8–15. DDR3 SDRAM Attributes Settings (Part 1 of 2)
Parameter Name
Output clock pairs from
FPGA
Total Memory chip selects
Memory interface DQ width
Range
(1)
1–6
Units
Description
pairs
Defines the number of differential clock pairs driven from
the FPGA to the memory. Memory clock pins use the
signal splitter feature in Arria II GX devices for differential
signaling.
The ALTMEMPHY parameter editor displays an error on
the bottom of the window if you choose more than one
for DDR3 SDRAM interfaces.
1, 2, 4, or 8
4–288
bits
Sets the number of chip selects in your memory
interface. The number of chip selects defines the depth of
your memory. You are limited to the range shown as the
local side binary encodes the chip select address.
bits
Defines the total number of DQ pins on the memory
interface. If you are interfacing with multiple devices,
multiply the number of devices with the number of DQ
pins per device. Even though the GUI allows you to
choose 288-bit DQ width, DDR3 SDRAM variations are
only supported up to 80-bit width due to restrictions in
the board layout which affects timing at higher data
width. Furthermore, the interface data width is limited by
the number of pins on the device. For best performance,
have the whole interface on one side of the device.
On multiple rank DDR3 SDRAM DIMMs address signals
are routed differently to each rank; referred to in the
JEDEC specification as address mirroring.
Mirror addressing
—
—
Memory vendor
Elpida, JEDEC,
Micron,
Samsung, Hynix,
Nanya, other
—
Lists the name of the memory vendor for all supported
memory standards.
Memory format
Discrete Device
—
Arria II GX devices only support DDR3 SDRAM
components without leveling, for example, Discrete
Device memory format.
Maximum memory
frequency
See the memory
device datasheet
MHz
Sets the maximum frequency supported by the memory.
10–12
bits
Defines the number of column address bits for your
interface.
bits
Defines the number of row address bits for your
interface. If your DDR3 SDRAM device’s row address bus
is 12-bit wide, set the row address width to 13 and set the
13th bit to logic-level low (or leave the 13th bit
unconnected to the memory device) in the top-level file.
Column address width
Row address width
November 2011
Altera Corporation
12–16
Enter ranks with mirrored addresses in this field. There is
one bit per chip select. For example, for four chip selects,
enter 1011 to mirror the address on chip select #3, #1,
and #0.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–46
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–15. DDR3 SDRAM Attributes Settings (Part 2 of 2)
Parameter Name
Range
(1)
Units
Description
3
bits
Defines the number of bank address bits for your
interface.
Chip selects per device
1 or 2
bits
Defines the number of chip selects on each device in your
interface. Currently, calibration is done with all ranks but
you can only perform timing analysis with one.
DQ bits per DQS bit
4 or 8
bits
Defines the number of data (DQ) bits for each data strobe
(DQS) pin.
Yes or No
—
Specifies whether you are using DM pins for write
operation. Altera devices do not support DM pins with ×4
mode.
MHz
Specifies the frequency limits from the memory data
sheet per given CAS latency. The ALTMEMPHY
MegaWizard Plug-In Manager generates a warning if the
operating frequency with your chosen CAS latency
exceeds this number. The lowest frequency supported by
DDR3 SDRAM devices is 300 MHz.
Bank address width
Drive DM pins from FPGA
Maximum memory
frequency for CAS latency
5.0
Maximum memory
frequency for CAS latency
6.0
Maximum memory
frequency for CAS latency
7.0
Maximum memory
frequency for CAS latency
8.0
80–700
Maximum memory
frequency for CAS latency
9.0
Maximum memory
frequency for CAS latency
10.0
Note to Table 8–15:
(1) The range values depend on the actual memory device used.
Table 8–16. DDR3 SDRAM Initialization Options (Part 1 of 3)
Parameter Name
Memory burst length
Memory burst ordering
DLL precharge power down
External Memory Interface Handbook
Volume 2: Design Guidelines
Range
Units
4, 8, on-the-fly
beats
Description
Sets the number of words read or written per
transaction.
Sequential or
Interleaved
—
Controls the order in which data is transferred between
memory and the FPGA during a read transaction. For
more information, refer to the memory device
datasheet.
Fast exit or Slow
exit
—
Sets the mode register setting to disable (Slow exit) or
enable (Fast exit) the memory DLL when CKE is
disabled.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–47
Table 8–16. DDR3 SDRAM Initialization Options (Part 2 of 3)
Parameter Name
Enable the DLL in the
memory devices
ODT Rtt nominal value
Dynamic ODT (Rtt_WR)
value
Range
Yes or No
ODT disable,
RZQ/4, RZQ/2,
RZQ/6
Dynamic ODT off,
RZQ/4, RZQ/2
Units
Description
—
Enables the DLL in the memory device when set to Yes.
You must always enable the DLL in the memory device
as Altera does not guarantee any ALTMEMPHY
operation when the DLL is turned off. All timings from
the memory devices are invalid when the DLL is turned
off.
W
RZQ in DDR3 SDRAM interfaces are set to 240  . Sets
the on-die termination (ODT) value to either 60 
(RZQ/4), 120  (RZQ/2), or 40  (RZQ/6). Set this to
ODT disable if you are not planning to use ODT. For a
single-ranked DIMM, set this to RZQ/4.
W
RZQ in DDR3 SDRAM interfaces are set to 240  . Sets
the memory ODT value during write operations to 60 
(RZQ/4) or 120  (RZQ/2). As ALTMEMPHY only
supports single rank DIMMs, you do not need this
option (set to Dynamic ODT off).
Output driver impedance
RZQ/6 (Reserved)
or RZQ/7
W
RZQ in DDR3 SDRAM interfaces are set to 240  . Sets
the output driver impedance from the memory device.
Some devices may not have RZQ/6 available as an
option. Be sure to check the memory device datasheet
before choosing this option.
Memory CAS latency setting
5.0, 6.0, 7.0, 8.0,
9.0, 10.0
cycles
Sets the delay in clock cycles from the read command
to the first output data from the memory.
cycles
Allows you to add extra latency in addition to the CAS
latency setting.
cycles
Sets the delay in clock cycles from the write command
to the first expected data to the memory.
—
Determine whether you want to self-refresh only certain
arrays instead of the full array. According to the DDR3
SDRAM specification, data located in the array beyond
the specified address range are lost if self refresh is
entered when you use this. This option is not supported
by the DDR3 SDRAM Controller with ALTMEMPHY IP,
so set to Full Array if you are using the Altera
controller.
Memory additive CAS
latency setting
Memory write CAS latency
setting (CWL)
Disable, CL – 1,
CL – 2
5.0, 6.0, 7.0, 8.0
Full array,
Half array
{BA[2:0]=000,001,
010,011},
Memory partial array self
refresh
Quarter array
{BA[2:0]=000,001}
,
Eighth array
{BA[2:0]=000},
Three Quarters
array
{BA[2:0]=010,011,
100,101,110,111},
Half array
{BA[2:0]=100,101,
110,111},
Quarter array
{BA[2:0]=110,
111},
Eighth array
{BA[2:0]=111}
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–48
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–16. DDR3 SDRAM Initialization Options (Part 3 of 3)
Parameter Name
Memory auto self refresh
method
Memory self refresh range
Range
Manual SR
reference (SRT) or
ASR enable
(Optional)
Normal or
Extended
Units
Description
—
Sets the auto self-refresh method for the memory
device. The DDR3 SDRAM Controller with ALTMEMPHY
IP currently does not support the ASR option that you
need for extended temperature memory self-refresh.
—
Determines the temperature range for self refresh. You
need to also use the optional auto self refresh option
when using this option. The Altera controller currently
does not support the extended temperature self-refresh
operation.
Table 8–17. DDR3 SDRAM Timing Parameter Settings (Part 1 of 3)
Parameter Name
Range
Units
(1)
Description
Time to hold memory reset
before beginning calibration
0–1000000
µs
Minimum time to hold the reset after a power cycle
before issuing the MRS commands during the DDR3
SDRAM device initialization process.
tINIT
0.001–
1000
µs
Minimum memory initialization time. After reset, the
controller does not issue any commands to the
memory during this period.
Minimum load mode register command period. The
controller waits for this period of time after issuing a
load mode register command before issuing any other
commands.
tMRD is specified in ns in the DDR3 SDRAM
high-performance controller and in terms of tCK cycles
in Micron's device datasheet. Convert tMRD to ns by
multiplying the number of cycles specified in the
datasheet times tCK, where tCK is the memory operation
frequency and not the memory device's tCK.
tMRD
2–39
ns
tRAS
8–200
ns
Minimum active to precharge time. The controller waits
for this period of time after issuing an active command
before issuing a precharge command to the same bank.
tRCD
4–65
ns
Minimum active to read-write time. The controller does
not issue read or write commands to a bank during this
period of time after issuing an active command.
tRP
4–65
ns
Minimum precharge command period. The controller
does not access the bank for this period of time after
issuing a precharge command.
tREFI
1–65534
µs
Maximum interval between refresh commands. The
controller performs regular refresh at this interval
unless user-controlled refresh is turned on.
tRFC
14–1651
ns
Minimum autorefresh command period. The length of
time the controller waits before doing anything else
after issuing an auto-refresh command.
tWR
4–65
ns
Minimum write recovery time. The controller waits for
this period of time after the end of a write transaction
before issuing a precharge command.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–49
Table 8–17. DDR3 SDRAM Timing Parameter Settings (Part 2 of 3)
Parameter Name
Range
(1)
Units
Description
tWTR
1–6
tCK
Minimum write-to-read command delay. The controller
waits for this period of time after the end of a write
command before issuing a subsequent read command
to the same bank. This timing parameter is specified in
clock cycles and the value is rounded off to the next
integer.
tAC
0–750
ps
DQ output access time.
tDQSCK
50–750
ps
DQS output access time from CK/CK# signals.
tDQSQ
50–500
ps
The maximum DQS to DQ skew; DQS to last DQ valid,
per group, per access.
tDQSS
0–0.3
tCK
Positive DQS latching edge to associated clock edge.
ps
DQ and DM input hold time relative to DQS, which has a
derated value depending on the slew rate of the
differential DQS and DQ/DM signals. Ensure that you
are using the correct number and that the value entered
is referenced to VREF(dc), not VIH(dc) min or VIL(dc)
max. Refer to “Derating Memory Setup and Hold
Timing” on page 8–50 for more information about how
to derate this specification.
10–600
tDH
tDS
10–600
ps
DQ and DM input setup time relative to DQS, which has
a derated value depending on the slew rate of the
differential DQS signals and DQ/DM signals. Ensure
that you are using the correct number and that the
value entered is referenced to VREF(dc), not VIH(ac) min
or VIL(ac) max. Refer to “Derating Memory Setup and
Hold Timing” on page 8–50 for more information about
how to derate this specification.
tDSH
0.1–0.5
tCK
DQS falling edge hold time from CK.
tDSS
0.1–0.5
tCK
DQS falling edge to CK setup.
ps
Address and control input hold time, which has a
derated value depending on the slew rate of the CK and
CK# clocks and the address and command signals.
Ensure that you are using the correct number and that
the value entered is referenced to VREF(dc), not VIH(dc)
min or VIL(dc) max. Refer to “Derating Memory Setup
and Hold Timing” on page 8–50 for more information
about how to derate this specification.
50–1000
tIH
tIS
65–1000
ps
Address and control input setup time, which has a
derated value depending on the slew rate of the CK and
CK# clocks and the address and command signals.
Ensure that you are using the correct number and that
the value entered is referenced to VREF(dc), not VIH(ac)
min or VIL(ac) max. Refer to “Derating Memory Setup
and Hold Timing” on page 8–50 for more information
about how to derate this specification.
tQHS
0–700
ps
The maximum data hold skew factor.
tQH
0.1–0.6
tCK
DQ output hold time.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–50
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–17. DDR3 SDRAM Timing Parameter Settings (Part 3 of 3)
Parameter Name
Range
(1)
Units
Description
tRRD
2.06–64
ns
The activate to activate time, per device, RAS to RAS
delay timing parameter.
tFAW
7.69–256
ns
The four-activate window time, per device.
tRTP
2.06–64
ns
Read to precharge time.
Note to Table 8–17:
(1) Refer to the memory device data sheet for the parameter range. Some of the parameters are listed in a clock cycle (tCK) unit. If the MegaWizard
Plug-In Manager requires you to enter the value in a time unit (ps or ns), convert the number by multiplying it with the clock period of your
interface (and not the maximum clock period listed in the memory data sheet).
Derating Memory Setup and Hold Timing
Because the base setup and hold time specifications from the memory device
datasheet assume input slew rates that may not be true for Altera devices, derate and
update the following memory device specifications in the Preset Editor dialog box:
1
■
tDS
■
tDH
■
tIH
■
tIS
For Arria II GX and Stratix IV devices (excluding DDR SDRAM), you need not derate
using the Preset Editor. You only need to enter the parameters referenced to VREF, and
the deration is done automatically when you enter the slew rate information on the
Board Settings tab.
After derating the values, you then need to normalize the derated value because
Altera input and output timing specifications are referenced to VREF. However, JEDEC
base setup time specifications are referenced to VIH/VIL AC levels; JEDEC base hold
time specifications are referenced to VIH/VIL DC levels.
When the memory device setup and hold time numbers are derated and normalized
to VREF, update these values in the Preset Editor dialog box to ensure that your timing
constraints are correct.
Example 8–1. Derating DDR2 SDRAM
For example, according to JEDEC, 400-MHz DDR2 SDRAM has the following
specifications, assuming 1V/ns DQ slew rate rising signal and 2V/ns differential slew
rate:
■
Base tDS = 50
■
Base tDH = 125
■
VIH(ac) = VREF + 0.2 V
■
VIH(dc) = VREF + 0.125V
■
VIL(ac) = VREF – 0.2 V
■
VIL(dc) = VREF – 0.125 V
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
1
8–51
JEDEC lists two different sets of base and derating numbers for tDS and tDH
specifications, whether you are using single-ended or differential DQS signaling, for
any DDR2 SDRAM components with a maximum frequency up to 267 MHz. In
addition, the VIL(ac) and VIH(ac) values may also be different for those devices.
The VREF referenced setup and hold signals for a rising edge are:
tDS (VREF) = Base tDS + delta tDS + (VIH(ac) – VREF)/slew_rate = 50 + 0 + 200 = 250 ps
tDH (VREF) = Base tDH + delta tDH + (VIH(dc) – VREF)/slew_rate = 125 + 0 + 67.5 =
192.5 ps
If the output slew rate of the write data is different from 1V/ns, you have to first
derate the tDS and tDH values, then translate these AC/DC level specs to VREF
specification.
For a 2V/ns DQ slew rate rising signal and 2V/ns DQS-DQSn slew rate:
tDS (VREF) = Base tDS + delta tDS + (VIH(ac) – VREF)/slew_rate = 25 + 100 + 100 = 225
ps
tDH (VREF) = Base tDH + delta tDH + (VIH(dc) – VREF)/slew_rate = 100 + 45 + 62.5 =
207.5 ps
For a 0.5V/ns DQ slew rate rising signal and 1V/ns DQS-DQSn slew rate:
tDS (VREF) = Base tDS + delta tDS + (VIH(ac) – VREF)/slew_rate = 25 + 0 + 400 = 425 ps
tDH (VREF) = Base tDH + delta tDH + (VIH(dc) – VREF)/slew_rate = 100 – 65 + 250 =
285 ps
A similar approach can be taken to address/command slew rate derating. For tIS/tIH
the slew rate used in the derating equations is the address/command slew rate; for
tDS/tDH the DQ slew rate is used.
Example 8–2. Derating DDR3 SDRAM
For example, according to JEDEC, 533-MHz DDR3 SDRAM has the following
specifications, assuming 1V/ns DQ slew rate rising signal and 2V/ns DQS-DQSn
slew rate:
■
Base tDS = 25
■
Base tDH = 100
■
VIH(ac) = VREF + 0.175 V
■
VIH(dc) = VREF + 0.100 V
■
VIL(ac) = VREF – 0.175 V
■
VIL(dc) = VREF – 0.100 V
The VREF referenced setup and hold signals for a rising edge are:
tDS (VREF) = Base tDS + delta tDS + (VIH(ac) – VREF)/slew_rate = 25 + 0 + 175 = 200 ps
tDH (VREF) = Base tDH + delta tDH + (VIH(dc) – VREF)/slew_rate = 100 + 0 + 100 =
200 ps
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–52
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
If the output slew rate of the write data is different from 1V/ns, you have to first
derate the tDS and tDH values, then translate these AC/DC level specs to VREF
specification.
For a 2V/ns DQ slew rate rising signal and 2V/ns DQS-DQSn slew rate:
tDS (VREF) = Base tDS + delta tDS + (VIH(ac) – VREF)/slew_rate = 25 + 88 + 87.5 = 200.5
ps
tDH (VREF) = Base tDH + delta tDH + (VIH(dc) – VREF)/slew_rate = 100 + 50 + 50 = 200
ps
For a 0.5V/ns DQ slew rate rising signal and 1V/ns DQS-DQSn slew rate:
tDS (VREF) = Base tDS + delta tDS + (VIH(ac) – VREF)/slew_rate = 25 + 5 + 350 = 380 ps
tDH (VREF) = Base tDH + delta tDH + (VIH(dc) – VREF)/slew_rate = 100 + 10 + 200 =
310 ps
PHY Settings
Click Next or the PHY Settings tab to set the options described in Table 8–18. The
options are available if they apply to the target Altera device.
Table 8–18. ALTMEMPHY PHY Settings (Part 1 of 3)
Applicable Device Families
Parameter Name
DDR/DDR2
SDRAM
DDR3 SDRAM
Description
Turn on to use dedicated PLL outputs to generate the
external memory clocks, which is required for
HardCopy II ASICs and their Stratix II FPGA
prototypes. When turned off, the DDIO output registers
generate the clock outputs.
Use dedicated PLL
outputs to drive
memory clocks
HardCopy II and
Stratix II
(prototyping for
HardCopy II)
Not supported
Dedicated memory
clock phase
HardCopy II and
Stratix II
(prototyping for
HardCopy II)
Not supported
The required phase shift to align the CK/CK# signals
with DQS/DQS# signals when using dedicated PLL
outputs to drive memory clocks.
Use differential DQS
Arria II GX,
Stratix III, and
Stratix IV
Not supported
Enable this feature for better signal integrity.
Recommended for operation at 333 MHz or higher. An
option for DDR2 SDRAM only, as DDR SDRAM does
not support differential DQSS.
External Memory Interface Handbook
Volume 2: Design Guidelines
When you use the DDIO output registers for the
memory clock, both the memory clock and the DQS
signals are well aligned and easily meets the tDQSS
specification. However, when the dedicated clock
outputs are for the memory clock, the memory clock
and the DQS signals are not aligned properly and
requires a positive phase offset from the PLL to align
the signals together.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–53
Table 8–18. ALTMEMPHY PHY Settings (Part 2 of 3)
Applicable Device Families
Parameter Name
Enable external
access to reconfigure
PLL prior to
calibration
Instantiate DLL
externally
Enable dynamic
parallel on-chip
termination
DDR/DDR2
SDRAM
HardCopy II,
Stratix II,
Stratix III, and
Stratix IV
(prototyping for
HardCopy II)
All supported
device families,
except for
Cyclone® III
devices
Stratix III and
Stratix IV
DDR3 SDRAM
Description
When enabling this option for HardCopy II, Stratix II,
Stratix III, and Stratix IV devices, the inputs to the
ALTPLL_RECONFIG megafunction are brought to the
top level for debugging purposes.
HardCopy II
All supported device
families
This option allows you to reconfigure the PLL before
calibration to adjust, if necessary, the phase of the
memory clock (mem_clk_2x) before the start of the
calibration of the resynchronization clock on the read
side. The calibration of the resynchronization clock on
the read side depends on the phase of the memory
clock on the write side.
Use this option with Stratix III, Stratix IV, HardCopy III,
or HardCopy IV devices, if you want to apply a nonstandard phase shift to the DQS capture clock. The
ALTMEMPHY DLL offsetting I/O can then be connected
to the external DLL and the Offset Control Block.
As Cyclone III devices do not have DLLs, this feature is
not supported.
Not supported
This option provides I/O impedance matching and
termination capabilities. The ALTMEMPHY
megafunction enables parallel termination during reads
and series termination during writes with this option
checked. Only applicable for DDR and DDR2 SDRAM
interfaces where DQ and DQS are bidirectional. Using
the dynamic termination requires that you use the OCT
calibration block, which may impose a restriction on
your DQS/DQ pin placements depending on your
RUP/RDN pin locations.
Although DDR SDRAM does not support ODT, dynamic
OCT is still supported in Altera FPGAs.
For more information, refer to the External Memory
Interfaces in Stratix III Devices chapter in volume 1 of
the Stratix III Device Handbook or the External Memory
Interfaces in Stratix IV Devices chapter in volume 1 of
the Stratix IV Device Handbook.
Clock phase
November 2011
Arria II GX, Arria
GX, Cyclone III,
HardCopy II,
Stratix II, and
Stratix II GX
Altera Corporation
Arria II GX
Adjusting the address and command phase can
improve the address and command setup and hold
margins at the memory device to compensate for the
propagation delays that vary with different loadings.
You have a choice of 0°, 90°, 180°, and 270°, based on
the rising and falling edge of the phy_clk and
write_clk signals. In Stratix IV and Stratix III
devices, the clock phase is set to dedicated.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–54
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–18. ALTMEMPHY PHY Settings (Part 3 of 3)
Applicable Device Families
Parameter Name
DDR/DDR2
SDRAM
Dedicated clock phase
Stratix III and
Stratix IV
All supported
device families
except Arria II GX
and Stratix IV
devices
Board skew
Autocalibration
simulation options
Description
DDR3 SDRAM
Not supported
When you use a dedicated PLL output for address and
command, you can choose any legal PLL phase shift to
improve setup and hold for the address and command
signals. You can set this value to between 180° and
359°, the default is 240°. However, generally PHY
timing requires a value of greater than 240° for
half-rate designs and 270° for full-rate designs.
Not supported
Maximum skew across any two memory interface
signals for the whole interface from the FPGA to the
memory (either a discrete memory device or a DIMM).
This parameter includes all types of signals (data,
strobe, clock, address, and command signals). You
need to input the worst-case skew, whether it is within
a DQS/DQ group, or across all groups, or across the
address and command and clocks signals. This
parameter generates the timing constraints in the .sdc.
All supported device families
Choose between Full Calibration (long simulation
time), Quick Calibration, or Skip Calibration.
For more information, refer to the “Simulation Options”
section in the Simulating Memory IP chapter.
Board Settings
Click Next or the Board Settings tab to set the options described in Table 8–19. The
board settings parameters are set to model the board level effects in the timing
analysis. The options are available if you choose Arria II GX or Stratix IV device for
your interface. Otherwise, the options are disabled. The options are also disabled for
all devices using DDR SDRAM.
Table 8–19. ALTMEMPHY Board Settings (Part 1 of 2)
Parameter Name
Units
Number of slots/discrete devices
—
Description
Sets the single-rank or multi-rank configuration.
CK/CK# slew rate (differential)
V/ns
Sets the differential slew rate for the CK and CK# signals.
Addr/command slew rate
V/ns
Sets the slew rate for the address and command signals.
DQ/DQS# slew rate (differential)
V/ns
Sets the differential slew rate for the DQ and DQS# signals.
DQ slew rate
V/ns
Sets the slew rate for the DQ signals.
Addr/command eye reduction
(setup)
ns
Sets the reduction in the eye diagram on the setup side due to the ISI on
the address and command signals.
Addr/command eye reduction
(hold)
ns
Sets the reduction in the eye diagram on the hold side due to the ISI on
the address and command signals.
DQ eye reduction
ns
Sets the total reduction in the eye diagram on the setup side due to the ISI
on the DQ signals.
Delta DQS arrival time
ns
Sets the increase of variation on the range of arrival times of DQS due to
ISI.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
8–55
Table 8–19. ALTMEMPHY Board Settings (Part 2 of 2)
Parameter Name
Units
Description
Sets the largest skew or propagation delay on the DQ signals between
ranks, especially true for DIMMs in different slots.
Max skew between
DIMMs/devices
ns
Max skew within DQS group
ns
Sets the largest skew between the DQ pins in a DQS group. This value
affects the Read Capture and Write margins for the DDR2 interfaces in all
configurations (single- or multi-rank, DIMM or device).
Max skew between DQS groups
ns
Sets the largest skew between DQS signals in different DQS groups. This
value affects the Resynchronization margin for the DDR2 interfaces in
both single- or multi-rank configurations.
ns
Sets the skew or propagation delay between the CK signal and the address
and command signals. The positive values represent the address and
command signals that are longer than the CK signals, and the negative
values represent the address and command signals that are shorter than
the CK signals. This skew is used by the Quartus II software to optimize
the delay of the address/command signals to have appropriate setup and
hold margins for the DDR2 interfaces.
Addr/command to CK skew
This value affects the Resynchronization margin for the DDR2 interfaces
in multi-rank configurations for both DIMMs and devices.
Controller Settings
1
This section describes parameters for the High Performance Controller II (HPC II)
with advanced features introduced in version 11.0 for designs generated in version
11.0. Designs created in earlier versions and regenerated in version 11.0 do not inherit
the new advanced features; for information on parameters for HPC II without the
version 11.0 advanced features, refer to the External Memory Interface Handbook for
Quartus II version 10.1, available in the Literature: External Memory Interfaces page
of the Altera website.
Table 8–20 lists the options provided in the Controller Settings tab.
Table 8–20. Controller Settings (Part 1 of 3)
Parameter
Description
Controller architecture
Specifies the controller architecture.
Enable self-refresh controls
Turn on to enable the controller to allow you to have control on when to place the external
memory device in self-refresh mode, refer to the “User-Controlled Self-Refresh” section in
the Functional Description—HPC II Controller chapter of the External Memory Interface
Handbook.
Enable power down controls
Turn on to enable the controller to allow you to have control on when to place the external
memory device in power-down mode.
Enable auto power down
Turn on to enable the controller to automatically place the external memory device in
power-down mode after a specified number of idle controller clock cycles is observed in the
controller. You can specify the number of idle cycles after which the controller powers down
the memory in the Auto Power Down Cycles field, refer to the “Automatic Power-Down with
Programmable Time-Out” section in the Functional Description—HPC II Controller chapter
of the External Memory Interface Handbook.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–56
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with ALTMEMPHY IP
Table 8–20. Controller Settings (Part 2 of 3)
Parameter
Auto power down cycles
Description
Determines the desired number of idle controller clock cycles before the controller places
the external memory device in a power-down mode. The legal range is 1 to 65,535.
The auto power-down mode is disabled if you set the value to 0 clock cycles.
Enable user auto-refresh
controls
Turn on to enable the controller to allow you to issue a single refresh.
Enable auto-precharge
control
Turn on to enable the auto-precharge control on the controller top level. Asserting the
auto-precharge control signal while requesting a read or write burst allows you to specify
whether or not the controller should close (auto-precharge) the current opened page at the
end of the read or write burst.
Enable reordering
Turn on to allow the controller to perform command and data reordering to achieve the
highest efficency.
Starvation limit for each
command
Specifies the number of commands that can be served before a waiting command is served.
The legal range is from 1 to 63.
Allows you to control the mapping between the address bits on the Avalon interface and the
chip, row, bank, and column bits on the memory interface.
Local-to-memory address
mapping
If your application issues bursts that are greater than the column size of the memory device,
choose the Chip-Row-Bank-Column option. This option allows the controller to use its
look-ahead bank management feature to hide the effect of changing the currently open row
when the burst reaches the end of the column.
On the other hand, if your application has several masters that each use separate areas of
memory, choose the Chip-Bank-Row-Column option. This option allows you to use the top
address bits to allocate a physical bank in the memory to each master. The physical bank
allocation avoids different masters accessing the same bank which is likely to cause
inefficiency, as the controller must then open and close rows in the same bank.
Command queue lookahead depth
Specifies a command queue look-ahead depth value to control the number of read or write
requests the look-ahead bank management logic examines.
Local maximum burst count
Specifies a burst count to configure the maximum Avalon burst count that the controller
slave port accepts.
Reduce controller latency
by
Specifies, in controller clock cycles, a value by which to reduce the controller latency. The
default value is 0 but you have the option to choose 1 to enhance the latency performance of
your design at the expense of timing closure.
Enable configuration and
status register interface
Turn on to enable run-time configuration and status retrieval of the memory controller.
Enabling this option adds an additional Avalon-MM slave port to the memory controller top
level that allows run-time reconfiguration and status retrieving for memory timing
parameters, memory address size and mode register settings, and controller features. If the
Error Detection and Correction Logic option is enabled, the same slave port also allows you
to control and retrieve the status of this logic. For more information, refer to the
“Configuration and Status Register (CSR) Interface” section in the
Functional Description—HPC II Controller chapter of the External Memory Interface
Handbook
Enable error detection and
correction logic
Turn on to enable error correction coding (ECC) for single-bit error correction and double-bit
error detection.
Turn on to allow the controller to perform auto correction when the ECC logic detects a
Enable auto error correction single-bit error. Alternatively, you can turn off this option and schedule the error correction
at a desired time for better system efficiency.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–57
Table 8–20. Controller Settings (Part 3 of 3)
Parameter
Description
This option is only available in SOPC Builder Flow. Turn on to allow one controller to use the
Avalon clock from another controller in the system that has a compatible PLL. This option
allows you to create SOPC Builder systems that have two or more memory controllers that
are synchronous to your master logic.
Multiple controller clock
sharing
1
This option is not for use with Cyclone III or Cyclone IV family devices.
Specifies the local side interface between the user logic and the memory controller. The
Avalon-MM interface allows you to easily connect to other Avalon-MM peripherals.
Local interface protocol
The HPC II architecture supports only the Avalon-MM interface.
Parameterizing Memory Controllers with UniPHY IP
This section describes the parameters you can set for the DDR2, DDR3 SDRAM,
QDR II, QDR II+ SRAM, and RLDRAM II memory controllers with the UniPHY IP.
The Parameter Settings page in the UniPHY parameter editor allows you to
parameterize the following settings:
■
PHY Settings
■
Memory Parameters
■
Memory Timing
■
Board Settings
■
Controller Settings
■
Diagnostics
The text window at the bottom of the MegaWizard Plug-In Manager displays
information about the memory interface, warnings, and errors if you are trying to
create something that is not supported. The Finish button is disabled until you correct
all the errors indicated in this window.
The following sections describe the tabs of the Parameter Settings page in more
detail.
PHY Settings
Table 8–21 lists the PHY parameters.
Table 8–21. Clock Parameters
Parameter
Description
General Settings
November 2011
Speed Grade
Specifies the speed grade of the targeted FPGA device that affects the
generated timing constraints and timing reporting.
Generate PHY only
Turn on this option to generate the UniPHY core wthout a memory
controller. When you turn on this option, the AFI interface is exported so
that you can easily connect your own memory controller.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–58
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–21. Clock Parameters
Parameter
Description
Clocks
Memory clock
frequency
The frequency of the clock that drives the memory device. Use up to 4
decimal places of precision.
To obtain the maximum supported frequency for your target memory
configuration, refer to the External Memory Interface Spec Estimator page
on the Altera website.
Achieved memory
clock frequency
The actual frequency the PLL generates to drive the external memory
interface (memory clock).
PLL reference clock
frequency
The frequency of the input clock that feeds the PLL. Use up to 4 decimal
places of precision.
Rate on Avalon-MM
interface
The width of data bus on the Avalon-MM interface. Full results in a width
of 2× the memory data width. Half results in a width of 4× the memory
data width. Quarter results in a width of 8× the memory data width and is
only supported for DDR3 SDRAM using Stratix V devices. Use Quarter for
memory frequency 533 MHz and above.
To determine the Avalon-MM interface rate selection for other memories,
refer to the local interface clock rate for your target device in the External
Memory Interface Spec Estimator page on the Altera website.
Achieved local clock
frequency
External Memory Interface Handbook
Volume 2: Design Guidelines
The actual frequency the PLL generates to drive the local interface for the
memory controller (AFI clock).
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–59
Table 8–21. Clock Parameters
Parameter
Description
Advanced PHY Settings
Advanced clock phase
control
Enables access to clock phases. Default value should suffice for most
DIMMs and board layouts, but can be modified if necessary to
compensate for larger address and command versus clock skews.
This option is available for DDR3 SDRAM only.
Additional address
and command clock
phase
Allows you to increase or decrease the amount of phase shift on the
address and command clock. The base phase shift center aligns the
address and command clock at the memory device, which may not be the
optimal setting under all circumstances. Increasing or decreasing the
amount of phase shift can improve timing. The default value is 0 degrees.
To achieve the optimum setting, adjust the value based on the address
and command timing analysis results.
This option is not available for Stratix V devices.
Additional phase for
core-to-periphery
transfer
Allows you to phase shift the latching clock of the core-to-periphery
transfers. By delaying the latch clock, a positive phase shift value
improves setup timing for transfers between registers in the core and the
half-rate DDIO_OUT blocks in the periphery, respectively. Adjust this
setting according to the core timing analysis.
This option is available for Stratix V devices only.
Additional phase for
periphery-to-core
transfer
Allows you to phase shift the latching clock of the periphery-to-core
transfers. By advancing the latch clock, a negative phase shift value
improves setup timing for transfers between read-fifo in the periphery and
the core. Adjust this setting according to the core timing analysis.
This option is available for Stratix V devices only.
Additional CK/CK#
phase
Allows you to increase or decrease the amount of phase shift on the
CK/CK# clock. The base phase shift center aligns the address and
command clock at the memory device, which may not be the optimal
setting under all circumstances. Increasing or decreasing the amount of
phase shift can improve timing. Increasing or decreasing the phase shift
on CK/CK# also impacts the read, write, and leveling transfers, which
increasing or decreasing the phase shift on the address and command
clocks does not.
To achieve the optimum setting, adjust the value based on the address
and command timing analysis results. Ensure that the read, write, and
write leveling timings are met after adjusting the clock phase. Adjust this
value when there is a core timing failure after adjusting Additional
address and command clock phase.
This option is available for DDR3 SDRAM only. However, this option is not
available for Stratix V devices.
Enable read DQS
tracking
Improves timing margins by continuously compensating temperature
variations. When you turn on this option, you will observe increased
design agrea and the refresh command times are longer due to tracking
accesses. Altera recommends that you turn on this option for design
running at 533 MHz and above.
This option is available for DDR3 SDRAM only.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–60
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–21. Clock Parameters
Parameter
Description
The supply voltage and sub-family type of memory.
Supply voltage
I/O standard
This option is available for DDR3 SDRAM only. DDR3L is currently
supported only on Stratix V.
The I/O standard voltage. Set the I/O standard according to your design’s
memory standard.
When you select No sharing, the parameter editor instantiates a PLL
block without exporting the PLL signals. When you select Master, the
parameter editor instantiates a PLL block andexports the signals. When
you select Slave, the parameter editor exposes a PLL interface and you
must connect an external PLL master to drive the PLL slave interface
signals.
PLL sharing mode
Select No sharing if you are not sharing PLLs, otherwise select Master or
Slave.
For more information about resource sharing, refer to “The DLL and PLL
Sharing Interface” section in the Functional Description—UniPHY chapter
of the External Memory Interface Handbook.
You must modify the timing script file to reflect the resource sharing
during timing analysis. For more information, refer to the UniPHY tutorials
on the List of designs using Altera External Memory IP page of the Altera
Wiki website.
When you select No sharing, the parameter editor instantiates a DLL
block without exporting the DLL signals. When you select Master, the
parameter editor instantiates a DLL block and exports the signals. When
you select Slave, the parameter editor exposes a DLL interface and you
must connect an external DLL master to drive the DLL slave signals.
DLL sharing mode
Select No sharing if you are not sharing DLLs, otherwise select Master or
Slave.
For more information about resource sharing, refer to “The DLL and PLL
Sharing Interface” section in the Functional Description—UniPHY chapter
of the External Memory Interface Handbook.
OCT sharing mode
When you select No sharing, the parameter editor instantiates an OCT
block without exporting the OCT signals. When you select Master, the
parameter editor instantiates an OCT block and exports the signals. When
you select Slave, the parameter editor exposes an OCT interface and you
must connect an external OCT control block to drive the OCT slave
signals.
Select No sharing if you are not sharing OCT blocks, otherwise select
Master or Slave.
For more information about resource sharing, refer to “The OCT Sharing
Interface” section in the Functional Description—UniPHY chapter of the
External Memory Interface Handbook.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–61
Table 8–21. Clock Parameters
Parameter
Description
Enables all required HardCopy compatibility options for the generated IP
core. For some parameterizations, a pipeline stage is added to the write
datapath to help the more challenging timing closure for designs using
HardCopy devices; the pipeline stage does not affect the overall read and
write latency.
HardCopy
compatibility
Turn on this option if you are migrating your design to a HardCopy device.
For more information, refer to the HardCopy Design Migration Guidelines
chapter.
Reconfigurable PLL
location
When you set the PLL used in the UniPHY memory interface to be
reconfigurable at run time, you must specify the location of the PLL. This
assignment generates a PLL that can only be placed in the given sides.
This option is enabled when you turn on HardCopy compatability. In
HardCopy designs, you must specify the PLL location according to the
location of the interface.
Select Performance to enable the Nios II-based sequencer, or Area to
enable the RTL-based sequencer.
Altera recommends that you enable the Nios-based sequencer for
memory clock frequencies greater than 400 MHz and enable the
RTL-based sequencer if you want to reduce resource utilization.
Sequencer
optimization
This option is available for QDRII and QDR II+ SRAM, and RLDRAM II
only.
Memory Parameters
Use this tab to apply the memory parameters from your memory manufacturer’s data
sheet.
DDR2 and DDR3 SDRAM
Table 8–22 lists the memory parameters for DDR2 and DDR3 SDRAM.
Table 8–22. Memory Parameters (Part 1 of 3)
Parameter
Memory vendor
Description
The vendor of the memory device. Select the memory vendor according to the
memory vendor you use. For memory vendors that are not listed in the setting,
select JEDEC with the nearest memory parameters and edit the parameter values
according to the values of the memory vendor that you use. However, if you select a
configuration from the list of memory presets, the default memory vendor for that
preset setting is automatically selected.
The format of the memory device.
Memory format
Select Discrete if you are using just the memory device. Select Unbuffered or
Registered for DIMM format. Use the DIMM format to turn on levelling circuitry for
DDR3 SDRAM.
Memory device speed grade
The maximum frequency at which the memory device can run.
Total interface width
The total number of DQ pins of the memory device. Limited to 144 bits for DDR2 and
DDR3 SDRAM (with or without leveling).
DQ/DQS group size
The number of DQ bits per DQS group.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–62
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–22. Memory Parameters (Part 2 of 3)
Parameter
Number of DQS groups
Description
The number of DQS groups is calculated automatically from the Total interface width
and the DQ/DQS group size parameters.
The number of chip-selects the IP core uses for the current device configuration.
Number of chip selects
Specify the total number of chip-selects according to the number of DIMM slots and
the number of rank for each slot. For example, select 4 chip-selects if there are two
DIMM slots with two ranks in each DIMM slot.
Number of clocks
The width of the clock bus on the memory interface.
Row address width
The width of the row address on the memory interface.
Column address width
The width of the column address on the memory interface.
Bank-address width
The width of the bank address bus on the memory interface.
Specifies whether the DM pins of the memory device are driven by the FPGA. You
can turn off this option to avoid overusing FPGA device pins when using x4 mode
memory devices.
Enable DM pins
When you are using using x4 mode memory devices, turn off this option for DDR3
SDRAM.
You must turn on this option if you are using Avalon byte enable.
DQS# Enable (DDR2)
Turn on differential DQS signaling to improve signal integrity and system
performance.
This option is available for DDR2 SDRAMonly.
Memory Initialization Options—DDR2
Address and command parity
Mode
Register 0
Burst length
Specifies the burst length.
READ burst type
Determines whether the controller performs accesses within a given burst in
sequential or interleaved order.
DLL precharge power
down
Determines whether the DLL in the memory device is in slow exit mode or in fast exit
mode during precharge power down.
Memory CAS latency
setting
Output drive strength
setting
Mode
Register 1
Memory additive CAS
latency setting
Memory on-die
termination (ODT)
setting
Mode
Register 2
Enables address/command parity checking.
SRT Enable
Determines the number of clock cycles between the READ command and the
availability of the first bit of output data at the memory device.
Set this parameter according to the target memory speed grade.
Determines the output driver impedance setting at the memory device.
To obtain the optimum signal integrity performance, select the optimum setting
based on the board simulation results.
Determines the posted CAS additive latency of the memory device.
Enable this feature to improve command and bus efficiency, and increase system
bandwidth. For more information, refer to the Optimizing the Controller chapter.
Determines the on-die termination resistance at the memory device.
To obtain the optimum signal integrity performance, select the optimum setting
based on the board simulation results.
Determines the selfrefresh temperature (SRT). Select 1x refresh rate for normal
temperature (0-85C)or select 2x refresh rate for high-temperature (>85C).
Memory Initialization Options—DDR3
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–63
Table 8–22. Memory Parameters (Part 3 of 3)
Parameter
Description
Mirror Addressing: 1 per chip
select
Specifies the mirror addressing. Enter ranks with mirrored addresses in this field.
For example, for four chip selects, enter 1101 to mirror the address on chip select
#3, #2, and #0.
Address and command parity
Enables address/command parity checking to detect errors in data transmission.
Mode
Register 0
READ burst type
Specifies whether accesses within a given burst are in sequential or interleaved
order.
DLL precharge power
down
Specifies whether the DLL in the memory device is off or on during precharge
power-down.
Memory CAS latency
setting
Output drive strength
setting
Mode
Register 1
Memory additive CAS
latency setting
The number of clock cycles between the read command and the availability of the
first bit of output data at the memory device.
Set this parameter according to the target memory speed grade.
The output driver impedance setting at the memory device.
To obtain the optimum signal integrity performance, select the optimum setting
based on the board simulation results.
The posted CAS additive latency of the memory device.
Enable this feature to improve command and bus efficiency, and increase system
bandwidth. For more information, refer to the Optimizing the Controller chapter.
The on-die termination resistance at the memory device.
ODT Rtt nominal value
Mode
Register 2
To obtain the optimum signal integrity performance, select the optimum setting
based on the board simulation results.
Auto selfrefresh
method
Disable or enable auto selfrefresh.
Selfrefresh
temperature
Specifies the selfrefresh temperature as Normal or Extended.
Memory write CAS
latency setting
The number of clock cycles from the releasing of the internal write to the latching of
the first data in, at the memory device.
Dynamic ODT
(Rtt_WR) value
The mode of the dynamic ODT feature of the memory device.
To obtain the optimum signal integrity performance, select the optimum setting
based on the board simulation results.
QDR II and QDR II+ SRAM
Table 8–23 describes the memory parameters for QDR II and QDR II+ SRAM.
Table 8–23. Memory Parameters (Part 1 of 2)
Parameter
November 2011
Description
Address width
The width of the address bus on the memory device.
Data width
The width of the data bus on the memory device.
Data-mask width
The width of the data-mask on the memory device,
CQ width
The width of the CQ (read strobe) bus on the memory
device.
K width
The width of the K (write strobe) bus on the memory
device.
Burst length
The burst length supported by the memory device.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–64
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–23. Memory Parameters (Part 2 of 2)
Parameter
Description
Topology
Emulates a larger memory-width interface using smaller
memory-width interfaces on the FPGA.
x36 emulated mode
Turn on this option when the target FPGA do not support
x36 DQ/DQS group. This option allows two x18 DQ/DQS
groups to emulate 1 x36 read data group.
Emulated write groups
Number of write groups to use to form the x36 memory
interface on the FPGA. Select 2 to use 2 x18 DQ/DQS group
to form x36 write data group. Select 4 to use 4 x9 DQ/DQS
group to form x36 write data group.
Device width
Specifies the number of memory devices used for width
expansion.
RLDRAM II
Table 8–23 describes the memory parameters for RLDRAM II.
Table 8–24. Memory Parameters
Parameter
Description
Address width
The width of the address bus on the memory device.
Data width
The width of the data bus on the memory device.
Bank-address width
The width of the bank-address bus on the memory device.
Data-mask width
The width of the data-mask on the memory device,
QK width
DK width
Burst length
The width of the QK (read strobe) bus on the memory
device.
Select 1 when data width is set to 9. Select 2 when data
width is set to 18 or 36.
The width of the DK (write strobe) bus on the memory
device.
Select 1 when data width is set to 9 or 18. Select 2 when
data width is set to 36.
The burst length supported by the memory device.
Memory mode register configuration Configuration bits that set the memory mode.
Topology
Device width
External Memory Interface Handbook
Volume 2: Design Guidelines
Specifies the number of memory devices used for width
expansion.
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–65
Memory Timing
Use this tab to apply the memory timings from your memory manufacturer’s data
sheet. Table 8–25 shows the memory timing parameters.
Table 8–25. Memory Timing Parameters (Part 1 of 2)
Parameter
Description
DDR2/DDR3 SDRAM
tIS (base)
Address and control setup to CK clock rise.
tIH (base)
Address and control hold after CK clock rise.
tDS (base)
Data setup to clock (DQS) rise.
tDH (base)
Data hold after clock (DQS) rise.
tDQSQ
DQS, DQS# to DQ skew, per access.
tQHS (DDR2)
DQ output hold time from DQS, DQS# (absolute time value)
tQH (DDR3)
DQ output hold time from DQS, DQS# (percentage of tCK).
tDQSCK
DQS output access time from CK/CK#.
tDQSS
First latching edge of DQS to associated clock edge (percentage of tCK).
tQSH (DDR3)
tDQSH (DDR2)
DQS Differential High Pulse Width (percentage of tCK). Specifies the
minimum high time of the DQS signal received by the memory.
tDSH
DQS falling edge hold time from CK (percentage of tCK).
tDSS
DQS falling edge to CK setup time (percentage of tCK).
tINIT
Memory initialization time at power-up.
tMRD
Load mode register command period.
tRAS
Active to precharge time.
tRCD
Active to read or write time.
tRP
Precharge command period.
tREFI
Refresh command interval.
tRFC
Auto-refresh command interval.
tWR
Write recovery time.
tWTR
Write to read period.
tFAW
Four active window time.
tRRD
RAS to RAS delay time.
tRTP
Read to precharge time.
QDR II and QDR II+
November 2011
tWL (cycles)
The write latency.
tRL (cycles)
The read latency.
tSA
The address and control setup to K clock rise.
tHA
The address and control hold after K clock rise.
tSD
The data setup to clock (K/K#) rise.
tHD
The data hold after clock (K/K#) rise.
tCQD
Echo clock high to data valid.
tCQDOH
Echo clock high to data invalid.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–66
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–25. Memory Timing Parameters (Part 2 of 2)
Parameter
Description
Internal jitter
The QDRII/II+ internal jitter.
TCQHCQnH
The CQ clock rise to CQn clock rise (rising edge to rising edge).
TKHKnH
The K clock rise to Kn clock rise (rising edge to rising edge).
RLDRAM II
Maximum memory
The maximum frequency at which the memory device can run.
clock frequency
Refresh interval
The refresh interval.
tCKH (%)
The input clock (CK/CK#) high expressed as a percentage of the full clock
period.
tQKH (%)
The read clock (QK/QK#) high expressed as a percentage of tCKH.
tAS
Address and control setup to CK clock rise.
tAH
Address and control hold after CK clock rise.
tDS
Data setup to clock (CK/CK#) rise.
tDH
Data hold after clock (CK/CK#) rise.
tQKQ_max
QK clock edge to DQ data edge (in same group).
tQKQ_min
QK clock edge to DQ data edge (in same group).
tCKDK_max
Clock to input data clock (max).
tCKDK_min
Clock to input data clock (min).
Board Settings
Use the Board Settings tab to model the board-level effects in the timing analysis. The
Board Settings tab allows you to set the following settings:
1
■
Setup and hold derating (Available for DDR2/DDR3 SDRAM and RLDRAM II
only)
■
Intersymbol inteference
■
Board skews
For accurate timing results, you must enter board settings parameters that are correct
for your PCB.
The IP core supports single and multiple chip-select configurations. Altera has
determined the effects on the output signalling of these configurations for certain
Altera boards, and has stored the effects on the output slew rate and the intersymbol
interference (ISI) within the wizard.
1
These stored values are representative of specific Altera boards. You must change the
values to account for the board-level effects for your board. You can use HyperLynx or
similar simulators to obtain values that are representative of your board.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–67
f For more information about how to include your board simulation results in the
Quartus II software and how to assign pins using pin planners, refer to the design
flow tutorials and design examples on the List of designs using Altera External
Memory IP page of the Altera Wiki website
f For information about timing deration methodology, refer to the “Timing Deration
Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs” section in
the Analyzing Timing of Memory IP chapter.
Setup and Hold Derating
The slew rate of the output signals affects the setup and hold times of the memory
device. You can specify the slew rate of the output signals to see their effect on the
setup and hold times of both the address and command signals and the DQ signals, or
specify the setup and hold times directly.
1
You should enter information derived during your PCB development process of
prelayout (line) and postlayout (board) simulation.
Table 8–26 lists the setup and hold derating parameters.
Table 8–26. Setup and Hold Derating Parameters (Part 1 of 3)
Parameter
Description
DDR2/DDR3 SDRAM
Derating method
Derating method. The default settings are based
on Altera internal board simulation data. To
obtain accurate timing analysis according to the
condition of your board, Altera recommends that
you perform board simulation and enter the slew
rate in the Quartus II software to calculate the
derated setup and hold time automatically or
enter the derated setup and hold time directly.
For more information, refer to the “Timing
Deration Methodology for Multiple Chip Select
DDR2 and DDR3 SDRAM Designs” section in
the Analyzing Timing of Memory IP chapter.
November 2011
CK/CK# slew rate (differential)
CK/CK# slew rate (differential).
Address/Command slew rate
Address and command slew rate.
DQS/DQS# slew rate (Differential)
DQS and DQS# slew rate (differential).
DQ slew rate
DQ slew rate.
tIS
Address/command setup time to CK.
tIH
Address/command hold time from CK.
tDS
Data setup time to DQS.
tDH
Data hold time from DQS.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–68
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–26. Setup and Hold Derating Parameters (Part 2 of 3)
Parameter
Description
RLDRAM II
tAS Vref to CK/CK# Crossing
For a given address/command and CK/CK# slew
rate, the memory device data sheet provides a
corresponding "tAS Vref to CK/CK# Crossing"
value that can be used to determine the derated
address/command setup time.
tAS VIH MIN to CK/CK# Crossing
For a given address/command and CK/CK# slew
rate, the memory device data sheet provides a
corresponding "tAS VIH MIN to CK/CK#
Crossing" value that can be used to determine
the derated address/command setup time.
tAH CK/CK# Crossing to Vref
For a given address/command and CK/CK# slew
rate, the memory device data sheet provides a
corresponding "tAH CK/CK# Crossing to Vref"
value that can be used to determine the derated
address/command hold time.
tAH CK/CK# Crossing to VIH MIN
For a given address/command and CK/CK# slew
rate, the memory device data sheet provides a
corresponding "tAH CK/CK# Crossing to VIH
MIN" value that can be used to determine the
derated address/command hold time.
tDS Vref to CK/CK# Crossing
For a given data and DK/DK# slew rate, the
memory device data sheet provides a
corresponding "tDS Vref to CK/CK# Crossing"
value that can be used to determine the derated
data setup time.
tDS VIH MIN to CK/CK# Crossing
For a given data and DK/DK# slew rate, the
memory device data sheet provides a
corresponding "tDS VIH MIN to CK/CK#
Crossing" value that can be used to determine
the derated data setup time.
tDH CK/CK# Crossing to Vref
For a given data and DK/DK# slew rate, the
memory device data sheet provides a
corresponding "tDH CK/CK# Crossing to Vref"
value that can be used to determine the derated
data hold time.
tDH CK/CK# Crossing to VIH MIN
For a given data and DK/DK# slew rate, the
memory device data sheet provides a
corresponding "tDH CK/CK# Crossing to VIH
MIN" value that can be used to determine the
derated data hold time.
Derated tAS
The derated address/command setup time is
calculated automatically from the "tAS", the "tAS
Vref to CK/CK# Crossing", and the "tAS VIH MIN
to CK/CK# Crossing" parameters.
Derated tAH
The derated address/command hold time is
calculated automatically from the "tAH", the "tAH
CK/CK# Crossing to Vref", and the "tAH CK/CK#
Crossing to VIH MIN" parameters.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–69
Table 8–26. Setup and Hold Derating Parameters (Part 3 of 3)
Parameter
Description
Derated tDS
The derated data setup time is calculated
automatically from the "tDS", the "tDS Vref to
CK/CK# Crossing", and the "tDS VIH MIN to
CK/CK# Crossing" parameters.
Derated tDH
The derated data hold time is calculated
automatically from the "tDH", the "tDH CK/CK#
Crossing to Vref", and the "tDH CK/CK# Crossing
to VIH MIN" parameters.
Intersymbol Inteference
Intersymbol interference is the distortion of a signal in which one symbol interferes
with subsequent symbols. Typically, when going from a single chip-select
configuration to a multiple chip-select configuration there is an increase in
intersymbol interference because there are multiple stubs causing reflections.
Table 8–27 lists the intersymbol interference parameters.
Table 8–27. ISI Parameters (Part 1 of 2)
Parameter
Derating method
Description
Choose between default Altera settings (with specific Altera boards) or
manually enter board simulation numbers obtained for your specific
board.
This option is supported in DDR2/DDR3 SDRAM only.
Address and
command eye
reduction (setup)
Address and
command eye
reduction (hold)
November 2011
Altera Corporation
The reduction in the eye diagram on the setup side (or left side of the eye)
due to ISI on the address and command signals compared to a case when
there is no ISI. (For single rank designs, ISI can be zero; in multirank
designs, ISI is necessary for accurate timing analysis.)
For more information about how to measure the ISI value for the address
and command signals, refer to the “Measuring Eye Reduction for
Address/Command, DQ, and DQS Setup and Hold Time” section in the
Analyzing Timing of Memory IP chapter.
The reduction in the eye diagram on the hold side (or right side of the eye)
due to ISI on the address and command signals compared to a case when
there is no ISI.
For more information about how to measure the ISI value for the address
and command signals, refer to “Measuring Eye Reduction for
Address/Command, DQ, and DQS Setup and Hold Time” section in the
Analyzing Timing of Memory IP chapter.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–70
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–27. ISI Parameters (Part 2 of 2)
Parameter
Description
The total reduction in the eye diagram due to ISI on DQ signals compared
to a case when there is no ISI. Altera assumes that the ISI reduces the eye
width symmetrically on the left and right side of the eye.
DQ/ D eye reduction
Delta DQS/Delta K/
Delta DK arrival time
For more information about how to measure the ISI value for the address
and command signals, refer to “Measuring Eye Reduction for
Address/Command, DQ, and DQS Setup and Hold Time” section in the
Analyzing Timing of Memory IP chapter.
The increase in variation on the range of arrival times of DQS compared to
a case when there is no ISI. Altera assumes that the ISI causes DQS to
further vary symmetrically to the left and to the right.
For more information about how to measure the ISI value for the address
and command signals, refer to “Measuring Eye Reduction for
Address/Command, DQ, and DQS Setup and Hold Time” section in the
Analyzing Timing of Memory IP chapter.
Board Skews
PCB traces can have skews between them that can reduce timing margins.
Furthermore, skews between different chip selects can further reduce the timing
margin in multiple chip-select topologies. The Board Skews section of the parameter
editor allows you to enter parameters to compensate for these variations. Very large
board trace skews should be specified in your board trace model.
Table 8–28 lists the board skew parameters.
Table 8–28. Board Skew Parameters (Part 1 of 6)
Parameter
Description
DDR2/DDR3 SDRAM
The delay of the longest CK trace from the FPGA to the memory device, whether on a DIMM or the same PCB as the
FPGA is expressed by the following equation:
max  CK n PathDelay 
n
Maximum CK
delay to
DIMM/device where n is the number of memory clocks. For example, the maximum CK delay for two pairs of memory clocks is
expressed by the following equation:
max  CK 1 PathDelay, CK 2 PathDelay 
2
The delay of the longest DQS trace from the FPGA to the memory device, whether on a DIMM or the same PCB as
the FPGA is expressed by the following equation:
max  DQS n PathDelay 
Maximum
n
where n is the number of memory clocks.
DQS delay to
DIMM/device For example, the maximum DQS delay for two DQS is expressed by the following equation:
max  DQS 1 PathDelay, DQS 2 PathDelay 
2
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–71
Table 8–28. Board Skew Parameters (Part 2 of 6)
Parameter
Description
The minimum skew (or largest negative skew) between the CK signal and any DQS signal when arriving at the same
DIMM over all DIMMs is expressed by the following equation:
Minimum
delay
difference
between CK
and DQS
min  CK n PathDelay – DQSm PathDelay 
n,m
where n is the number of memory clocks and m is the number of DQS. For example, the minimum delay difference
between CK and DQS for two pairs of memory clocks and four DQS signals (two DQS signals for each clock) is
expressed by the following equation:
min {(Ck1 Delay–DQS1 Delay), (Ck1 Delay–DQS2 Delay),(Ck2 Delay–DQS3 Delay), (Ck2 Delay–DQS4 Delay)}
2, 2
This parameter value affects the write leveling margin for DDR3 interfaces with leveling in multirank configurations.
The maximum skew (or largest positive skew) between the CK signal and any DQS signal when arriving at the same
DIMM over all DIMMs is expressed by the following equation:
Maximum
delay
difference
between CK
and DQS
max  CK n PathDelay – DQS m PathDelay 
n,m
where n is the number of memory clocks and m is the number of DQS. For example, the maximum delay difference
between CK and DQS for two pairs of memory clocks and four DQS signals (two DQS signals for each clock) is
expressed by the following equation:
max {(Ck1 Delay–DQS1 Delay), (Ck1 Delay–DQS2 Delay),(Ck2 Delay–DQS3 Delay), (Ck2 Delay–DQS4 Delay)}
2, 2
This value affects the write Leveling margin for DDR3 interfaces with leveling in multi-rank configurations.
Maximum
skew within
DQS group
The largest skew between DQ and DM signals in a DQS group. This value affects the read capture and write margins
for DDR2 and DDR3 SDRAM interfaces in all configurations (single or multiple chip-select, DIMM or component).
Maximum
skew
between
DQS groups
The largest skew between DQS signals in different DQS groups. This value affects the resynchronization margin in
memory interfaces wihtout leveling such as DDR2 SDRAM and discrete-device DDR3 SDRAM in both single- or
multiple chip-select configurations.
Average
delay
difference
between DQ
and DQS
The average delay difference between each DQ signal and the DQS signal, calculated by averaging the longest and
smallest DQ signal delay values minus the delay of DQS. The average delay difference between DQ and DQS is
expressed by the following equation:
Longest DQ Path Delay in DQS n group + Shortest DQ Path Delay in DQS n group
 n = n  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- – DQS n PathDelay
  n = 1 

2
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
where n is the number of DQS groups.
Maximum
skew within
address and
command
bus
November 2011
The largest skew between the address and command signals.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–72
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–28. Board Skew Parameters (Part 3 of 6)
Parameter
Description
A value equal to the average of the longest and smallest address and command signal delay values, minus the delay
of the CK signal. The value can be positive or negative. Positive values represent address and command signals that
are longer than CK signals; negative values represent address and command signals that are shorter than CK
signals. The average delay difference between address and command and CK is expressed by the following
equation:
Average
delay
difference
between
address and
command
and CK
n = n  Longest AC Path Delay + Shortest AC Path Delay 
------------------------------------------------------------------------------------------------------------------------ – CK n PathDelay

2
= 1 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
  n
where n is the number of memory clocks.
The Quartus II software uses this skew to optimize the delay of the address and command signals to have
appropriate setup and hold margins for DDR2 and DDR3 SDRAM interfaces. You should derive this value through
board simulation.
QDR II and QDR II+
The maximun delay difference of data signals between devices is expressed by the following equation:
Maximum
delay
difference
between
devices
Maximum
skew within
write data
group (ie, K
group)
Longest device 1 delay – Shortest device 2 delay
Longest device 2 delay – Shortest device 1 delay
Abs  -------------------------------------------------------------------------------------------------------------------- –  --------------------------------------------------------------------------------------------------------------------

 

2
2
For example, in a two-device configuration there is greater propagation delay for data signals going to and returning
from the furthest device relative to the nearest device. This parameter is applicable for depth expansion. Set the
value to 0 for non-depth expansion design.
The maximum skew between D and BWS signals referenced by a common K signal.
Maximun
skew within
read data
The maximum skew between Q signals referenced by a common CQ signal.
group (ie, CQ
group)
Maximum
skew
between CQ
groups
The maximum skew between CQ signals of different read data groups.
Maximun
skew within
address/com
mand bus
The maximum skew between the address/command signals.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–73
Table 8–28. Board Skew Parameters (Part 4 of 6)
Parameter
Description
A value equal to the average of the longest and smallest address/command signal delay values, minus the delay of
the K signal. The value can be positive or negative.
Average
delay
difference
between
address/com
mand and K
The average delay difference between the address and command and K is expressed by the following equation:
n = n  Longest AC Path Delay + Shortest AC Path Delay 
------------------------------------------------------------------------------------------------------------------------ – K n PathDelay

2
= 1 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
  n
where n is the number of K clocks.
Average
delay
difference
between
write data
signals and
K
A value equal to the average of the longest and smallest write data signal delay values, minus the delay of the K
signal. Write data signals include the D and BWS signals. The value can be positive or negative.
The average delay difference between D and K is expressed by the following equation:
n = n  Longest D Path Delay in K n group + Shortest D Path Delay in K n group
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ – K n PathDelay

2
= 1 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
  n
where n is the number of DQS groups.
Average
delay
difference
between
read data
signals and
CQ
A value equal to the average of the longest and smallest read data signal delay values, minus the delay of the CQ
signal. The value can be positive or negative.
The average delay difference between Q and CQ is expressed by the following equation:
n = n  Longest Q Path Delay in CQ n group + Shortest Q Path Delay in CQ n group
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- – CQ n PathDelay

2
= 1 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
  n
where n is the number of CQ groups.
RLDRAM II
The delay of the longest CK trace from the FPGA to any device/DIMM is expressed by the following equation:
max  CK n PathDelay 
n
Maximum CK
delay to
where n is the number of memory clocks. For example, the maximum CK delay for two pairs of memory clocks is
device
expressed by the following equation:
max  CK 1 PathDelay, CK 2 PathDelay 
2
The delay of the longest DK trace from the FPGA to any device/DIMM is expressed by the following equation:
Maximum
DK delay to
device
max  DK n PathDelay 
n
where n is the number of DK. For example, the maximum DK delay for two DK is expressed by the following
equation:
max  DK 1 PathDelay, DK 2 PathDelay 
2
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–74
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–28. Board Skew Parameters (Part 5 of 6)
Parameter
Description
The minimum delay difference between the CK signal and any DK signal when arriving at the memory device(s). The
value is equal to the minimum delay of the CK signal minus the maximum delay of the DK signal. The value can be
positive or negative.
Minimum
delay
difference
between CK
and DK
The minimum delay difference between CK and DK is expressed by the following equations:
min  CK n PathDelay – DK m PathDelay 
n,m
where n is the number of memory clocks and m is the number of DK. For example, the minimum delay difference
between CK and DK for two pairs of memory clocks and four DK signals (two DK signals for each clock) is
expressed by the following equation:
min {(Ck1 Delay–DK1 Delay), (Ck1 Delay–DK2 Delay),(Ck2 Delay–DK3 Delay), (Ck2 Delay–DK4 Delay)}
2, 2
The maximum delay difference between the CK signal and any DK signal when arriving at the memory device(s). The
value is equal to the maximum delay of the CK signal minus the minimum delay of the DK signal. The value can be
positive or negative.
Maximum
delay
difference
between CK
and DK
The maximum delay difference between CK and DK is expressed by the following equations:
max  CK n PathDelay – DK m PathDelay 
n,m
where n is the number of memory clocks and m is the number of DK. For example, the maximum delay difference
between CK and DK for two pairs of memory clocks and four DK signals (two DK signals for each clock) is
expressed by the following equation:
max {(Ck1 Delay–DK1 Delay), (Ck1 Delay–DK2 Delay),(Ck2 Delay–DK3 Delay), (Ck2 Delay–DK4 Delay)}
2, 2
The maximun delay difference of data signals between devices is expressed by the following equation:
Maximum
delay
difference
between
devices
Longest device 1 delay – Shortest device 1 delay
Longest device 2 delay – Shortest device 2 delay
Abs  -------------------------------------------------------------------------------------------------------------------- –  --------------------------------------------------------------------------------------------------------------------

 

2
2
For example, in a two-device configuration there is greater propagation delay for data signals going to and returning
from the furthest device relative to the nearest device. This parameter is applicable for depth expansion. Set the
value to 0 for non-depth expansion design.
Maximum
skew within
QK group
The maximum skew between the DQ signals referenced by a common QK signal.
Maximum
skew
between QK
groups
The maximum skew between QK signals of different data groups.
Maximun
skew within
address/com
mand bus
The maximum skew between the address/command signals.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–75
Table 8–28. Board Skew Parameters (Part 6 of 6)
Parameter
Description
A value equal to the average of the longest and smallest address/command signal delay values, minus the delay of
the CK signal. The value can be positive or negative.
Average
The average delay difference between the address and command and CK is expressed by the following equation:
delay
n = n
difference
Longest AC Path Delay + Shortest AC Path Delay 
– CK n PathDelay
  n = 1  -----------------------------------------------------------------------------------------------------------------------
between
2
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------address/com
n
mand and CK
where n is the number of memory clocks.
Average
delay
difference
between
write data
signals and
DK
A value equal to the average of the longest and smallest write data signal delay values, minus the delay of the DK
signal. Write data signals include the DQ and DM signals. The value can be positive or negative.
The average delay difference between DQ and DK is expressed by the following equation:
n = n  Longest DQ Path Delay in DK n group + Shortest DQ Path Delay in DK n group
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- – DK n PathDelay

2
= 1 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
  n
where n is the number of DK groups.
Average
delay
difference
between
read data
signals and
QK
A value equal to the average of the longest and smallest read data signal delay values, minus the delay of the QK
signal. The value can be positive or negative.
The average delay difference between DQ and QK is expressed by the following equation:
n = n  Longest DQ Path Delay in QK n group + Shortest DQ Path Delay in QK n group
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- – QK n PathDelay

2
= 1 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------n
  n
where n is the number of QK groups.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–76
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Controller Settings
Use this tab to apply the controller settings suitable for your design.
1
This section describes parameters for the High Performance Controller II (HPC II)
with advanced features introduced in version 11.0 for designs generated in version
11.0. Designs created in earlier versions and regenerated in version 11.0 do not inherit
the new advanced features; for information on parameters for HPC II without the
version 11.0 advanced features, refer to the External Memory Interface Handbook for
Quartus II version 10.1, available on the Literature: External Memory Interfaces page
of the Altera website.
Table 8–29 lists the controller settings.
Table 8–29. Controller Settings
Parameter
Description
DDR2/DDR3 SDRAM
Generate power-of-2
bus widths for SOPC
Builder
Rounds down the Avalon-MM side data bus to the nearest power of 2. You
must enable this option for both Qsys and SOPC Builder systems.
Generate SOPC
Builder compatible
resets
You must enable this option if the IP core is to be used in an SOPC Builder
system. When turned on, the reset inputs become associated with the PLL
reference clock and the paths must be cut. This option must be enabled for
SOPC Builder, but is not required when using the MegaWizard Plug-in
Manager or Qsys.
Maximum Avalon-MM
burst length
Specifies the maximum burst length on the Avalon-MM bus. Affects the
AVL_SIZE_WIDTH parameter.
Avalon Interface
Enable Avalon-MM
byte-enable signal
Low Power Mode
When you turn on this option, the controller adds the byte enable signal
(avl_be) for the Avalon-MM bus to control the data mask (mem_dm) pins
going to the memory interface. You must also turn on Enable DM pins if you
are turning on this option.
When you turn off this option, the byte enable signal (avl_be) is not enabled
for the Avalon-MM bus, and by default all bytes are enabled. However, if you
turn on Enable DM pins with this option turned off, all write words are
written.
Avalon interface
address width
The address width on the Avalon-MM interface.
Avalon interface data
width
The data width on the Avalon-MM interface.
Enable self-refresh
controls
Enables the self-refresh signals on the controller top-level design. These
controls allow you to control when the memory is placed into self-refresh
mode.
Enable auto-power
down
Allows the controller to automatically place the memory into power-down
mode after a specified number of idle cycles. Specifies the number of idle
cycles after which the controller powers down the memory in the auto-power
down cycles parameter.
Auto power-down
cycles
The number of idle controller clock cycles after which the controller
automatically powers down the memory. The legal range is from 1 to 65,535
controller clock cycles.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–77
Table 8–29. Controller Settings
Parameter
Description
Enable user
auto-refresh controls
Enables the user auto-refresh control signals on the controller top level.
These controller signals allow you to control when the controller issues
memory autorefresh commands.
Enable
auto-precharge
control
Enables the autoprecharge control on the controller top level. Asserting the
autoprecharge control signal while requesting a read or write burst allows you
to specify whether the controller should close (autoprecharge) the currently
open page at the end of the read or write burst.
Allows you to control the mapping between the address bits on the
Avalon-MM interface and the chip, row, bank, and column bits on the
memory.
Local-to-memory
address mapping
Efficiency
Select Chip-Row-Bank-Col to improve efficiency with sequential traffic .
Select Chip-Bank-Row-Col to improve efficiency with random traffic.
Select Row-Chip-Bank-Col to improve efficiency with multiple chip select and
sequential traffic.
Configuration,
Status, and Error
Handling
Command queue
look-ahead depth
Selects a look-ahead depth value to control how many read or writes requests
the look-ahead bank management logic examines. Larger numbers are likely
to increase the efficiency of the bank management, but at the cost of higher
resource usage. Smaller values may be less efficient, but also use fewer
resources. The valid range is from 1 to 16.
Enable reordering
Allows the controller to perform command and data reordering that reduces
bus turnaround time and row/bank switching time to improve controller
efficiency.
Starvation limit for
each command
Specifies the number of commands that can be served before a waiting
command is served. The valid range is from 1 to 63.
Enable Configuration
and Status Register
Interface
Enables run-time configuration and status interface for the memory
controller. This option adds an additional Avalon-MM slave port to the
memory controller top level, which you can use to change or read out the
memory timing parameters, memory address sizes, mode register settings
and controller status. If Error Detection and Correction Logic is enabled, the
same slave port also allows you to control and retrieve the status of this logic.
Specifies the type of connection to the CSR port. The port can be exported,
internally connected to a JTAG Avalon Master, or both.
CSR port host
interface
Select Internal (JTAG) to export the CSR port.
Select Avalon-MM Slave to connect the CSR port to a JTAG Avalon Master.
Select Shared to export and connect the CSR port to a JTAG Avalon Master.
Advanced
Controller
Features
November 2011
Enable error detection
and correction logic
Enables ECC for single-bit error correction and double-bit error detection.
Your memory interface must be a multiple of 40 or 72 bits wide to use ECC.
Enable auto error
correction
Allows the controller to perform auto correction when a single-bit error is
detected by the ECC logic.
Enable half rate
bridge
Turn on this option to enable half rate bridge block.
Enable hard memory
controller
Turn on this option to enable hard memory controller.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
8–78
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
Table 8–29. Controller Settings
Parameter
Multiple Port
Front End
Description
Export bonding port
Turn on this option to export bonding interface for wider avalon data width
with two controllers. Bonding ports are exported to the top level.
Number of ports
Specifies the number of Avalon-MM Slave ports to be exported. The number
of ports depends on the width and the type of port you selected. There are
four 64-bit read FIFOs and four 64-bit write FIFOs in the multi-port front-end
(MPFE) component. For example, If you select 256 bits width and
bidirectional slave port, all the FIFOs are fully utilized, therefore you can only
select one port.
Width
Specifies the local data width for each Avalon-MM Slave port. The width
depends on the type of slave port and also the number of ports selected. This
is due to the limitation of the FIFO counts in the MPFE. There are four 64-bit
read FIFOs and four 64-bit write FIFOs in the MPFE. For example, if you select
one bidirectional slave port, you can select up to 256 bits to utilize all the read
and write FIFOs.
Priority
Specifies the absolute priority for each Avalon-MM Slave port. Any
transaction from the port with higher priority number will be served before
transactions from the port with lower priority number.
Weight
Specifies the relative priority for each Avalon-MM Slave port. When there are
two or more ports having the same absolute priority, the transaction from the
port with higher (bigger number) relative weight will be served first. You can
set the weight from a range of 0 to 32.
Type
Specifies the type of Avalon MM slave port to either a bidirectional port, read
only port or write only port.
QDR II/QDR II+ SRAM and RLDRAM II
Generate power-of-2 data bus widths for
SOPC Builder
This option must be enabled if this core is to be used in an SOPC Builder
system. When turned on, the Avalon-MM side data bus width is rounded
down to the nearest power of 2.
Generate SOPC Builder compatible resets
This option must be enabled if this core is to be used in an SOPC Builder
system.
Maximum Avalon-MM burst length
Specifies the maximum burst length on the Avalon-MM bus.
Enable Avalon-MM byte-enable signal
When you turn on this option, the controller adds a byte-enable signal
(avl_be_w) for the Avalon-MM bus, in which controls the bws_n signal on
the memory side to mask bytes during write operations.
When you turn off this option, the avl_be_w signal is not available and the
controller will always drive the memory bws_n signal so as to not mask any
bytes during write operations.
Avalon interface address width
Specifies the address width on the Avalon-MM interface.
Avalon interface data width
Specifies the data width on the Avalon-MM interface.
Specifies the number of clock cycles by which to reduce controller latency.
Reduce controller latency by
Enable user refresh
Lower controller latency results in lower resource usage and fMAX while higher
latency results in higher resource usage and fMAX,
Enables user-controlled refresh. Refresh signals will have priority over
read/write requests.
This option is available for RLDRAM II only.
Enable error detection parity
External Memory Interface Handbook
Volume 2: Design Guidelines
Enables per-byte parity protection.
This option is available for RLDRAM II only
November 2011 Altera Corporation
Chapter 8: Implementing and Parameterizing Memory IP
Parameterizing Memory Controllers with UniPHY IP
8–79
Diagnostics
The Diagnostics tab allows you to set parameters for certain diagnostic functions.
Table 8–30 describes parameters for simulation.
Table 8–30. Simulation Options
Parameter
Description
Simulation Options
Specifies whether you want to improve simulation
performance by reducing calibration. There is no change
to the generated RTL. The following autocalibration modes
are available:
Auto-calibration mode
■
Skip calibration—provides the fastest simulation. It
loads the settings calculated from the memory
configuration and enters user mode.
■
Quick calibration—calibrates (without centering) one
bit per group before entering user mode.
■
Full calibration—calibrates the same as in hardware,
and includes all phases, delay sweeps, and centering
on every data bit. You can use timing annotated
memory models. Be aware that full calibration can take
hours or days to complete.
To perform proper PHY simulation, select Quick
calibration or Full calibration. For more information,
refer to the “Simulation Options” section in the Simulating
Memory IP chapter.
For QDR II, QDR II+ SRAM, and RLDRAM II, the
Nios II-based sequencer must be selected to enable the
auto calibration modes selection.
Skip memory initialization delays
When you turn on this option, required delays between
specific memory initialization commands are skipped to
speed up simulation.
Enable verbose memory model
output
Turn on this option to display more detailed information
about each memory access during simulation.
Enable support for Nios II ModelSim®
flow in Eclipse
Initializes the memory interface for use with the Run as
Nios II ModelSim flow with Eclipse.
This option is not available for QDR II and QDR II+ SRAM.
Debug Options
Debug level
Specifies the debug level of the memory interface.
Efficiency Monitor and Protocol Checker Settings
Enable the Efficiency Monitor and
Protocol Checker on the Controller
Avalon Interface
November 2011
Altera Corporation
Enables efficiency monitor and protocol checker block on
the controller Avalon interface.
This option is not available for QDR II and QDR II+ SRAM.
External Memory Interface Handbook
Volume 2: Design Guidelines
8–80
Chapter 8: Implementing and Parameterizing Memory IP
Document Revision History
Document Revision History
Table 8–31 lists the revision history for this document.
Table 8–31. Document Revision History
Date
November 2011
June 2011
Version
Changes
■
Updated Installation and Licensing section.
■
Combined Qsys and SOPC Builder Interfaces sections.
■
Combined parameter settings for DDR, DDR2, DDR3 SDRAM, QDRII SRAM, and
RLDRAM II for both ALTMEMPHY and UniPHY IP.
■
Added parameter usage details to Parameterizing Memory Controllers with UniPHY IP
section.
■
Moved “Functional Description” section for DDR, DDR2, DDR3 SDRAM, QDRII SRAM,
and RLDRAM II to volume 3 of the External Memory Interface Handbook.
■
Removed references to High-Performance Controller.
■
Updated High-Performance Controller II information.
■
Removed HardCopy III, HardCopy IV E, HardCopy IV GX, Stratix III, and Stratix IV
support.
■
Updated Generated Files lists.
■
Added Qsys and SOPC Builder Interfaces section.
4.0
3.0
Updated the following items for 10.1:
December 2010
2.1
■
Updated Design Flows and Generated Files information.
■
Updated Parameterizing Memory Controllers with UniPHY IP chapter.
■
Added information for new GUI parameters: Controller latency, Enable reduced bank
tracking for area optimization, and Number of banks to track.
■
Removed information about IP Advisor. This feature is removed from the DDR/DDR2
SDRAM IP support for version 10.0.
July 2010
2.0
February 2010
1.3
February 2010
1.2
November 2009
1.1
Minor corrections.
November 2009
1.0
First published.
External Memory Interface Handbook
Volume 2: Design Guidelines
Corrected typos.
■
Full support for Stratix IV devices.
■
Added timing diagrams for initialization and calibration stages for HPC.
November 2011 Altera Corporation
9. Simulating Memory IP
November 2011
EMI_DG_009-4.0
EMI_DG_009-4.0
This chapter describes the simulation basics so that you are aware of the supported
simulators and options available to you when you perform functional simulation with
Altera® external memory interface IP.
You need the following components to simulate your design:
■
A simulator—The simulator must be any Altera-supported VHDL or Verilog HDL
simulator
■
A design using one of Altera’s external memory IP
■
An example driver (to initiate read and write transactions)
■
A testbench and a suitable memory simulation model
Memory Simulation Models
There are two types of memory simulation models. You can use one of the following
memory models:
■
Altera-provided generic memory model.
The Quartus® II software generates this model together with the example design
and this model adheres to all the memory protocol specifications. You can
parameterize the generic memory model.
■
Vendor-specific memory model.
Memory vendors such as Micron and Samsung provide simulation models for
specific memory components that you can download from their websites.
Although Denali models are also available, currently Altera does not provide
support for Denali models. All memory vendor simulation models that you use to
simulate Altera memory IP must be JEDEC compliant.
Simulation Options
With the example testbench, the following simulation options are available to
improve simulation speed:
■
Full calibration—Calibrates the same way as in hardware, and includes all phase,
delay sweeps, and centering on every data bit.
■
Quick calibration—Calibrates the read and write latency only, skipping per bit
deskew.
■
Skip calibration—Provides the fastest simulation. It loads the settings calculated
from the memory configuration and enters user mode.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
9–2
Chapter 9: Simulating Memory IP
Simulation Options
1
By default, the UniPHY IP generates abstract PHY, which uses skip calibration
regardless of the simulation options that you chose in the MegaWizard™-Plug In
Manager.
Table 9–1 lists the typical simulation times implemented using UniPHY IP.
1
These simulation times are estimates based on average run times of a few example
designs. The simulation times for your design may vary depending on the memory
interface specifications, simulator, or the system you are using.
Table 9–1. Typical Simulation Times Using UniPHY IP
Simulation Time
Calibration Mode/Run Time
(1)
Small Interface
Large Interface
(×72 Quad Rank)
10 minutes
~ 1 day
3 minutes
4 hours
3 minutes
20 minutes
Full
■
Full calibration
■
Includes all phase/delay
sweeps and centering
Quick
■
Scaled down calibration
■
Calibrate one pin
Skip
■
Skip all calibration, jump to
user mode
■
Preload calculated settings
Note to Table 9–1:
(1) Uses one loop of driver test. One loop of driver is approximately 600 read or write requests, with burst length up
to 64.
For more information about steps to follow before simulating, modifying the vendor
memory model, and simulation flow for both ALTMEMPHY and UniPHY IPs, refer to
the “Simulation Walkthrough with UniPHY IP” on page 9–3 and “Simulation
Walkthrough with ALTMEMPHY IP” on page 9–15.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
9–3
Simulation Walkthrough with UniPHY IP
Simulating the whole memory interface is a good way to determine the latency of
your system. However, the latency found in simulation may be different than the
latency found on the board because functional simulation does not take into account
board trace delays and different process, voltage, and temperature scenarios. For a
given design on a given board, the latency found may differ by one clock cycle (for
full-rate designs) or two clock cycles (for half-rate designs) upon resetting the board.
Different boards can also show different latencies even with the same design.
1
The UniPHY IP only supports functional simulation. Functional simulation is
supported at the RTL level and after generating a post-fit functional simulation netlist.
The post-fit netlist for designs that contain UniPHY IP is a hybrid of the gate level (for
FPGA core) and RTL level (for the external memory interface IP).
1
Altera recommends that you validate the functional operation of your design via RTL
simulation, and the timing of your design using TimeQuest Timing Analysis.
For high-performance memory controllers with UniPHY IP, you can simulate a
functional simulation example design generated by the MegaWizard Plug-In
Manager. The MegaWizard Plug-In Manager generates the relevant files to the
\<variation_name>_example_design directory.
You can use the IP functional simulation model with any Altera-supported VHDL or
Verilog HDL simulator.
1
After you have generated the memory IP, view the README. txt file located in the
\<variation_name>_example_design\simulation directory for instructions on how to
generate the simulation example design for Verilog HDL or VHDL. The README.txt
file also contain instructions on how to run simulation using the ModelSim-Altera
software. Altera provides you with the simulation scripts for the Mentor, Cadence,
and Synopsis simulators. However, detailed instructions on how to run simulation
using these third party simulators are not provided.
Simulation Scripts
The Quartus II software generates three simulation scripts during project generation
for three different third party simulation tools—Cadence, Synopsys, and Mentor.
These scripts reduces the number of files that you need to compile separately before
simulating a design. These scripts are located in three separate folders under the
<project directory>\<varitation_name>_sim directory, each named after the names of
the simulation tools. The example designs also provide equivalent scripts after you
run the .tcl script from the project located in the <name>_example_design\simulation
directory.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–4
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
Preparing the Vendor Memory Model
You can replace the Altera-supplied memory model with a vendor-specific memory
model. In general, you may find vendor-specific models to be standardized, thorough,
and well supported, but sometimes more complex to setup and use. Please note that
Altera does not provide support for vendor-specific memory models. If you do want
to replace the Altera-supplied memory model with a vendor-supplied memory
model, observe the following guidelines:
■
Ensure that you have the correct vendor-supplied memory model for your
memory device.
■
Disconnect all signals from the default memory model and reconnect them to the
vendor-supplied memory model.
■
If you intend to run simulation from the Quartus II software, ensure that the .qip
file points to the vendor-supplied memory model.
When you are using a vendor memory model, instead of the MegaWizard-generated
functional simulation model, you need to make some modifications to the vendor
memory model and the testbench files by following these steps:
1. Obtain and copy the vendor memory model to the
\<variation_name>_example_design\simulation\<variation_name>_sim\
submodules directory. For example, obtain the ddr2.v and ddr2_parameters.vh
simulation model files from the Micron website and save them in the directory.
1
The auto-generated generic SDRAM model may be used as a placeholder
for a specific vendor memory model.
1
Some vendor DIMM memory models do not use data mask (DM) pin
operation, which can cause calibration failures. In these cases, use the
vendor’s component simulation models directly.
2. Open the vendor memory model file in a text editor and specify the speed grade
and device width at the top of the file. For example, you can add the following
statements for a DDR2 SDRAM model file:
'define sg25
'define x8
The first statement specifies the memory device speed grade as –25 (for 400 MHz
operation). The second statement specifies the memory device width per DQS.
3. Check that the following statement is included in the vendor memory model file.
If not, include it at the top of the file. This example is for a DDR2 SDRAM model
file:
`include "ddr2_parameters.vh"
4. Save the vendor memory model file.
5. Open the simulation example project file <variation_name>_example_sim.qpf,
located in the <variation_name>_example_design\simulation directory.
6. On the Tools menu, select TCL scripts to run the
generate_sim_verilog_example_design.tcl file, in which generates the simulation
example design.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
9–5
7. To enable vendor memory model simulation, you have to include and compile the
vendor memory model by adding it into the simulation script. Open the .tcl script,
msim_setup.tcl, located in the
<variation_name>_example_design\simulation\verilog\mentor directory in the
text editor. Add in the following line in the '# Compile the design files in
correct order' section:
vlog
+incdir+$QSYS_SIMDIR/submodules/
"$QSYS_SIMDIR/submodules/<vendor_memory>.v"
-<variation_name>_example_sim_work
8. Open the simulation example design, <variation_name>_example_sim.v, located in
the <variation_name>_example_design\simulation\verilog directory in a text
editor and delete the following module:
alt_mem_if_<memory_type>_mem_model_top_<memory_type>_mem_if_dm_pins_en_
mem_if_dqsn_en
1
The actual name of the pin may differ slightly depending on the memory
controller you are using.
9. Instantiate the downloaded memory model and connect its signals to the rest of
the design.
10. Ensure that the ports names and capitalization in the memory model match the
port names and capitalization in the testbench.
1
The vendor memory model may use different pin names and capitalization
than the MegaWizard-generated functional model.
11. Save the testbench file.
The original instantiation may be similar to the following code:
alt_mem_if_ddr2_mem_model_top_mem_if_dm_pins_en_mem_if_dqsn_en #(
.MEM_IF_ADDR_WIDTH
(13),
.MEM_IF_ROW_ADDR_WIDTH
(12),
.MEM_IF_COL_ADDR_WIDTH
(8),
.MEM_IF_CS_PER_RANK
(1),
.MEM_IF_CONTROL_WIDTH
(1),
.MEM_IF_DQS_WIDTH
(1),
.MEM_IF_CS_WIDTH
(1),
.MEM_IF_BANKADDR_WIDTH
(3),
.MEM_IF_DQ_WIDTH
(8),
.MEM_IF_CK_WIDTH
(1),
.MEM_IF_CLK_EN_WIDTH
(1),
.DEVICE_WIDTH
(1),
.MEM_TRCD
(6),
.MEM_TRTP
(3),
.MEM_DQS_TO_CLK_CAPTURE_DELAY (100),
.MEM_IF_ODT_WIDTH
November 2011
Altera Corporation
(1),
External Memory Interface Handbook
Volume 2: Design Guidelines
9–6
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
.MEM_MIRROR_ADDRESSING_DEC
(0),
.MEM_REGDIMM_ENABLED
(0),
.DEVICE_DEPTH
(1),
.MEM_INIT_EN
(0),
.MEM_INIT_FILE
(""),
.DAT_DATA_WIDTH
(32)
) m0 (
.mem_a
(e0_memory_mem_a),
// memory.mem_a
.mem_ba
(e0_memory_mem_ba),
//
.mem_ba
.mem_ck
(e0_memory_mem_ck),
//
.mem_ck
.mem_ck_n
(e0_memory_mem_ck_n),
//
.mem_ck_n
.mem_cke
(e0_memory_mem_cke),
//
.mem_cke
.mem_cs_n
(e0_memory_mem_cs_n),
//
.mem_cs_n
.mem_dm
(e0_memory_mem_dm),
//
.mem_dm
.mem_ras_n (e0_memory_mem_ras_n), //
.mem_ras_n
.mem_cas_n (e0_memory_mem_cas_n), //
.mem_cas_n
.mem_we_n
(e0_memory_mem_we_n),
//
.mem_we_n
.mem_dq
(e0_memory_mem_dq),
//
.mem_dq
.mem_dqs
(e0_memory_mem_dqs),
//
.mem_dqs
.mem_dqs_n (e0_memory_mem_dqs_n), //
.mem_dqs_n
.mem_odt
.mem_odt
(e0_memory_mem_odt)
//
);
Replace the original code with the following code:
ddr2 memory_0 (
.addr (e0_memory_mem_a), // memory.mem_a
.ba (e0_memory_mem_ba), // .mem_ba
.clk (e0_memory_mem_ck), // .mem_ck
.clk_n (e0_memory_mem_ck_n), // .mem_ck_n
.cke (e0_memory_mem_cke), // .mem_cke
.cs_n (e0_memory_mem_cs_n), // .mem_cs_n
.dm_rdqs (e0_memory_mem_dm), // .mem_dm
.ras_n (e0_memory_mem_ras_n), // .mem_ras_n
.cas_n (e0_memory_mem_cas_n), // .mem_cas_n
.we_n (e0_memory_mem_we_n), // .mem_we_n
.dq (e0_memory_mem_dq), // .mem_dq
.dqs (e0_memory_mem_dqs), // .mem_dqs
.rdqs_n (), // .mem_dqs_n
.dqs_n (e0_memory_mem_dqs_n), // .mem_dqs_n
.odt (e0_memory_mem_odt) // .mem_odt
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
9–7
If you are interfacing with a DIMM or multiple memory components, you need to
instantiate all the memory components in the simulation file.
Functional Simulations
This topic discusses VHDL and Verilog HDL simulations with UniPHY IP example
design.
f For more information about simulating Verilog HDL or VHDL designs using
command lines, refer to the Mentor Graphics ModelSim® and QuestaSim Support chapter
in volume 3 of the Quartus II Software Handbook.
Verilog HDL
Altera provides simulation scripts for you to run the example design. The simulation
scripts are for Synopsys, Cadence and Mentor simulators. These simulation scripts are
located in the following main folder locations:
Simulation scripts in the simulation folders are located as follows:
■
<variation_name>_example_design\simulation\verilog\mentor\msim_setup.tcl
■
<variation_name>_example_design\simulation\verilog\synopsys\vcs\vcs_setup.
sh
■
<variation_name>_example_design\simulation\verilog\synopsys\vcsmx\vcsmx_
setup.sh
■
<variation_name>_example_design\simulation\verilog\cadence\ncsim_setup.sh
Simulation scripts in the <>_sim_folder are located as follows:
November 2011
■
<variation_name>_sim\mentor\msim_setup.tcl
■
<variation_name>_sim\cadence\ncsim_setup.sh
■
<variation_name>_sim\synopsys\vcs\vcs_setup.sh
■
<variation_name>_sim\synopsys\vcsmx\vcsmx_setup.sh
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–8
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
VHDL
The UniPHY VHDL fileset is specifically for ModelSim customers who use VHDL
exclusively and who do not have a mixed-language (VHDL and Verilog) simulation
license. All other customers should either select the Verilog language option during
generation, or simulate using the synthesis fileset.
The UniPHY IP VHDL simulation fileset consists of the following types of files:
■
IPFS-generated VHDL files
■
IEEE Encrypted Verilog HDL files (for Mentor, and in addition the equivalent
plain-text Verilog files for all simulators that support mixed-language
simulations).
■
Plain-text VHDL files.
Although the IEEE Encrypted files are written in Verilog, you can simulate these files
in combination with VHDL without violating the single-language restrictions in
ModelSim because they are encrypted.
Because the VHDL fileset consists of both VHDL and Verilog files, you must follow
certain mixed-language simulation guidelines. The general guideline for
mixed-language simulation is that you must always link the Verilog files (whether
encrypted or not) against the Verilog version of the Altera libraries, and the VHDL
files (whether simgen-generated or pure VHDL) against the VHDL libraries.
Altera provides simulation scripts for you to run the example design. The simulation
scripts are for Synopsys, Cadence, and Mentor simulators. These simulation scripts
are located in the following main folder locations:
Simulation scripts in the simulation folders are located as follows:
■
<variation_name>_example_design\simulation\vhdl\mentor\msim_setup.tcl
■
<variation_name>_example_design\simulation\vhdl\synopsys\vcsmx/vcsmx_set
up.sh
■
<variation_name>_example_design\simulation\vhdl\cadence\ncsim_setup.sh
Simulation scripts in the <>_sim_folder are located as follows:
■
<variation_name>_sim\mentor\msim_setup.tcl
■
<variation_name>_sim\cadence\ncsim_setup.sh
■
<variation_name>_sim\synopsys\vcsmx\vcsmx_setup.sh
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
9–9
Simulating the Example Design
The following section describes how to simulate the example design in Cadence,
Synopsys, and Mentor simulators.
To simulate the example design in the Quartus II software using the Cadence
simulator, follow these steps:
1. At the Linux shell command prompt, change directory to
<name>_example_design\simulation\<verilog/vhdl>\cadence
2. Run the simulation by typing the following commandat the command prompt:
sh ncsim_setup.sh
To simulate the example design in the Quartus II software using the Synopsys
simulator, follow these steps:
1. At the Linux shell command prompt, change directory to
<name>_example_design\simulation\<verilog/vhdl>\synopsys\vcsmx
2. Run the simulation by typing the following command at the command prompt:
sh vcsmx_setup.sh
To simulate the example design in the Quartus II software using the Mentor simulator,
follow these steps:
1. At the Linux or Windows shell command prompt, change directory to
<name>_example_design\simulation\<verilog/vhdl>\mentor
2. Execute the msim_setup.tcl script that automatically compiles and runs the
simulation by typing the following command at the Linux or Windows command
prompt:
vsim -do run.do
or
Type the following command at the ModelSim command prompt:
do run.do
1
This simulation method is only applicable to UniPHY.
f For more information about simulation, refer to the Simulating Altera Designs chapter
in volume 3 of the Quartus II Handbook.
f If your Quartus II project appears to be configured correctly but the example
testbench still fails, check the known issues on the Knowledge Database page of the
Altera website before filing a service request.
Abstract PHY
In the Quartus II software version 11.1, UniPHY IP generates both synthesizable and
abstract models for simulation, with the abstract model as default. The UniPHY
abstract model replaces the PLL with simple fixed-delay model, and the detailed
models of the hard blocks with simple cycle-accurate functional models.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–10
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
The UniPHY abstract model always runs in skip calibration mode, regardless of the
auto-calibration mode you select in the parameter editor because calibration does not
support abstract models. For VHDL, the UniPHY abstract model is the only option
because you cannot switch to regular simulation model. The PLL frequencies in
simulation may differ from the real time simulation due to pico-second timing
rounding.
However, you can switch to regular simulation models for Verilog HDL language.
The full and quick calibration modes are available for regular simulation models.
To switch to regular simulation models for Verilog HDL language, follow these steps:
1. In a text editor, type the following command to create a Verilog HDL header file.
define ALTERA_ALT_MEM_IF_PHY_FAST_SIM_MODEL 0
2. Name the header file as uniphy_fast_sim_parameter.vh and save it in the
<project directory>\<variation name>_example_design\simulation\
<variation name>_example_sim\ directory.
3. Add the following command line to
<project_directory><variation name>_example_design\simulation\verilog\<name
>_example_sim_e0_if0_p0.sv and
<project_directory><variation name>_example_design\simulation\verilog\<name
>_example_sim_e0_if0_pll0.sv
include "uniphy_fast_sim_parameter.vh"
If you use the UniPHY abstract model, the simulation is two times faster in magnitude
if compared to the real simulation model. Instantiating a standalone UniPHY IP in
your design further improves the simulation time if you use a half-rate controller with
UniPHY or a larger memory DQ width.
PHY-Only Simulation
The PHY-only simulation option is a new feature in the Quartus II version 11.1
software. To enable this feature in the parameter editor, under PHY Settings tab, in the
FPGA section, turn on Generate PHY only. This setting also applies to designs using
Qsys. This option allows you to replace the Altera high-performance memory
controllers with your own custom controller.
f For more information about using a custom controller, refer to “Using a Custom
Controller” section in the Functional Description—ALTMEMPHY chapter of the
External Memory Interface Handbook.
When you are using a standard UniPHY memory interface, by default, the parameter
editor generates an external memory interface with a controller and a PHY. The
controller and PHY are connected internally with the Altera PHY interface (AFI). The
memory interface has an Avalon slave port that connects to the controller to allow
communication from the user logic. When you turn on the PHY-only option, the
parameter editor generates the PHY without the controller. In this case, the PHY is
accessed via the AFI port, which can be externally connected to a custom controller. In
the example design, a controller is instantiated externally to the memory interface.
This provides a fully functional example design and demonstrates how to connect the
controller to manage the transactions from the traffic generator.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
9–11
Figure 9–1 shows the difference in the UniPHY memory interface when the PHY-only
option is enabled.
Figure 9–1. PHY-only Option
UniPHY Memory Interface
Traffic
Generator
Standard UniPHY
Avalon
S
M
S
Controller
AFI
PHY
Memory Model
UniPHY Memory Interface
Traffic
Generator
PHY-only option
Avalon
M
S
Controller
AFI
Memory Model
PHY
[
Post-fit Functional Simulation
The post-fit functional simulation does not work for the UniPHY IP core because of
the following inherent problems:
■
The UniPHY sample 'X's during calibration, in which causes an issue during
timing simulation
■
Some internal transfers that are 0-cycle require delays to properly function in a
post-fit netlist
To enable timing simulation for a design that uses UniPHY IP core, the quasi-post fit
scheme is implemented. This scheme allows gate-level simulation of the full design
(excluding the UniPHY IP), while you use RTL simulation for the UniPHY IP. The
quasi-post-fit scheme involves partitioning blocks in the EMIF and swaping them
with simulation RTL. With this workaround the memory interface is partially post-fit
RTL and partially premap RTL, therefore the simulation flow is not impeded.
Assuming that the Uniphy IP has been generated and inserted in some larger design,
follow these steps to run post-fit simulation:
1. In the Quartus II software, set up a project that contains a UniPHY IP core.
2. On the Assignments menu, click Assignment Editor.
3. In the assignment editor, add the global assignment “VERILOG_MACRO” and set
the value to “SYNTH_FOR_SIM=1”.
4. On the Assignments menu, click Settings.
5. In the Category list, under EDA Tools Settings, select Simulation.
6. On the Simulation page, select a tool name (for example, ModelSim-Altera).
7. In the Format for output netlist list, select a HDL language.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–12
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
8. In the Output directory box, type or browse to the location where you want
output files saved.
9. Click More EDA Netlist Writer Settings to choose from a list of other options.
10. Set the value for Maintain hierarchy to PARTITION_ONLY, and click OK.
11. Elaborate the project. On the Processing menu, select Start and click Start
Hierarchy Elaboration.
12. In the Project Navigator window, click the Hierarchy tab. In the Entity box, locate
the instances for the following devices:
a. For instances in Stratix III, Stratix IV, Arria II GX, Arria II GZ , click the + icon
to expand the following top-level design entities, right-click on the lower-level
entities, select Design Partition, and click Set as Design Partition:
■
<hierarchy path to Uniphy top-level>\<name>_if0:if0\<name>_if0_p0:p0
■
<hierarchy path to Uniphy top-level>\<name>_if0:if0\<name>_if0_s0:s0
b. For instances in Stratix V, click the + icon to expand the following top-level
design entity, right-click on the lower-level entities, select Design Partition,
and click Set as Design Partition:
■
<hierarchy path to Uniphy top-level>\<name>_if0:if0\<name>_if0_s0:s0
13. In the Design Partitions Window, ensure that the netlist type value of the design
partitions listed in Step12 a and 12b are set to Post-synthesis.
14. On the Processing menu, select Start and click Start Analysis and Synthesis.
15. Run the Pin assignments script. To run the pin assignment script, follow these
steps:
a. On the Tools menu,click TCL Scripts.
b. In the Libraries list, locate the <name>_pin_assignment.tcl.
c. Click Run.
16. On the Processing menu, select Start and click Partition Merge.
17. On the Processing menu, select Start and click Start Fitter.
18. On the Processing menu, select Start and click Start EDA netlist writer.
19. The output post-fit netlist is located in the directory you chose in Step 8.
20. Assume that the netlist filename is dut.vo (or dut.vho for VHDL). Replace the
instance of the partitioned modules (specified in step 12) in dut.vo and instantiate
the original instance of the RTL. As a result, the RTL of those modules will
simulate correctly instead of the the post-fit netlist. For example, you can delete
the definition of the <name>_if0_s0 (and <name>_if0_p0, if appropriate) modules
in the post-fit netlist, and ensure that your simulator compiles the post-fit netlist
and all the UniPHY RTL in order to properly link these modules for simulation.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
9–13
21. To match the post-fit netlist instantiation of s0 (and p0, if appropriate) with the
original RTL module definition (specified in step 12), you must also account for
three device input ports that are added to the post-fit netlist. The easiest way to do
this is to delete the following three connections from the s0 (and p0, if
appropriate) instances in the post-fit netlist:
■
.devpor(devpor)
■
.devclrn(devclrn)
■
.devoe(devpoe)
22. For Stratix V the <name>_if0_s0 instance in the post-fit netlist will also have a
connection .QIC_GND_PORT( <wire name> ) that you must delete because it does
not match with the original RTL module.
23. Set up and run your simulator.
Simulation Issues
When you simulate an example design in the ModelSim, you might see the following
warnings, which are expected and not harmful:
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_controller_phy.sv(402
): [PCDPC] - Port size (1 or 1) does not match connection size (7) for port
'local_size'.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_controller_phy.sv(402
): [PCDPC] - Port size (9 or 9) does not match connection size (1) for port
'ctl_cal_byte_lane_sel_n'.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_controller_phy.sv(402
): [PCDPC] - Port size (18 or 18) does not match connection size (1) for port
'afi_doing_read'.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_controller_phy.sv(402
): [PCDPC] - Port size (2 or 2) does not match connection size (1) for port
'afi_rdata_valid'.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_controller_phy.sv(402
): [PCDPC] - Port size (112 or 112) does not match connection size (1) for port
'bank_information'.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_controller_phy.sv(402
): [PCDPC] - Port size (8 or 8) does not match connection size (1) for port
'bank_open'.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–14
Chapter 9: Simulating Memory IP
Simulation Walkthrough with UniPHY IP
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst
# ** Warning: (vsim-3017)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_alt_ddrx_bank_timer_w
rapper.v(1191): [TFMPC] - Too few port connections. Expected 127, found 126.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst/
bank_timer_wrapper_inst/bank_timer_inst
# ** Warning: (vsim-3722)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_alt_ddrx_bank_timer_w
rapper.v(1191): [TFMPC] - Missing connection for port 'wr_to_rd_to_pch_all'.
# ** Warning: (vsim-3015)
D:/design_folder/iptest10/simulation/uniphy_s4/rtl/uniphy_s4_alt_ddrx_bank_timer_w
rapper.v(1344): [PCDPC] - Port size (5 or 5) does not match connection size (1) for
port 'wr_to_rd_to_pch_all'.
#
Region:
/uniphy_s4_example_top_tb/dut/mem_if/controller_phy_inst/alt_ddrx_controller_inst/
bank_timer_wrapper_inst/rank_monitor_inst
# ** Warning: (vsim-8598) Non-positive replication multiplier inside concat.
Replication will be ignored
Warning-[OSPA-N] Overriding same parameter again
/p/eda/acd/altera/quartusII/10.1/quartus/eda/sim_lib/synopsys/stratixv_atoms_ncryp
t.v, 8499
Warning-[ZONMCM] Zero or negative multiconcat multiplier
../quartus_stratix5/ddr3_ctlr_sim/ddr3_ctlr_sequencer.sv, 916
Zero or negative multiconcat multiplier is found in design. It will be
by 1'b0.
replaced
Source info: {INIT_COUNT_WIDTH {1'b0}}
Warning-[PCWM-W] Port connection width mismatch
../quartus_stratix5/ddr3_ctlr_sim/ddr3_ctlr_sequencer_cpu.v, 2830
"the_sequencer_cpu_nios2_oci_itrace"
The following 38-bit expression is connected to 16-bit port "jdo" of module
"ddr3_ctlr_sequencer_cpu_nios2_oci_itrace", instance
"the_sequencer_cpu_nios2_oci_itrace".
Expression: jdo
use +lint=PCWM for more details
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
9–15
Simulation Walkthrough with ALTMEMPHY IP
For high-performance memory controllers with ALTMEMPHY IP, you can simulate
the example top-level file with the MegaWizard-generated IP functional simulation
models. The MegaWizard™ Plug-In Manager generates a VHDL or Verilog HDL
testbench for the example top-level file, which is in the \testbench directory of your
project directory.
You can use the IP functional simulation model with any Altera-supported VHDL or
Verilog HDL simulator. You can perform a simulation in a third-party simulation tool
from within the Quartus II software, using NativeLink.
The ALTMEMPHY megafunction cannot be simulated alone. To simulate the
ALTMEMPHY megafunction, you must use all of the following blocks:
■
Memory controller
■
Example driver (to initiate read and write transactions)
■
Testbench and a suitable vendor memory model
Simulating the whole memory interface is a good way to determine the latency of
your system. However, the latency found in simulation may be different than the
latency found on the board because functional simulation does not take into account
board trace delays and different process, voltage, and temperature scenarios. For a
given design on a given board, the latency found may differ by one clock cycle (for
full-rate designs) or two clock cycles (for half-rate designs) upon resetting the board.
Different boards can also show different latencies even with the same design.
1
1
November 2011
The ALTMEMPHY megafunction only supports functional simulation; it does not
support gate-level simulation, for the following reasons:
■
The ALTMEMPHY is a calibrated interface. Therefore, gate-level simulation time
can be very slow and take up to several hours to complete.
■
Gate-level timing annotations, and the phase sweeping that the calibration uses,
determine setup and hold violations. Because of the effect of X (unknown value)
propagation within the atom simulation models, this action causes gate-level
simulation failures in some cases.
■
Memory interface timing methodology does not match the timing annotation that
gate-level simulations use. Therefore the gate-level simulation does not accurately
match real device behavior.
Altera recommends that you validate the functional operation of your design via RTL
simulation, and the timing of your design using TimeQuest Timing Analysis.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–16
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
Before Simulating
In general, you need the following files to simulate:
■
Library files from the <Quartus II install path>\quartus\eda\sim_lib\ directory:
■
220model
■
altera_primitives
■
altera_mf
■
sgate
■
arriaii_atoms, stratixiv_atoms, stratixiii_atoms, cycloneiii_atoms,
stratixii_atoms, stratixiigx_atoms (device dependent)
1
If you are targeting Stratix IV devices, you need both the Stratix IV and
Stratix III files (stratixiv_atoms and stratixiii_atoms) to simulate, unless
you are using NativeLink.
■
Sequencer wrapper file (in .vo or .vho format)
■
PLL file (for example <variation_name>_alt_mem_phy_pll.v or .vhd)
■
ALTMEMPHY modules (in the <variation_name>_alt_mem_phy.v)
■
Top-level file
■
User logic, or a driver for the PHY
■
Testbench
■
Vendor memory model
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
9–17
Preparing the Vendor Memory Model
If you are using a vendor memory model, instead of the MegaWizard-generated
functional simulation model, you need to make some modifications to the vendor
memory model and the testbench files by following these steps:
1. Make sure the IP functional simulation model is generated by turning on Generate
Simulation Model during the instantiate PHY and controller design flow step.
2. Obtain and copy the vendor memory model to the \testbench directory. For
example, obtain the ddr2.v and ddr2_parameters.vh simulation model files from
the Micron website and save them in the testbench directory.
1
The auto-generated generic SDRAM model may be used as a placeholder
for a specific vendor memory model.
1
Some vendor DIMM memory models do not use data mask (DM) pin
operation, which can cause calibration failures. In these cases, use the
vendor’s component simulation models directly.
3. Open the vendor memory model file in a text editor and specify the speed grade
and device width at the top of the file. For example, you can add the following
statements for a DDR2 SDRAM model file:
'define sg25
'define x8
The first statement specifies the memory device speed grade as –25 (for 400 MHz
operation). The second statement specifies the memory device width per DQS.
4. Check that the following statement is included in the vendor memory model file.
If not, include it at the top of the file. This example is for a DDR2 SDRAM model
file:
`include "ddr2_parameters.vh"
5. Save the vendor memory model file.
6. Open the testbench in a text editor, delete the whole section between the START
MEGAWIZARD INSERT MEMORY_ARRAY and END MEGAWIZARD INSERT MEMORY_ARRAY
comments, instantiate the downloaded memory model, and connect its signals to
the rest of the design.
7. Delete the START MEGAWIZARD INSERT MEMORY_ARRAY and END MEGAWIZARD INSERT
MEMORY_ARRAY lines so that the wizard does not overwrite your changes if you use
the wizard to regenerate the design.
8. Ensure that ports names and capitalization in the memory model match the port
names and capitalization in the testbench.
1
The vendor memory model may use different pin names and capitalization
than the MegaWizard-generated functional model.
9. Save the testbench file.
1
November 2011
Step 6 to step 9 is only valid for ALTMEMPHY design using Verilog HDL.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–18
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
The original instantiation (from step 3 to step 9) may be similar to the following code:
// << START MEGAWIZARD INSERT MEMORY_ARRAY
// This will need updating to match the memory models you are using.
// Instantiate a generated DDR memory model to match the datawidth &
chipselect requirements
ddr2_mem_model mem (
.mem_dq
(mem_dq),
.mem_dqs
(mem_dqs),
.mem_dqs_n
(mem_dqs_n),
.mem_addr
(a_delayed),
.mem_ba
(ba_delayed),
.mem_clk
(clk_to_ram),
.mem_clk_n
(clk_to_ram_n),
.mem_cke
(cke_delayed),
.mem_cs_n
(cs_n_delayed),
.mem_ras_n
(ras_n_delayed),
.mem_cas_n
(cas_n_delayed),
.mem_we_n
(we_n_delayed),
.mem_dm
(dm_delayed),
.mem_odt
(odt_delayed)
);
// << END MEGAWIZARD INSERT MEMORY_ARRAY
Replace the original code with the following code:
// << START MEGAWIZARD INSERT MEMORY_ARRAY
// This will need updating to match the memory models you are using.
// Instantiate a generated DDR memory model to match the datawidth &
chipselect requirements
ddr2 memory_0 (
.clk
(clk_to_ram),
.clk_n
(clk_to_ram_n),
.cke
(cke_delayed),
.cs_n
(cs_n_delayed),
.ras_n
(ras_n_delayed),
.cas_n
(cas_n_delayed),
.we_n
(we_n_delayed),
.dm_rdqs (dm_delayed[0]),
.ba
(ba_delayed),
.addr
(a_delayed),
.dq
(mem_dq[7:0]),
.dqs
(mem_dqs[0]),
.dqs_n
(mem_dqs_n[0]),
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
9–19
.rdqs_n (),
.odt (odt_delayed)
);
// << END MEGAWIZARD INSERT MEMORY_ARRAY
If you are interfacing with a DIMM or multiple memory components, you need to
instantiate all the memory components in the testbench file.
Simulating Using NativeLink
To set up simulation in the Quartus® II software using NativeLink, follow these steps:
1. Create a custom variation with an IP functional simulation model.
2. Set the top-level entity to the example project.
a. On the File menu, click Open.
b. Browse to <variation name>_example_top and click Open.
c. On the Project menu, click Set as Top-Level Entity.
3. Ensure that the Quartus II EDA Tool Options are configured correctly for your
simulation environment.
a. On the Tools menu, click Options.
b. In the Category list, click EDA Tool Options and verify the locations of the
executable files.
4. Set up the Quartus II NativeLink.
a. On the Assignments menu, click Settings. In the Category list, expand EDA
Tool Settings and click Simulation.
b. From the Tool name list, click on your preferred simulator.
c. In NativeLink settings, select Compile test bench and click Test Benches.
d. Click New at the Test Benches page to create a testbench.
5. On the New Test Bench Settings dialog box:
a. Type a name for the Test bench name, for example
<variation name>_example_top_tb.
b. In Top level module in test bench, type the name of the automatically
generated testbench, <variation name>_example_top_tb.
c. In Design instance in test bench, type the name of the top-level instance, dut.
d. Under Simulation period, set End simulation at to 600 µs.
e. Add the testbench files and automatically-generated memory model files. In
the File name field, browse to the location of the memory model and the
testbench, click Open and then click Add. The testbench is
<variation name>_example_top_tb.v; memory model is
<variation name>_mem_model.v.
f. Select the files and click OK.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–20
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
6. On the Processing menu, point to Start and click Start Analysis & Elaboration to
start analysis.
7. On the Tools menu, point to Run Simulation Tool and click RTL Simulation.
f If your Quartus II project appears to be configured correctly but the example
testbench still fails, check the known issues on the Knowledge Database page of the
Altera website before filing a service request.
For a complete MegaWizard Plug-In Manager system design example containing the
DDR and DDR2 SDRAM Controller with ALTMEMPHY IP, refer to the design
tutorials and design examples on the List of designs using Altera External Memory IP
page of the Altera Wiki website.
IP Functional Simulations
This topic discusses VHDL and Verilog HDL simulations with IP functional
simulation models.
VHDL
For VHDL simulations with IP functional simulation models, perform the following
steps:
1. Create a directory in the <project directory>\testbench directory.
2. Launch your simulation tool from this directory and create the following libraries:
■
altera_mf
■
lpm
■
sgate
■
<device name>
■
altera
■
ALTGXB
■
<device name>_hssi
■
auk_ddr_hp_user_lib
3. Compile the files into the appropriate library (Table 9–2). The files are in VHDL93
format.
Table 9–2. Files to Compile—VHDL IP Functional Simulation Models (Part 1 of 2)
Library
altera_mf
lpm
sgate
External Memory Interface Handbook
Volume 2: Design Guidelines
File Name
<QUARTUS ROOTDIR>\eda\sim_lib\altera_mf_components.vhd
<QUARTUS ROOTDIR>\eda\sim_lib\altera_mf.vhd
\eda\sim_lib\220pack.vhd
\eda\sim_lib\220model.vhd
eda\sim_lib\sgate_pack.vhd
eda\sim_lib\sgate.vhd
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
9–21
Table 9–2. Files to Compile—VHDL IP Functional Simulation Models (Part 2 of 2)
Library
File Name
eda\sim_lib\<device name>_atoms.vhd
<device name>
eda\sim_lib\<device name>_ components.vhd
eda\sim_lib\<device name>_hssi_atoms.vhd
(1)
eda\sim_lib\altera_primitives_components.vhd
altera
eda\sim_lib\altera_syn_attributes.vhd
eda\sim_lib\altera_primitives.vhd
ALTGXB
<device name>_mf.vhd
(1)
<device name>_mf_components.vhd
<device name>_hssi
<device name>_hssi_components.vhd
(1)
<device name>_hssi_atoms.vhd
<QUARTUS ROOTDIR>\
libraries\vhdl\altera\altera_europa_support_lib.vhd
<project directory>\<variation name>_phy_alt_mem_phy_seq_wrapper.vho
<project directory>\<variation name>_phy.vho
<project directory>\<variation name>.vhd
<project directory>\<variation name>_example_top.vhd
auk_ddr_hp_user_lib
<project directory>\<variation name>_controller_phy.vhd
<project directory>\<variation name>_phy_alt_mem_phy_pll.vhd
<project directory>\<variation name>_phy_alt_mem_phy_seq.vhd
<project directory>\<variation name>_example_driver.vhd
<project directory>\<variation name>_ex_lfsr8.vhd
testbench\<variation name>_example_top_tb.vhd
testbench\<variation name>_mem_model.vhd
Note for Table 9–2:
(1) Applicable only for Arria II GX and Stratix IV devices.
1
If you are targeting a Stratix IV device, you need both the Stratix IV and
Stratix III files (stratixiv_atoms and stratixiii_atoms) to simulate in your
simulator, unless you are using NativeLink.
4. Load the testbench in your simulator with the timestep set to picoseconds.
5. Compile the testbench file.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–22
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
Verilog HDL
For Verilog HDL simulations with IP functional simulation models, follow these
steps:
1. Create a directory in the <project directory>\testbench directory.
2. Launch your simulation tool from this directory and create the following libraries:
■
altera_mf_ver
■
lpm_ver
■
sgate_ver
■
<device name>_ver
■
altera_ver
■
ALTGXB_ver
■
<device name>_hssi_ver
■
auk_ddr_hp_user_lib
3. Compile the files into the appropriate library as shown in Table 9–3 on page 9–22.
Table 9–3. Files to Compile—Verilog HDL IP Functional Simulation Models (Part 1 of 2)
Library
File Name
altera_mf_ver
<QUARTUS ROOTDIR>\eda\sim_lib\altera_mf.v
lpm_ver
\eda\sim_lib\220model.v
sgate_ver
eda\sim_lib\sgate.v
eda\sim_lib\<device name>_atoms.v
<device name>_ver
eda\sim_lib\<device name>_hssi_atoms.v
altera_ver
ALTGXB_ver
(1)
eda\sim_lib\altera_primitives.v
(1)
<device name>_hssi_ver
<device name>_mf.v
(1)
External Memory Interface Handbook
Volume 2: Design Guidelines
<device name>_hssi_atoms.v
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
9–23
Table 9–3. Files to Compile—Verilog HDL IP Functional Simulation Models (Part 2 of 2)
Library
File Name
<QUARTUS ROOTDIR>\
libraries\vhdl\altera\altera_europa_support_lib.v
alt_mem_phy_defines.v
<project directory>\<variation name>_phy_alt_mem_phy_seq_wrapper.vo
<project directory>\<variation name>.v
<project directory>\<variation name>_example_top.v
<project directory>\<variation name>_phy.v
auk_ddr_hp_user_lib
<project directory>\<variation name>_controller_phy.v
<project directory>\<variation name>_phy_alt_mem_phy_pll.v
<project directory>\<variation name>_phy_alt_mem_phy.v
<project directory>\<variation name>_example_driver.v
<project directory>\<variation name>_ex_lfsr8.v
testbench\<variation name>_example_top_tb.v
testbench\<variation name>_mem_model.v
Note for Table 9–3:
(1) Applicable only for Arria II GX and Stratix IV devices.
1
If you are targeting a Stratix IV device, you need both the Stratix IV and
Stratix III files (stratixiv_atoms and stratixiii_atoms) to simulate in your
simulator, unless you are using NativeLink
4. Configure your simulator to use transport delays, a timestep of picoseconds, and
to include all the libraries in Table 9–3.
5. Compile the testbench file.
Simulation Tips and Issues
This topic discusses simulation tips and issues.
Tips
The ALTMEMPHY datapath is in Verilog HDL; the sequencer is in VHDL. For
ALTMEMPHY designs with the AFI, to allow the Verilog HDL simulator to simulate
the design after modifying the VHDL sequencer, follow these steps:
1. On the View menu, point to Utility Windows, and click TCL console.
2. Enter the following command in the console:
quartus_map --read_settings_file=on --write_settings_file=off -source=<variation_name>_phy_alt_mem_phy_seq.vhd -source=<variation_name>_phy_alt_mem_phy_seq_wrapper.v --simgen -simgen_parameter=CBX_HDL_LANGUAGE=verilog
<variation_name>_phy_alt_mem_phy_seq_wrapper -c
<variation_name>_phy_alt_mem_phy_seq_wrapper
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
9–24
Chapter 9: Simulating Memory IP
Simulation Walkthrough with ALTMEMPHY IP
The Quartus II software regenerates
<variation_name>_phy_alt_mem_phy_seq_wrapper.vo and uses this file when the
simulation runs.
DDR3 SDRAM (without Leveling) Warnings and Errors
You may see the following warning and error messages with skip calibration and
quick calibration simulation:
■
WARNING: 200 us is required before RST_N goes inactive
■
WARNING: 500 us is required after RST_N goes inactive before CKE goes
active
If these warning messages appear, change the values of the two parameters
(tinit_tck and tinit_rst) in the following files to match the parameters in
<variation_name>_phy_alt_mem_phy_seq_wrapper.v:
■
<variation_name>_phy_alt_mem_phy_seq_wrapper.vo or
■
<variation_name>_phy_alt_mem_phy_seq_wrapper.vho files
You may see the following warning and error messages with full calibration
simulation during write leveling, which you can ignore:
■
Warning: tWLS violation on DQS bit 0 positive edge. Indeterminate CK
capture is possible
■
Warning: tWLH violation on DQS bit 0 positive edge. Indeterminate CK
capture is possible.
■
ERROR: tDQSH violation on DQS bit 0
You may see the following warning messages at time 0 (before reset) of simulation,
which you can ignore:
■
Warning: There is an 'U'|'X'|'W'|'Z'|'-' in an arithmetic operand, the
result will be 'X'(es).
■
Warning: NUMERIC_STD.TO_INTEGER: metavalue detected, returning 0
You may see the following warning and error messages during reset, which you can
ignore:
■
Error : clock switches from 0/1 to X (Unknown value) on DLL instance
■
Warning : Duty Cycle violation DLL instance Warning: Input clock duty
cycle violation.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 9: Simulating Memory IP
Document Revision History
9–25
Document Revision History
Table 9–4 lists the revision history for this document.
Table 9–4. Document Revision History
Date
Version
November 2011
4.0
Changes
■
Added the PHY-Only Simulation section.
■
Added the Post-fit Functional Simulation section.
■
Updated the Simulation Walkthrough with UniPHY IP section.
■
Added an overview about memory simulation.
■
Added the Simulation Walkthrough with UniPHY IP section.
June 2011
3.0
December 2010
2.1
Updated for 10.1 release.
July 2010
2.0
Updated for 10.0 release.
January 2010
1.1
Corrected minor typos.
November 2009
1.0
First published.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10. Analyzing Timing of Memory IP
November 2011
EMI_DG_010-4.0
EMI_DG_010-4.0
Ensuring that your external memory interface meets the various timing requirements
of today’s high-speed memory devices can be a challenge. Altera addresses this
challenge by offering external memory physical layer (PHY) interface IPs—
ALTMEMPHY and UniPHY, which employ a combination of source-synchronous and
self-calibrating circuits to maximize system timing margins. This PHY interface is a
plug-and-play solution that the Quartus® II TimeQuest Timing Analyzer timing
constrains and analyzes. The ALTMEMPHY and UniPHY IP, and the numerous
device features offered by Arria® II, Arria V, Cyclone® III, Cyclone IV, Cyclone V,
Stratix® III, Stratix IV, and Stratix V FPGAs, greatly simplifies the implementation of
an external memory interface. All the information presented in this document for
Stratix III and Stratix IV devices is applicable to HardCopy® III and HardCopy IV
devices, respectively.
This chapter details the various timing paths that determine overall external memory
interface performance, and describes the timing constraints and assumptions that the
PHY IP uses to analyze these paths.
f This chapter focuses on timing constraints for external memory interfaces based on
the ALTMEMPHY and UniPHY IP. For information about timing constraints and
analysis of external memory interfaces and other source-synchronous interfaces based
on the ALTDQ_DQS and ALTDQ_DQS2 megafunctions, refer to AN 433: Constraining
and Analyzing Source-Synchronous Interfaces and the Quartus II TimeQuest Timing
Analyzer chapter in volume 3 of the Quartus II Handbook.
External memory interface timing analysis is supported only by the TimeQuest
Timing Analyzer, for the following reasons:
■
The wizard-generated timing constraint scripts only support the TimeQuest
analyzer.
■
The Classic Timing Analyzer does not offer analysis of source-synchronous
outputs. For example, write data, address, and command outputs.
■
The Classic Timing Analyzer does not support detailed rise and fall delay analysis.
The performance of an FPGA interface to an external memory device is dependent on
the following items:
■
Read datapath timing
■
Write datapath timing
■
Address and command path timing
■
Clock to strobe timing (tDQSS in DDR and DDR2 SDRAM, and tKHK#H in QDR II
and QDRII+ SRAM)
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
10–2
Chapter 10: Analyzing Timing of Memory IP
Memory Interface Timing Components
1
■
Read resynchronization path timing (applicable for DDR, DDR2, and DDR3
SDRAM in Arria II, Arria V, Stratix III, Stratix IV, and Stratix V devices)
■
Read postamble path timing (applicable for DDR and DDR2 SDRAM in Stratix II
devices)
■
Write leveling path timing (applicable for DDR3 SDRAM with ALTMEMPHY and
DDR2 and DDR3 SDRAM with UniPHY)
■
PHY timing paths between I/O element and core registers
■
PHY and controller internal timing paths (core fMAX and reset recovery/removal)
■
I/O toggle rate
■
Output clock specifications
■
Bus turnaround timing (applicable for RLDRAM II and DDR2 and DDR3 SDRAM
with UniPHY)
External memory interface performance depends on various timing components, and
overall system level performance is limited by performance of the slowest link (that is,
the path with the smallest timing margins).
Memory Interface Timing Components
There are several categories of memory interface timing components, including
source-synchronous timing paths, calibrated timing paths, internal FPGA timing
paths, and other FPGA timing parameters.
Understanding the nature of timing paths enables you to use an appropriate timing
analysis methodology and constraints. The following section examines these aspects
of memory interface timing paths.
Source-Synchronous Paths
These are timing paths where clock and data signals pass from the transmitting device
to the receiving device.
An example of such a path is the FPGA-to-memory write datapath. The FPGA device
transmits DQ output data signals to the memory along with a center-aligned DQS
output strobe signal. The memory device uses the DQS signal to clock the data on the
DQ pins into its internal registers.
1
For brevity, the remainder of this chapter refers to data signals and strobe and clock
signals as DQ signals and DQS signals, respectively. While the terminology is
formally correct only for DDR-type interfaces and does not match QDR II, QDR II+
and RLDRAM II pin names, the behavior is similar enough that most timing
properties and concepts apply to both. The clock that captures address and command
signals is always referred to as CK/CK# too.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Memory Interface Timing Components
10–3
Calibrated Paths
These are timing paths where the clock used to capture data is dynamically positioned
within the data valid window (DVW) to maximize timing margin.
For Arria II FPGAs interfacing with a DDR2 and DDR3 SDRAM controller with
ALTMEMPHY IP, the resynchronization of read data from the DQS-based capture
registers to the FPGA system clock domain is implemented using a self-calibrating
circuit. On initialization, the sequencer block analyzes all path delays between the
read capture and resynchronization registers to set up the resynchronization clock
phase for optimal timing margin.
In Cyclone III and Cyclone IV FPGAs, the ALTMEMPHY IP performs the initial data
capture from the memory device using a self-calibrating circuit. The ALTMEMPHY IP
does not use the DQS strobes from the memory for capture; instead, it uses a dynamic
PLL clock signal to capture DQ data signals into core LE registers.
For UniPHY-based controllers, the sequencer block analyzes all path delays between
the read capture registers and the read FIFO buffer to set up the FIFO write clock
phase for optimal timing margin. The read postamble calibration process is
implemented in a similar manner to the read resynchonization calibration. In
addition, the sequencer block calibrates a read data valid signal to the delay between a
controller issuing a read command and read data returning to controller.
In DDR2 and DDR3 SDRAM and RLDRAM II with UniPHY, the UniPHY IP calibrates
the write-leveling chains and programmable output delay chain to align the DQS
edge with the CK edge at memory to meet the tDQSS, tDSS, and tDSH specifications.
UniPHY IP enables the dynamic deskew calibration with NIOS sequencer for read
and write paths. Dynamic deskew process uses the programmable delay chains that
exist within the read and write data paths to adjust the delay of each DQ and DQS pin
to remove the skew between different DQ signals and to centre-align the DQS strobe
in the DVW of the DQ signals. This process occurs at power up for the read and the
write paths.
Internal FPGA Timing Paths
Other timing paths that have an impact on memory interface timing include FPGA
internal fMAX paths for PHY and controller logic. This timing analysis is common to all
FPGA designs. With appropriate timing constraints on the design (such as clock
settings), the TimeQuest Timing Analyzer reports the corresponding timing margins.
f For more information about the TimeQuest Timing Analyzer, refer to the Quartus II
TimeQuest Timing Analyzer chapter in volume 3 of the Quartus II Handbook.
Other FPGA Timing Parameters
Some FPGA data sheet parameters, such as I/O toggle rate and output clock
specifications, can limit memory interface performance.
I/O toggle rates vary based on speed grade, loading, and I/O bank location—
top/bottom versus left/right. This toggle rate is also a function of the termination
used (OCT or external termination) and other settings such as drive strength and slew
rate.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–4
Chapter 10: Analyzing Timing of Memory IP
FPGA Timing Paths
1
Ensure you check the I/O performance in the overall system performance calculation.
Altera recommends that you perform signal integrity analysis for the specified drive
strength and output pin load combination.
f For information about signal integrity, refer to the board design guidelines chapters
and AN 476: Impact of I/O Settings on Signal Integrity in Stratix III Devices.
Output clock specifications include clock period jitter, half-period jitter, cycle-to-cycle
jitter, and skew between FPGA clock outputs. You can obtain these specifications from
the FPGA data sheet and must meet memory device requirements. You can use these
specifications to determine the overall data valid window for signals transmitted
between the memory and FPGA device.
FPGA Timing Paths
This topic describes the FPGA timing paths, the timing constraints examples, and the
timing assumptions that the constraint scripts use.
In Arria II, Arria V, Stratix III, Stratix IV, and Stratix V devices, the interface margin is
reported based on a combination of the TimeQuest Timing Analyzer and further steps
to account for calibration that occurs at runtime. First the TimeQuest analyzer returns
the base setup and hold slacks, and then further processing adjusts the slacks to
account for effects which cannot be modeled in TimeQuest.
Arria II Device PHY Timing Paths
Table 10–1 lists all Arria II devices external memory interface timing paths.
Table 10–1. Arria II Devices External Memory Interface Timing Paths
Timing Path
Circuit Category
(1)
(Part 1 of 2)
Source
Destination
Read Data
(2), (7)
Source-Synchronous
Memory DQ, DQS Pins
DQ Capture Registers in IOE
Write Data
(2), (7)
Source-Synchronous
FPGA DQ, DQS Pins
Memory DQ, DM, and DQS
Pins
Source-Synchronous
FPGA CK/CK# and Addr/Cmd
Pins
Memory Input Pins
Source-Synchronous
FPGA CK/CK# and DQS
Output Pins
Memory Input Pins
Address and command
Clock-to-Strobe
(2)
(2)
Read Resynchronization
(2), (3)
Calibrated
IOE Capture Registers
IOE Resynchronization
Registers
Read Resynchronization
(2), (6)
Calibrated
IOE Capture Registers
Read FIFO in FPGA Core
(3)
Source-Synchronous
IOE Resynchronization
Registers
FIFO in FPGA Core
Internal Clock fMAX
Core Registers
Core Registers
I/O
FPGA Output Pin
Memory Input Pins
PHY IOE-Core Paths
(2),
PHY and Controller Internal
Paths (2)
I/O Toggle Rate
(4)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
FPGA Timing Paths
10–5
(1)
Table 10–1. Arria II Devices External Memory Interface Timing Paths
Timing Path
Circuit Category
Output Clock Specifications
(Jitter, DCD) (5)
(Part 2 of 2)
Source
I/O
Destination
FPGA Output Pin
Memory Input Pins
Notes to Table 10–1:
(1) Timing paths applicable for an interface between Arria II devices and SDRAM component.
(2) Timing margins for this path are reported by the TimeQuest Timing Analyzer Report DDR function.
(3) Only for ALTMEMPHY megafunctions.
(4) Altera recommends that you perform signal integrity simulations to verify I/O toggle rate.
(5) For output clock specifications, refer to the Arria II Device Data Sheet chapter of the Arria II Handbook.
(6) Only for UniPHY IP.
(7) Arria II GX devices use source-synchronous and calibrated path.
Figure 10–1 shows the Arria II GX devices input datapath registers and circuit types.
1
UniPHY IP interfaces bypass the synchronization registers.
Figure 10–1. Arria II GX Devices Input Data Path Registers and Circuit Types in SDRAM Interface
Internal
Source
Synchronous
Calibrated
I/O Source
Synchronous
Arria II GX FPGA
DDR Input Registers
DQ
Q
D
Input
Reg AI
FIFO
Synchronization
Registers
SDRAM
Q
D
Input
Reg CI
Q
D
Input
Reg BI
DQS
Resynchronization Clock
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–6
Chapter 10: Analyzing Timing of Memory IP
FPGA Timing Paths
Figure 10–2 shows the Arria II GZ devices input datapath registers and circuit types.
Figure 10–2. Arria II GZ Devices Input Data Path Registers and Circuit Types in SDRAM Interface
I/O Source
Synchronous and Calibrated
Arria II GZ FPGA
DDR Input Registers
DQ
Q
D
Input
Reg AI
FIFO
SDRAM
Q
D
Input
Reg CI
Q
D
Input
Reg BI
DQS
Half-Rate
Resynchronization Clock
Stratix III and Stratix IV PHY Timing Paths
A closer look at all the register transfers occurring in the Stratix III and Stratix IV
input datapath reveals many source-synchronous and calibrated circuits.
1
The information in Figure 10–3 and Table 10–2 are based on Stratix IV devices, but
they are applicable to Stratix III devices.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
FPGA Timing Paths
10–7
Figure 10–3 shows a block diagram of this input path with some of these paths
identified for Stratix IV devices. The output datapath contains a similar set of circuits.
1
UniPHY IP interfaces bypass the alignment and synchronization registers.
Figure 10–3. Stratix IV Input Path Registers and Circuit Types in SDRAM Interface
Internal
Source
Synchronous
Calibrated
I/O Source
Synchronous and Calibrated
Stratix IV FPGA
Half-Rate Data Registers
Q
D
DDR Input Registers
DQ
FIFO
Q
D
Q
Q
Alignment
and
Synchronization
Registers
D
D
Input
Reg AI
SDRAM
Q
Q
D
Input
Reg CI
D
Q
D
Input
Reg BI
DQS
Q
D
Q
Resynchronization Clock
D
Half-Rate
Resynchronization Clock
I/O Clock
Divider
Table 10–2 lists the timing paths applicable for an interface between Stratix IV devices
and half-rate SDRAM components.
1
The timing paths are also applicable to Stratix III devices, but Stratix III devices use
only source-synchronous path for read and write data paths.
Table 10–2. Stratix IV External Memory Interface Timing Paths (Part 1 of 2)
Timing Path
Circuit Category
Source
Destination
Read Data
(1)
Source-Synchronous
and Calibrated
Memory DQ, DQS Pins
DQ Capture Registers in IOE
Write Data
(1)
Source-Synchronous
and Calibrated
FPGA DQ, DQS Pins
Memory DQ, DM, and DQS
Pins
Source-Synchronous
FPGA CK/CK# and
Addr/Cmd Pins
Memory Input Pins
Source-Synchronous
FPGA CK/CK# and DQS
Output Pins
Memory Input Pins
Address and command
Clock-to-Strobe
November 2011
(1)
(1)
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–8
Chapter 10: Analyzing Timing of Memory IP
FPGA Timing Paths
Table 10–2. Stratix IV External Memory Interface Timing Paths (Part 2 of 2)
Timing Path
Circuit Category
IOE Alignment and
Resynchronization Registers
Calibrated
IOE Capture Registers
Read FIFO in FPGA Core
Source-Synchronous
IOE Half Data Rate
Registers and Half-Rate
Resynchronization Clock
FIFO in FPGA Core
(1), (2)
Calibrated
Read Resynchronization
(1), (5)
(1), (2)
PHY & Controller Internal Paths
(1)
Destination
IOE Capture Registers
Read Resynchronization
PHY IOE-Core Paths
Source
Internal Clock fMAX
Core registers
Core registers
(3)
I/O – Data sheet
FPGA Output Pin
Memory Input Pins
Output Clock Specifications (Jitter,
DCD) (4)
I/O – Data sheet
FPGA Output Pin
Memory Input Pins
I/O Toggle Rate
Notes to Table 10–2:
(1) Timing margins for this path are reported by the TimeQuest Timing Analyzer Report DDR function.
(2) Only for ALTMEMPHY megafunctions.
(3) Altera recommends that you perform signal integrity simulations to verify I/O toggle rate.
(4) For output clock specifications, refer to the DC and Switching Characteristics chapter of the Stratix IV Device Handbook.
(5) Only for UniPHY IP.
Arria V, Cyclone V, and Stratix V Timing paths
Figure 10–4 shows a block diagram of the Stratix V input data path.
Figure 10–4. Arria V, Cyclone V, and Stratix V Input Data Path
I/O Source
Synchronous and Calibrated
Stratix V FPGA
DDR Input Registers
DQ
Q
D
Input
Reg AI
SDRAM
Q
FIFO
D
Input
Reg CI
Q
D
Input
Reg BI
DQS
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
FPGA Timing Paths
10–9
Table 10–3 lists all Stratix V devices external memory interface timing paths.
Table 10–3. Stratix V External Memory Interface Timing Paths
Timing Path
Circuit Category
(1)
Source
Destination
Read Data
(2)
Source-Synchronous
and Calibrated
Memory DQ, DQS Pins
DQ Capture Registers in IOE
Write Data
(2)
Source-Synchronous
and Calibrated
FPGA DQ, DM, DQS Pins
Memory DQ, DM, and DQS
Pins
Source-Synchronous
FPGA CK/CK# and
Addr/Cmd Pins
Memory Input Pins
Source-Synchronous
FPGA CK/CK# and DQS
Output Pins
Memory Input Pins
Source-Synchronous
IOE Capture Registers
Read FIFO in IOE
Internal Clock fMAX
Core Registers
Core Registers
I/O – Data sheet
FPGA Output Pin
Memory Input Pins
I/O – Data sheet
FPGA Output Pin
Memory Input Pins
Address and command
Clock-to-Strobe
(2)
(2)
Read Resynchronization (2)
PHY & Controller Internal Paths
i/O Toggle
(2)
Rate (3)
Output Clock Specifications (Jitter,
DCD) (4)
Notes to Table 10–3:
(1) This table lists the timing paths applicable for an interface between Arria V, Cyclone V, and Stratix V devices and half-rate SDRAM components.
(2) Timing margins for this path are reported by the TimeQuest Timing Analyzer Report DDR function.
(3) Altera recommends that you perform signal integrity simulations to verify I/O toggle rate.
(4) For output clock specifications, refer to the DC and Switching Characteristics chapter of the Stratix V Device Handbook.
Cyclone III and Cyclone IV PHY Timing Paths
Table 10–4 lists the various timing paths in a Cyclone III and Cyclone IV memory
interface. Cyclone III and Cyclone IV devices use a calibrated PLL output clock for
data capture and ignore the DQS strobe from the memory. Therefore, read
resynchronization and postamble timing paths do not apply to Cyclone III and
Cyclone IV designs. The read capture is implemented in LE registers specially placed
next to the data pin with fixed routing, and data is transferred from the capture clock
domain to the system clock domain using a FIFO block. Figure 10–5 shows the
Cyclone III and Cyclone IV input datapath registers and circuit types.
Table 10–4. Cyclone III and Cyclone IV SDRAM External Memory Interface Timing Paths
Timing Path
Circuit Category
(1)
Source
(Part 1 of 2)
Destination
Read Data
(2)
Calibrated
Memory DQ, DQS Pins
FPGA DQ Capture
Registers in LEs
Write Data
(2)
Source-Synchronous
FPGA DQ, DQS Pins
Memory DQ, DM, and DQS
Pins
Source-Synchronous
FPGA CK/CK# and Addr/Cmd
Pins
Memory Input Pins
Source-Synchronous
FPGA CK/CK# and DQS
Output Pins
Memory Input Pins
Internal Clock fMAX
LE Half Data Rate Registers
FIFO in FPGA Core
I/O – Data sheet
I/O Timing section
FPGA Output Pin
Memory Input Pins
Address and command
Clock-to-Strobe
(2)
PHY Internal Timing
I/O Toggle Rate
November 2011
(2)
(2)
(3)
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–10
Chapter 10: Analyzing Timing of Memory IP
Timing Constraint and Report Files
Table 10–4. Cyclone III and Cyclone IV SDRAM External Memory Interface Timing Paths
Timing Path
Circuit Category
I/O – Data sheet
Switching
Characteristics section
Output Clock Specifications (Jitter,
DCD) (4)
(1)
(Part 2 of 2)
Source
Destination
FPGA Output Pin
Memory Input Pins
Notes to Table 10–4:
(1) Table 10–4 lists the timing paths applicable for an interface between Cyclone III and Cyclone IV devices and SDRAM.
(2) Timing margins for this path are reported by the TimeQuest Timing Analyzer Report DDR function.
(3) Altera recommends that you perform signal integrity simulations to verify I/O toggle rate.
(4) For output clock specifications, refer to the DC and Switching Characteristics chapter of the Cyclone III Device Handbook and of the Cyclone IV
Device Handbook
Figure 10–5. Cyclone III or Cyclone IV Input Data Path Registers and Circuit Types in SDRAM Interface
Internal
Source
Synchronous
Calibrated
Cyclone III/Cyclone IV FPGA
DDR Input Registers
Q
D
Q
DQ
D
LE
Register
FIFO
Q
D
Q
D
LE
Register
Q
SDRAM
D
LE
Register
Capture and
Resynchronization Clock
PLL
Timing Constraint and Report Files
The timing contraints differ for the ALTMEMPHY megafunction and the UniPHY IP.
ALTMEMPHY Megafunction
To ensure a successful external memory interface operation, the ALTMEMPHY
MegaWizard™ Plug-In Manager generates the following files for timing constraints
and reporting scripts:
■
<variation_name>phy_ddr_timing.sdc
■
<variation_name>phy_ddr_timing.tcl (except Cyclone III devices)
■
<variation_name>phy_report_timing.tcl
■
<variation_name>phy_report_timing_core.tcl (except Cyclone III devices)
■
<variation_name>phy_ddr_pins.tcl
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Constraint and Report Files
10–11
<variation_name>_ddr_timing.sdc
The Synopsys Design Constraints File (.sdc) has the name
<controller_variation_name>_phy_ddr_timing.sdc when you instantiate the
ALTMEMPHY megafunction in the Altera® memory controller, and has the name
<phy_variation_name>_ddr_timing.sdc when you instantiate the ALTMEMPHY
megafunction as a stand-alone design.
To analyze the timing margins for all ALTMEMPHY megafunction timing paths,
execute the Report DDR function in the TimeQuest Timing Analyzer; refer to the
“Timing Analysis Description” on page 10–13. No timing constraints are necessary (or
specified in the .sdc) for Arria II GX devices read capture and write datapaths,
because all DQ and DQS pins are predefined. The capture and output registers are
built into the IOE, and the signals are using dedicated routing connections. Timing
constraints have no impact on the read and write timing margins. However, the
timing margins for these paths are analyzed using FPGA data sheet specifications and
the user-specified memory data sheet parameters.
The ALTMEMPHY megafunction uses the following .sdc constraints for internal
FPGA timing paths, address and command paths, and clock-to-strobe timing paths:
1
■
Creating clocks on PLL inputs
■
Creating generated clocks using derive_pll_clocks, which includes all full-rate
and half-rate PLL outputs, PLL reconfiguration clock, and I/O scan clocks
■
Calling derive_clock_uncertainty
■
Cutting timing paths for DDR I/O, calibrated paths, and most reset paths
■
Setting output delays on address and command outputs (versus CK/CK# outputs)
■
Setting 2T or two clock-period multicycle setup for all half-rate address and
command outputs, except nCS and on-die termination (ODT) (versus CK/CK#
outputs)
■
Setting output delays on DQS strobe outputs (versus CK/CK# outputs for DDR2
and DDR SDRAM)
The high-performance controller MegaWizard Plug-In Manager generates an extra
<variation_name>_example_top.sdc for the example driver design. This file contains
the timing constraints for the non-DDR specific parts of the project.
<variation_name>_ddr_timing.tcl
This script includes the memory interface and FPGA device timing parameters for
your variation. It is included within <variation_name>_report_timing.tcl and
<variation_name>_ddr_timing.sdc and runs automatically during compilation. This
script is run for every instance of the same variation. Cyclone III devices do not have
this .tcl file. All the parameters are in the .sdc.
<variation_name>_report_timing.tcl
This script reports the timing slacks for your variation. It runs automatically during
compilation. You can also run this script with the Report DDR task in the TimeQuest
Timing Analyzer window. This script is run for every instance of the same variation.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–12
Chapter 10: Analyzing Timing of Memory IP
Timing Constraint and Report Files
<variation_name>_report_timing_core.tcl
This script contains high-level procedures that <variation_name>_report_timing.tcl
uses to compute the timing slacks for your variation. It runs automatically during
compilation. Cyclone III devices do not have this .tcl file.
<variation_name>_ddr_pins.tcl
This script includes all the functions and procedures required by the
<variation_name>_report_timing.tcl and <variation_name>_ddr_timing.sdc scripts. It
is a library of useful functions to include at the top of an .sdc. It finds all the variation
instances in the design and the associated clock, register, and pin names of each
instances. The results are saved in the same directory as the .sdc and
<variation_name>_report_timing.tcl as <variation_name>_autodetectedpins.tcl.
1
Because this .tcl file traverses the design for the project pin names, you do not need to
keep the same port names on the top level of the design.
UniPHY IP
To ensure a successful external memory interface operation, the UniPHY IP generates
two sets of files for timing constraints but in different folders and with slightly
different filenames. One set of files are used for synthesis project, which is available
under the <variation_name> folder located in the main project folder while the other
set of files are the example designs, located in the <variation_name>example
design\example_project folder.
The project folders contain the following files for timing constraints and reporting
scripts:
■
<variation_name>.sdc
■
<variation_name>_timing.tcl
■
<variation_name>_report_timing.tcl
■
<variation_name>_report_timing_core.tcl
■
<variation_name>_pin_map.tcl
■
<variation_name>_parameters.tcl
<variation_name>.sdc
The <variation_name>.sdc is listed in the wizard-generated Quartus II IP File (.qip).
Including this file in the project allows the Quartus II Synthesis and Fitter to use the
timing driven compilation to optimize the timing margins.
To analyze the timing margins for all UniPHY timing paths, execute the Report DDR
function in the TimeQuest Timing Analyzer.
The UniPHY IP uses the .sdc to constrain internal FPGA timing paths, address and
command paths, and clock-to-strobe timing paths, and more specifically:
■
Creating clocks on PLL inputs
■
Creating generated clocks
■
Calling derive_clock_uncertainty
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
10–13
■
Cutting timing paths for specific reset paths
■
Setting input and output delays on DQ inputs and outputs
■
Setting output delays on address and command outputs (versus CK/CK# outputs)
<variation_name>_timing.tcl
This script includes the memory, FPGA, and board timing parameters for your
variation. It is included within <variation_name>_report_timing.tcl and
<variation_name>.sdc. In multiple interface designs with PLL and DLL sharing, you
must change the master core name and instance name in this file for the slave
controller.
<variation_name>_report_timing.tcl
This script reports the timing slack for your variation. It runs automatically during
compilation (during static timing analysis). You can also run this script with the
Report DDR task in the TimeQuest Timing Analyzer. This script is run for every
instance of the same variation.
<variation_name>_report_timing_core.tcl
This script contains high-level procedures that the
<variation_name>_report_timing.tcl script uses to compute the timing slack for your
variation. This script runs automatically during compilation.
<variation_name>_pin_map.tcl
This script is a library of functions and procedures that the
<variation_name>_report_timing.tcl and <variation_name>.sdc scripts use. The
<variation_name>_pin_assignments.tcl script, which is not relevant to timing
constraints, also uses this library.
<variation_name>_parameters.tcl
This script defines some of the parameters that describe the geometry of the core and
the PLL configuration. Do not change this file, except when you modify the PLL
through the MegaWizard Plug-In Manager. In this case, the changes to the PLL
parameters do not automatically propagate to this file and you must manually apply
those changes in this file.
Timing Analysis Description
The following sections describe the timing analysis using the respective FPGA data
sheet specifications and the user-specified memory data sheet parameters.
For detailed timing analysis description, refer to the scripts listed in “Timing
Constraint and Report Files” on page 10–10.
To account for the effects of calibration, the ALTMEMPHY and UniPHY IP include
additional scripts that are part of the <phy_variation_name>_report_timing.tcl and
<phy_variation_name>_ report_timing_core.tcl files that determine the timing margin
after calibration. These scripts use the setup and hold slacks of individual pins to
emulate what is occurring during calibration to obtain timing margins that are
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–14
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
representative of calibrated PHYs. The effects considered as part of the calibrated
timing analysis include improvements in margin because of calibration, and
quantization error and calibration uncertainty because of voltage and temperature
changes after calibration. The calibration effects do not apply to Stratix III and
Cyclone III devices.
Address and Command
Address and command signals are single data rate signals latched by the memory
device using the FPGA output clock. Some of the address and command signals are
half-rate data signals, while others, such as the chip select, are full-rate signals. The
TimeQuest Timing Analyzer analyzes the address and command timing paths using
the set_output_delay (max and min) constraints.
PHY or Core
Timing analysis of the PHY or core path includes the path of soft registers in the
device and the register in the I/O element. However, the analysis does not include the
paths through the pin or the calibrated path. The PHY or core analyzes this path by
calling the report_timing command in <variation_name>_report_timing.tcl and
<variation_name>_report_timing_core.tcl.
PHY or Core Reset
The PHY or core reset is the internal timing of the asynchronous reset signals to the
ALTMEMPHY or UniPHY IPs. The PHY or core analyzes this path by calling the
report_timing command in <variation_name>_report_timing.tcl and
<variation_name>_report_timing_core.tcl.
Read Capture and Write
Cyclone III and Stratix III memory interface designs perform read capture and write
timing analysis using the TCCS and SW timing specification. Read capture and write
timing analysis for Arria II, Cyclone IV, Stratix IV, and Stratix V memory interface
designs are based on the timing slacks obtained from the TimeQuest Timing Analyzer
and all the effects included with the Quartus II timing model such as die-to-die and
within-die variations, aging, systematic skew, and operating condition variations.
Because the PHY IP adjusts the timing slacks to account for the calibration effects,
there are two sets of read capture and write timing analysis numbers—Before
Calibration and After Calibration.
Cyclone III and Stratix III
This section details the timing margins, such as the read data and write data timing
paths, which the TimeQuest Timing Analyzer callates for Cyclone III and Stratix III
designs. Timing paths internal to the FPGA are either guaranteed by design and
tested on silicon, or analyzed by the TimeQuest Timing Analyzer using corresponding
timing constraints.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
10–15
f For design guidelines about implementing and analyzing your external memory
interface using the PHY in Cyclone III, Stratix III, and Stratix IV devices, refer to the
design tutorials on the List of designs using Altera External Memory IP page of the
Altera Wiki website.
Timing margins for chip-to-chip data transfers can be defined as:
Margin = bit period – transmitter uncertainties – receiver requirements
where:
■
Sum of all transmitter uncertainties = transmitter channel-to-channel skew
(TCCS).
The timing difference between the fastest and slowest output edges on data
signals, including tCO variation, clock skew, and jitter. The clock is included in the
TCCS measurement and serves as the time reference.
■
Sum of all receiver requirements = receiver sampling window (SW) requirement.
The period of time during which the data must be valid to capture it correctly. The
setup and hold times determine the ideal strobe position within the sampling
window.
■
Receiver skew margin (RSKM) = margin or slack at the receiver capture register.
f For TCCS and SW specifications, refer to the DC and Switching Characteristics chapter
of the Cyclone III Device Handbook or Stratix III Device Handbook.
Figure 10–6 relates this terminology to a timing budget diagram.
Figure 10–6. Sample Timing Budget Diagram
Bit Period (TUI)
½ × TCCS
RSKM
Sampling Window (SW)
RSKM
Setup + Hold + S kew + Jitter
½ × TCCS
Data Skew with
respect to Clock
The timing budget regions marked “½ × TCCS” represent the latest data valid time
and earliest data invalid times for the data transmitter. The region marked sampling
window is the time required by the receiver during which data must stay stable. This
sampling window comprises the following:
1
November 2011
■
Internal register setup and hold requirements
■
Skew on the data and clock nets within the receiver device
■
Jitter and uncertainty on the internal capture clock
The sampling window is not the capture margin or slack, but instead the requirement
from the receiver. The margin available is denoted as RSKM.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–16
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
The simple example illustrated in Figure 10–6 does not consider any board level
uncertainties, assumes a center-aligned capture clock at the middle of the receiver
sampling window region, and assumes an evenly distributed TCCS with respect to
the transmitter clock pin. In this example, the left end of the bit period corresponds to
time t = 0, and the right end of the bit period corresponds to time t = TUI (where TUI
stands for time unit interval). Therefore, the center-aligned capture clock at the
receiver is best placed at time t = TUI/2.
Therefore:
the total margin = 2 × RSKM = TUI – TCCS – SW.
Consider the case where the clock is not center-aligned within the bit period (clock
phase shift = P), and the transmitter uncertainties are unbalanced (TCCSLEAD and
TCCSLAG). TCCSLEAD is defined as the skew between the clock signal and latest data
valid signal. TCCSLAG is defined as the skew between the clock signal and earliest
data invalid signal. Also, the board level skew across data and clock traces are
specified as tEXT. For this condition, you should compute independent setup and hold
margins at the receiver (RSKMSETUP and RSKMHOLD). In this example, the sampling
window requirement is split into a setup side requirement (SWSETUP) and hold side
(SWHOLD) requirement. Figure 10–7 illustrates the timing budget for this condition. A
timing budget similar to that shown in Figure 10–7 is used for Cyclone III and
Stratix III FPGA read and write data timing paths.
Figure 10–7. Sample Timing Budget with Unbalanced (TCCS and SW) Timing Parameters
Clock Phase Shift = P
Bit Period (TUI)
TCCSLEAD
tEXT
RSKMSETUP
SWSETUP
SWHOLD
RSKMHOLD
tEXT
TCCSLAG
Sampling Window (SW)
Therefore:
Setup margin = RSKMSETUP = P – TCCSLEAD – SWSETUP – tEXT
Hold margin = RSKMHOLD = (TUI – P) – TCCSLAG – SWHOLD – tEXT
The timing budget illustrated in Figure 10–6 with balanced timing parameters applies
for calibrated paths where the clock is dynamically center-aligned within the data
valid window. The timing budget illustrated in Figure 10–7 with unbalanced timing
parameters applies for circuits that employ a static phase shift using a DLL or PLL to
place the clock within the data valid window.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
10–17
Read Capture
Memory devices provide edge-aligned DQ and DQS outputs to the FPGA during read
operations. Stratix III FPGAs center-aligns the DQS strobe using static DLL-based
delays, and the Cyclone III FPGAs use a calibrated PLL clock output to capture the
read data in LE registers without using DQS. While Stratix III devices use a source
synchronous circuit for data capture and Cyclone III devices use a calibrated circuit,
the timing analysis methodology is quite similar, as shown in the following section.
When applying this methodology to read data timing, the memory device is the
transmitter and the FPGA device is the receiver.
The transmitter channel-to-channel skew on outputs from the memory device is
available from the corresponding device data sheet. Let us examine the TCCS
parameters for a DDR2 SDRAM component.
For DQS-based capture:
■
The time between DQS strobe and latest data valid is defined as tDQSQ
■
The time between earliest data invalid and next strobe is defined as tQHS
■
Based on earlier definitions, TCCSLEAD = tDQSQ and TCCSLAG = tQHS
The sampling window at the receiver, the FPGA, includes several timing parameters:
■
Capture register micro setup and micro hold time requirements
■
DQS clock uncertainties because of DLL phase shift error and phase jitter
■
Clock skew across the DQS bus feeding DQ capture registers
■
Data skew on DQ paths from pin to input register including package skew
f For TCCS and SW specifications, refer to the DC and Switching Characteristics chapter
of the Cyclone III Device Handbook or the Stratix III Device Handbook.
Figure 10–8 shows the timing budget for a read data timing path.
Figure 10–8. Timing Budget for Read Data Timing Path
DQS Delay Shift
Half-Period (min)
tDQSQ
tEXT
Read
Setup
Margin
DQ Skew + DQS Uncertainty
+ µTsu + µTh
tSW_SETUP
November 2011
Altera Corporation
Read
Hold
Margin
tEXT
tQHS
Duty Cycle
Distortion (tDCD)
tSW_HOLD
External Memory Interface Handbook
Volume 2: Design Guidelines
10–18
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
Table 10–5 lists a read data timing analysis for a Stratix III –2 speed-grade device
interfacing with a 400-MHz DDR2 SDRAM component.
Table 10–5. Read Data Timing Analysis for Stratix III Device with a 400-MHz DDR2 SDRAM (1)
Parameter
Memory
Specifications
Specifications Value (ps)
tHP
1250
tDCD
50
Duty cycle distortion = 2% × tCK = 0.02 × 2500 ps
tDQSQ
200
Skew between DQS and DQ from memory
tQHS
300
Data hold skew factor as specified by memory
tSW_SETUP
181
tSW_HOLD
306
FPGA sampling window specifications for a given configuration (DLL
mode, width, location, and so on.)
tEXT
20
Maximum board trace variation allowed between any two signal traces
(user specified parameter)
tDVW
710
tHP – tDCD – tDQSQ – tQHS – 2 × tEXT
tDQS_PHASE_DELAY
500
(1)
FPGA
Specifications
Board
Specifications
Timing
Calculations
Description
Average half period as specified by the memory data sheet, tHP = 1/2 * tCK
Ideal phase shift delay on DQS capture strobe
Results
= (DLL phase resolution × number of delay stages × tCK) / 360° = (36° × 2
stages × 2500 ps)/360° = 500 ps
Setup margin
99
RSKMSETUP = tDQSQ_PHASE_DELAY – tDQSQ – tSW_SETUP – tEXT
Hold margin
74
RSKMHOLD = tHP – tDCD – tDQS_PHASE_DELAY – tQHS – tSW_HOLD – tEXT
Notes to Table 10–5:
(1) This sample calculation uses memory timing parameters from a 72-bit wide 256-MB micron MT9HTF3272AY-80E 400-MHz DDR2 SDRAM
DIMM.
Table 10–6 lists a read data timing analysis for a DDR2 SDRAM component at
200 MHz using the SSTL-18 Class I I/O standard and termination. A 267-MHz DDR2
SDRAM component is required to ensure positive timing margins for the 200-MHz
memory interface clock frequency for the 200 MHz operation.
Table 10–6. Read Data Timing Analysis for a 200-MHz DDR2 SDRAM on a Cyclone III Device (1)
Parameter
Memory
Specifications
(1)
Specifications
Value
(ps)
tHP
2500
Average half period as specified by the memory data sheet
tDCD_TOTAL
250
Duty cycle distortion = 2% × tCK = 0.02 × 5000 ps
Description
± 500
Data (DQ) output access time for a 267-MHz DDR2 SDRAM component
tSW_SETUP
580
tSW_HOLD
550
FPGA sampling window specification for a given configuration (interface
width, location, and so on).
tEXT
20
Timing
Calculations
tDVW
1230
tHP - tDCD - 2 × tAC – 2 × tEXT
Results
Total margin
100
tDVW - tSW_SETUP - tSW_HOLD
FPGA
Specifications
Board
Specifications
tAC
(1)
Maximum board trace variation allowed between any two signal traces (user
specified parameter)
Notes to Table 10–6:
(1) For this sample calculation, total duty cycle distortion and board skew are split over both setup and hold margin. For more information on
Cyclone III –6 speed-grade device read capture and timing analysis, refer to “Cyclone III and Cyclone IV PHY Timing Paths” on page 10–9.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
10–19
Write Capture
During write operations, the FPGA generates a DQS strobe and a center-aligned DQ
data bus using multiple PLL-driven clock outputs. The memory device receives these
signals and captures them internally. The Stratix III family contains dedicated DDIO
(double data rate I/O) blocks inside the IOEs.
For write operations, the FPGA device is the transmitter and the memory device is the
receiver. The memory device’s data sheet specifies data setup and data hold time
requirements based on the input slew rate on the DQ/DQS pins. These requirements
make up the memory sampling window, and include all timing uncertainties internal
to the memory.
Output skew across the DQ and DQS output pins on the FPGA make up the TCCS
specification. TCCS includes contributions from numerous internal FPGA circuits,
including:
■
Location of the DQ and DQS output pins
■
Width of the DQ group
■
PLL clock uncertainties, including phase jitter between different output taps used
to center-align DQS with respect to DQ
■
Clock skew across the DQ output pins, and between DQ and DQS output pins
■
Package skew on DQ and DQS output pins
f Refer to the DC and Switching Characteristics chapter of the Cyclone III Device Handbook
or the Stratix III Device Handbook for TCCS and SW specifications.
Figure 10–9 illustrates the timing budget for a write data timing path.
Figure 10–9. Timing Budget for Write Data Timing Path
TX_DVWLAG
DQ-DQS Output Clock Offset
TX_DVWLEAD
TCCSLEAD
(DQS to
late DQ)
tEXT
Write
Setup
Margin
tDS
tDH
Memory Sampling Window
November 2011
Altera Corporation
Write
Hold
Margin
tEXT
TCCSLAG
(early DQ
to late DQS)
TCO /Clock skew
External Memory Interface Handbook
Volume 2: Design Guidelines
10–20
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
Table 10–7 lists a write data timing analysis for a Stratix III –2 speed-grade device
interfacing with a DDR2 SDRAM component at 400 MHz. This timing analysis
assumes the use of a differential DQS strobe with 2.0-V/ns edge rates on DQS, and
1.0-V/ns edge rate on DQ output pins. Consult your memory device’s data sheet for
derated setup and hold requirements based on the DQ/DQS output edge rates from
your FPGA.
.
Table 10–7. Write Data Timing Analysis for 400-MHz DDR2 SDRAM Stratix III Device (1)
Parameter
Specifications
Value
(ps)
tHP
1250
Average half period as specified by the memory data sheet
tDSA
250
Memory setup requirement (derated for DQ/DQS edge rates and VREF reference
voltage)
tDHA
250
Memory hold requirement (derated for DQ/DQS edge rates and VREF reference
voltage)
TCCSLEAD
229
TCCSLAG
246
tEXT
20
tOUTPUT_CLOCK
625
tOUTPUT_CLOCK_OFFSET = (output clock phase DQ and DQS offset x tCK)/360° = (90° x
2500)/360° = 625
TX_DVWLEAD
396
Transmitter data valid window = tOUTPUT_CLOCK_OFFSET – TCCSLEAD
Memory
Specifications
(1)
FPGA
Specifications
Board
Specifications
Timing
Calculations
Results
Description
FPGA transmitter channel-to-channel skew for a given configuration (PLL setting,
location, and width).
Maximum board trace variation allowed between any two signal traces (user
specified parameter)
Output clock phase offset between DQ & DQS output clocks = 90°.
_OFFSET
TX_DVWLAG
379
Transmitter data valid window = tHP - tOUTPUT_CLOCK_OFFSET – TCCSLAG
Setup margin
126
TX_DVWLEAD – tEXT – tDSA
Hold margin
109
TX_DVWLAG – tEXT – tDHA
Notes to Table 10–7:
(1) This sample calculation uses memory timing parameters from a 72-bit wide 256-MB micron MT9HTF3272AY-80E 400-MHz DDR2 SDRAM
DIMM
Table 10–8 lists a write timing analysis for a Cyclone III –6 speed-grade device
interfacing with a DDR2 SDRAM component at 200 MHz. A 267-MHz DDR2 SDRAM
component is used for this analysis.
Table 10–8. Write Data Timing Analysis for a 200-MHz DDR2 SDRAM Interface on a Cyclone III Device
Parameter
Memory
Specifications
FPGA
Specifications
Board
Specifications
(1)
(Part 1 of 2)
Specifications
Value (ps)
tHP
2500
Average half period as specified by the memory data sheet
tDCD_TOTAL
250
Total duty cycle distortion = 5% × tCK = 0.05 x 5000
tDS (derated)
395
Memory setup requirement from a 267-MHz DDR2 SDRAM component
(derated for single-ended DQS and 1 V/ns slew rate)
tDH (derated)
335
Memory hold from DDR2 267-MHz component (derated for single-ended
DQS and 1 V/ns slew rate)
TCCSLEAD
790
TCCSLAG
380
tEXT
20
External Memory Interface Handbook
Volume 2: Design Guidelines
Description
FPGA TCCS for a given configuration (PLL setting, location, width)
Maximum board trace variation allowed between any two signal traces
(user specified parameter)
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
10–21
Table 10–8. Write Data Timing Analysis for a 200-MHz DDR2 SDRAM Interface on a Cyclone III Device
Parameter
(1)
(Part 2 of 2)
Specifications
Value (ps)
TX_DVWLEAD
460
Transmitter data valid window = tOUTPUT_CLOCK_OFFSET – TCCSLEAD
TX_DVWLAG
870
Transmitter data valid window = tHP - tOUTPUT_CLOCK_OFFSET – TCCSLAG
tOUTPUT_CLOCK
1250
Timing
Calculations
Output clock phase offset between DQ/DQS output clocks = 90°
_OFFSET
Results
Description
tOUTPUT_CLOCK_OFFSET = (output clock phase DQ & DQS offset x tCK)/360° =
(90° x 5000)/360° = 1250
Setup margin
45
TX_DVWLEAD – tEXT – tDS
Hold margin
265
TX_DVWLAG – tEXT – tDH – tDCD_TOTAL
Note to Table 10–8:
(1) For more information on Cyclone III –6 speed-grade device read capture and timing analysis, refer to “Read Capture” on page 10–17.
Arria II, Arria V, Cyclone IV, Cyclone V, Stratix IV and Stratix V
Read Capture
Read capture timing analysis indicates the amount of slack on the DDR DQ signals
that are latched by the FPGA using the DQS strobe output of the memory device. The
read capture timing paths are analyzed by a combination of the TimeQuest Timing
Analyzer using the set_input_delay (max and min), set_max_delay, and
set_min_delay constraints, and further steps to account for calibration that occurs at
runtime. The ALTMEMPHY and UniPHY IP include timing constraints in the
<phy_variation_name>_ddr_timing.sdc (ALTMEMPHY) or <phy_variation_name>.sdc
(UniPHY), and further slack analysis in <phy_variation_name>_report_timing.tcl and
<phy_variation_name>_report_timing_core.tcl files.
The PHY IP captures the Cyclone IV devices read data using a PLL phase that is
calibrated and tracked with the sequencer. The equations in
<phy_variation_name>_report_timing_core.tcl ensures optimum read capture timing
margin.
In Arria II, Cyclone IV, and Stratix IV devices, the margin is reported based on a
combination of the TimeQuest Timing Analyzer calculation results and further
processing steps that account for the calibration that occurs at runtime. First, the
TimeQuest analyzer returns the base setup and hold slacks, and further processing
steps adjust the slacks to account for effects which the TimeQuest analyzer cannot
model.
Write
Write timing analysis indicates the amount of slack on the DDR DQ signals that are
latched by the memory device using the DQS strobe output from the FPGA device.
The write timing paths are analyzed by a combination of the TimeQuest Timing
Analyzer using the set_output_delay (max and min) and further steps to account for
calibration that occurs at runtime. The ALTMEMPHY and UniPHY IP include timing
constraints in the <phy_variation_name>_ddr_timing.sdc (ALTMEMPHY) or
<phy_variation_name>.sdc (UniPHY), and further slack analysis in
<phy_variation_name>_report_timing.tcl and
<phy_variation_name>_report_timing_core.tcl files.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–22
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
Read Resynchronization
In the DDR3, DDR2, and DDR SDRAM interfaces with Arria II GX FPGAs, the
resynchronization timing analysis concerns transferring read data that is captured
with a DQS strobe to a clock domain under the control of the ALTMEMPHY. After
calibration by a sequencer, a dedicated PLL phase tracks any movements in the data
valid window of the captured data. The exact length of the DQS and CK traces does
not affect the timing analysis. The calibration process centers the resynchronization
clock phase in the middle of the captured data valid window to maximize the
resynchronization setup and hold the margin, and removes any static offset from
other timing paths. With the static offset removed, any remaining uncertainties are
voltage and temperature variation, jitter and skew.
In a UniPHY interface, a FIFO buffer synchronizes the data transfer from the data
capture to the core. The calibration process sets the depth of the FIFO buffer and no
dedicated synchronization clock is required. Refer to
<phy_variation_name>_report_timing_core.tcl for more information about the
resynchronization timing margin equation.
Mimic Path
The mimic path mimics the FPGA portion of the elements of the round-trip delay,
which enables the calibration sequencer to track delay variations because of voltage
and temperature changes during the memory read and write transactions without
interrupting the operation of the ALTMEMPHY megafunction.
As the timing path register is integrated in the IOE, there is no timing constraint
required for the Arria II GX device families.
For Cyclone III and Cyclone IV devices, the mimic register is a register in the core and
it is placed closer to the IOE by the fitter.
1
The UniPHY IP does not use any mimic path.
DQS versus CK—Arria II GX, Cyclone III, and Cyclone IV Devices
The DQS versus CK timing path indicates the skew requirement for the arrival time of
the DQS strobe at the memory with respect to the arrival time of CK/CK# at the
memory. Arria II GX, Cyclone III, and Cyclone IV devices require the DQS strobes
and CK clocks to arrive edge aligned.
There are two timing constraints for DQS versus CK timing path to account for duty
cycle distortion. The DQS/DQS# rising edge to CK/CK# rising edge (tDQSS) requires
the rising edge of DQS to align with the rising edge of CK to within 25% of a clock
cycle, while the DQS/DQS# falling edge setup/hold time from CK/CK# rising edge
(tDSS/tDSH) requires the falling edge of DQS to be more than 20% of a clock cycle away
from the rising edge of CK.
The TimeQuest Timing Analyzer analyzes the DQS vs CK timing paths using the
set_output_delay (max and min) constraints. For more information, refer to
<phy_variation_name>_phy_ddr_timing.sdc.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Analysis Description
10–23
Write Leveling tDQSS
In DDR2 SDRAM (with UniPHY) and DDR3 SDRAM (with ALTMEMPHY and
UniPHY) interfaces, write leveling tDQSS timing is a calibrated path that details skew
margin for the arrival time of the DQS strobe with respect to the arrival time of
CK/CK# at the memory side. For proper write leveling configuration, DLL delay
chain must be equal to 8. The PHY IP reports the margin through an equation. For
more information, refer to <phy_variation_name>_report_timing_core.sdc.
Write Leveling tDSH/tDSS
In DDR2 SDRAM (with UniPHY) and DDR3 SDRAM (with ALTMEMPHY and
UniPHY) interfaces, write leveling tDSH/tDSS timing details the setup and hold margin
for the DQS falling edge with respect to the CK clock at the memory. The PHY IP
reports the margin through an equation. For more information, refer to
<phy_variation_name>_report_timing_core.sdc.
DK versus CK (RLDRAM II with UniPHY)
In RLDRAM II with UniPHY designs using the Nios-based sequencer, DK versus CK
timing is a calibrated path that details skew margin for the arrival time of the DK
clock versus the arrival time of CK/CK# on the memory side. The PHY IP reports the
margin through an equation. For more information, refer to
<phy_variation_name>_report_timing_core.sdc.
Bus Turnaround Time
In DDR2 and DDR3 SDRAM, and RLDRAM II (CIO) with UniPHY designs that use
bidirectional data bus, you may have potential encounter with data bus contention
failure when a write command follows a read command. The bus-turnaround time
analysis determines how much margin there is on the switchover time and prevents
bus contention. If the timing is violated, you can either increase the controller's bus
turnaround time, which may reduce efficiency or board traces delay. Refer to
<variation>_report_timing_core.tcl for the equation. You can find this analysis in the
timing report. This analysis is only available for DDR2/3 SDRAM and RLDRAM II
UniPHY IPs in Arria II GZ, Arria V, Cyclone V, Stratix IV, and Stratix V devices.
The RTL simulation for ALTMEMPHY IP is unable to detect timing violations because
ALTMEMPHY IP is not enhanced with the bus turnaround analysis feature.
Therefore, Altera recommends that you verify the design on board by manually
changing the default values of MEM_IF_WR_TO_RD_TURNAROUND_OCT and
MEM_IF_RD_TO_WR_TURNAROUND_OCT parameters in the controller wrapper
file.
To determine whether the bus turnaround time issue is the cause of your design
failure and to overcome this timing violation, follow these steps:
1. When the design fails, change the default values of
MEM_IF_WR_TO_RD_TURNAROUND_OCT and
MEM_IF_RD_TO_WR_TURNAROUND_OCT parameters in the controller
wrapper file to a maximum value of 5. If the design passes after the change, it is a
bus turnaround issue.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–24
Chapter 10: Analyzing Timing of Memory IP
Timing Report DDR
2. To solve the bus turnaround time issue, reduce the values of the
MEM_IF_WR_TO_RD_TURNAROUND_OCT and
MEM_IF_RD_TO_WR_TURNAROUND_OCT parameters gradually until you
reach the minimum value needed for the design to pass on board.
Timing Report DDR
The Report DDR task in the TimeQuest Timing Analyzer generates custom timing
margin reports for all ALTMEMPHY and UniPHY instances in your design. The
TimeQuest Timing Analyzer generates this custom report by sourcing the wizardgenerated <variation>_report_timing.tcl script.
This <variation>_report_timing.tcl script reports the following timing slacks on
specific paths of the DDR SDRAM:
■
Read capture
■
Read resynchronization
■
Mimic, address and command
■
Core
■
Core reset and removal
■
Half-rate address and command
■
DQS versus CK
■
Write
■
Write leveling (tDQSS)
■
Write leveling (tDSS/tDSH)
In Stratix III and Cyclone III designs, the <variation_name>_report_timing.tcl script
checks the design rules and assumptions as listed in “Timing Model Assumptions and
Design Rules” on page 10–29. If you do not adhere to these assumptions and rules,
you receive critical warnings when the TimeQuest Timing Analyzer runs during
compilation or when you run the Report DDR task.
To generate a timing margin report, follow these steps:
1. Compile your design in the Quartus II software.
2. Launch the TimeQuest Timing Analyzer.
3. Double-click Report DDR from the Tasks pane. This action automatically executes
the Create Timing Netlist, Read SDC File, and Update Timing Netlist tasks for
your project.
c The .sdc may not be applied correctly if the variation top-level file is the top-level file
of the project. You must have the top-level file of the project instantiate the variation
top-level file.
The Report DDR feature creates a new DDR folder in the TimeQuest Timing
Analyzer Report pane.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Report DDR
10–25
Expanding the DDR folder reveals the detailed timing information for each PHY
timing path, in addition to an overall timing margin summary for the ALTMEMPHY
or UniPHY instance, as shown in Figure 10–10.
Figure 10–10. Timing Margin Summary Window Generated by Report DDR Task
1
November 2011
Bus turnaround time shown in Figure 10–10 is available in all UniPHY IPs and
devices except in QDR II and QDR II+ SRAM memory protocols and Stratix III
devices.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–26
Chapter 10: Analyzing Timing of Memory IP
Timing Report DDR
Figure 10–11 shows the timing analysis results calculated using FPGA timing model
before adjustment in the Before Calibration panel.
Figure 10–11. Read and Write Before Calibration
Figure 10–12 and Figure 10–13 show the read capture and write margin summary
window generated by the Report DDR Task for a DDR3 core. It first shows the timing
results calculated using the FPGA timing model. The
<variation_name>_report_timing_core.tcl then adjusts these numbers to account for
effects that are not modeled by either the timing model or by TimeQuest Timing
Analyzer. The read and write timing margin analysis for Stratix III and Cyclone III
devices do not need any adjustments.
Figure 10–12. Read Capture Margin Summary Window
Figure 10–13. Write Capture Margin Summary Window
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Report SDC
10–27
Report SDC
The Report SDC task in the TimeQuest Timing Analyzer generates the SDC
assignment reports for your design. The TimeQuest Timing Analyzer generates this
constraint report by sourcing the .sdc. The SDC assignment reports show the
constraint applied in the design.
For example, the reports may include the following constraints:
■
Create Clock
■
Create Generated Clock
■
Set Clock Uncertainty
■
Set Input Delay
■
Set Output Delay
■
Set False Path
■
Set Multicycle Path
■
Set Maximum Delay
■
Set Minimum Delay
Figure 10–14 shows the SDC assignments generated by the Report SDC task for a
DDR3 SDRAM core design. The timing analyzer uses these constraint numbers in
analysis to calculate the timing margin. Refer to the .sdc files of each constraints
number.
Figure 10–14. SDC Assignments Report Window
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–28
Chapter 10: Analyzing Timing of Memory IP
Calibration Effect in Timing Analysis
Calibration Effect in Timing Analysis
Timing analysis for Arria II, Cyclone IV, Stratix IV, and Stratix V devices take into
account the calibration effects to improve the timing margin. This section discusses
ways to include the calibration effects in timing analysis.
Calibration Emulation for Calibrated Path
In conventional static timing analysis, calibration paths do not include calibration
effects. To account for the calibration effects, the timing analyzer emulates the
calibration process and integrates it into the timing analysis. Normally the calibration
process involves adding or subtracting delays to a path. The analyzer uses the delay
obtained through static timing analysis in the emulation algorithm to estimate the
extra delay added during calibration. With these estimated delays, the timing analysis
emulates hardware calibration and obtains a better estimate timing margin.
1
Refer to <phy_variation_name>_report_timing.tcl and <phy_variation_name>_
report_timing_core.tcl for the files that determine the timing margin after calibration.
Calibration Error or Quantization Error
Hardware devices use calibration algorithms when delay information is unknown or
incomplete. If the delay information is unknown, the timing analysis of the calibrated
paths has to work with incomplete data. This unknown information may cause the
timing analysis calibration operations to pick topologies that are different than what
would actually occur in hardware. The differences between what can occur in
hardware and what occurs in the timing analysis are quantified and included in the
timing analysis of the calibrated paths as quantization error or calibration error.
Calibration Uncertainties
Calibration results may change or reduce due to one or more of the following
uncertainties:
■
Jitter and DCD effects
■
Voltage and temperature variations
■
Board trace delays changing due to noise on terminated supply voltages
These calibration uncertainties are accounted for in the timing analysis.
Memory Calibration
All the timing paths reported include one or more memory parameters, such as tDQSS
and tDQSQ. These specifications indicate the amount of variation that occurs in various
timing paths in the memory and abstracts them into singular values so that they can
be used by others when interfacing with the memory device.
JEDEC defines these parameters in their specification for memory standards, and
every memory vendor must meet this specification or improve it. However, there is
no proportion of each specification due to different types of variations. Variations that
are of interest are typically grouped into three different types: process variations (P),
voltage variations (V), and temperature variations (T). These together compose PVT
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
10–29
variations that typically define the JEDEC specification. You can determine the
maximum P variation by comparing different dies, and you can determine the
maximum V and T variations by operating a design at the endpoints of the range of
voltage and temperature. P variations do not change once the chip has been
fabricated, while V and T variations change over time.
The timing analysis for Stratix V FPGAs at 667 MHz of various paths (if the analysis is
comprehensive and includes all the sources of noise) indicate that there is no timing
margin available. However, the designs do actually work in practice with a reasonable
amount of margin. The reason for this behavior is that the memory devices typically
have specifications that easily beat the JEDEC specification and that our calibration
algorithms calibrate out the process portion of the JEDEC specification, leaving only
the V and T portions of the variations.
The memory calibration figure determination includes noting what percentage of the
JEDEC specification of various memory parameters is caused by process variations
for which Altera IPs' (ALTMEMPHY and UniPHY) calibration algorithms can
calibrate out, and to apply that to the full JEDEC specification. The remaining portion
of the variation is caused by voltage and temperature variations which cannot be
calibrated out.
You can find the percentage of the JEDEC specification that is due to process variation
is set in <variation>_report_timing.tcl.
Timing Model Assumptions and Design Rules
External memory interfaces using Altera IP are optimized for highest performance,
and use a high-performance timing model to analyze calibrated and
source-synchronous, double-data rate I/O timing paths. This timing model applies to
designs that adhere to a set of predefined assumptions. These timing model
assumptions include memory interface pin-placement requirements, PLL and clock
network usage, I/O assignments (including I/O standard, termination, and slew
rate), and many others.
For example, the read and write datapath timing analysis is based on the FPGA
pin-level tTCCS and tSW specifications, respectively. While calculating the read and
write timing margins, the Quartus II software analyzes the design to ensure that all
read and write timing model assumptions are valid for your variation instance.
1
November 2011
Timing model assumptions only apply to Stratix III and Cyclone III devices.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–30
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
When the Report DDR task or report_timing.tcl script is executed, the timing analysis
assumptions checker is invoked with specific variation configuration information. If a
particular design rule is not met, the Quartus II software reports the failing
assumption as a Critical Warning message. Figure 10–15 shows a sample set of
messages generated when the memory interface DQ, DQS, and CK/CK# pins are not
placed in the same edge of the device.
Figure 10–15. Read and Write Timing Analysis Assumption Verification
Memory Clock Output Assumptions
To verify the quality of the FPGA clock output to the memory device (CK/CK# or
K/K#), which affects FPGA performance and quality of the read clock/strobe used to
read data from the memory device, the following assumptions are necessary:
■
The slew rate setting must be Fast or an on-chip termination (OCT) setting must be
used.
■
The output delay chains must all be 0 (the default value applied by the Quartus II
software). These delay chains include the Cyclone III output register to pin delay
chain and the Stratix III D5 and D6 output delay chains.
■
The output open-drain parameter on the memory clock pin IO_OBUF atom must be
Off. The Output Open Drain logic option must not be enabled.
■
The weak pull-up on the CK and CK# pads must be Off. The Weak Pull-Up
Resistor logic option must not be enabled.
■
The bus hold on the CK and CK# pads must be Off. The Enable Bus-Hold
Circuitry logic option must not be enabled.
■
All CK and CK# pins must be declared as output-only pins or bidirectional pins
with the output enable set to VCC.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
10–31
Cyclone III Devices
For Cyclone III devices the following additional memory clock assumptions are
necessary:
■
The memory clock output pins must be fed by DDIO output registers and placed
on DIFFIO p- and n- pin pairs.
■
The memory output clock signals must be generated using the DDIO
configuration shown in Figure 10–16. In this configuration, the high register
connects to VCC and the low register connects to GND.
Figure 10–16. DDIO Configuration
VCC
DDIO
Clk
PLL reference clock
PLL
mem_clk_2x
CK or K
VCC
DDIO
CK# or K#
Clk
■
CK and CK# pins must be fed by a DDIO_OUT WYSIWYG with datainhi
connected to GND and datainlo connected to VCC.
■
CK or K pins must be fed by a DDIO_OUT with its clock input from the PLL
inverted.
■
CK# or K# pins must be fed by a DDIO_OUT with its clock input from the PLL
uninverted.
■
The I/O standard and current strength settings on the memory clock output pins
must be as follows:
■
SSTL-2 Class I and 12 mA, or SSTL-2 Class II and 16 mA for DDR SDRAM
interfaces
■
SSTL-18 Class I and 12 mA, or SSTL-18 Class II and 16 mA for DDR2 SDRAM
interfaces
f For more information about placing memory clock output pins, refer to “Additional
Placement Rules for Cyclone III and Cyclone IV Devices” in the Planning Pin and
Resource chapter in volume 2 of the External Memory Interface Handbook.
Stratix III Devices
For Stratix III devices the following additional memory clock assumptions are
necessary:
■
November 2011
All memory clock output pins must be placed on DIFFOUT pin pairs on the same
edge of the device.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–32
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
■
For DDR3 SDRAM interfaces:
■
The CK pins must be placed on FPGA output pins marked DQ, DQS, or DQSn.
■
The CK pin must be fed by an OUTPUT_PHASE_ALIGNMENT WYSIWYG with a 0°
phase shift.
■
The PLL clock driving CK pins must be the same as the clock driving the DQS
pins.
■
The T4 (DDIO_MUX) delay chains setting for the memory clock pins must be the
same as the settings for the DQS pins.
■
For non-DDR3 interfaces, the T4 (DDIO_MUX) delay chains setting for the memory
clock pins must be greater than 0.
■
The programmable rise and fall delay chain settings for all memory clock pins
must be set to 0.
■
The memory output clock signals must be generated with the DDIO configuration
shown in Figure 10–17, with a signal splitter to generate the n- pin pair and a
regional clock network-to-clock to output DDIO block.
Figure 10–17. DIDO Configuration with Signal Splitter
FPGA LEs
I/O Elements
VCC
D
D
Q
Q
1
0
mem_clk (1)
mem_clk_n (1)
System Clock (2)
Notes to Figure 10–17:
(1) The mem_clk[0] and mem_clk_n[0] pins for DDR3, DDR2, and DDR SDRAM interfaces use the I/O input buffer
for feedback, therefore bidirectional I/O buffers are used for these pins. For memory interfaces using a differential
DQS input, the input feedback buffer is configured as differential input; for memory interfaces using a single-ended
DQS input, the input buffer is configured as a single-ended input. Using a single-ended input feedback buffer
requires that the I/O standard’s VREF voltage is provided to that I/O bank’s VREF pins.
(2) Regional QCLK (quadrant) networks are required for memory output clock generation to minimize jitter.
Write Data Assumptions
To verify the memory interface using the FPGA TCCS output timing specifications,
the following assumptions are necessary:
■
For QDRII, QDRII+, and RLDRAM II SIO memory interfaces, the write clock
output pins (such as K/K# or DK/DK#) must be placed in DQS/DQSn pin pairs.
■
The PLL clock used to generate the write-clock signals and the PLL clock used to
generate the write-data signals must come from the same PLL.
■
The slew rate for all write clocks and write data pins must be set to Fast or OCT
must be used.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
10–33
■
When auto deskew is not enabled (or not supported by the ALTMEMPHY
configuration), the output delay chains and output enable delay chains must all be
set to the default values applied by Quartus II. These delay chains include the
Cyclone III output register and output enable register-to-pin delay chains, and the
Stratix III D5 and D6 delay chains.
■
The output open drain for all write clocks and write data pins’ IO_OBUF atom must
be set to Off. The Output Open Drain logic option must not be enabled.
■
The weak pull-up for all write clocks and write data pins must be set to Off.
The Weak Pull-Up Resistor logic option must not be enabled.
■
The Bus Hold for all write clocks and write data pins must be set to Off.
The Enable Bus-Hold Circuitry logic option must not be enabled.
Cyclone III Devices
For Cyclone III devices the following additional write data assumptions are
necessary:
■
Write data pins (including the DM pins) must be placed on DQ pins related to the
selected DQS pins.
■
All write clock pins (DQS/DQS#) must be fed by DDIO output registers.
■
All write data pins must be fed by DDIO output registers, VCC, or GND.
■
The phase shift of the PLL clock used to generate the write clocks must be 72° to
108° more than the PLL clock used to generate the write data (nominally 90°
offset).
■
The I/O standard and current strength settings on the write data- and
clock-output pins must be as follows:
■
SSTL-2 Class I and 12 mA, or SSTL-2 Class II and 16 mA for DDR SDRAM
interfaces
■
SSTL-18 Class I and 8/12 mA, or SSTL-18 Class II and 16 mA for DDR2
SDRAM interfaces
Stratix III Devices
For Stratix III devices the following additional write data assumptions are necessary:
November 2011
■
Differential write clock signals (DQS/DQSn) must be generated using the signal
splitter.
■
The write data pins (including the DM pins) must be placed in related DQ pins
associated with the chosen DQS pin. The only exception to this rule is for QDRII
and QDRII+ ×36 interfaces emulated using two ×18 DQ groups. For such
interfaces, all of the write data pins must be placed on the same edge of the device
(left, right, top, or bottom). Also, the write clock K/K# pin pair should be placed
on one of the DQS/DQSn pin pairs on the same edge.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–34
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
■
■
■
All write clock pins must have similar circuit structure.
■
For DDR2 SDRAM interfaces and DDR3 SDRAM with leveling interfaces, all
DQS/DQS# write strobes must be fed by DDIO output registers clocked by the
write-leveling delay chain in the OUTPUT_PHASE_ALIGNMENT block.
■
For DDR and DDR2 SDRAM interfaces, all write clock pins must be fed by
DDIO output registers clocked by a global or regional clock network.
All write data pins must have similar circuit structure.
■
For DDR3 SDRAM interfaces, all write data pins must be fed by either DDIO
output registers clocked by the OUTPUT_PHASE_ALIGNMENT block, VCC, or
GND.
■
For DDR and DDR2 SDRAM interfaces, all write data pins must be fed by
either DDIO output registers clocked by a global or regional clock network,
VCC, or GND.
The write clock output must be 72,° 90°, or 108° more than the write data output.
■
For DDR2 SDRAM and DDR3 SDRAM with leveling interfaces, the
write-leveling delay chain in the OUTPUT_PHASE_ALIGNMENT block must
implement a phase shift of 72°, 90°, or 108° to center-align write clock with
write data.
■
For DDR and DDR2 SDRAM interfaces, the phase shift of the PLL clock used to
clock the write clocks must be 72 to 108° more than the PLL clock used to clock
the write data clocks to generated center-aligned clock and data.
■
The T4 (DDIO_MUX) delay chains must all be set to 3. When differential DQS
(using splitter) is used, T4 must be set to 2.
■
The programmable rise and fall delay chain settings for all memory clock pins
must be set to 0.
Table 10–9 lists I/O standards supported for the write clock and write data signals for
each memory type and pin location.
Table 10–9. I/O standards (Part 1 of 2)
Memory
Type
Placement
Legal I/O Standards for DQS
Legal I/O Standards for DQ
DDR3 SDRAM
Row I/O
Differential 1.5-V SSTL Class I
1.5-V SSTL Class I
DDR3 SDRAM
Column I/O
Differential 1.5-V SSTL Class I
1.5-V SSTL Class I
Differential 1.5-V SSTL Class II
1.5-V SSTL Class II
SSTL-18 Class I
DDR2 SDRAM
Any
SSTL-18 Class II
SSTL-18 Class I
Differential 1.8V SSTL Class I
SSTL-18 Class II
Differential 1.8V SSTL Class II
DDR SDRAM
Any
External Memory Interface Handbook
Volume 2: Design Guidelines
SSTL-2 Class I
SSTL-2 Class I
SSTL-2 Class II
SSTL-2 Class II
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
10–35
Table 10–9. I/O standards (Part 2 of 2)
Memory
Type
Placement
QDRII and QDR II +
SRAM
Any
RLDRAM II
Any
Legal I/O Standards for DQS
Legal I/O Standards for DQ
HSTL-1.5 Class I
HSTL-1.5 Class I
HSTL-1.8 Class I
HSTL-1.8 Class I
HSTL-1.5 Class I
HSTL-1.5 Class I
HSTL-1.8 Class I
HSTL-1.8 Class I
Read Data Assumptions
To verify that the external memory interface can use the FPGA Sampling Window
(SW) input timing specifications, the following assumptions are necessary:
■
The read clocks input pins must be placed on DQS pins. DQS/DQS# inputs must
be placed on differential DQS/DQSn pins on the FPGA.
■
Read data pins (DQ) must be placed on the DQ pins related to the selected DQS
pins.
■
For QDR II and QDR II+ SRAM interfaces, the complementary read clocks must
have a single-ended I/O standard setting of HSTL-18 Class I or HSTL-15 Class I.
■
For RLDRAM II interfaces, the differential read clocks must have a single ended
I/O standard setting of HSTL 18 Class I or HSTL 15 Class I.
Cyclone III Devices
For Cyclone III devices the following additional read data and mimic pin assumptions
are necessary:
■
November 2011
The I/O standard setting on read data and clock input pins must be as follows:
■
SSTL-2 Class I and Class II for DDR SDRAM interface
■
SSTL-18 Class I and Class II for DDR2 SDRAM interfaces
■
The read data and mimic input registers (flip-flops fed by the read data pin’s input
buffers) must be placed in the LAB adjacent to the read data pin. A read data pin
can have 0 input registers.
■
Specific routing lines from the IOE to core read data/mimic registers must be
used. The Quartus II Fitter ensures proper routing unless user-defined placement
constraints or LogicLock™ assignments force non-optimal routing. User
assignments that prevent input registers from being placed in the LAB adjacent to
the IOE must be removed.
■
The read data and mimic input pin input pad to core/register delay chain must be
set to 0.
■
If all read data pins are on row I/Os or column I/Os, the mimic pin must be placed
in the same type of I/O (row I/O for read-data row I/Os, column I/O for
read-data column I/Os). For wraparound cases, the mimic pin can be placed
anywhere.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–36
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
Stratix III Devices
For Stratix III devices the following additional read data and mimic pin assumptions
are necessary:
■
For DDR3, DDR2, and DDR SDRAM interfaces, the read clock pin can only drive a
DQS bus clocking a ×4 or ×9 DQ group.
■
For QDR II, QDR II+ SRAM, and RLDRAM II interfaces, the read clock pin can
only drive a DQS bus clocking a ×9, ×18, or ×36 DQ group.
■
For non-wraparound DDR, DDR2, and DDR3 interfaces, the mimic pin, all read
clock, and all read data pins must be placed on the same edge of the device (top,
bottom, left, or right). For wraparound interfaces, these pins can be placed on
adjacent row I/O and column I/O edges and operate at reduced frequencies.
■
All read data pins and the mimic pin must feed DDIO_IN registers and their input
delay chains D1, D2, and D3 set to the Quartus II default.
■
DQS phase-shift setting must be either 72° or 90° (supports only one phase shift for
each operating band and memory standard).
■
All read clock pins must have the dqs_ctrl_latches_enable parameter of its
DQS_DELAY_CHAIN WYSIWYG set to false.
■
The read clocks pins must have their D4 delay chain set to the Quartus II default
value of 0.
■
The read data pins must have their T8 delay chain set to the Quartus II default
value of 0.
■
When differential DQS strobes are used (DDR3 and DDR2 SDRAM), the mimic pin
must feed a true differential input buffer. Placing the memory clock pin on a
DIFFIO_RX pin pair allows the mimic path to track timing variations on the DQS
input path.
■
When single ended DQS strobes are used, the mimic pin must feed a single ended
input buffer.
Mimic Path Assumptions
To verify that the ALTMEMPHY-based DDR, DDR2, or DDR3 SDRAM interface’s
mimic path is configured correctly, the mimic path input must be placed on the
mem_clk[0] pin.
DLL Assumptions
The following DLL assumptions are necessary:
1
These assumptions do not apply to Cyclone III devices.
■
The DLL must directly feed its delayctrlout[] outputs to all DQS pins without
intervening logic or inversions.
■
The DLL must be in a valid frequency band of operation as defined in the
corresponding device data sheet.
■
The DLL must have jitter reduction mode and dual-phase comparators enabled.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Model Assumptions and Design Rules
10–37
PLL and Clock Network Assumptions
The PLL and clock network assumptions vary for each device family.
Stratix III Devices
■
The PLL that generates the memory output clock signals and write data and clock
signals must be set to No compensation mode to minimize output clock jitter.
■
The reference input clock signal to the PLL must be driven by the dedicated clock
input pin located adjacent to the PLL, or from the clock output signal from the
adjacent PLL. To minimize output clock jitter, the reference input clock pin to the
ALTMEMPHY PLL must not be routed through the core using global or regional
clock networks. If the reference clock cascades from another PLL, that upstream
PLL must be in No compensation mode and Low bandwidth mode.
■
For DDR3 and DDR2 SDRAM interfaces, use only regional or dual regional clock
networks to route PLL outputs that generate the write data, write clock, and
memory output clock signals. This requirement ensures that the memory output
clocks (CK/CK#) meet the memory device input clock jitter specifications, and
that output timing variations or skews are minimized.
■
For other memory types, the same clock tree type (global, regional, or dual
regional) is recommended for PLL clocks generating the write clock, write data,
and memory clock signals to minimize timing variations or skew between these
outputs.
Cyclone III Devices
To verify that the memory interface’s PLL is configured correctly, the following
assumptions are necessary:
November 2011
■
The PLL that generates the memory output clock signals and write data/clock
signals must be set to Normal compensation mode in Cyclone III devices.
■
PLL cascading is not supported.
■
The reference input clock signal to the PLL must be driven by the dedicated clock
input pin located adjacent to the PLL. The reference input clock pin must not be
routed through the core using global or regional clock networks to minimize
output clock jitter.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–38
Chapter 10: Analyzing Timing of Memory IP
Timing Closure
Timing Closure
This section describes common issues and how to optimize timing.
Common Issues
This topic describes potential timing closure issues that can occur when using the
ALTMEMPHY or UniPHY IP. For possible timing closure issues with ALTMEMPHY
or UniPHY variations, refer to the Quartus II Software Release Notes for the software
version that you are using. You can solve some timing issues by moving registers or
changing the project fitting setting to Standard (from Auto).
f The Quartus II Software Release Notes list common timing issues that can be
encountered in a particular version of the Quartus II software.
Missing Timing Margin Report
The ALTMEMPHY and UniPHY timing margin reports may not be generated during
compilation if the .sdc does not appear in the Quartus II project settings.
Timing margin reports are not generated if you specify the ALTMEMPHY or UniPHY
variation as the top-level project entity. Instantiate the ALTMEMPHY or UniPHY
variation as a lower level module in your user design or memory controller.
Incomplete Timing Margin Report
The timing report may not include margin information for certain timing paths if
certain memory interface pins are optimized away during synthesis. Verify that all
memory interface pins appear in the <variation>_autodetectedpins.tcl
(ALTMEMPHY) or <variation>_all_pins.txt (UniPHY) file generated during
compilation, and ensure that they connect to the I/O pins of the top-level FPGA
design.
Read Capture Timing
In Stratix III and Stratix IV devices, read capture timing may fail if the DQS phase
shift selected is not optimal or if the board skew specified is large.
■
You can adjust the effective DQS phase shift implemented by the DLL to balance
setup and hold margins on the read timing path. The DQS phase shift can be
adjusted in coarse PVT-compensated steps of 22.5°, 30°, 36°, or 45° by changing the
number of delay buffers (range 1 to 4), or in fine steps using the DQS phase offset
feature that supports uncompensated delay addition and subtraction in
approximately 12 ps steps.
■
To adjust the coarse phase shift selection, determine the supported DLL modes for
your chosen memory interface frequency by referencing the DLL and DQS Logic
Block Specifications tables in the Switching Characteristics section of the device data
sheet. For example, a 400 MHz DDR2 interface on a -2 speed grade device can use
DLL mode 5 (resolution 36°, range 290 – 450 MHz) to implement a 72° phase shift,
or DLL mode 6 (resolution 45°, range
360–560 MHz) to implement a 90° phase shift.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Closure
10–39
In Cyclone III devices, the read capture is implemented using a calibrated clock, and
therefore no clock phase-shift adjustment is possible. Additionally, the capture
registers are routed to specific LE registers in the logic array blocks (LABs) adjacent to
the IOE using predefined routing. Therefore, no timing optimization is possible for
this path. Ensure that you select the correct memory device speed grade for the FPGA
speed grade and interface frequency.
1
Ensure that you specify the appropriate board-skew parameter when you
parameterize the controllers in the parameter editor. The default board trace length
mismatch used is 20 ps.
Write Timing
Negative timing margins may be reported for write timing paths if the PLL phase
shift used to generate the write data signals is not optimal. Adjust the PLL phase shift
selection on the write clock PLL output using the PLL MegaWizard Plug-In Manager.
1
Regenerating the ALTMEMPHY- or UniPHY-based controller overwrites changes
made using the PLL MegaWizard Plug-In Manager.
Address and Command Timing
You can optimize the timing margins on the address and command timing path by
changing the PLL phase shift used to generate these signals. Modify the Dedicated
Clock Phase parameter in the PHY Settings page of the ALTMEMPHY parameter
editor. In the DDR2 or DDR3 SDRAM Controllers with UniPHY IP cores, modify the
Additional CK/CK# phase and Additional Address and Command clock phase
parameters.
The DDR2 and DDR3 SDRAM memory controllers use 1T memory timing even in
half-rate mode, which may affect the address command margins for DDR2 or DDR3
SDRAM designs that use memory DIMMs. DDR2 SDRAM designs have a greater
impact because the address command bus fans out to all the memory devices on a
DIMM increasing the loading effect on the bus. Make sure your board designs are
robust enough to have the memory clock rising edge within the 1T address command
window. You can also use the Additional Address and Command clock phase
parameter to adjust the phase of the address and command if needed.
The far-end load value and board trace delay differences between address and
command and memory clock pins can result in timing failures if they are not
accounted for during timing analysis.
The Quartus II Fitter may not optimally set output delay chains on the address and
command pins. To ensure that any PLL phase-shift adjustments are not negated by
delay chain adjustments, create logic assignments using the Assignment Editor to set
all address and command output pin D5 delay chains to 0.
For HardCopy III, HardCopy IV, Stratix III, and Stratix IV devices, some corner cases
of device family and memory frequency may require an increase to the address and
command clock phase to meet core timing. You can identify this situation, if the DDR
timing report shows a PHY setup violation with the phy_clk launch clock, and the
address and command latch clock—clock 0 (half-rate phy_clk) or 2 (full-rate phy_clk),
and 6, respectively.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–40
Chapter 10: Analyzing Timing of Memory IP
Timing Closure
If you see this timing violation, you may fix it by advancing the address and
command clock phase as required. For example, a 200-ps violation for a 400-MHz
interface represents 8% of the clock period, or 28.8. Therefore, advance the address
and command phase from 270 to 300. However, this action reduces the setup and
hold margin at the DDR device.
PHY Reset Recovery and Removal
A common cause for reset timing violations in ALTMEMPHY or UniPHY designs is
the selection of a global or regional clock network for a reset signal. The
ALTMEMPHY or UniPHY IP does not require any dedicated clock networks for reset
signals. Only ALTMEMPHY or UniPHY PLL outputs require clock networks, and any
other PHY signal using clock networks may result in timing violations.
You can correct such timing violations by:
■
Setting the Global Signal logic assignment to Off for the problem path (using the
Assignment Editor), or
■
Adjusting the logic placement (using the Assignment Editor or Chip Planner)
Clock-to-Strobe (for DDR and DDR2 SDRAM Only)
Memory output clock signals and DQS strobes are generated using the same PLL
output clock. Therefore, no timing optimization is possible for this path and positive
timing margins are expected for interfaces running at or below the FPGA data sheet
specifications.
For DDR3 interfaces, the timing margin for this path is reported as Write Leveling.
Read Resynchronization and Write Leveling Timing (for SDRAM Only)
These timing paths apply only to Arria II GX, Stratix III, and Stratix IV devices, and
are implemented using calibrated clock signals driving dedicated IOE registers.
Therefore, no timing optimization is possible for these paths, and positive timing
margin is expected for interfaces running at or below the FPGA data sheet
specifications.
Ensure that you specify the correct memory device timing parameters (tDQSCK, tDSS,
tDSH) and board skew (tEXT) in the ALTMEMPHY, DDR, DDR2, and DDR3 SDRAM
Controllers with ALTMEMPHY, or DDR2 and DDR3 SDRAM Controllers with
UniPHY parameter editor.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Closure
10–41
Optimizing Timing
For full-rate designs you may need to use some of the Quartus II advanced features, to
meet core timing, by following these steps:
1. On the Assignments menu click Settings. In the Category list, click Analysis &
Synthesis Settings. For Optimization Technique select Speed (see Figure 10–18).
Figure 10–18. Optimization Technique
2. In the Category list, click Physical Synthesis Optimizations. Specify the following
options:
■
Turn on Perform physical synthesis for combinational logic.
f For more information about physical synthesis, refer to the Netlist and
Optimizations and Physical Synthesis chapter in the Quartus II Software
Handbook.
November 2011
■
Turn on Perform register retiming
■
Turn on Perform automatic asynchronous signal pipelining
■
Turn on Perform register duplication
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–42
Chapter 10: Analyzing Timing of Memory IP
Timing Closure
1
You can initially select Normal for Effort level, then if the core timing is still
not met, select Extra (see Figure 10–19).
Figure 10–19. Physical Synthesis Optimizations
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
10–43
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3
SDRAM Designs
In a multiple chip select system, each individual rank has its own chip select signal.
Consequently, you must change the Total Memory chip selects, Number of chip
select (for discrete components) or Number of chip select per slot (DIMMs) in the
Preset Editor of the ALTMEMPHY- or UniPHY-based parameter editors.
In the Preset Editor, you must leave the baseline non-derated tDS, tDH, tIS, tIH values,
because the settings on the Board Settings page account for multiple chip select slew
rate deration.
This section explains the following two timing deration methodologies for multiple
chip-select DDR2 and DDR3 SDRAM designs:
■
Timing Deration using the Board Settings
■
Timing Deration Using the Excel-Based Calculator
For Arria II GX, Arria II GZ, Stratix IV, and Stratix V devices, the ALTMEMPHY, and
ALTMEMPHY- and UniPHY-based controller parameter editors have an option to
select multiple chip-select deration.
1
To perform multiple chip-select timing deration for other Altera devices (for example
Cyclone III and Stratix III devices), Altera provides an Excel-based calculator
available from the Altera website.
Timing deration in this section applies to either discrete components or DIMMs.
1
You can derate DDR SDRAM multiple chip select designs by using the DDR2 SDRAM
section of the Excel-based calculator, but Altera does not guarantee the results.
This section assumes you know how to obtain data on PCB simulations for timing
deration from HyperLynx or any other PCB simulator.
Multiple Chip Select Configuration Effects
A DIMM contains one or several RAM chips on a small PCB with pins that connect it
to another system such as a motherboard or router.
Nonregistered (unbuffered) DIMMs (or UDIMMs) connect address and control buses
directly from the module interface to the DRAM on the module.
Registered DIMMs (RDIMMs) improve signal integrity (and hence potentially clock
rates and overall memory size) by electrically buffering the signals with a register, at a
cost of an extra clock of increased latency. In addition, most RDIMMs come with error
correction coding (ECC) as standard.
Multiple chip select configurations allow for one set of data pins (and address pins for
UDIMMs) to be connected to two or more memory ranks. Multiple chip select
configurations have a number of effects on the timing analysis including the
intersymbol interference (ISI) effects, board effects, and calibration effects.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–44
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
ISI Effects
With multiple chip selects and possible slots loading the far end of the pin, there may
be ISI effects on a signal causing the eye openings for DQ, DQS, and address and
command signals to be smaller than for single-rank designs (Figure 10–20).
Figure 10–20 shows the eye shrinkage for DQ signal of a single rank system (top) and
multiple chip select system (bottom). The ISI eye reductions reduce the timing
window available for both the write path and the address and command path
analysis. You must specify them as output delay constraints in the .sdc.
Extra loading from the additional ranks causes the slew rate of signals from the FPGA
to be reduced. This reduction in slew rate affects some of the memory parameters
including data, address, command and control setup and hold times (tDS, tDH, tIS,
and tIH).
Figure 10–20. Eye Shrinkage for DQ Signal
Calibration Effects
In addition to the SI effects, multiple chip select topologies change the way that the
FPGA calibrates to the memories. In single-rank situations with leveling, the
calibration algorithms set delay chains in the FPGA such that specific DQ and DQS
pin delays from the memory are equalized (only for ALTMEMPHY-based designs at
401 MHz and above) so that the write-leveling and resynchronization timing
requirements are met. In single rank without leveling situations, the calibration
algorithm centers the resynchronization or capture phase such that it is optimum for
the single rank. When there are two or more ranks in a system, the calibration
algorithms must calibrate to the average point of the ranks.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
10–45
Board Effects
Unequal length PCB traces result in delays reducing timing margins. Furthermore,
skews between different memory ranks can further reduce the timing margins in
multiple chip select topologies. Board skews can also affect the extent to which the
FPGA can calibrate to the different ranks. If the skew between various signals for
different ranks is large enough, the timing margin on the fully calibrated paths such as
write leveling and resynchronization changes.
To account for all these board effects for Arria II GX, Arria II GZ, Arria V, Cyclone V,
Stratix IV, and Stratix V devices, refer to the Board Settings page in the
ALTMEMPHY- or UniPHY-based controller parameter editors.
1
To perform multiple chip select timing deration for other Altera devices (for example
Cyclone III and Stratix III devices), use the Excel-based calculator available from the
Altera website.
Timing Deration using the Board Settings
When you target Arria II GX, Arria II GZ, Arria V, Cyclone V, Stratix IV, or Stratix V
devices, the ALTMEMPHY- or UniPHY-based parameter editors include the Board
Settings page, to automatically account for the timing deration caused by the
multiple chip selects in your design.
When you target Cyclone III or Stratix III devices, you can derate single chip-select
designs using the parameter editors to account for the skews, ISI, and slew rates in the
Board Settings page.
If you are targeting Cyclone III or Stratix III devices you see the following warning:
"Warning: Calibration performed on all chip selects, timing analysis
only performed on first chip select. Manual timing derating is
required"
1
You must perform manual timing deration using the Excel-based calculator.
The Board Settings page allows you to enter the parameters related to the board
design including skews, signal integrity, and slew rates. The Board Settings page also
includes the board skew setting parameter, Addr/Command to CK skew, (previously
on the PHY Settings tab).
Slew Rates
You can obtain the slew rates in one of the following ways:
■
November 2011
Altera performs PCB simulations on internal Altera boards to compute the output
slew rate and ISI effects of various multiple chip select configurations. These
simulation numbers are prepopulated in the Slew Rates based on the number of
ranks selected. The address and command setup and hold times ( tDS, tDH, tIS,
tIH) are then computed from the slew rate values and the baseline nonderated
tDS, tDH, tIS, tIH numbers entered in the Preset Editor. The parameter editor
shows the computed values in Slew Rates. If you do not have access to a simulator
to obtain accurate slew rates for your specific system, Altera recommends that you
use these prepopulated numbers for an approximate estimate of your actual board
parameters.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–46
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
■
Alternatively, you can update these default values, if dedicated board simulation
results are available for the slew rates. Custom slew rates cause the tDS, tDH, tIS,
tIH values to be updated. Altera recommends performing board level simulations
to calculate the slew rate numbers that accounts for accurate board-level effects for
your board.
■
You can modify the auto-calculated tDS, tDH, tIS, tIH values with more accurate
dedicated results direct from the vendor data sheets, if available.
Slew Rate Setup, Hold, and Derating Calculation
Slew rate is calculated based on the nominal slew rate for setup and hold times. The
total tIS (setup time) and tIH (hold time) required is calculated by adding the Micron
data sheet tIS (base) and tIH (base) values to the delta tIS and delta tIH derating
values, respectively.
For more information about slew rate calculation, setup, hold, and derating values,
download the data sheet specifications from the following vendor websites:
■
Micron (www.micron.com)
For example, refer to Command and Address Setup, Hold, and Derating section in the
Micron DDR3 data sheet.
■
JEDEC (www.jedec.org)
For example, refer to the DDR2 SDRAM Standard data sheet.
The following section describes the timing derating algorithms and shows you where
to obtain the setup, hold, and derating values in the Micron datasheet.
The slew rate derating process uses the following timing derating algorithms, which
is similar to the JEDEC specification:
tDS = tDS(base) + delta tDS + (VIHAC -VREF)/(DQ slew rate)
tDH
= tDH(base) + delta tDH + (VIHDC -VREF)/(DQ slew rate)
tIS = tIS (base) + delta tIS + (VIHAC -VREF)/(Address/Command slew rate)
tIH= tIH(base) + delta tIH + (VIHDC -VREF)/(Address/Command slew rate)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
10–47
where:
a. The setup and hold values for tDS(base), tDH(base), tIS(base), and tIH(base)
are obtained from the Micron datasheet.
Figure 10–21 shows a screenshot example of the values from the Micron
datasheet.
Figure 10–21. Setup and Hold Values from Micron Datasheet
b. The JEDEC defined logic trip points for DDR3 SDRAM memory standard are
as follow:
November 2011
Altera Corporation
■
VIHAC = VREF + 0.175 V
■
VIHDC = VREF + 0.1 V
■
VILAC = VREF - 0.175 V
■
VILDC = VREF - 0.1 V
External Memory Interface Handbook
Volume 2: Design Guidelines
10–48
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
c. The derating values for delta tIS, tIH, tDH, and tDS are obtained from the
Micron data sheet.
Figure 10–22 shows the screenshot of the derating values from the Micron data
sheet.
Figure 10–22. Derating Values from Micron Datasheet
Intersymbol Interference
ISI parameters are similarly auto-populated based on the number of ranks you select
with Altera's PCB simulations. You can update these autopopulated typical values, if
more accurate dedicated simulation results are available.
Altera recommends performing board-level simulations to calculate the slew rate and
ISI deltas that account for accurate board level effects for your board. You can use
HyperLynx or similar simulators to obtain these simulation numbers. The default
values have been computed using HyperLynx simulations for Altera boards with
multiple DDR2 and DDR3 SDRAM slots.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
1
10–49
For DQ and DQS ISI there is one textbox for the total ISI, which assumes symmetric
setup and hold. For address and command, there are two textboxes: one for ISI on the
leading edge, and one for the lagging edge, to allow for asymmetric ISI.
The wizard writes these parameters for the slew rates and the ISI into the .sdc and
they are used during timing analysis.
Board Skews
Table 10–10 lists the types of board skew.
Table 10–10. Board Skews
Board Skew
Description
ALTMEMPHY
Minimum CK/DQS skew to
DIMM
Maximum CK/DQS skew to
DIMM
UniPHY
—
The largest negative skew that exists between the CK signal
and any DQS signal when arriving at any rank. This value
affects the write leveling margin for DDR3 SDRAM DIMM
interfaces in multiple chip select configurations only.
—
The maximum skew (or largest positive skew) between the
CK signal and any DQS signal when arriving at any rank.
This value affects the write leveling margin for DDR3
SDRAM DIMM interfaces in multiple chip select
configurations.
Maximum skew between
DIMMs
Maximum delay difference
between DIMMs/devices
The largest skew or propagation delay between ranks
(especially for different ranks in different slots). This value
affects the resynchronization margin for DDR2 and DDR3
SDRAM interfaces in multiple chip select configurations.
Maximum skew within DQS
group
Maximum skew within DQS
group
The largest skew between DQ pins in a DQS group. This
value affects the read capture and write margins for DDR2
and DDR3 SDRAM interfaces.
Maximum skew between DQS
groups
Maximum skew between DQS
groups
The largest skew between DQS signals in different DQS
groups. This value affects the resynchronization margin in
non-leveled memory interfaces such as DDR2 and DDR3
SDRAM.
Maximum delay difference
between Address/Command
and CK
The skew (or propagation delay) between the CK signal and
the address and command signals. Positive values
represent address and command signals that are longer
than CK signals; negative values represent address and
command signals that are shorter than CK signals. The
Quartus II software uses this skew to optimize the delay of
the address and command signals to have appropriate
setup and hold margins for DDR2 and DDR3 SDRAM
interfaces.
Maximum skew within
Address/Command bus
The largest skew between the Address/Command signals.
Address and command to CK
skew
—
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–50
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
Measuring Eye Reduction for Address/Command, DQ, and DQS Setup and Hold Time
This section describes how to measure eye reduction for address/command, DQ, and
DQS.
Address/Command
The setup and hold times for address/command eye reduction is measured by
comparing both the multirank and single rank address/command timing window
as shown in Figure 10–24. Relative to the single rank address/command timing
window, the reduction of the eye opening on the left side of the window denotes
the setup time, while the reduction of the eye opening on the right side denotes the
hold time.
To obtain the address/command eye reduction (setup time), measure the VIL(AC) or
VIH(AC) difference between the single rank and multirank timing window, denoted
by A in Figure 10–23 and Figure 10–24.
To obtain the address/command eye reduction (hold time), measure the VIL(DC) or
VIH(DC) difference between the single rank and multirank timing window, denoted
by B in Figure 10–23 and Figure 10–24.
Figure 10–23. Difference between Single Rank and Multirank Timing Window for
Address/Command Eye Reduction (Setup)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
10–51
Figure 10–24. Difference between Single Rank and Multirank Timing Window for
Address/Command Eye Reduction (Hold)
For signals with multiple loads, look at the measurements for all the target
locations and pick the worst case eye width in all cases. For example, if pin A7 is
the worst case eye width from pins A0 to A14, then the A7 measurement is used
for the address signal. In general, look for eye reduction for the worst-pin in
single-rank as compared to the worst-pin in multirank regardless of whether the
pin is the same pin or a different pin.
DQ
The method to measure the DQ eye reduction is similar to the method you use to
measure the command/address eye reduction. To measure the DQ eye reduction
hold time, compare VIH(DC) or VIL(DC) between the single rank and multirank
timing window. To measure the DQ eye reduction setup time, compare VIH(AC) or
VIL(AC) between the single rank and multirank timing window.
DQS
DQS arrival time is the jitter before and after the single-rank timing window that
you must enter in the GUI. The DQS arrival time is indicated by the DQS signal
eye reduction between the signal rank system and multiple chip selects. Therefore,
the method to measure the DQS arrival time is similar to the method you use to
measure the command/address and DQ eye reduction.
ISI and Board Skew
Skews are systematic effects that are not time-varying, while ISI values are timeand pattern-dependent varying margin reduction.
In the .sdc, the address/command eye reduction is applied as an output delay
constraint on the setup side and on the hold side, respectively.
For the write analysis, the eye reduction in DQ is applied as an output delay
constraint, with half on the setup side and half on the output side. Similarly, the
extra variation in the DQS is also applied as an output delay constraint with half
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–52
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
removed from the setup side and half removed from the hold side.
The board skews are included in the timing margin on the fully calibrated path
such as write-leveling and resynchronization. Both the ISI and board skews values
are entered to ensure that the interfaces are not over constraint.
Timing Deration Using the Excel-Based Calculator
To perform multiple chip select timing deration for other Altera devices (for example
Stratix III and Cyclone III devices), use the Excel-based calculator, which is available
from the Altera web site. You can also derate single chip-select cases using the Excel
based calculator for devices that do not have the board settings panel provided you
have the board simulation data to account for the ISI, skew and slew rate information.
The Excel-based calculator requires data like the slew rate, eye reduction, and the
board skews of your multiple chip select system as inputs and outputs the final result
based on built-in formula.
The calculator requires the Quartus II timing results (report DDR section) from the
single rank version of your system. Two simulations are also required for the slew rate
and ISI information required by the calculator: a baseline single rank version of your
system and your multiple chip select system. The calculator uses the timing deltas of
these simulation results for the signals of interest (DQ, DQS, CK/CK#, address and
command, and CS). You must enter board skews for your specific board. The
calculator outputs the final derated timing margins, which provides a full analysis of
your multiple chip select system's performance.
The main assumption for this flow is that you have board trace models available for
the single rank version of your system. If you have these board trace models, the
Quartus II software automatically derates the effects for the single rank case correctly.
Hence the Excel-based calculator provides the deration of the supported single-rank
timing analysis, assuming that the single rank version has provided an accurate
baseline.
You must ensure that the single rank board trace models are included in your
Quartus II project so that the baseline timing analysis is accurate. If you do not have
board trace models, follow the process described at the end of this section.
Before You Use the Excel-based Calculator for Timing Deration
Ensure you have the following items before you use the Excel-based calculator for
timing deration:
1. A Quartus II project with your instantiated memory IP. Always use the latest
version of the Quartus II software, for the most accurate timing models.
2. The board trace models for the single rank version of your system.
1
If you do not have board trace models, refer to “Using the Excel-based
Calculator for Timing Deration (Without Board Trace Models)” on
page 10–55.
Using the Excel-Based Calculator
To obtain derated timing margins for multiple chip select designs using the Excelbased calculator, follow these steps:
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
10–53
1. Create a memory interface design in the Quartus II software.
2. Ensure board trace models are specified for your single rank system. Extract
Quartus II reported timing margins into the Excel-based calculator.
3. Use the slow 85C model timing results (Figure 10–25).
1
Use the worst-case values from all corners, which means that some values
may come from different corners. For example, a setup value may come
from the slow 85C model, while the hold value for the same metric may
come from the fast 0C model. In most cases, using the slow 85C model
should be accurate enough.
Figure 10–25. Quartus II Report DDR Section Timing Results for the Slow 85C Model
4. Enter the Report DDR results from Quartus II timing analysis into the Excel-based
calculator (Figure 10–26).
Figure 10–26. Calculator
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–54
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
5. Perform PCB SI simulations to get the following values:
■
Single rank eye width and topology eye width for data, strobe, and address
and command signals.
■
Multiple chip select topology slew rates for clock, address and command, and
data and strobe signals.
Table 10–11 lists the data rates and recommended stimulus patterns for various
signals and memory interface types.
1
Use a simulation tool (for example, HyperLynx), if you have access to the
relevant input files that the tool requires, or use prelayout line simulations
if the more accurate layout information is not available.
Table 10–11. Data Rates and Stimulus Patterns
Memory Interface
CLK and DQS
Toggling Pattern
(MHz)
DQ
PRBS Pattern (MHz)
Address and
Command
PRBS Pattern (MHz)
DDR2 SDRAM (with a
half-rate controller)
400
400
100
DDR2 SDRAM (with a
full-rate controller)
300
300
150
DDR3 SDRAM (with a
half-rate controller)
533
533
133
6. Calculate the deltas to enter into the Excel-based calculator. For example, if DQ for
the single rank case is 853.682 ps and DQ for the dual rank case is 805.137 ps, enter
48 ps into the calculator (Figure 10–27).
1
For signals with multiple loads, look at the measurements at all the target
locations and pick the worst case eye width in all cases. For example, for the
address bus, if A7 is the worst case eye width from pins A0 to A14, use that
measurement for the address signal.
Figure 10–27. ISI and Slew Rate Values
7. Enter the topology slew rates into the slew rate deration section. The calculator
calculates the extra tDS, tDH, tIS, tIH values.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Timing Deration Methodology for Multiple Chip Select DDR2 and DDR3 SDRAM Designs
10–55
8. Obtain the board skew numbers for your PCB from either your board simulation
or from your PCB vendor and enter them into the calculator (Figure 10–28).
Figure 10–28. Board Skew Values
The Excel-based calculator then outputs the final derated numbers for your multiple
chip select design.
Figure 10–29. Derated Setup and Hold Values
These values are an accurate calculation of your multiple chip select design timing,
assuming the simulation data you provided is correct. In this example, there is
negative slack on some of the paths, so this design does not pass timing. You have the
following four options available:
■
Try to optimize margins and see if it improves timing (for example modify address
and command phase setting)
■
Lower the frequency of your design
■
Lower the loading (change the topology of your interface to lower the loading and
skew)
■
Use a faster DIMM
Using the Excel-based Calculator for Timing Deration (Without Board Trace
Models)
If board trace models are not available for any of the signals of the single rank system,
follow these steps:
1. Create a new Quartus II Project with the Stratix III or Cyclone III device that you
are targeting and instantiate a High-Performance SDRAM Controller for your
memory interface.
2. Do not enter the board trace models (assumes a 0-pf load) and compile the
Quartus II design.
3. Enter the Report DDR setup and hold slack numbers into the Excel-based
calculator.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
10–56
Chapter 10: Analyzing Timing of Memory IP
Performing I/O Timing Analysis
4. Perform a prelayout line simulation of a 0-pf load simulation and obtain eye width
and slew rate numbers. Perform multiple chip select simulations of your topology
and use the Excel-based calculator.
Performing I/O Timing Analysis
For accurate I/O timing analysis, the Quartus II software must be made aware of the
board trace and loading information. This information must be derived and refined
during your PCB development process of pre-layout (line) and post-layout (board)
simulations.
For external memory interfaces that use memory modules (DIMMs), the board trace
and loading information must include the trace and loading information of the
module in addition to the main and host platform, which you can obtain from your
memory vendor.
You can use the following I/O timing analysis methods for your memory interface:
■
Perform I/O Timing Analysis with 3rd Party Simulation Tools
■
Perform Advanced I/O Timing Analysis with Board Trace Delay Model
Perform I/O Timing Analysis with 3rd Party Simulation Tools
Altera recommends that you perform I/O timing analysis using the 3rd party
simulation tool flow because this flow involves post layout simulation that can
capture more accurate I/O timing. This method is also easier because it only requires
you to enter the slew rates, board skews, and ISI values in the ALTMEMPHY or
UniPHY IP parameter editor.
To perform I/O timing analysis using 3rd party simulation tools, follow these steps:
1. Use a 3rd party simulation tool such as HyperLynx to simulate the full path for
DQ, DQS, CK, Address, and Command signals.
2. Under the Board Settings tab of the ALTMEMPHY or UniPHY parameter editor,
enter the slowest slew rate, ISI, and board skew values.
Perform Advanced I/O Timing Analysis with Board Trace Delay Model
You should use this method only if you are unable to perform post-layout simulation
on the memory interface signals to obtain the slew rate parameters, and/or when no
simulation tool is available.
To perform I/O timing analysis using board trace delay model, follow these steps:
1. After the instantiation is complete, analyze and synthesize your design.
2. Add pin and DQ group assignment by running the
<variation_name>_p0_pin_assignments.tcl script.
1
The naming of the pin assignment file may vary depending on the
Quartus II software version that you are using.
3. Enter the pin location assignments.
4. Assign the virtual pins, if necessary.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 10: Analyzing Timing of Memory IP
Document Revision History
10–57
5. Enter the board trace model information. To enter board trace model information,
follow these steps:
a. In the Pin Planner, select the pin or group of pins for which you want to enter
board trace parameters.
b. Right-click and select Board Trace Model.
6. Compile your design. To compile the design, on the Processing menu, click Start
Compilation.
7. After successfully compiling the design, perform timing analysis in the TimeQuest
timing analyzer. To perform timing analysis, follow these steps:
a. In the Quartus II software, on the Tools menu, click TimeQuest Timing
Analyzer.
b. On the Tasks pane, click Report DDR.
c. On the Report pane, select Advanced I/O Timing>Signal Integrity Metrics.
d. In the Signal Integrity Metrics window, right-click and select Regenerate to
regenerate the signal integrity metrics.
e. In the Signal Integrity Metrics window, note the 10–90% rise time (or fall time
if fall time is worse) at the far end for CK/CK#, address, and command,
DQS/DQS#, and DQ signals.
f. In the DDR3 SDRAM controller parameter editor, in the Board Settings tab,
type the values you obtained from the signal integrity metrics.
g. For the board skew parameters, set the maximum skew within DQS groups of
your design. Set the other board parameters to 0 ns.
h. Compile your design.
Document Revision History
Table 10–12 lists the revision history for this document.
Table 10–12. Document Revision History
Date
Version
November 2011
4.0
Changes
■
Added Arria V and Cyclone V information.
■
Added Performing I/O Timing Analysis section.
■
Added Measuring Eye Reduction for Address/Command, DQ, and DQS Setup and Hold
Time section.
June 2011
3.0
Updated for 11.0 release.
December 2010
2.1
Added Arria II GZ and Stratix V, updated board skews table.
July 2010
2.0
Added information about UniPHY-based IP and controllers.
January 2010
1.2
Corrected minor typos.
December 2009
1.1
Added Timing Deration section.
November 2009
1.0
First published.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11. Debugging Memory IP
November 2011
EMI_DG_011-4.0
EMI_DG_011-4.0
This chapter describes the process of debugging hardware and the tools to debug any
external memory interface. The concepts discussed can be applied to any IP but focus
on the debug of issues using the Altera® DDR, DDR2, DDR3, QDRII, QDRII+, and
RLDRAM II IP.
Increases in external memory interface frequency results in the following issues that
increase the challenges of debugging interfaces:
■
More complex memory protocols
■
Increased features and functionality
■
More critical timing
■
Increased complexity of calibration algorithm
Before the in-depth debugging of any issue, gather and confirm all information
regarding the issue.
Memory IP Debugging Issues
Debug issues may not be directly related to interface operation. Issues can also arise at
the Quartus® II Fitter stage, or complex designs may have timing analysis issues.
Memory debugging issues can be catagorized as follows:
■
Resource and Planning Issues
■
Interface Configuration Issues
■
Functional Issues
■
Timing Issues
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
11–2
Chapter 11: Debugging Memory IP
Resource and Planning Issues
Resource and Planning Issues
Typically, single stand-alone interfaces should not present any Quartus II Fitter or
timing issues. You may find that fitter, timing, and hardware operation can sometimes
become a challenge, as multiple interfaces are combined into a single project, or as the
device utilization increases. In such cases, the interface configuration is not the issue,
the placement and total device resource requirements create problems.
Resource Issue Evaluation
External memory interfaces typically require the following resource types, which you
must consider when trying to manually place, or perhaps use additional constraints to
force the placement or location of external memory interface IP:
■
Dedicated IOE DQS group resources and pins
■
Dedicated DLL resources
■
Specific PLL resources
■
Specific global, regional, and dual-regional clock net resources
Dedicated IOE DQS Group Resources and Pins
Fitter issues can occur with even a single interface, if you do not size the interface to fit
within the specified constraints and requirements. A typical requirement includes
containing assignments for the interface within a single bank or possibly side of the
chosen device.
Such a constraint requires that the chosen device meets the following conditions:
■
Sufficient DQS groups and sizes to support the required number of common I/O
(CIO) or separate I/O (SIO) data groups.
■
Sufficient remaining pins to support the required number of address, command,
and control pins.
Failure to evaluate these fundamental requirements can result in suboptimal interface
design, if the chosen device cannot be modified. The resulting wraparound interfaces
or suboptimal pseudo read and write data groups artificially limit the maximum
operating frequency.
Multiple blocks of IP further complicate the issue, if other IP has either no specified
location constraints or incompatible location constraints.
The Quartus II fitter may first place other components in a location required by your
memory IP, then error at a later stage because of an I/O assignment conflict between
the unconstrained IP and the constrained memory IP.
Your design may require that one instance of IP is placed anywhere on one side of the
device, and that another instance of IP is placed at a specific location on the same side.
While the two individual instances may compile in isolation, and the physical number
of pins may appear sufficient for both instances, issues can occur if the instance
without placement constraints is placed before the instance with placement
constraints.
In such circumstances, Altera recommends manually placing each individual pin, or
at least try using more granular placement constraints.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Resource and Planning Issues
11–3
f For more information about the pin number and DQS group capabilities of your
chosen device, refer to device data sheets or the Quartus II pin planner.
Dedicated DLL Resources
Altera devices typically use DLLs to enhance data capture at the FPGA.
While multiple external memory interfaces can usually share DLL resources, fitter
issues can occur when there is insufficient planning before HDL coding. If DLL
sharing is required, Altera gives the following recommendations for each instance of
the IP that shares the DLL resources:
1
■
Must have compatible DLL requirements—same frequency and mode.
■
Exports its autogenerated DLL instance out of its own dedicated PHY hierarchy
and into the top-level design file. This procedure allows easy comparison of the
generated DLL’s mode, and allows you to explicitly show the required DLL
sharing between two IP blocks in the HDL
The Quartus II fitter does not dynamically merge DLL instances.
Specific PLL Resources
When only a single interface resides on one side or one quadrant of a device, PLL
resources are typically not an issue. However if multiple interfaces or IP are required
on a single side or quadrant, consider the specific PLL used by each IP, and the
sharing of any PLL resources.
The Quartus II software automerges PLL resources, but not for any dynamically
controlled PLL components. Use the following PLL resource rules:
■
Ensure that the PLL located in the same bank or side of the device is available for
your memory controller.
■
If multiple PLLs are required for multiple controllers that cannot be shared, ensure
that enough PLL resources are available within each quadrant to support your
interface number requirements.
■
Try to limit multiple interfaces to a single quadrant. For example, if two complete
same size interfaces can fit on a single side of the device, constrain one interface
entirely in one bank of that side, and the other controller in the other bank.
f For more information about using multiple PHYs or controllers, refer to the design
tutorials on the List of designs using Altera External Memory IP page of the Altera
Wiki website.
Specific Global, Regional and Dual-Regional Clock Net Resources
Memory PHYs typically have specific clock resource requirements for each PLL clock
output. For example because of characterization data, the PHY may require that the
phy_clk is routed on a global clock net. The remaining clocks may all be routed on a
global or a regional clock net. However, they must all be routed on the same type.
Otherwise, the operating frequency of the interface is lowered, because of the
increased uncertainty between two different types of clock nets. The design may still
fit, but not meet timing.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–4
Chapter 11: Debugging Memory IP
Interface Configuration Issues
Planning Issue Evaluation
Plan the total number of DQS groups and total number of other pins required in your
shared area. Use the Pin Planner to assist with this activity.
Decide which PLLs or clock networks can be shared between IP blocks, then ensure
that sufficient resources are available. For example, if an IP core requires a regional
clock network, a PLL located on the opposite side of the device cannot be used.
Calculate the number of total clock networks and types required when trying to
combine multiple instances of IP.
You must understand the number of quadrants that the IP uses and if this number can
be reduced. For example, an interface may be autoplaced across an entire side of the
device, but may actually be constrained to fit in a single bank.
Optimizing the physical placement ensures that when possible, regional clock
networks are used as opposed to dual-regional clock networks, hence clock net
resources are maintained and routing is simplified.
As device utilization increases, the Quartus II software may have difficulty placing
the core. To optimize design utilization, follow these steps:
■
Review any fitter warning messages in multiple IP designs to ensure that clock
networks or PLL modes are not modified to achieve the desired fit.
■
Use the Quartus II Fitter resource section to compare the types of resources used in
a successful standalone IP implementation to those used in an unreliable multiple
IP implementation.
■
Use this information to better constrain the project to achieve the same results as
the standalone project.
■
Use the Chip Planner (Floorplan and Chip Editor) to compare the placement of the
working stand-alone design to the multiple IP project. Then use LogicLock™ or
Design Partitions to better guide the Quartus II software to the required results.
■
When creating LogicLock regions, ensure that they encompass all required
resources. For example, if constraining the read and write datapath hierarchy,
ensure that your LogicLock region includes the IOE blocks used for your datapath
pin out.
Interface Configuration Issues
This topic describes the performance (fMAX), efficiency (latency and transaction
efficiency) and bottleneck (the limiting factor) issues that you can encounter in any
design.
Performance Issues
There are a large number of interface combinations and configurations possible in an
Altera design, therefore it is impractical for Altera to explicitly state the achievable
fMAX for every combination. Altera seeks to provide guidance on typical performance,
but this data is subject to memory component timing characteristics, interface widths,
depths directly affecting timing deration requirements, and the achieved skew and
timing numbers for a specific PCB.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Interface Configuration Issues
11–5
FPGA timing issues should generally not be affected by interface loading or layout
characteristics. In general, the Altera performance figures for any given device family
and speed-grade combination should usually be achievable.
f To resolve FPGA (PHY and PHY reset) timing issues, refer to the Analyzing Timing of
Memory IP chapter.
Achievable interface timing (address and command, half-rate address and command,
read and write capture) is directly affected by any layout issues (skew), loading issues
(deration), signal integrity issues (crosstalk timing deration), and component speed
grades (memory timing size and tolerance). Altera performance figures are typically
stated for the default (single rank, unbuffered DIMM) case. Altera provides additional
expected performance data where possible, but the fMAX is not achievable in all
configurations. Altera recommends that you optimize the following items whenever
interface timing issues occur:
■
Improve PCB layout tolerances
■
Use a faster speed grade of memory component
■
Ensure that the interface is fully and correctly terminated
■
Reduce the loading (reduce the deration factor)
Bottleneck and Efficiency Issues
Depending on the transaction types, efficiency issues can exist where the achieved
data rate is lower than expected. Ideally, these issues should be assessed and resolved
during the simulation stage because they are sometimes impossible to solve later
without rearchitecting the product.
Any interface has a maximum theoretical data rate derived from the clock frequency,
however, in practise this theoretical data rate can never be achieved continuously due
to protocol overhead and bus turnaround times.
Simulate your desired configuration to ensure that you have specified a suitable
external memory family and that your chosen controller configuration can achieve
your required bandwidth.
Efficiency can be assessed in several different ways, and the primary requirement is
an achievable continuous data rate. The local interface signals combined with the
memory interface signals and a command decode trace should provide adequate
visibility of the operation of the IP to understand whether your required data rate is
sufficient and the cause of the efficiency issue.
To show if under ideal conditions the required data rate is possible in the chosen
technology, follow these steps:
1. Use the memory vendors own testbench and your own transaction engine.
2. Use either your own driver, or modify the provided example driver, to replicate
the transaction types typical of your system.
3. Simulate this performance using your chosen memory controller and decide if the
achieved performance is still acceptable.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–6
Chapter 11: Debugging Memory IP
Functional Issues
Observe the following points that may cause efficiency or bottleneck issues at this
stage:
■
Identify the memory controller rate (full, half, or quarter) and commands, which
may take two or four times longer than necessary
■
Determine whether the memory controller is starved for data by observing the
appropriate request signals.
■
Determine whether the memory controller processor transactions at a rate
sufficient to meet throughput requirements by observing appropriate signals,
including the local ready signal.
Altera has several versions and types of memory controller, and where possible you
can evaluate different configurations based on the results of the first tests.
Consider using either a faster interface, or a different memory type to better align
your data rate requirements to the IP available directly from Altera.
Altera also provides stand-alone PHY configurations so that you may develop custom
controllers or use third-party controllers designed specifically for your requirements.
Functional Issues
This topic discusses functional issues that occur at all frequencies (using the same
conditions) and are not altered by speed grade, temperature, or PCB changes.
Functional Issue Evaluation
Functional issues should be evaluated using functional simulation.
The Altera IP includes the option to autogenerate a testbench specific to your IP
configuration, which provides an easy route to functional verification.
The following issues should be considered when trying to debug functional issues in a
simulation environment.
Correct Combination of the Quartus II Software and ModelSim-Altera Device
Models
When running any simulations, ensure that you are using the correct combination of
the Quartus II software and device models. Altera only tests each release of software
and IP with the aligned release of device models. Failure to use the correct RTL and
model combination may result in unstable simulation environments.
The ModelSim®-Altera edition of the ModelSim simulator comes precompiled with
the Altera device family libraries included. You must always install the correct release
of ModelSim-Altera to align with your Quartus II software and IP release.
If you are using a full version of ModelSim-SE or PE, or any other supported
simulation environment, ensure that you are compiling the current Quartus II
supplied libraries. These libraries are located in the <Quartus II install
path>/quartus/eda/sim_lib/ directory.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Functional Issues
11–7
Altera IP Memory Model
Altera memory IP autogenerates a generic simplified memory model that works in all
cases. This simple read and write model is not designed or intended to verify all
entered IP parameters or transaction requirements.
The Altera-generated memory model may be suitable to evaluate some limited
functional issues, but it does not provide comprehensive functional simulation.
Vendor Memory Model
Contact the memory vendor directly as many additional models are available from
the vendors support system.
When using memory vendor models, ensure that the model is correctly defined for
the following characteristics:
■
Speed grade
■
Organization
■
Memory allocation
■
Maximum memory usage
■
Number of ranks on a DIMM
■
Buffering on the DIMM
■
ECC
f Refer to the readme.txt file supplied with the memory vendor model, for more
information about how to define this information for your configuration.
During simulation vendor models output a wealth of information regarding any
device violations that may occur because of incorrectly parameterized IP.
f Refer to Transcript Window Messages, for more information.
Out of PC Memory Issues
If you are running the ModelSim-Altera simulator, the limitation on memory size,
may mean that you have insufficient memory to run your simulation. Or, if you are
using a 32-bit operating system, your PC may have insufficient memory.
Typical simulation tool errors include: "Iteration limit reached" or "Error out of
memory".
When using either the Altera generic memory model, or a vendor specific model quite
large memory depths can be required if you do not specify your simulation carefully.
For example, if you simulate an entire 4-GB DIMM interface, the hardware platform
that performs that simulation requires at least this amount of memory just for the
simulation contents storage.
f Refer to Memory Allocation and Max Memory Usage in the vendor's readme.txt files
for more information.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–8
Chapter 11: Debugging Memory IP
Functional Issues
Transcript Window Messages
When debugging a functional issue in simulation, vendor models typically provide a
much more detailed checks and feedback regarding the interface and their operational
requirements than the Altera generic model.
In general, you should use a vendor-supplied model whenever one is available.
Consider using second-source vendor models in preference to the Altera generic
model.
Many issues can be traced to incorrectly configured IP for the specified memory
components. Component data sheets usually contain settings information for several
different speed grades of memory. Be aware data sheet specify parameters in fixed
units of time, frequencies, or clock cycles.
The Altera generic memory model always matches the parameters specified in the IP,
as it is generated using the same engine. Because vendor models are independent of
the IP generation process, they offer a more robust IP parameterization check.
During simulation, review the transcript window messages and do not rely on the
Simulation Passed message at the end of simulation. This message only indicates that
the example driver successfully wrote and then read the correct data for a single test
cycle.
Even if the interface functionally passes in simulation, the vendor model may report
operational violations in the transcript window. These reported violations often
specifically explain why an interface appears to pass in simulation, but fails in
hardware.
Vendor models typically perform checks to ensure that the following types of
parameters are correct:
■
Burst length
■
Burst order
■
tMRD
■
tMOD
■
tRFC
■
tREFPDEN
■
tRP
■
tRAS
■
tRC
■
tACTPDEN
■
tWR
■
tWRPDEN
■
tRTP
■
tRDPDEN
■
tINIT
■
tXPDLL
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Functional Issues
■
tCKE
■
tRRD
■
tCCD
■
tWTR
■
tXPR
■
PRECHARGE
■
CAS length
■
Drive strength
■
AL
■
tDQS
■
CAS_WL
■
Refresh
■
Initialization
■
tIH
■
tIS
■
tDH
■
tDS
11–9
If a vendor model can verify all these parameters are compatible with your chosen
component values and transactions, it provides a specific insight into hardware
interface failures.
Simulation
Passing simulation means that the interface calibrates and successfully completes a
single test complete cycle without asserting pass not fail (pnf). It does not take into
account any warning messages that you may receive during simulation. If you are
debugging an interface issue, review and, if necessary, correct any warning messages
from the transcript window before continuing.
Modifying the Example Driver to Replicate the Failure
Often during debugging, you may discover that the example driver design works
successfully, but that your custom logic is observing data errors. This information
indicates that the issue is either:
■
Related to the way that the local interface transactions are occurring. Altera
recommends you probe and compare using the SignalTap™ II analyzer.
■
Related to the types or format of transactions on the external memory interface.
Altera recommends you modify the example design to replicate the issue.
Typical issues on the local interface side include:
■
November 2011
Incorrect local address to memory address translation causing the word order to
be different than expected. Refer to Burst Definition in your memory vendor data
sheet.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–10
Chapter 11: Debugging Memory IP
Timing Issues
■
Incorrect timing on the local interface. When your design requests a transaction,
the local side must be ready to service that transaction as soon as it is accepted
without any pause.
f For more information, refer to the Avalon® Interface Specification.
The default example driver only performs a limited set of transaction types,
consequently potential bus contention or preamble and postamble issues can often be
masked in its default operation. For successful debugging, isolate the custom logic
transaction types that are causing the read and write failures and modify the example
driver to demonstrate the same issue. Then, you can try to replicate the failure in RTL
simulation with the modified driver.
An issue that you can replicate in RTL simulation indicates a potential bug in the IP.
You should recheck the IP parameters. An issue that you can not replicate in RTL
simulation indicates a timing issue on the PCB. You can try to replicate the issue on an
Altera development platform to rule out a board issue.
1
Ensure that all PCB timing, loading, skew, and deration information is correctly
defined in the Quartus II software, as the timing report is inaccurate if this initial data
is not correct.
Functional simulation allows you to identify any issues with the configuration of
either the Altera memory controller and or PHY. You can then easily check the
operation against both the memory vendor data sheet and the respective JEDEC
specification. After you resolve functional issues, you can start testing hardware.
f For more information about simulation, refer to the Simulating Memory IP chapter.
Timing Issues
The Altera PHY and controller combinations autogenerate timing constraint files to
ensure that the PHY and external interface are fully constrained and that timing is
analyzed during compilation. However, timing issues can still occur. This topic
discusses how to identify and resolve any timing issues that you may encounter.
Timing Issue Characteristics
Timing issues typically fall into two distinct categories:
■
FPGA core timing reported issues
■
External memory interface timing issues in a specific mode of operation or on a
specific PCB
TimeQuest reports timing issues in two categories: core to core and core to IOE
transfers. These timing issues include the PHY and PHY reset sections in the
TimeQuest Report DDR subsection of timing analysis. External memory interface
timing issues are specifically reported in the TimeQuest Report DDR subsection,
excluding the PHY and PHY reset. The Report DDR PHY and PHY reset sections only
include the PHY, and specifically exclude the controller, core, PHY-to-controller and
local interface. Quartus II timing issues should always be evaluated and corrected
before proceeding to any hardware testing.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Timing Issues
11–11
PCB timing issues are usually Quartus II timing issues, which are not reported in the
Quartus II software, if incorrect or insufficient PCB topology and layout information
is not supplied. PCB timing issues are typically characterized by calibration failure, or
failures during user mode when the hardware is heated or cooled. Further PCB timing
issues are typically hidden if the interface frequency is lowered.
Timing Issue Evaluation
Try to fix and identify timing issues in the Quartus II software. Consider the following
issues when resolving timing issues.
FPGA Timing Issues
In general, you should not have any timing issues with Altera-provided IP unless you
running the IP outside of Altera's published performance range or are using a version
of the Quartus II software with preliminary timing model support for new devices.
However, timing issues can occur in the following circumstances:
■
The .sdc files are incorrectly added to the Quartus II project
■
Quartus II analysis and synthesis settings are not correct
■
Quartus II Fitter settings are not correct
For all of these issues, refer to the correct user guide for more information about
recommended settings and follow these steps:.
1. Ensure that the IP generated .sdc files are listed in the Quartus II TimeQuest
Timing Analyzer files to include in the project window.
2. Ensure that Analysis and Synthesis Settings are set to Optimization Technique
Speed.
3. Ensure that Fitter Settings are set to Fitter Effort Standard Fit.
4. Use TimeQuest Report Ignored Constraints, to ensure that .sdc files are
successfully applied.
5. Use TimeQuest Report Unconstrained Paths, to ensure that all critical paths are
correctly constrained.
More complex timing issues can occur if any of the following conditions are true:
■
The design includes multiple PHY or core projects
■
Devices where the resources are heavily used
■
The design includes wide, distributed, maximum performance interfaces in large
die sizes
Any of these conditions can lead to suboptimal placement results when the PHY or
controller are distributed around the FPGA. To evaluate such issues, simplify the
design to just the autogenerated example top-level file and determine if the core meets
timing and you see a working interface. Failure implies that a more fundamental
timing issue exists. If the standalone design passes core timing, evaluate how this
placement and fit is different than your complete design.
Use LogicLock regions, or design partitions to better define the placement of your
memory controllers. When you have your interface standalone placement, repeat for
additional interfaces, combine, and finally add the rest of your design.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–12
Chapter 11: Debugging Memory IP
Timing Issues
Additionally, use fitter seeds and increase the placement and router effort multiplier.
External Memory Interface Timing Issues
External memory interface timing issues are not directly related to the FPGA timing
but are actually derived from the FPGA input and output characteristics, PCB timing,
and the memory component characteristics.
The FPGA input and output characteristics tend to be a predominately fixed value, as
the IOE structure of the devices is fixed. Optimal PLL characteristics and clock routing
characteristics do have an effect. Assuming the IP is correctly constrained with the
autogenerated assignments, and you follow implementation rules, the design should
reach the stated performance figures.
The memory component characteristics are fixed for any given component or DIMM.
However, consider using faster components or DIMMs in marginal cases when PCB
skew may be suboptimal, or your design includes multiple ranks when deration may
be causing read capture or write timing challenges. Using faster memory components
typically reduces the memory data output skew and uncertainty easing read capture,
and lowering the memory’s input setup and hold requirement, which eases write
timing.
Increased PCB skew reduces margins on address, command, read capture and write
timing. If you are narrowly failing timing on these paths, consider reducing the board
skew (if possible), or using faster memory. Address and command timing typically
requires you to manually balance the reported setup and hold values with the
dedicated address and command phase in the IP.
f Refer to the respective IP user guide for more information.
Multiple-slot multiple-rank UDIMM interfaces can place considerable loading on the
FPGA driver. Typically a quad rank interface can have thirty-six loads. In multiplerank configurations, Altera's stated maximum data rates are not likely to be
achievable because of loading deration. Consider using different topologies, for
example registered DIMMs, so that the loading is reduced.
Deration because of increased loading, or suboptimal layout may result in a lower
than desired operating frequency meeting timing. You should close timing in the
Quartus II software using your expected loading and layout rules before committing
to PCB fabrication.
Ensure that any design with an Altera PHY is correctly constrained and meets timing
in the Quartus II software. You must address any constraint or timing failures before
testing hardware.
f For more information about timing constraints, refer to the Analyzing Timing of
Memory IP chapter.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Verifying Memory IP Using the SignalTap II Logic Analyzer
11–13
Verifying Memory IP Using the SignalTap II Logic Analyzer
The SignalTap II logic analyzer shows read and write activity in the system.
f For more information about using the SignalTap II logic analyzer, refer to Design
Debugging Using the SignalTap II Embedded Logic Analyzer chapter in volume 3 of the
Quartus II Software Handbook
To add the SignalTap II logic analyzer, follow these steps:
1. On the Tools menu click SignalTap II Logic Analyzer.
2. In the Signal Configuration window next to the Clock box, click … (Browse Node
Finder).
3. Type the memory interface system clock (typically *phy_clk) in the Named box,
for Filter select SignalTap II: pre-synthesis and click List.
4. Select the memory interface system clock (<variation
name>_example_top|<variation name>:<variation name>_inst|<variation
name>_controller_phy:<variation name>_controller_phy_inst|phy_clk|phy_clk)
in Nodes Found and click > to add the signal to Selected Nodes.
5. Click OK.
6. Under Signal Configuration, specify the following settings:
■
For Sample depth, select 512
■
For RAM type, select Auto
■
For Trigger flow control, select Sequential
■
For Trigger position, select Center trigger position
■
For Trigger conditions, select 1
7. On the Edit menu, click Add Nodes.
8. Search for specific nodes by typing *local* in the Named box, for Filter select
SignalTap II: pre-synthesis and click List.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–14
Chapter 11: Debugging Memory IP
Verifying Memory IP Using the SignalTap II Logic Analyzer
9. Select the following nodes in Nodes Found and click > to add to Selected Nodes:
■
local_address
■
local_rdata
■
local_rdata_valid
■
local_read_req
■
local_ready
■
local_wdata
■
local_wdata_req
■
local_write_req
■
pnf
■
pnf_per_byte
■
test_complete (trigger)
■
ctl_cal_success
■
ctl_cal_fail
■
ctl_wlat
■
ctl_rlat
1
Do not add any memory interface signals to the SignalTap II logic analyzer.
The load on these signals increases and adversely affects the timing
analysis.
10. Click OK.
11. To reduce the SignalTap II logic size, turn off Trigger Enable on the following bus
signals:
■
local_address
■
local_rdata
■
local_wdata
■
pnf_per_byte
■
ctl_wlat
■
ctl_rlat
12. Right-click Trigger Conditions for the test_complete signal and select Rising
Edge.
13. On the File menu, click Save, to save the SignalTap II .stp file to your project.
1
If you see the message Do you want to enable SignalTap II file “stp1.stp”
for the current project, click Yes.
14. Once you add signals to the SignalTap II logic analyzer, recompile your design, on
the Processing menu, click Start Compilation.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Verifying Memory IP Using the SignalTap II Logic Analyzer
11–15
15. When the design compiles, ensure that TimeQuest timing analysis passes
successfully. In addition to this FPGA timing analysis, check your PCB or system
SDRAM timing. To run timing analysis, run the *_phy_report_timing.tcl script.
a. On the Tools menu, click Tcl Scripts.
b. Select <variation name>_phy_report_timing.tcl and click Run.
16. Connect the development board to your computer.
17. On the Tools menu, click SignalTap II Logic Analyzer.
18. Add the correct <your project name>.sof file to the SOF Manager:
a. Click ... to open the Select Program Files dialog box.
b. Select <your project name>.sof.
c. Click Open.
d. To download the file, click the Program Device button.
19. When the example design including SignalTap II successfully downloads to your
development board, click Run Analysis to run once, or click Autorun Analysis to
run continuously.
Monitoring Signals with the SignalTap II Logic Analyzer
The following sections detail the memory controller signals you should consider
analyzing for different memory interfaces. The list is not exhaustive, but is a starting
point.
f For a description of each signal, refer to Volume 3: Reference Material of the External
Memory Interface Handbook.
DDR, DDR2, and DDR3 ALTMEMPHY Designs
Monitor the following signals for DDR, DDR2, and DDR3 SDRAM ALTMEMPHY
designs:
November 2011
■
Local_* -example_driver (all the local interface signals)
■
Pnf -example_driver
■
Pnf_per_byte -example_driver
■
Test_complete -example_driver
■
Test_status -example_driver
■
Ctl_cal_req -phy_inst
■
Ctl_init_fail -phy_inst
■
Ctl_init_success -phy_inst
■
Ctl_cal_fail -phy_inst
■
Ctl_cal_success -phy_inst
■
Cal_codvw_phase * -phy_inst
■
Cal_codvw_size * -phy_inst
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–16
Chapter 11: Debugging Memory IP
Verifying Memory IP Using the SignalTap II Logic Analyzer
■
Codvw_trk_shift * -phy_inst
■
Ctl_rlat * -phy_inst
■
Ctl_wlat * -phy_inst
■
Locked -altpll_component
■
Phasecounterselect * -altpll_component
■
Phaseupdown -altpll_component
■
Phasestep -altpll_component
■
Phase_done -altpll_component
■
Flag_done_timeout -seq_inst:ctrl
■
Flag_ack_timeout -seq_inst:ctrl
■
Proc_ctrl.command_err -seq_inst:ctrl
■
Proc_ctrl.command_result * -seq_inst:ctrl
■
dgrb_ctrl.command_err -seq_inst:ctrl
■
dgrb_ctrl.command_result * -seq_inst:ctrl
■
dgwb_ctrl.command_err -seq_inst:ctrl
■
dgwb_ctrl.command_result * -seq_inst:ctrl
■
admin_ctrl.command_err -seq_inst:ctrl
■
admin_ctrl.command_result * -seq_inst:ctrl
■
setup_ctrl.command_err -seq_inst:ctrl
■
setup_ctrl.command_result * -seq_inst:ctrl
■
state.s_phy_initialise -seq_inst:ctrl
■
state.s_init_dram -seq_inst:ctrl
■
state.s_write_ihi -seq_inst:ctrl
■
state.s_cal -seq_inst:ctrl
■
state.s_write_btp -seq_inst:ctrl
■
state.s_write_mtp -seq_inst:ctrl
■
state.s_read_mtp -seq_inst:ctrl
■
state.s_rrp_reset -seq_inst:ctrl
■
state.s_rrp_sweep -seq_inst:ctrl
■
state.s_rrp_seek -seq_inst:ctrl
■
state.s_rdv -seq_inst:ctrl
■
state.s_poa -seq_inst:ctrl
■
state.s_was -seq_inst:ctrl
■
state.s_adv_rd_lat -seq_inst:ctrl
■
state.s_adv_wr_lat -seq_inst:ctrl
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Verifying Memory IP Using the SignalTap II Logic Analyzer
■
state.s_prep_customer_mr_setup -seq_inst:ctrl
■
state.s_tracking_setup -seq_inst:ctrl
■
state.s_tracking -seq_inst:ctrl
■
state.s_reset -seq_inst:ctrl
■
state.s_non_operational -seq_inst:ctrl
■
state.s_operational -seq_inst:ctrl
■
dqs_delay_ctrl_export * -phy_inst
■
* = Disable Trigger Enable
11–17
UniPHY Designs
Monitor the following signals for UniPHY designs:
November 2011
■
avl_addr
■
avl_rdata
■
avl_rdata_valid
■
avl_read_req
■
avl_ready
■
avl_wdata
■
avl_write_req
■
fail
■
pass
■
afi_cal_fail
■
afi_cal_success
■
test_complete
■
be_reg (QDRII only)
■
pnf_per_bit
■
rdata_reg
■
rdata_valid_reg
■
data_out
■
data_in
■
written_data_fifo|data_out
■
usequencer|state*
■
usequencer|phy_seq_rdata_valid
■
usequencer|phy_seq_read_fifo_q
■
usequencer|phy_read_increment_vfifo*
■
usequencer|phy_read_latency_counter
■
uread_datapath|afi_rdata_en
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–18
Chapter 11: Debugging Memory IP
Hardware Debugging Guidelines
■
uread_datapath|afi_rdata_valid
■
uread_datapath|ddio_phy_dq
■
qvld_wr_address*
■
qvld_rd_address*
Hardware Debugging Guidelines
Before starting to debug, confirm the design followed the Altera recommended
design flow.
f Refer to the Design Flow chapter in volume 1 of the External Memory Interface Handbook.
Always keep a record of tests, to avoid repeating the same tests later. To start
debugging the design, perform the following initial steps.
Create a Simplified Design that Demonstrates the Same Issue
To help debugging create a simple design that replicates the issue. A simple design
compiles faster and is much easier to understand. Altera's external memory interface
IP generates an example top-level file that is ideal for debugging. The example
top-level file uses all the same parameters, pin-outs, and so on.
Measure Power Distribution Network
Ensure you take measurements of the various power supplies on their hardware
development platform over a suitable time base and with a suitable trigger using an
appropriate probe and grounding scheme. In addition, take the measurements
directly on the pins or vias of the devices in question, and with the hardware
operational.
Measure Signal Integrity and Setup and Hold Margin
Measure the signals on their PCB to ensure that everything looks correct. This
information can be vital. When measuring any signal, consider the edge rate of the
signal, not just its frequency. Modern FPGA devices have very fast edge rates,
therefore you must use a suitable oscilloscope, probe, and grounding scheme when
you measure the signals.
You can take measurements to capture the setup and hold time of key signal classes
with respect to their clock or strobe. Ensure that the measured setup and hold margin
is at least better than that reported in the Quartus II software. A worse margin
indicates that a timing discrepancy exists somewhere in the project. However, this
timing issue may not be the cause of your problem.
Vary Voltage
Try and vary the voltage of your system, if you suspect a marginality issue. Increasing
the voltage typically causes devices to operate faster and also usually provides
increased noise margin.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Hardware Debugging Guidelines
11–19
Use Freezer Spray and Heat Gun
If you have an intermittent marginal issue, cool or heat the interface to try and stress
the issue. Cooling down ICs causes them to run faster, which makes timing easier.
Conversely heating up ICs causes them to run slower, which makes timing more
difficult.
If cooling or heating fixes the issue, you are probably looking at a timing issue.
Operate at a Lower Speed
Test the interface at a lower speed. If the interface works, the interface is correctly
pinned out and functional. However, if the interface fails at a lower speed, determine
if the test is valid. Many high-speed memory components have a minimal operating
frequency, or require subtly different configurations when operating at a lower
speeds.
For example, DDR, DDR2, or DDR3 SDRAM typically requires modification to the
following parameters if you want operate the interface a lower speeds:
■
tMRD
■
tWTR
■
CAS latency and CAS write latency
Find Out if the Issue Exists in Previous Versions of Software
Hardware that works before an update to either the Quartus II software or the
memory IP indicates that the development platform is not the issue. However, the
previous generation IP may be less susceptible to a PCB issue, masking the issue.
Find out if the Issue Exists in the Current Version of Software
Designs are often tested using previous generations of Altera software or IP. Projects
do not always get upgraded for the following reasons:
November 2011
■
Multiple engineers are on the same project. To ensure compatibility, a common
release of Altera software is used by all engineers for the duration of the product
development. The design may be several releases behind the current Quartus II
software version.
■
Many companies delay before adopting a new release of software so that they can
first monitor Internet forums to get a feel for how successful other users say the
software is.
■
Many companies never use the latest version of any software, preferring to wait
until the first service pack is released that fixes the primary issues.
■
Some users may only have a license for the older version of the software and can
only use that version until their company makes the financial decision to upgrade.
■
The local interface specification from Altera IP to the customer's logic sometimes
changes from software release to software release. If you have already spent
resources designing interface logic, you may be reluctant to repeat this exercise. If
a block of code is already signed off, you may be reluctant to modify it to upgrade
to newer IP from Altera..
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–20
Chapter 11: Debugging Memory IP
Debugging Checklist
In all of these scenarios, you must determine if the issue still exists in the latest version
of the Altera software. Bugs are fixed and enhancements are added to the Altera IP
every release. Depending on the nature of the bug or enhancement, it may not always
be clearly documented in the release notes.
Finally, if the latest version of the software resolves the issue, it may be easier to debug
the version of software that you are using.
Try A Different PCB
If you are using the same Altera IP on a number of different hardware platforms; find
out if the issue occurs on all of these hardware platforms, or just one. Multiple
instances of the same PCB, or multiple instances of the same interface, on physically
different hardware platforms may exhibit different behavior. You can determine if the
configuration is fundamentally not working, or if some form of marginality is
involved in the issue.
Issues are often reported on the alpha build of a development platform. These are
produced in very limited numbers and often have received limited BBT (bare board
testing), or FT (functional testing). Hence, these early boards are often more unreliable
than production quality PCBs.
Additionally, if the IP is from a previous project to help save development resources,
find out if this specific IP configuration works on a previous platform.
Try Other Configurations
Designs are typically quite large, using multiple blocks of IP in many different
combinations. Find out if any other configurations work correctly on the development
platform. The full project may have multiple external memory controllers in the same
device, or may have configurations where only half the memory width or frequency is
required. Find out what does and does not work to help the debugging of the issue.
Debugging Checklist
The following checklist is a good starting point when debugging an external memory
interface. This chapter discusses all of the items in the checklist.
Check
Item



Try a different fit.






Simulate the design. If it fails in simulation, it will fail in hardware.
External Memory Interface Handbook
Volume 2: Design Guidelines
Check IP parameters at the operating frequency (tMRD, tWTR for example).
Ensure you have constrained your design with proper timing deration and have closed
timing.
Analyze timing.
Place and assign RUP and RDN (OCT).
Measure the power distribution network (PDN).
Measure signal integrity.
Measure setup and hold timing.
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
11–21
Check
Item








Measure FPGA voltages.

Retarget to a smaller interface width or a single bank.
Vary voltages.
Heat and cool the PCB.
Operate at a lower or higher frequency.
Check board timing and trace Information.
Check LVDS and clock sources, I/O voltages and termination.
Check PLL clock source, specification, and jitter.
Ensure the correct number of PLL phase steps take place during calibration. If the
number stated in the IP does not match the number, you may have manually altered the
PLL.
Catagorizing Hardware Issues
The following topic catagorizises issues. Identifying which category or groups of
category an issue may be classified within allows you to focus on the cause of the
issue.
Signal Integrity Issues
Many design issues, even ones that you find at the protocol layer, can often be traced
back to signal integrity issues. Hence, you must check circuit board construction,
power systems, command, and data signaling to determine if they meet
specifications. If infrequent, random errors exist in the memory subsystem, product
reliability suffers. As such, electrical verification is vital. Check the bare circuit board
or PCB design file. Circuit board errors can cause poor signal integrity, signal loss,
signal timing skew, and trace impedance mismatches. Differential traces with
unbalanced lengths or signals that are routed too closely together can cause crosstalk.
Characteristics
Signal integrity issues often appear when the performance of the hardware design is
marginal. The design may not always initialize and calibrate correctly, or may exhibit
occasional bit errors in user mode. Severe signal integrity issues can result in total
failure of an interface at certain data rates, and sporadic component failure because of
electrical stress. PCB component variance and signal integrity issues often show up as
failures on one PCB, but not on another identical board. Timing issues can have a
similar characteristic. Multiple calibration windows or significant differences in the
calibration results from one calibration to another can also indicate signal integrity
issues.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–22
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
Evaluating SignaI Integrity Issues
Signal integrity issues can only really be evaluated in two ways, direct measurement
using suitable test equipment like an oscilloscope and probe, or simulation using a
tool like HyperLynx or Allegro PCB SI. Signals should be compared against the
respective electrical specification. You should look for overshoot and undershoot,
non-monotonicity, eye height and width, and crosstalk.
Skew
Ensure that all clocked signals, commands, addresses, and control signals arrive at the
memory inputs at the same time. Trace length variations cause data valid window
variations between the signals reducing margin. For example, DDR2-800 at 400 MHz
has a data valid window that is smaller than 1,250 ps. Trace length skew or crosstalk
can reduce this data valid window further, making it difficult to design a reliably
operating memory interface. Ensure that the skew figure previously entered into the
Altera IP matches that actually achieved on the PCB, otherwise Quartus II timing
analysis of the interface is accurate.
Crosstalk
Crosstalk is another issue that is best evaluated early in the memory design phase.
Check the clock-to-data strobes, as these are bidirectional. Measure the crosstalk at
both ends of the line. Check the data strobes to clock, as the clocks are unidirectional,
these only need checking at the memory end of the line.
Power System
Some memory interfaces tend to draw current in spikes from their power delivery
system as SDRAMs are based on capacitive memory cells. Rows are read and
refreshed one at a time, which causes dynamic currents that can stress any power
distribution network (PDN). The various power rails should be checked either at or as
close as possible to the SDRAM pins power pins. Ideally, a real-time oscilloscope set to
fast glitch triggering should be used for this activity.
Clock Signals
The clock signal quality is important for any external memory system. Measurements
include frequency, digital core design (DCD), high width, low width, amplitude, jitter,
rise, and fall times.
Read Data Valid Window and Eye Diagram
The memory generates the read signals. Take measurements at the FPGA end of the
line. To ease read diagram capture, modify the example driver to mask writes or
modify the PHY to include a signal that you can trigger on when performing reads.
Write Data Valid Window and Eye Diagram
The FPGA generates the write signals. Take measurements at the memory device end
of the line. To ease write diagram capture, modify the example driver to mask reads or
modify the PHY export a signal that is asserted when performing writes.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
11–23
OCT and ODT Usage
Modern external memory interface designs typically use OCT for the FPGA end of the
line, and ODT for the memory component end of the line. If either the OCT or ODT
are incorrectly configured or enabled, signal integrity issues exist. If the design is
using OCT, RUP or RDN pins must be placed correctly for the OCT to work. If you do
not place these pins, the Quartus II software allocates them automatically with the
following warning:
Warning: No exact pin location assignment(s) for 2 pins of 110 total pins
Info: Pin termination_blk0~_rup_pad not assigned to an exact location
on the device
Info: Pin termination_blk0~_rdn_pad not assigned to an exact location
on the device
If you see these warnings, the RUP and RDN pins may have been allocated to a pin that
does not have the required external resistor present on the board. This allocation
renders the OCT circuit faulty, resulting in unreliable UniPHY and ALTMEMPHY
calibration and or interface behavior. The pins with the required external resistor must
be specified in the Quartus II software.
For the FPGA, ensure that follow these actions:
■
Specify the RUP and RDN pins in either the projects HDL port list, or in the
assignment editor (termination_blk0~_rup_pad/ termination_blk0~_rdn_pad).
■
Connect the RUP and RDN pins to the correct resistors and pull-up and pull-down
voltage in the schematic or PCB.
■
Contain the RUP and RDN pins within a bank of the device that is operating at the
same VCCIO voltage as the interface that is terminated.
■
Check that only the expected number of RUP and RDN pins exists in the project pinout file. Look for Info: Created on-chip termination messages at the fitter stage
for any calibration blocks not expected in your design.
■
Review the Fitter Pin-Out file for RUP and RDN pins to ensure that they are on the
correct pins, and that only the correct number of calibration blocks exists in your
design.
■
Check in the fitter report that the input, output, and bidirectional signals with
calibrated OCT all have the termination control block applicable to the associated
RUP and RDN pins.
For the memory components, ensure that you follow these actions:
November 2011
■
Connect the required resistor to the correct pin on each and every component, and
ensure that it is pulled to the correct voltage.
■
Place the required resistor close to the memory component.
■
Correctly configure the IP to enable the desired termination at initialization time.
■
Check that the speed grade of memory component supports the selected ODT
setting.
■
Check that the second source part that may have been fitted to the PCB, supports
the same ODT settings as the original
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–24
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
Hardware and Calibration Issues
When you resolve functional, timing, and signal integrity issues, assess the operation
of the PHY and its interface calibration.
Hardware and Calibration Issue Characteristics
Hardware and calibration issues have the following definitions:
1
■
Calibration issues result in calibration failing, which typically results in the design
asserting the ctl_cal_fail signal.
■
Hardware issues result in read and write failures, which typically results in the
design asserting the pass not fail (pnf) signal
Ensure that functional, timing, and signal integrity issues are not the direct cause of
your hardware issue, as functional, timing or signal integrity issues are usually the
cause of any hardware issue.
Evaluating Hardware and Calibration Issues
Use the following methods to evaluate hardware and calibration issues:
■
Evaluate hardware issues using the SignalTap II logic analyzer to monitor the local
side read and write interface with the pass or fail or error signals as triggers
■
Evaluate calibration issues using the SignalTap II logic analyzer to monitor the
various calibration, configuration with the pass or fail or error signals as triggers,
but also use the debug toolkit and system consoles when available
f For more information about debug toolkits and the type of signals for
debugging external memory interfaces, refer to the ALTMEMPHY External
Memory Interface Debug Toolkit and UniPHY External Memory Interface Debug
Toolkit chapters in volume 3 of the External Memory Interface Handbook.
Consider adding core noise to your design to aggravate margin timing and signal
integrity issues. Steadily increase the stress on the interface in the following order:
1. Increase the interface utilization by modifying the example driver to focus on the
types of transactions that exhibit the issue.
2. Increase the SNN or aggressiveness of the data pattern by modifying the example
driver to output in synchronization PRBS data patterns, or hammer patterns.
3. Increase the stress on the PDN by adding more and more core noise to your
system. Try sweeping the fundamental frequency of the core noise to help identify
resonances in your power system.
Steadily increasing the stress on the external memory interface is an ideal way to
assess and understand the cause of any previously intermittent failures that you may
observe in your system. Using the SignalTap II probe tool can provide insights into
the source or cause of operational failure in the system.
Additionally, steadily increasing stress on the external memory interface allows you
to assess and understand the impact that such factors have on the amount of timing
margin and resynchronization window. Take measurements with and without the
additional stress factor to allow evaluation of the overall effect.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
11–25
Write Timing Margin
Determine the write timing margin by phase sweeping the write clock from the PLL.
Use sources and probes to dynamically control the PLL phase offset control, to
increase and decrease the write clock phase adjustment so that the write window size
may be ascertained.
Remember that when sweeping PLL clock phases, the following two factors may
cause operational failure:
■
The available write margin.
■
The PLL phase in a multi-clock system.
The following code achieves this adjustment. You should use sources and probes to
modify the respective output of the PLL. Ensure that the example driver is writing
and reading from the memory while observing the pnf_per_byte signals to see when
write failures occur:
/////////////////
wire [7:0] Probe_sig;
wire [5:0] Source_sig;
PhaseCount PhaseCounter (
.resetn (1'b1),
.clock (pll_ref_clk),
.step (Source_sig[5]),
.updown (Source_sig[4]),
.offset (Probe_sig)
);
CheckoutPandS freq_PandS (
.probe (Probe_sig),
.source (Source_sig)
);
ddr2_dimm_phy_alt_mem_phy_pll_siii pll (
.inclk0 (pll_ref_clk),
.areset (pll_reset),
.c0 (phy_clk_1x), // hR
.c1 (mem_clk_2x), // FR
.c2 (aux_clk), // FR
.c3 (write_clk_2x), // FR
.c4 (resync_clk_2x), // FR
.c5 (measure_clk_1x), // hR
.c6 (ac_clk_1x), // hR
.phasecounterselect (Source_sig[3:0]),
.phasestep (Source_sig[5]),
.phaseupdown (Source_sig[4]),
.scanclk (scan_clk),
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–26
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
.locked (pll_locked_src),
.phasedone (pll_phase_done)
);
Read Timing Margin
Similarly, assess the read timing margin by using sources and probes to manually
control the DLL phase offset feature. Open the autogenerated DLL using ALT_DLL
and add the additionally required offset control ports. This action allows control and
observation of the following signals:
dll_delayctrlout[5:0], // Phase output control from DLL to DQS pins
(Gray Coded)
dll_offset_ctrl_a_addnsub, // Input add or subtract the phase offset
value
dll_offset_ctrl_a_offset[5:0], // User Input controlled DLL offset
value (Gray Coded)
dll_aload, // User Input DLL load command
dll_dqsupdate, // DLL Output update required signal.
In examples where the applied offset applied results in the maximum or minimum
dll_delayctrlout[5:0] setting without reaching the end of the read capture window,
regenerate the DLL in the next available phase setting, so that the full capture window
is assessed.
Modify the example driver to constantly perform reads (mask writes). Observe the
pnf_per_byte signals while the DLL capture phase is manually modified to see when
failures begin, which indicates the edge of the window.
A resynchronization timing failure can indicate failure at that capture phase, and not a
capture failure. You should recalibrate the PHY with the calculated phase offset to
ensure that you are using the true read-capture margin.
Address and Command Timing Margin
You set the address and command clock phase directly in the IP. Assuming you enter
the correct board trace model information into the Quartus II software, the timing
analysis should be correct. However, if you want to evaluate the address and
command timing margin, use the same process as in “Write Timing Margin”, only
phase step the address and command PLL output (c6 ac_clk_1x). You can achieve this
effect using the debug toolkit or system console.
f Refer to the ALTMEMPHY External Memory Interface Debug Toolkit and UniPHY
External Memory Interface Debug Toolkit chapters in volume 3 of the External Memory
Interface Handbook.
Resynchronization Timing Margin
Observe the size and margins, available for resynchronization using the debug toolkit
or system console.
f Refer to the ALTMEMPHY External Memory Interface Debug Toolkit and UniPHY
External Memory Interface Debug Toolkit chapters in volume 3 of the External Memory
Interface Handbook.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Catagorizing Hardware Issues
11–27
Additionally for PHY configurations that use a dedicated PLL clock phase (as
opposed to a resynchronization FIFO buffer), use the same process as described in
“Write Timing Margin”, to dynamically sweep resynchronization margin (c4
resynch_clk_2x).
Postamble Timing Issues and Margin
The postamble timing is set by the PHY during calibration. You can diagnose
postamble issues by viewing the pnf_per_byte signal from the example driver.
Postamble timing issues mean only read data is corrupted during the last beat of any
read request.
Intermittent Issues
Intermittent issues are typically the hardest type of issue to debug—they appear
randomly and are hard to replicate.
Intermittent Issue Evaluation
Errors that occur during run-time indicate a data related issue, which you can identify
by the following actions:
■
Add the SignalTap II logic analyzer and trigger on the post-trigger pnf
■
Use a stress pattern of data or transactions, to increase the probability of the issue
■
Heat up or cool down the system
■
Run the system at a slightly faster frequency
If adding the SignalTap II logic analyzer or modifying the project causes the issue to
go away, the issue is likely to be placement or timing related.
Errors that occur at start-up indicate that the issue is related to calibration, which you
can identify by the following actions:
■
Modify the design to continually calibrate and reset in a loop until the error is
observed
■
Where possible, evaluate the calibration margin either from the debug toolkit or
system console.
1
■
November 2011
Refer to the ALTMEMPHY External Memory Interface Debug Toolkit and
UniPHY External Memory Interface Debug Toolkit chapters in volume 3 of the
External Memory Interface Handbook.
Capture the calibration error stage or error code, and use this information with
whatever specifically occurs at that stage of calibration to assist with your debug
of the issue.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
11–28
Chapter 11: Debugging Memory IP
Debug Toolkit
Debug Toolkit
The debug toolkit is an interface that runs on your PC and enables you to debug your
external memory interface design on the circuit board, retrieve calibration status, and
perform margining activities. Debug toolkit uses a JTAG connection to a Windows PC.
Altera provides the following types of debug toolkits:
■
ALTMEMPHY Debug Toolkit
■
UniPHY EMIF Debug Toolkit
ALTMEMPHY Debug Toolkit Overview and Usage Flow
The ALTMEMPHY Debug Toolkit supports the following Altera AFI-based IP:
1
■
ALTMEMPHY megafunction
■
DDR2 and DDR3 SDRAM High-Performance Controller and High Performance
Controller II
The debug toolkit does not support the QDR II and II+ SRAM, RLDRAM II with
UniPHY controllers.
The ALTMEMPHY Debug Toolkit lists and indicates whether calibration stages are
successful, states error specific to calibration failure and provides possible causes.
However, the debug toolkit does not fix a failing design. You can run the debug
toolkit and the SignalTap II logic analyzer at the same time.
The ALTMEMPHY Debug Toolkit usage flow involves the following steps:
1. Before using the debug toolkit, modify the design example top-level file by
regenerating the IP with the JTAG Avalon-MM port enabled.
2. Add additional debug and sequencer signals to indicate the location of calibration
failure, resyncronization margin, read and write latency, and PLL status.
3. Recompile the design.
4. Connect your pc's download cable (for example, ByteBlaster™ II download cable)
to the JTAG port on the development board.
5. Program the device with the debug enabled in your design using the SignalTap II
logic analyzer.
6. Run analysis and interpret calibration results using the ALTMEMPHY Debug
Toolkit with the SignalTap II logic analyzer.
f For more information about the ALTMEMPHY debug toolkit and calibration stages,
refer to the ALTMEMPHY External Memory Interface Debug Toolkit chapter in volume 3
of the External Memory Interface Handbook.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 11: Debugging Memory IP
Document Revision History
11–29
UniPHY EMIF Debug Toolkit Overview and Usage Flow
The UniPHY EMIF Debug Toolkit is a Tcl-based interface and consists of the following
parts:
■
DDR2 and DDR3 SDRAM Controllers with UniPHY
■
Avalon Memory-Mapped (Avalon-MM) slave interface
■
JTAG Avalon master
The EMIF toolkit allows you to display information about your external memory
interface and generate calibration and margining reports. The toolkit can aid in
diagnosing the type of failure that may be occurring in your external memory
interface, and help identify areas of reduced margin that might be potential failure
points.
The UniPHY Debug Toolkit usage flow involves the following steps:
1. (Optional) Generate your IP core with the CSR port enabled and with the CSR
communication interface type properly set.
2. Recompile the design.
3. Connect your pc's download cable (for example, ByteBlaster II download cable) to
the JTAG port on the development board.
4. Program the device.
5. Specify project settings using the UniPHY EMIF Debug Toolkit.
6. Generate calibration report and interpret calibration results using the UniPHY
EMIF Debug Toolkit.
f For more information about the UniPHY EMIF debug toolkit and calibration stages,
refer to the UniPHY External Memory Interface Debug Toolkit chapter in volume 3 of the
External Memory Interface Handbook.
Document Revision History
Table 11–1 lists the revision history for this document.
Table 11–1. Document Revision History
Date
Version
Changes
November 2011
4.0
Added Debug Toolkit section.
June 2011
3.0
Removed leveling information from ALTMEMPHY Calibration Stages and UniPHY Calibration
Stages chapter.
December 2010
2.1
July 2010
2.0
January 2010
1.2
Corrected minor typos.
December 2009
1.1
Added Debug Toolkit for DDR2 and DDR3 SDRAM High-Performance Controllers chapter
and ALTMEMPHY Calibration Stages chapter.
November 2009
1.0
First published.
November 2011
Altera Corporation
■
Added new chapter: UniPHY Calibration Stages.
■
Added new chapter: DDR2 and DDR3 SDRAM Controllers with UniPHY EMIF Toolkit.
Updated for 10.0 release.
External Memory Interface Handbook
Volume 2: Design Guidelines
Section II. Miscellaneous Guidelines
This section provides information about HardCopy migration for UniPHY-based
designs, ways to increase the efficiency of the controller and the PHY, and describes
the power estimation methods.
This section includes the following chapters:
■
Chapter 12, HardCopy Design Migration Guidelines
■
Chapter 13, Optimizing the Controller
■
Chapter 14, PHY Considerations
■
Chapter 15, Power Estimation Methods for External Memory Interface Designs
f For information about the revision history for chapters in this section, refer to
“Document Revision History” in each individual chapter.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
12. HardCopy Design Migration
Guidelines
November 2011
EMI_DG_012-2.0
EMI_DG_012-2.0
This chapter discusses HardCopy® migration guidelines for UniPHY-based designs. If
you want to migrate your ALTMEMPHY-based designs to HardCopy, Altera
recommends that you first upgrade your design to UniPHY.
f For information about upgrading an ALTMEMPHY-based design to UniPHY, refer to
the Upgrading to UniPHY-based Controllers from ALTMEMPHY-based Controllers chapter
in volume 3 of the External Memory Interface Handbook.
HardCopy Migration Design Guidelines
If you intend to target your design to a HardCopy device, you should select both your
prototyping FPGA device and target HardCopy device at the start of your project, to
avoid any late difficulties that might occur due to differences in the UniPHY IP
between FPGA and HardCopy implementations.
1
You must migrate your design from an FPGA to a HardCopy companion device; you
cannot target HardCopy directly.
Ensure you use the following design guidelines when migrating your design:
■
On the PHY Settings page of the parameter editor, turn on HardCopy
Compatibility Mode, and then specify whether the Reconfigurable PLL Location
is Top_Bottom or Left_Right.
1
Altera recommends that you set the Reconfigurable PLL Location to the
same side as your memory interface.
When turned on, the HardCopy Compatibility Mode option enables a ROM
loader and run-time reconfiguration for all phase-locked loops (PLLs) and
delay-locked loops (DLLs) instantiated in memory interfaces. In this mode, all the
necessary PLL and DLL reconfiguration and ROM loader signals are brought to
the top level of the design.
■
Enable run-time reconfiguration mode for all PLLs and DLLs instantiated in
interfaces that are configured in PLL and DLL slaves.
f For information about PLL megafunctions, refer to the Phase-Locked Loop
(ALTPLL) Megafunction User Guide and the Phase-Locked Loops
Reconfiguration (ALTPLL_RECONFIG) Megafunctions User Guide. For
information about DLL megafunctions, refer to the ALTDLL and
ALTDQ_DQS Megafunctions User Guide.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
12–2
Chapter 12: HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines
■
Ensure that you place all memory interface pins close together. If, address pins are
located far away from data pins, for example, closing timing might be difficult.
■
For DDR2 and DDR3 (and RLDRAM II when using the Nios® II -based sequencer)
UniPHY-based designs, ensure that you have external nonvolatile ROM or flash
memory on your circuit board for storing the Nios II instruction code. (QDR II and
QDR II+ SRAM with UniPHY designs support only the RTL-based sequencer in
HardCopy migration.) The UniPHY IP instantiates a ROM loader to load Nios II
instruction code from external ROM.
For ROM loader connection guidelines, refer to “ROM Loader for Designs Using
Nios II Sequencer” on page 12–3.
■
In the wraparound interface design, open the
<variation_name>_p0_new_io_pads.v file in an editor and locate the following
line:
.dll_offsetdelay_in((i < 0) ?
hc_dll_config_dll_offset_ctrl_offsetctrlout :
hc_dll_config_dll_offset_ctrl_offsetctrlout),
In the preceding line, first change the second
hc_dll_config_dll_offset_ctrl_offsetctrlout to
hc_dll_config_dll_offset_ctrl_b_offsetctrlout, and then change the numeral
0 to the number of DQS groups located in the top or bottom I/O edge. For
example, changing 0 to 3 would mean that DQS groups 0 to 2 are connected to the
output of the first DLL offset control block. DQS group 3 and above are connected
to the output of the second DLL offset control block.
You can use the example top-level project that is generated when you turn on
HardCopy Migration as a guide to help you connect the necessary signals in your
design.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 12: HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines
12–3
Differences in UniPHY IP Generated with HardCopy Migration Support
When you generate a UniPHY memory interface for HardCopy device support,
certain features in the IP are enabled that do not exist when you generate the IP core
for only the FPGA. This section discusses those additional enabled features.
Figure 12–1 shows the additional blocks enabled when HardCopy Compatibility
Mode is turned on.
Figure 12–1. HardCopy UniPHY Example Design
HardCopy Example Design
Memory
AFI
PHY
Avalon MM
MM-Slave
MM-Master
Controller
Driver
ROM
Loader
Interface
Controller+PHY Wrapper
Pass/Fail
PLL/DLL
Reconfiguration
Interface
ROM Loader for Designs Using Nios II Sequencer
An additional ROM loader is instantiated in the design for UniPHY designs that use
the Nios II sequencer. The Nios II sequencer instruction code resides in RAM on either
the HardCopy or FPGA device.
When you target only an FPGA device, the RAM is initialized when the device is
programmed; however, HardCopy devices are not programmable and therefore the
RAM cannot be initialized in this fashion. Instead, the Nios II sequencer instruction
code must be stored in an external, non-volatile memory and must be loaded to the
Nios II sequencer RAM through a ROM loader. You must create a subsystem to load
the Nios II sequencer code from the external nonvolatile memory to the ROM loader.
The instruction code varies according to memory protocols and memory
parameterization in UniPHY. You can share the same sequencer instruction code
content for multiple interface designs if the memory protocol and settings are the
same for each interface. You can also store different Nios II code in the same external,
nonvolatile memory and use a single subsystem to load the Nios II codes to the
corresponding Nios II sequencer RAMs.
f For more information about the ROM loader, refer to the RAM Initializer
(ALTMEM_INIT) Megafunction User Guide.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
12–4
Chapter 12: HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines
Table 12–1 lists the ports exposed at the top level of the PHY+Controller wrapper to
expose the ROM loader utilized by the Nios II-based sequencer within the RLDRAM
II, DDR2, or DDR3 PHY.
Table 12–1. Top-level Ports that Connect to External ROM for Loading Nios II Code Memory
Port Name
Direction
hc_rom_config_clock
Input
hc_rom_config_datain
Input
Size
Description
1 bit
Write clock for the ROM loader. This clock is the write clock
for the Nios II code memory.
32 bits Data input from external ROM.
hc_rom_config_rom_data_ready Input
1 bit
Asserts to the code memory loader that the word of memory
is ready to be loaded.
hc_rom_config_init
Input
1 bit
Signals that the Nios II code memory is being loaded from
the external ROM.
hc_rom_config_init_busy
Output
1 bit
Remains asserted throughout initialization and becomes
inactive when initialization is complete. soft_reset_n can
be issued after hc_rom_config_init_busy is deasserted.
hc_rom_config_rom_rden
Output
1 bit
Read-enable signal that connects to the external ROM.
hc_rom_config_rom_address
Output
12 bits ROM address that connects to the external ROM.
You can load the Nios II instruction code in several ways. You can connect the ROM
loader directly to the dedicated external, nonvolatile memory; alternatively, you may
reuse the FPGA configuration interface with flash memory to load the Nios II
instruction code using the ROM loader. The configuration flash memory is used to
configure the FPGA design, but device configuration is not required in HardCopy
designs, conserving resources and board space. You can reuse existing configuration
pins and flash memory for interfacing with the ROM loader without any extra I/O
pins or flash memory.
Three FPGA configuration schemes are available with flash memory: passive serial
(PS) configuration, active serial (AS) configuration, and fast passive parallel (FPP)
configuration. Only active serial (AS) and fast passive parallel (FPP) configuration
interfaces are suitable for interfacing with the ROM loader when the device is in user
mode.
f For more information about FPGA configuration schemes, refer to the Configuration
Handbook.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 12: HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines
12–5
Passive Serial (PS) Configuration Scheme
In the passive serial configuration scheme, CONF_DONE, nSTATUS, nCE, nCONFIG, DATA[0],
and DCLK are used for FPGA configuration. Only the DATA[0] signal is a dual-purpose
configuration pin which is used as a normal I/O pin in user mode. Thus, only a single
DATA[0] pin is available in the HardCopy device to use as an I/O pin. Interfacing
with the ROM loader requires more than one I/O pin; therefore this configuration
scheme cannot be used for interfacing with the ROM loader. Altera recommends
using other configuration schemes in UniPHY-based designs that you intend to
migrate to HardCopy devices.
Active Serial (AS) Configuration Scheme
The active serial configuration scheme uses four configuration pins (DATA[0], DCLK,
ASDO, and nCSO) to configure the FPGA. You can directly access the content of the flash
memory through these four configuration pins in FPGA user mode, or in the
HardCopy device using the active serial memory interface (ASMI) controller. You
must add the ASMI controller to your HardCopy design to interface with the ROM
loader. Figure 12–2 shows an example connection between a ROM loader with ASMI
controller and EPCS flash memory.
Figure 12–2. ROM Loader Connection in Active Serial Configuration Scheme
Serial Configuration
Device/EPCS Device
HardCopy Device
DATA[0]
DATA
DCLK
nCS
ASDI
UniPHY
DCLK
nCSO
ASMI
Controller
ASDO
ROM
Loader
f For more information about the ASMI controller, refer to the Active Serial Memory
Interface (ALTASMI_PARALLEL) Megafunction User Guide.
Fast Passive Parallel (FPP) Configuration Scheme
In the fast passive parallel configuration scheme, only the DATA[0...7] configuration
pins can be used as normal I/Os in a HardCopy device; eight DATA pins are available
with this scheme.
The data pins can be used as clock, data, address, and command pins to interface with
the external host for loading the flash memory content to Nios II through the ROM
loader. You must ensure the flash memory data pins are connected to the external host
or controller before connecting to the FPGA during FPGA configuration, in order to
reuse these data pins.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
12–6
Chapter 12: HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines
FPGA configuration uses an external host which you must configure to interface with
the ROM loader and to read the content of the flash memory. Due to limited FPP
interface I/O count, you must create two controllers to serialize and deserialize the
address (12 bits) and data (23 bits) signals of the ROM loader. One controller resides in
the HardCopy device and one resides in the external host device. Figure 12–3 shows
an example of a ROM loader configured in the FPP configuration scheme.
Figure 12–3. ROM Loader Connection in Fast Passive Parallel Configuration Scheme
Flash
Memory
DATA
ADDR
DATA[0]
ADDR
CMD
HardCopy Device
UniPHY
DATA[1]
CMD
.
.
. DATA[7]
DATA
ROM Loader
Controller
ROM
Loader
External Host
(Max II device or
microprocessor)
PLL/DLL Run-time Reconfiguration
The PLLs and DLLs in the HardCopy design have run-time reconfiguration enabled—
provided that they are not in PLL/DLL slave mode.
When the PLLs and DLLs are generated with reconfiguration enabled, extra signals
must be connected and driven by user logic. In the example design generated during
IP core generation, the PLL/DLL reconfiguration signals are brought to the top level
and connected to constants.
f For information about PLL megafunctions and reconfiguration, refer to the PhaseLocked Loop (ALTPLL) Megafunction User Guide and the Phase-Locked Loops
Reconfiguration (ALTPLL_RECONFIG) Megafunctions User Guide.
Table 12–2 lists the DLL reconfiguration ports exposed at the top level of the
Controller and PHY wrapper.
Table 12–2. DLL Reconfiguration Ports Exposed at Top-Level of Controller+PHY Wrapper (Part 1 of 2)
Port Name
Direction
Size
Description
hc_dll_config_dll_offset_ctrl_
addnsub (1)
Input
1 bit
Addition/subtraction control port for the DLL. This port
controls whether the delay-offset setting on
hc_dll_config_dll_offset_ctrl_offset is added
or subtracted.
hc_dll_config_dll_offset_ctrl_
offset (1)
Input
6 bits
Offset input setting for the PLL. This is a Gray-coded
offset that is added or subtracted from the current value of
the DLL’s delay chain.
hc_dll_config_dll_offset_ctrl_
offsetctrlout
Output (2)
6 bits
The registered and gray-coded value of the current
delay-offset setting for the DLL offset control block that
controls DQS pins on the top or bottom I/O edge.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 12: HardCopy Design Migration Guidelines
HardCopy Migration Design Guidelines
12–7
Table 12–2. DLL Reconfiguration Ports Exposed at Top-Level of Controller+PHY Wrapper (Part 2 of 2)
Port Name
Direction
Size
Description
hc_dll_config_dll_offset_ctrl_
b_offsetctrlout
Output (2)
6 bits
The registered and gray-coded value of the current delay
offset setting for the DLL offset control block that controls
DQS pins on left or right I/O edge.
Note:
(1) Available only in DLL nonsharing mode and DLL master sharing mode.
(2) Functions as an output in DLL nonsharing mode and DLL master sharing mode, and as an input in DLL slave mode.
Table 12–3 lists the PLL reconfiguration ports exposed at the top level of the
Controller and PHY wrapper.
Table 12–3. PLL Reconfiguration Ports Exposed at the Top-Level of Controller+PHY Wrapper (1)
Port Name
Direction
Size
Description
hc_pll_config_configupdate
Input
1 bit
Control signal to enable PLL reconfiguration. (Applies to
RLDRAMII and QDRII only; the phase reconfiguration
feature for DDR2/3 is included in the CSR port.)
hc_pll_config_phasecountersele
ct
Input
4 bits
Specifies the counter select for dynamic phase
adjustment. (Applies to RLDRAMII and QDR II only.)
hc_pll_config_phasestep
Input
1 bit
Specifies the phase step for dynamic phase shifting.
(Applies to RLDRAMII and QDR II only.)
hc_pll_config_phaseupdown
Input
1 bit
Specifies if the phase adjustment should be up or down.
(Applies to RLDRAMII and QDR II only.)
hc_pll_config_scanclk
Input
1 bit
PLL reconfiguration scan chain clock.
hc_pll_config_scanclkena
Input
1 bit
Clock enable port of the hc_pll_config_scanclk
clock.
hc_pll_config_scandata
Input
1 bit
Serial input data for the PLL reconfiguration scan chain.
hc_pll_config_phasedone
Output
1 bit
When asserted, this signal indicates to core logic that
phase adjustment is completed and that the PLL is
ready to act on a possible second adjustment pulse.
hc_pll_config_scandataout
Output
1 bit
The data output of the serial scan chain.
1 bit
This signal is asserted when the scan chain write
operation is in progress and is deasserted when the
write operation is complete.
hc_pll_config_scandone
Output
Note:
(1) Inputs and outputs are available only in PLL nonsharing mode and PLL master sharing mode. No inputs or outputs are available in PLL slave
mode.
To facilitate placement and timing closure and to help compensate for PLLs adjacent
to I/Os and vertical I/O overhang issues that can occur when targeting HardCopy III
and HardCopy IV devices, an additional pipeline stage is added to the write path in
the RTL when you turn on HardCopy Compatibility. The additional pipeline stage is
added in all cases, except when CAS write latency equals 2 (for DDR3) or CAS latency
equals 3 (for DDR2), where the additional pipeline stage is not required to meet
timing requirements. The additional pipeline stage does not affect the overall latency
of the controller, because the controller’s internal latency is reduced by 1 to
compensate for the extra pipeline stage.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
12–8
Chapter 12: HardCopy Design Migration Guidelines
Document Revision History
In DDR2 and DDR3 designs, at a certain frequency the DLL length changes when you
generate the IP with HardCopy Compatibility enabled; this allows the DLL to work
in both FPGA and HardCopy devices at the requested frequency. In addition, because
the memory clock uses the global clock network, the write clock changes from a
regional clock network to a global clock network, for reduced skew between the write
clock and memory clock, resulting in improved leveling timing.
f For information about HardCopy issues such as vertical I/O overhang, PLLs adjacent
to I/Os, and timing closure, refer to the I/O Features for HardCopy III Devices chapter in
volume 1 of the HardCopy III Device Handbook and I/O Features for HardCopy IV Devices
chapter in volume 1 of the HardCopy IV Device Handbook.
Document Revision History
Table 12–4 lists the revision history for this document.
Table 12–4. Document Revision History
Date
Version
November 2011
2.0
June 2011
1.0
External Memory Interface Handbook
Volume 2: Design Guidelines
Changes
■
Reorganized HardCopy design migration information into an individual chapter.
■
Updated the HardCopy Migration Design Guidelines section.
Initial release.
November 2011 Altera Corporation
13. Optimizing the Controller
November 2011
EMI_DG_013-2.0
EMI_DG_013-2.0
Understanding how to increase the efficiency and bandwidth of the memory
controller is important when you design any external memory interface. This section
discusses factors that affect controller efficiency and ways to increase the efficiency of
the controller.
Controller Efficiency
Controller efficiency varies depending on data transaction. The best way to determine
the efficiency of the controller is to simulate the memory controller for your specific
design.
You express controller efficiency as:
Efficiency = number of active cycles of data transfer/total number of cycles
The total number of cycles includes the number of cycles required to issue commands
or other requests.
1
You calculate the number of active cycles of data transfer in terms of local clock cycles.
For example, if the number of active cycles of data transfer is 2 memory clock cycles,
you convert that to the local clock cycle which is 1.
The following cases are based on a DDR2 SDRAM high-performance controller
design targeting a Stratix® IV device that has a CAS latency of 3, and burst length of 4
on the memory side (2 cycles of data transfer), with accessed bank and row in the
memory device already open. The Stratix IV device has a command latency of 9 cycles
in half-rate mode. The local_ready signal is high.
■
Case 1: The controller performs individual reads.
Efficiency = 1/(1 + CAS + command latency) = 1/(1+1.5+9) = 1/11.5 = 8.6%
■
Case 2: The controller performs 4 back to back reads.
In this case, the number of data transfer active cycles is 8. The CAS latency is only
counted once because the data coming back after the first read is continuous. Only
the CAS latency for the first read has an impact on efficiency. The command
latency is also counted once because the back to back read commands use the same
bank and row.
Efficiency = 4/(4 + CAS + command latency) = 4/(4+1.5+9) = 1/14.5 = 27.5%
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
13–2
Chapter 13: Optimizing the Controller
Controller Efficiency
Factors Affecting Efficiency
The two main factors that affect controller efficiency are the interface standard
specified by the memory vendor, and the way you transfer data.
The following sections discuss these two factors in detail.
Interface Standard
Complying with certain interface standard specifications affects controller efficiency.
When interfacing the memory with the DDR2 or DDR3 SDRAM controllers, you must
follow certain timing specifications and perform the following bank management
operations:
■
Activate
Before you issue any read (RD) or write (WR) commands to a bank within a DDR2
SDRAM device, you must open a row in that bank using the activate (ACT)
command. After you open a row, you can issue a read or write command to that
row based on the tRCD specification. Reading or writing to a closed row has
negative impact on the efficiency as the controller has to first activate that row and
then wait until tRCD time to perform a read or write.
■
Precharge
To open a different row in the same bank, you must issue a precharge (PCH)
command. The precharge command deactivates the open row in a particular bank
or the open row in all banks. Switching a row has a negative impact on the
efficiency as you must first precharge the open row, then activate the next row and
wait tRCD time to perform any read or write operation to the row.
■
Device CAS latency
The higher the CAS latency, the less efficient an individual access. The memory
device has its own read latency, which is about 12 ns to 20 ns regardless of the
actual frequency of the operation. The higher the operating frequency, the longer
the CAS latency is in number of cycles.
■
Refresh
A refresh, in terms of cycles, consists of the precharge command and the waiting
period for the auto refresh. Based on the memory datasheet, these components
require the following values:
■
tRP = 12 ns, 3 clock cycles for a 200-MHz operation (5 ns period for 200 MHz)
■
tRFC = 75 ns, 15 clock cycles for a 200-MHz operation.
Based on this calculation, a refresh pauses read or write operations for 18 clock
cycles. So, at 200 MHz, you lose 1.15% (18 x 5 ns/7.8 us) of the total efficiency.
Figure 13–1 and Figure 13–2 show some examples of how the bank management
operations affect controller efficiency. Figure 13–1 shows a read operation in which
you have to change a row in a bank. This figure shows how CAS latency and
precharge and activate commands affect efficiency.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Controller Efficiency
13–3
Figure 13–1 illustrates a read-after-write operation. The controller changes the row
address after the write-to-read from a different row.
Figure 13–1. Read Operation—Changing A Row in A Bank
(1)
(1)
(1)
(2)
(2)
(3)
(4) (5)
The following sequence of events describes Figure 13–1:
1. The local_read_req signal goes high, and when the local_ready signal goes high,
the controller accepts the read request along with the address.
2. After the memory receives the last write data, the row changes for read. Now you
require a precharge command to close the row opened for write. The controller
waits for tWR time (3 memory clock cycles) to give the precharge command after
the memory receives the last write data.
3. After the controller issues the precharge command, it must wait for tRP time to
issue an activate command to open a row.
4. After the controller gives the activate command to activate the row, it needs to
wait tRCD time to issue a read command.
5. After the memory receives the read command, it takes the memory some time to
provide the data on the pin. This time is known as CAS latency, which is 3 memory
clock cycles in this case.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
13–4
Chapter 13: Optimizing the Controller
Controller Efficiency
For this particular case, you need approximately 17 local clock cycles to issue a read
command to the memory. Because the row in the bank changes, the read operation
takes a longer time, as the controller has to issue the precharge and activate
commands first. You do not have to take into account tWTR for this case because the
precharge and activate operations already exceeded tWTR time.
Figure 13–2 shows the case where you use the same the row and bank address when
the controller switches from write to read. In this case, the read command latency is
reduced.
Figure 13–2. Changing From Write to Read—Same Row and Bank Address
(1)
(2)
(4) (3)
The following sequence of events describes Figure 13–2:
1. The local_read_req signal goes high and the local_ready signal is high already.
The controller accepts the read request along with the address.
2. When switching from write to read, the controller has to wait tWTR time before it
gives a read command to the memory.
3. The SDRAM device receives the read command.
4. After the SDRAM device receives the read command, it takes some time to give
the data on the pin. This time is called CAS latency, which is 3 memory clock
cycles in this case.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Controller Efficiency
13–5
For the case illustrated in Figure 13–2, you need approximately 11 local clock cycles to
issue a read command to the memory. Because the row in the bank remains the same,
the controller does not have to issue the precharge and activate commands, which
speeds up the read operation and in turn results in a better efficiency compared to the
case in Figure 13–1.
Similarly, if you do not switch between read and write often, the efficiency of your
controller improves significantly.
Data Transfer
The following methods of data transfer reduce the efficiency of your controller:
■
Performing individual read or write accesses is less efficient.
■
Switching between read and write operation has a negative impact on the
efficiency of the controller.
■
Performing read or write operations from different rows within a bank or in a
different bank—if the bank and a row you are accessing is not already open—also
affects the efficiency of your controller.
Figure 13–3 shows an example of changing the row in the same bank.
Figure 13–3. Changing Row in the Same Bank
(1)
November 2011
Altera Corporation
(2)
External Memory Interface Handbook
Volume 2: Design Guidelines
13–6
Chapter 13: Optimizing the Controller
Controller Efficiency
The following sequence of events describes Figure 13–3:
1. You have to wait tWR time before giving the precharge command
2. You then wait tRP time to give the activate command.
Ways to Improve Efficiency
To improve the efficiency of your controller, you can use the following methods:
■
DDR2 SDRAM Controller
■
Auto-Precharge Commands
■
Additive Latency
■
Bank Interleaving
■
Additive Latency and Bank Interleaving
■
User-Controlled Refresh
■
Frequency of Operation
■
Burst Length
■
Series of Reads or Writes
The following sections discuss these methods in detail.
DDR2 SDRAM Controller
The DDR2 SDRAM controller maintains up to eight open banks; one row in each bank
is open at a time. Maintaining more banks at one time helps avoid bank management
commands. Ensure that you do not change a row in a bank frequently, because
changing the row in a bank causes the bank to close and reopen to open another row
in that bank.
Auto-Precharge Commands
The auto-precharge read and write commands allow you to indicate to the memory
device that this read or write command is the last access to the currently opened row.
The memory device automatically closes or auto-precharges the page it is currently
accessing, so that the next access to the same bank is quicker. This command is useful
when performing fast random memory accesses.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Controller Efficiency
13–7
Figure 13–4 shows how you can improve controller efficiency using the autoprecharge command.
Figure 13–4. Improving Efficiency Using Auto-Precharge Command
(1)
(2)
The following sequence of events describes Figure 13–4:
1. The controller accepts a read request from the local side as soon as the
local_ready signal goes high.
2. The controller gives the activate command and then gives the read command. The
read command latency is approximately 14 clock cycles for this case as compared
to the similar case with no auto precharge which had approximately 17 clock
cycles of latency (described in Figure 13–3).
When using the auto-precharge option, note the following guidelines:
November 2011
■
Use the auto-precharge command if you know the controller is issuing the next
read or write to a particular bank and a different row.
■
Auto-precharge does not improve efficiency if you auto-precharge a row and
immediately reopen it.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
13–8
Chapter 13: Optimizing the Controller
Controller Efficiency
Additive Latency
Additive latency increases the efficiency of the command and data bus for sustainable
bandwidths. You may issue the commands externally but the device holds the
commands internally for the duration of additive latency before executing, to improve
the system scheduling. The delay helps to avoid collision on the command bus and
gaps in data input or output bursts. Additive latency allows the controller to issue the
row and column address commands—activate, and read or write—in consecutive
clock cycles, so that the controller need not hold the column address for several (tRCD)
cycles. This gap between the activate and the read or write command can cause
bubbles in the data stream.
Figure 13–5 shows an example of additive latency.
Figure 13–5. Additive Latency—Read
[1]
[2]
T0
T1
T2
T3
T4
T5
ACT n
RD n
NOP
NOP
NOP
NOP
T6
T7
T8
NOP
NOP
NOP
CK#
CK
Command
DQS/DQS#
tRCD (MIN)
DQ
n
AL = 2
n+1 n+2 n+3
CL = 3
RL = 5
(1)
The following sequence of events describes Figure 13–5:
1. The controller issues a read or write command before the tRCD (MIN)
requirement— additive latency less than or equal to tRCD (MIN).
2. The controller holds the read or write command for the time defined by additive
latency before issuing it internally to the SDRAM device.
Read latency = additive latency + CAS latency
Write latency = additive latency + CAS latency – tCK
Bank Interleaving
You can use bank interleaving to sustain bus efficiency when the controller misses a
page, and that page is in a different bank.
1
Page size refers to the minimum number of column locations on any row that you
access with a single activate command.
Without interleaving, the controller sends the address to the SDRAM device, receives
the data requested, and then waits for the SDRAM device to refresh before initiating
the next data transaction, thus wasting several clock cycles.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Controller Efficiency
13–9
Interleaving allows banks of the SDRAM device to alternate their refresh and access
cycles. One bank undergoes its refresh cycle while another is being accessed. By
alternating banks, the controller improves its performance by masking the refresh
time of each bank. If there are four banks in the system, the controller can ideally send
one data request to each of the banks in consecutive clock cycles.
For example, in the first clock cycle, the CPU sends an address to Bank 0, and then
sends the next address to Bank 1 in the second clock cycle, before sending the third
and fourth addresses to Banks 2 and 3 in the third and fourth clock cycles respectively.
The sequence is as follows:
1. Controller sends address 0 to Bank 0.
2. Controller sends address 1 to Bank 1 and receives data 0 from Bank 0.
3. Controller sends address 2 to Bank 2 and receives data 1 from Bank 1.
4. Controller sends address 3 to Bank 3 and receives data 2 from Bank 2.
5. Controller receives data 3 from Bank 3.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
13–10
Chapter 13: Optimizing the Controller
Controller Efficiency
Figure 13–6 shows how you can use interleaving to increase bandwidth.
Figure 13–6. Using Interleaving to Increase Bandwidth
Access Pattern Without Interleaving
CPU
Memory
D1 available
Start Access for D1
Start Access for D2
Access Pattern With 4-way Interleaving
Memory
Bank 0
Memory
Bank 1
CPU
Memory
Bank 2
Memory
Bank 3
Access Bank 0
Access Bank 1
Access Bank 2
Access Bank 3
Access Bank 0 (again)
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Controller Efficiency
13–11
Additive Latency and Bank Interleaving
Using additive latency together with bank interleaving increases the bandwidth of the
controller.
Figure 13–7 shows an example of bank interleaving in a read operation without
additive latency.
Figure 13–7. Bank Interleaving—Without Additive Latency
[1]
T0
[3]
T1
T2
[2]
T3
[4]
T4
T5
T6
[5]
T7
T8
T9
T10
T11
T12
T13
T14
CK#
CK
Command
ACT
ACT
READ
ACT
READ
READ
Address
Bank x
Row n
Bank y
Row n
Bank x
Col n
Bank z
Row n
Bank y
Col n
Bank z
Col n
A10
DQS
DQ
Figure 13–7 illustrates an example of DDR2 SDRAM bank interleave reads with CAS
latency of 4, and burst length of 4.
The following sequence of events describes Figure 13–7:
1. The controller issues an activate command to open the bank, which activates bank
x and the row in it.
2. After tRCD time, the controller issues a read with auto-precharge command to the
specified bank.
3. Bank y receives an activate command after tRRD time.
4. The controller cannot issue an activate command to bank z at its optimal location
because it must wait for bank x to receive the read with auto-precharge command,
thus delaying the activate command for one clock cycle.
5. The delay in activate command causes a gap in the output data from the DDR2
SDRAM device.
1
November 2011
If you use additive latency of 1, the latency affects only read commands and not the
timing for write commands.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
13–12
Chapter 13: Optimizing the Controller
Controller Efficiency
Figure 13–8 shows an example of bank interleaving in a read operation with additive
latency. In this configuration, the controller issues back-to-back activate and read with
auto-precharge commands.
Figure 13–8. Bank Interleaving—With Additive Latency
[1]
[2]
[3]
T0
T1
T2
Command
ACT
READ
ACT
Address
Bank x
Row n
Bank x
Col n
Bank y
Row n
T3
T4
T5
READ
ACT
READ
Bank y
Col n
Bank z
Row n
Bank z
Col n
[4]
T6
T7
T8
T9
T10
[5]
T11
T12
T13
T14
CK#
CK
A10
DQS
DQ
Figure 13–8 illustrates an example of a DDR2 SDRAM bank interleave reads with
additive latency of 3, CAS latency of 4, and burst length of 4.
The following sequence of events describes Figure 13–8:
1. The controller issues an activate command to bank x.
2. The controller issues a read with auto precharge command to bank x right after the
activate command, before waiting for the tRCD time.
3. The controller executes the read with auto-precharge command tRCD time later on
the rising edge T4.
4. 4 cycles of CAS latency later, the SDRAM device issues the data on the data bus.
5. For burst length of 4, you need 2 cycles for data transfer. With 2 clocks of giving
activate and read with auto-precharge commands, you get a continuous flow of
output data.
Compare the following efficiency results in Figure 13–7 and Figure 13–8:
■
DDR2 SDRAM bank interleave reads with no additive latency, CAS latency of 4,
and burst length of 4 (Figure 13–7),
Number of active cycles of data transfer = 6.
Total number of cycles = 15
Efficiency = 40%
■
DDR2 SDRAM bank interleave reads with additive latency of 3, CAS latency of 4,
and burst length of 4 (Figure 13–8),
Number of active cycles of data transfer = 6.
Total number of cycles = 14
Efficiency = approximately 43%
The interleaving reads used with additive latency increases efficiency by
approximately 3%.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Controller Efficiency
1
13–13
Additive latency improves the efficiency of back-to-back interleaved reads or writes,
but not individual random reads or writes.
User-Controlled Refresh
The requirement to periodically refresh memory contents is normally handled by the
memory controller; however, the User Controlled Refresh option allows you to
determine when memory refresh occurs. With specific knowledge of traffic patterns,
you can time the refresh operations so that they do not interrupt read or write
operations, thus improving efficiency.
Frequency of Operation
Certain frequencies of operation give you the best possible latency based on the
memory parameters. The memory parameters you specify through the parameter
editor in the MegaWizardTM Plug-In Manager are converted to clock cycles and
rounded up.
If you are using a memory device that has tRCD = 20 ns and running the interface at
100 MHz, you get the following results:
■
For full-rate implementation (tCk = 10 ns):
tRCD convert to clock cycle = 20/10 = 2.
■
For half rate implementation (tCk = 20 ns):
tRCD convert to clock cycle = 20/20 = 1
This frequency and parameter combination is not easy to find because there are many
memory parameters and frequencies for the memory device and the controller to run.
Memory device parameters are optimal for the speed at which the device is designed
to run, so you should run the device at that speed.
In most cases, the frequency and parameter combination is not optimal. If you are
using a memory device that has tRCD = 20 ns and running the interface at 133 MHz,
you get the following results:
■
For full-rate implementation (tCk = 7.5 ns):
tRCD convert to clock cycle = 20/7.5 = 2.66, rounded up to 3 clock cycles or 22.5 ns.
■
For half rate implementation (tCk = 15 ns):
tRCD convert to clock cycle = 20/15 = 1.33, rounded up to 2 clock cycles or 30 ns.
There is no latency difference for this frequency and parameter combination.
Burst Length
Burst length affects the efficiency of the controller. A burst length of 8 provides more
cycles of data transfer, compared to a burst length of 4.
For a half-rate design that has a command latency of 9 half-rate clock cycles, and a
CAS latency of 3 memory clock cycles or 1.5 half rate local clock cycles, the efficiency
is 9% for burst length of 4, and 16% for burst length of 8.
■
Burst length of 4 (2 memory clock cycles of data transfer or 1 half-rate local clock
cycle)
Efficiency = number of active cycles of data transfer/total number of cycles
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
13–14
Chapter 13: Optimizing the Controller
Bandwidth
Efficiency = 1/(1 + CAS + command latency) = 1/(1 + 1.5 + 9) = 1/11.5 = 8.6% or
approximately 9%
■
Burst length of 8 (4 memory clock cycles of data transfer or 2 half-rate local clock
cycles)
Efficiency = number of active cycles of data transfer/total number of cycles
Efficiency = 2/(2 + CAS + command latency) = 2/(2 + 1.5 + 9) = 2/12.5 = 16%
Series of Reads or Writes
Performing a series of reads or writes from the same bank and row increases
controller efficiency.
The case shown in Figure 13–2 on page 13–4 demonstrates that a read performed from
the same row takes only 14.5 clock cycles to transfer data, making the controller 27%
efficient.
Do not perform random reads or random writes. When you perform reads and writes
to random locations, the operations require row and bank changes. To change banks,
the controller must precharge the previous bank and activate the row in the new bank.
Even if you change the row in the same bank, the controller has to close the bank
(precharge) and reopen it again just to open a new row (activate). Because of the
precharge and activate commands, efficiency decreases as the controller needs more
time to issue a read or write.
If you must perform a random read or write, use additive latency and bank
interleaving to increase efficiency.
Controller efficiency depends on the method of data transfer between the memory
device and the FPGA, the memory standards specified by the memory device vendor,
and the type of memory controller.
Bandwidth
Bandwidth depends on the efficiency of the memory controller controlling the data
transfer to and from the memory device.
You can express bandwidth as follows:
Bandwidth = data width (bits) × data transfer rate (1/s) × efficiency
Data rate transfer (1/s) = 2 × frequency of operation (4 × for QDR SRAM interfaces)
The following example shows the bandwidth calculation for a 16-bit interface that has
70% efficiency and runs at 200 MHz frequency:
Bandwidth = 16 bits × 2 clock edges × 200 MHz × 70% = 4.48 Gbps.
DRAM typically has an efficiency of around 70%, but when you use the Altera®
memory controller efficiency can vary from 10 to 92%.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 13: Optimizing the Controller
Document Revision History
13–15
In QDR II+ or QDR II SRAM the IP implements two separate unidirectional write and
read data buses, so the data transfer rate is four times the clock rate. The data transfer
rate for a 400-MHz interface is 1, 600 Mbps. The efficiency is the percentage of time the
data bus is transferring data. It is dependent on the type of memory. For example, in a
QDR II+ or QDR II SRAM interface with separate write and read ports, the efficiency
is 100% when there is an equal number of read and write operations on these memory
interfaces.
Document Revision History
Table 13–1 lists the revision history for this document.
Table 13–1. Document Revision History
Date
Version
Changes
November 2011
2.0
Reorganized optimizing the controller information into an individual chapter.
June 2011
1.0
Initial release.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
14. PHY Considerations
November 2011
EMI_DG_014-1.0
EMI_DG_014-1.0
This chapter describes the design considerations that affect the external memory
interface performance and the device resource usage when you use UniPHY IP in
your design.
Core Logic and User Interface Data Rate
The clocking operation in the PHY is categorized into the following two domains:
■
PHY-memory domain—the PHY interfaces with the external memory device and
is always at full-rate.
■
PHY-AFI domain—the PHY interfaces with the memory controller and can either
be at full, half or quarter rate of the memory clock depending on your choice of
controller and PHY.
For the memory controller to operate at full, half and quarter data rate, the UniPHY IP
supports full, half and quarter data rate. The data rate defines the ratio between the
frequency of the Altera® PHY Interface (AFI) clock and the frequency of the memory
device clock.
Table 14–2 compares the clock cycles, data bus width and address/command bus
width between the full-, half-, and quarter-rate designs.
Table 14–1. Ratio between Clock Cycles, Data Bus Width, and Address/Command Bus Width
Bus Width
Data Rate
Controller Clock Cycles
AFI Data
AFI
Address/Command
1
2
1
Half
2
4
2
Quarter
4
8
4
Full
In general, full-rate designs require smaller data and address/command bus width.
However, because the core logic runs at a high frequency, full rate designs might have
difficulties in closing timing. As such, for high frequency memory interface designs,
Altera recommends that you use half-rate or quarter-rate UniPHY IP and controllers.
DDR3 SDRAM interfaces are capable of running at much higher frequencies as
compared to the DDR, DDR2 SDRAM, QDRII, QDRII+ SRAM, and RLDRAM II
interfaces. For this reason, Altera High-Performance Controller II and UniPHY IPs do
not support full rate designs using the DDR3 SDRAM interface. However, DDR3 hard
controller in Arria® V devices only support full rate. Quarter rate design support is
for DDR3 SDRAM interfaces targeting frequencies higher than 667 MHz.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
14–2
Chapter 14: PHY Considerations
Hard and Soft Memory PHY
While it is easier to close timing for half-rate and quarter-rate designs due to the lower
frequency required on the core logic, full-rate interface offer better efficiency for low
burst-length designs because of 1T addressing mode where the address and
command signals are asserted for one memory clock cycle. Typically half-rate and
quarter-rate designs operate in 2T and 4T mode, respectively, in which the address &
command signals in 2T and 4T mode must be asserted for two and four memory clock
cycles, respectively. To improve efficiency, the controller can operate in Quasi-1T
half-rate and Quasi-2T quarter-rate modes. In Quasi-1T half-rate mode, two
commands are issued to the memory on two memory clock cycles. In Quasi-2T
quarter-rate mode, two commands are issued to the memory on four memory clock
cycles. The controller is constrained to issue a row command on the first clock phase
and a column command on the second clock phase, or vice versa. Row commands
include activate and precharge commands; column commands include read and write
commands.
Hard and Soft Memory PHY
The Arria V and Cyclone® V device families support hard and soft memory interfaces.
Hard memory interfaces use the hard memory controllers and hard memory PHY
blocks in the devices.
Currently the hard memory PHY is instantiated together with the hard memory
controller. In addition to the PHY data path that uses the hard IP blocks in the devices
(similar to how the soft PHY is implemented for device families supported by
UniPHY), the hard memory PHY also uses the dedicated hardware circuitries in the
devices for certain component managers in the sequencer, including the read write
(RW) and PHY managers.
1
Standalone hard memory PHY instantiation will be supported in future versions of
the Quartus® II software.
In soft memory PHY, the UniPHY sequencer implements the Nios® II processor and
all the component managers in the core logic. The hard memory PHY uses dedicated
hard IP blocks in the Arria V and Cyclone V devices to implement the RW and PHY
managers to save LE resources, and to allow better performance and lower latency.
Each Arria V and Cyclone V device has a fixed number of hard PHYs. Dedicated I/O
pins with specific functions for data, strobe, address, command, control, and clock
must be used together with each hard PHY.
f For the list of hard PHY dedicated pins, refer to the device pin-out files for your target
device on the Pin-Out Files for Altera Devices page of the Altera website.
Using the soft memory PHY gives you the flexibility to choose the pins to be used for
the memory interface. Soft memory PHY also supports wider interfaces as compared
to hard memory PHY.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 14: PHY Considerations
Sequencer
14–3
Sequencer
Starting from Quartus II software version 11.0, the UniPHY IP soft memory PHY
supports the following two types of sequencer used for QDRII and QDRII+ SRAM,
and RLDRAM II calibration:
■
RTL-based sequencer
■
Nios II-based sequencer
The RTL-based sequencer performs FIFO calibration that includes adjusting the
valid-prediction FIFO (VFIFO) and latency FIFO (LFIFO) length. On top of the FIFO
calibration, the Nios II-based sequencer also performs I/O calibration that includes
adjusting delay chains and phase settings to center-align the data pins with respect to
the strobes that sample them. I/O calibration is required for memory interfaces
running at higher frequencies to increase the read and write margin.
Because the RTL-based sequencer performs relatively simpler calibration process, it
does not require a Nios II processor. For this reason, the resource utilization like LE
and RAM usage is lower as compared to the Nios II-based sequencer.
f For more information about the RTL-based sequencer and Nios II-based sequencer,
refer to the Functional Description—UniPHY chapter in volume 3 of the External
Memory Interface Handbook.
f For more information about the calibration process, refer to the “UniPHY Calibration
Stages” section in the Functional Description—UniPHY chapter of the External Memory
Interface Handbook.
PLL, DLL and OCT Resource Sharing
By default, each external memory interface in a device needs one PLL, one DLL and
one OCT control block. Due to the fixed number of PLL, DLL and OCT resources
available in a device, these resources can be shared by two or more memory interfaces
when certain criterias are met. This method allows more memory interfaces to fit into
a device and allows the remaining resources to be used for other purposes.
By sharing PLLs, apart from reducing the number of PLLs to be used, the number of
clock networks and the clock input pins required are also reduced. To share PLLs, the
memory interfaces must meet the following criterias:
November 2011
■
Run the same memory protocol (for example, DDR3 SDRAM)
■
Run at the same frequency
■
The controllers or PHYs run at the same rate (for example, half rate)
■
Use the same phase requirements (for example, additional core-to-periphery clock
phase of 90°)
■
The memory interfaces are located on the same side of the device, or adjacent sides
of the device if the PLL is able to drive both sides.
Altera Corporation
External Memory Interface Handbook
Volume 2: Design Guidelines
14–4
Chapter 14: PHY Considerations
PLL, DLL and OCT Resource Sharing
Altera devices have up to four DLLs available to perform phase shift on the DQS
signal for capturing the read data. The DLLs are located at the device corners and
some of the DLLs can access two adjacent sides of the device. To share DLLs, the
memory interfaces must meet the following criterias:
■
Run at the same frequency
■
The memory interfaces are located on the same side of the device, or adjacent sides
of the device accessible by the DLL.
Memory interface pins with OCT calibration requires the OCT control block to
calibrate the OCT resistance value. Depending on the device family, the OCT control
block uses either the RUP and RDN, or RZQ pins for OCT calibration. Each OCT
control block can only be shared by pins powered by the same VCCIO level. Sharing
of the OCT control block by interfaces operating at the same VCCIO level allows other
OCT control blocks in the device to support other VCCIO levels. The unused
RUP/RDN or RZQ pins can also be used for other purposes. For example, the
RUP/RDN pins can be used as DQ or DQS pins. To share OCT control block, the
memory interfaces must operate at the same VCCIO level.
f For more information about the resources required for memory interfaces in various
device families, refer to the Planning Pin and FPGA Resources chapter.
f For more information about how to share PLL, DLL and OCT control block, refer to
the Functional Description—UniPHY chapter in volume 3 of the External Memory
Interface Handbook.
f For more information about the DLL, refer to the external memory interface chapters
in the respective device handbooks.
f For more information about the OCT control block, refer to the I/O features chapters
in the respective device handbooks.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
Chapter 14: PHY Considerations
Pin Placement Consideration
14–5
Pin Placement Consideration
The Stratix® V, Arria V, and Cyclone V device families use the PHY clock (PHYCLK)
networks to clock the external memory interface pins for better performance. Each
PHYCLK network is driven by a PLL. In Cyclone V and Stratix V devices, the
PHYCLK network spans across two I/O banks on the same side of the device,
whereas for Arria V devices, each PHYCLK network spans across one I/O bank. As
such, all pins for a memory interface must be placed on the same side of the device.
f For more information about pin placement guidelines related to the PHYCLK
network, refer to the External Memory Interfaces in Stratix V Devices chapter in volume
2 of the Stratix V Device Handbook, External Memory Interfaces in Arria V Devices chapter
in volume 2 of the Arria V Device Handbook, or the External Memory Interfaces in
Cyclone V Devices chapter in volume 2 of the Cyclone V Device Handbook.
Wraparound interface, in which data pins from a memory interface are placed on two
adjacent sides of a device, and split interface, in which data pins are place on two
opposite I/O banks, are supported in certain device families that do not use the PHY
clock network to allow more flexibility in pin placement.
The x36 emulated mode is supported in certain device families that do not use the
PHY clock network for QDRII and QDRII+ SRAM x36 interfaces. In x36 emulated
mode, two x18 DQS groups or four x9 DQS groups can be combined to form a 36-bit
wide write data bus, while two x18 DQS groups can be combined to form a 36-bit
wide read data bus. This method allows a device to support x36 QDRII and QDRII+
SRAM interfaces even if the device does hot have the required number of x36 DQS
groups.
Some device families might support wraparound or x36 emulated mode interfaces at
slightly lower frequencies.
f For information about the devices that support wraparound and x36 emulated mode
interfaces, and the supported frequency for your design, refer to the External Memory
Interface Spec Estimator page on the Altera website
f For more information about x36 emulated mode support for QDRII and QDRII+
SRAM interfaces, refer to the Planning Pin and FPGA Resources chapter.
Document Revision History
Table 14–2 lists the revision history for this document.
Table 14–2. Document Revision History
Date
Version
November 2011
November 2011
1.0
Altera Corporation
Changes
Initial release.
External Memory Interface Handbook
Volume 2: Design Guidelines
15. Power Estimation Methods for
External Memory Interface Designs
November 2011
EMI_DG_015-2.0
EMI_DG_015-2.0
Table 15–1 lists the Altera®-supported power estimation methods for external
memory interfaces.
Table 15–1. Power Estimation Methods for External Memory Interfaces
Method
Vector Source
ALTMEMPHY
Support
UniPHY Support
Accuracy
Lowest
Fastest
Highest
Slowest
Early power
estimator (EPE)
Not applicable
v
v
Vector-less
PowerPlay power
analysis (PPPA)
Not applicable
v
v
RTL simulation
v
v
Zero-delay
simulation
v
v
(2)
(2)
Vector-based
PPPA
(2)
Timing simulation
Estimation Time
(1)
Notes to Table 15–1:
(1) To decrease the estimation time, you can skip power estimation during calibration. Power consumption during calibration is typically equivalent
to power consumption during user mode.
(2) Power analysis using timing simulation vectors is not supported.
When using Altera IP, you can use the zero-delay simulation method to analyze the
power required for the external memory interface. Zero-delay simulation is as
accurate as timing simulation for 95% designs (designs with no glitching). For a
design with glitching, power may be under estimated.
f For more information about zero-delay simulation, refer to the Power Estimation and
Analysis section in the Quartus® II Handbook.
1
The size of the vector file (.vcd) generated by zero-delay simulation of an Altera
DDR3 SDRAM High-Performance Controller Example Design is 400 GB. The .vcd
includes calibration and user mode activities. When vector generation of calibration
phase is skipped, the vector size decreases to 1 GB.
To perform vector-based PPPA using zero-delay simulation, follow these steps:
1. Perform design compilation in the Quartus II software to generate your design's
Netlist <project_name>.vo.
1
The <project_name>.vo is generated in the last stage of a compile EDA
Netlist Writer.
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
ISO
9001:2008
Registered
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011
Subscribe
15–2
Chapter 15: Power Estimation Methods for External Memory Interface Designs
Document Revision History
2. In <project_name>.vo, search for the include statement for <project_name>.sdo,
comment the statement out, and save the file.
3. Create a simulation script containing device model files and libraries and design
specific files:
■
Netlist file for the design, <project_name>.vo
■
RTL or netlist file for the memory device
■
Testbench RTL file
4. Compile all the files.
5. Invoke simulator with commands to generate .vcd files.
6. Generate .vcd files for the parts of the design that contribute the most to power
dissipation.
7. Run simulation
8. Use the generated .vcd files in PPPA tool as the signal activity input file.
9. Run PPPA
f For more information about estimating power, refer to the Power Estimation and
Analysis section in the Quartus II Handbook.
Document Revision History
Table 15–2 lists the revision history for this document.
Table 15–2. Document Revision History
Date
Version
Changes
November 2011
2.0
Reorganized power estimation methods section into an individual chapter.
April 2010
1.0
Initial release.
External Memory Interface Handbook
Volume 2: Design Guidelines
November 2011 Altera Corporation
External Memory Interface Handbook Volume 3:
Reference Material
External Memory Interface Handbook
Volume 3: Reference Material
101 Innovation Drive
San Jose, CA 95134
www.altera.com
EMI_RM-1.0
Document last updated for Altera Complete Design Suite version:
Document publication date:
11.1
November 2011
© 2011 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos
are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance of its
ISO
semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and 9001:2008
services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service Registered
described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device specifications before relying
on any published information and before placing orders for products or services.
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 3: Reference Material
Contents
Chapter Revision Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Section I. Functional Descriptions
Chapter 1. Functional Description—UniPHY
Block Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–1
I/O Pads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–2
Reset and Clock Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–2
Dedicated Clock Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3
Address and Command Datapath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–3
Write Datapath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–4
Leveling Circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–5
Read Datapath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–7
Sequencer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–8
Nios II-Based Sequencer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–8
RTL-based Sequencer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–12
DLL Offset Control Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–13
Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–14
AFI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–14
The Memory Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–15
The DLL and PLL Sharing Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–15
About PLL Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–17
The OCT Sharing Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–17
UniPHY Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–18
PHY-to-Controller Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–21
Using a Custom Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–25
Using a Vendor-Specific Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–25
AFI 3.0 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–26
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–26
Bus Width and AFI Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–26
AFI Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–27
Parameters Affecting Bus Width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–27
AFI Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–28
Clock and Reset Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–28
Address and Command Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–29
Write Data Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–30
Read Data Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–30
Calibration Status Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–31
Tracking Management Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–32
Register Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–33
UniPHY Register Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–33
Controller Register Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–35
Efficiency Monitor and Protocol Checker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–38
Efficiency Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–38
Protocol Checker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–38
Read Latency Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–38
Using the Efficiency Monitor and Protocol Checker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–38
Avalon CSR Slave and JTAG Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–39
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 3: Reference Material
ii
Contents
UniPHY Calibration Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–40
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–41
Calibration Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–41
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–41
Memory Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–42
Stage 1: Read Calibration Part One—DQS Enable Calibration and DQ/DQS Centering . . . . . . . . 1–42
Guaranteed Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–43
DQS Enable Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–44
Centering DQ/DQS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–46
Stage 2: Write Calibration Part One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–47
Stage 3: Write Calibration Part Two—DQ/DQS Centering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–48
Stage 4: Read Calibration Part Two—Read Latency Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . 1–48
Read Latency Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–48
Calibration Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–49
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–49
Chapter 2. Functional Description—ALTMEMPHY
Block Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–2
Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Address and Command Datapath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Arria II GX Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–3
Clock and Reset Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5
Clock Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–5
Reset Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–7
Read Datapath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–8
Arria II GX Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–8
ALTMEMPHY Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–10
PHY-to-Controller Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–16
Using a Custom Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–23
Preliminary Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–23
Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–23
Clocks and Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–23
Calibration Process Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–24
Other Local Interface Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–24
Address and Command Interfacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–24
Handshake Mechanism Between Read Commands and Read Data . . . . . . . . . . . . . . . . . . . . . . . 2–24
Handshake Mechanism Between Write Commands and Write Data . . . . . . . . . . . . . . . . . . . . . . 2–25
Partial Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–26
ALTMEMPHY Calibration Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–27
Enter Calibration (s_reset) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–29
Initialize PHY (s_phy_initialize) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–29
Initialize DRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–29
Initialize DRAM Power Up Sequence (s_int_dram) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–29
Program Mode Registers for Calibration (s_prog_mr) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–29
Write Header Information in the internal RAM (s_write_ihi) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–30
Load Training Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–30
Write Block Training Pattern (s_write_btp) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–30
Write More Training Patterns (s_write_mtp) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–31
Test More Pattern Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–31
Calibrate Read Resynchronization Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–33
Initialize Read Resynchronisation Phase Calibration (s_rrp_reset) . . . . . . . . . . . . . . . . . . . . . . . . 2–34
Calibrate Read Resynchronization Phase (s_rrp_sweep) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–34
Calculate Read Resynchronization Phase (s_rrp_seek) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–34
Calculate Read Data Valid Window (s_rdv) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–34
External Memory Interface Handbook
Volume 3: Reference Material
November 2011
Altera Corporation
Contents
iii
Advertize Write Latency (s_was) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–35
Calculate Read Latency (s_adv_rlat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–35
Output Write Latency (s_adv_wlat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–36
Calibrate Postamble (s_poa) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–36
Set Up Address and Command Clock Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–37
Write User Mode Register Settings (s_prep_customer_mr_setup) . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–37
Voltage and Temperature Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–37
Setup the Mimic Window (s_tracking_setup) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–38
Perform Tracking (s_tracking) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–38
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–38
Chapter 3. Functional Description—Hard Memory Interface
Hard Memory Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
High-Level Feature Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
Multi-Port Front End (MPFE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2
Fabric Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2
Operation Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–3
Multi-port Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–3
Port Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–3
DRAM Burst Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4
DRAM Power Saving Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4
Hard Memory Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
Clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
DRAM Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
Controller ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–6
Bonding of Memory Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–6
Data Return Bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–6
FIFO Ready . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–7
Bonding Latency Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–7
Bonding Controller Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–7
Hard PHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–7
Interconnections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–7
Clock Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
Hard Sequencer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
Document Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
Chapter 4. Functional Description—HPC II Controller
Memory Controller Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–1
Avalon-ST Input Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–2
AXI to Avalon-ST Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–2
Handshaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–3
Command Channel Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–3
Data Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–3
Burst Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–3
Backpressure Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–4
Command Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–4
Timing Bank Pool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–4
Arbiter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–4
Arbitration Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–4
Rank Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Read Data Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Write Data Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
November 2011
Altera Corporation
External Memory Interface Handbook
Volume 3: Reference Material
iv
Contents
ECC Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
AFI Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
CSR Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Controller Features Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Data Reordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–5
Pre-emptive Bank Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–6
Quasi-1T and Quasi-2T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–6
User Autoprecharge Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–6
Half-Rate Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–6
Address and Command Decoding L