Altera Stratix V Network Reference Platform User Guide

Altera Stratix V Network Reference
Platform User Guide
Subscribe
Send Feedback
OCL008-14.0.0
2014.07.25
101 Innovation Drive
San Jose, CA 95134
www.altera.com
TOC-2
Contents
Altera Stratix V Network Reference Platform User Guide................................ 1-1
About the Altera Stratix V Network Reference Platform User Guide..................................................1-1
Stratix V Network Reference Platform: Prerequisites.............................................................................1-1
Legacy Board Support..................................................................................................................... 1-2
Features of the Stratix V Network Reference Platform.......................................................................... 1-2
Contents of the Stratix V Network Reference Platform......................................................................... 1-3
Developing Your Custom Platform...........................................................................................................1-4
Initializing Your Custom Platform................................................................................................1-5
Removing Unused Hardware......................................................................................................... 1-5
Integrating Your Custom Platform with the AOCL................................................................... 1-6
Setting up the Software Development Environment.................................................................. 1-6
Branding Your Custom Platform.................................................................................................. 1-7
Establishing Host Communication............................................................................................... 1-8
Connecting the Memory................................................................................................................. 1-8
Integrating an OpenCL Kernel...................................................................................................... 1-9
Programming Your FPGA Quickly Using CvP........................................................................... 1-9
Guaranteeing Timing Closure..................................................................................................... 1-10
Stratix V Network Reference Platform Design Architecture...............................................................1-11
Host-FPGA Communication over PCIe.....................................................................................1-11
DDR3 as Global Memory for OpenCL Applications................................................................1-16
QDRII as Heterogeneous Memory for OpenCL Applications................................................ 1-18
Host Connection to OpenCL Kernels.........................................................................................1-19
Implementation of UDP Cores as OpenCL Channels .............................................................1-19
FPGA System Design.....................................................................................................................1-21
Guaranteed Timing Closure.........................................................................................................1-25
Addition of Timing Constraints.................................................................................................. 1-27
Connection to the Altera SDK for OpenCL............................................................................... 1-28
FPGA Programming Flow............................................................................................................1-29
Host-to-Device MMD Software Implementation..................................................................... 1-36
OpenCL Utilities Implementation...............................................................................................1-37
Stratix V Network Reference Platform Implementation Considerations..........................................1-38
Troubleshooting.........................................................................................................................................1-39
Document Revision History.....................................................................................................................1-40
Altera Corporation
1
Altera Stratix V Network Reference Platform
User Guide
2014.07.25
OCL008-14.0.0
Subscribe
Send Feedback
About the Altera Stratix V Network Reference Platform User Guide
The Altera Stratix V Network Reference Platform User Guide describes the procedures and design
considerations you can implement to modify the Altera® Stratix® V Network Reference Platform (s5_net)
into your own Custom Platform for use with the Altera Software Development Kit (SDK) for OpenCL™(1)
(2)
(AOCL). This document also contains reference information on the design decisions for the s5_net
Reference Platform, which makes use of features such as heterogeneous memory buffers and I/O channels
to maximize hardware usage on a computing card designed for networking.
Important: You must use the tools available in the s5_net Reference Platform and the AOCL Custom
Platform Toolkit together to create your own Custom Platform. As such, Altera recommends
that you familiarize yourself with the contents of both this document and the Altera SDK for
OpenCL Custom Platform Toolkit User Guide.
Stratix V Network Reference Platform: Prerequisites
The Altera Stratix V Network Reference Platform User Guide assumes that you are an experienced FPGA
designer deriving an Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform
from the Stratix V Network Reference Platform (s5_net).
This document also assumes that you are familiar with the following Altera FPGA design tools and
concepts:
•
•
•
•
•
•
(1)
(2)
FPGA architecture, including clocking, global routing and I/Os
High-speed design
Timing analysis
Quartus® II software
Qsys design and Avalon® interfaces
Tcl scripting
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission of the Khronos Group™.
The Altera SDK for OpenCL is based on a published Khronos Specification, and has passed the Khronos
Conformance Testing Process. Current conformance status can be found at www.khronos.org/conform‐
ance.
© 2014 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, ENPIRION, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos are
trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance
of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any
products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information,
product, or service described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device
specifications before relying on any published information and before placing orders for products or services.
www.altera.com
101 Innovation Drive, San Jose, CA 95134
ISO
9001:2008
Registered
1-2
OCL008-14.0.0
2014.07.25
Legacy Board Support
• Designing with LogicLock regions
• PCI Express® (PCIe®)
• DDR3 external memory
In addition, you should be familiar with the contents of the following documents:
• Altera SDK for OpenCL Custom Platform Toolkit User Guide
• Altera SDK for OpenCL Getting Started Guide
• Altera SDK for OpenCL Programming Guide
Related Information
• Altera SDK for OpenCL Custom Platform Toolkit User Guide
• Altera SDK for OpenCL Getting Started Guide
• Altera SDK for OpenCL Programming Guide
Legacy Board Support
The Stratix V Network Reference Platform (s5_net) and the Altera Stratix V Network Reference Platform
User Guide are not compatible with platforms created for previous versions (that is, prior to 14.0) of the
Altera Software Development Kit (SDK) for OpenCL (AOCL).
Follow the instructions in the Migration.txt file available with the AOCL Custom Platform Toolkit to
migrate a platform to the current version without redesigning the system.
The AOCL Custom Platform Toolkit is downloadable from the OpenCL Reference Platforms page on
the Altera website.
Features of the Stratix V Network Reference Platform
Prior to designing an Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform,
you must decide on design considerations that allow you to fully utilize the available hardware on your
computing card.
The following figure depicts the hardware features on a hypothetical computer card that the Stratix V
Network Reference Platform (s5_net) targets.
DDR3-1600
DDR3-1600
10GE
PCIe Gen2 x8
QDRII + 500 MHz
QDRII + 500 MHz
FPGA
Stratix V D8
QDRII + 500 MHz
QDRII + 500 MHz
10GE
Below is a list of features for the s5_net Reference Platform:
1. OpenCL Host
This Reference Platform uses a PCI Express (PCIe)-based host that connects to the Stratix V PCIe
Gen2 x8 Hard intellectual property (HIP).
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Contents of the Stratix V Network Reference Platform
1-3
2. OpenCL Global Memory
The hardware provides two separate DDR3 memory buffers, each with 4 gigabytes (GB) of storage.
This Reference Platform uses both banks together to create 8 GB of global memory.
3. Heterogeneous Memory
This Reference Platform uses the four on-board quad data rate II (QDRII) memory buffers to
implement a total of 64 megabytes (MB) of heterogeneous memory for the Altera Offline Compiler
(AOC). By default, the host application allocates memory into the OpenCL global memory (that is,
DDR3) when an AOCL kernel program loads into the OpenCL runtime. However, based on the kernel
arguments, the host might relocate memory to other buffers available on the computing card (that is,
the four QDRII buffers). Accesses to heterogeneous memory buffers are advantageous for network
applications because they require the fast random access bandwidth that QDR provides.
4. OpenCL I/O Channels
The two 10-Gbps Ethernet (10GbE) I/Os connect to a full user datagram protocol (UDP) stack that
provides an Avalon Streaming (Avalon-ST) interface for direct connection to OpenCL kernels.
5. FPGA Programming
The computing card uses the Configuration via Protocol (CvP)-capable PCIe HIP. This Reference
Platform uses Altera's CvP feature for implementing fast reprogramming over PCIe.
6. Guaranteed Timing
Timing guarantee is achievable via the Altera Quartus II compilation flow for CvP. This Reference
Platform delivers a precompiled netlist in a .personax file that the AOC imports into each kernel
compilation.
Contents of the Stratix V Network Reference Platform
The Stratix V Network Reference Platform (s5_net) is available for download on the OpenCL Reference
Platforms page of the Altera website.
The following table highlights the contents of the s5_net Reference Platform:
Windows File or Folder
Linux File or Directory
Description
board_env.xml
board_env.xml
Extensible markup language file (XML) file that
describes the Reference Platform to the Altera
Software Development Kit (SDK) for OpenCL
(AOCL).
\windows64
/linux64
Contains memory mapped devices (MMD) library,
kernel mode driver, and executables for the AOCL
utilities (that is, install, flash, program,
diagnose) for your 64-bit operating system.
\source
/source
Contains source codes for the MMD library and
AOCL utilities in the /linux64 and \windows64
directories.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-4
OCL008-14.0.0
2014.07.25
Developing Your Custom Platform
Windows File or Folder
\include
Linux File or Directory
/include
Description
Contains header files necessary for compiling an
OpenCL host application and accessing boardspecific application programming interface (API)
calls. For the s5_net Reference Platform, these files
are necessary for user datagram protocol (UDP)
initialization.
Related Information
OpenCL Reference Platforms page
Developing Your Custom Platform
You can develop your Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform by
modifying the Stratix V Network Reference Platform (s5_net).
Before you begin
Developing your Custom Platform requires in-depth knowledge of the contents in the following
documents and tools:
1.
2.
3.
4.
Altera SDK for OpenCL Custom Platform User Guide
Contents of the Altera SDK for OpenCL Custom Platform Toolkit
Altera Stratix V Network Reference Platform for OpenCL User Guide
Altera documentation for all the intellectual properties (IPs) in your Custom Platform
In addition, you must verify independently all the Hard IPs (HIPs) on your computing card (for example,
PCI Express (PCIe) controllers, DDR3 external memory, and Ethernet)
1. Initializing Your Custom Platform on page 1-5
To initialize your Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform,
copy the Stratix V Network Reference Platform (s5_net) to another directory and rename it.
2. Removing Unused Hardware on page 1-5
After you store the Stratix V Network Reference Platform (s5_net) to your own directory and perform
some preliminary modifications, the next step is to modify the Quartus II files.
3. Integrating Your Custom Platform with the AOCL on page 1-6
After you modify your Quartus II files, integrate your Altera Software Development Kit (SDK) for
OpenCL (AOCL) Custom Platform with the AOCL by performing the following tasks.
4. Setting up the Software Development Environment on page 1-6
Prior to building the software layer for your Altera Software Development Kit (SDK) for OpenCL
(AOCL) Custom Platform, you must set up the software development environment.
5. Branding Your Custom Platform on page 1-7
Modify the library, driver and source files in the Stratix V Network Reference Platform (s5_net) to
reference your Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform.
6. Establishing Host Communication on page 1-8
After you modify and rebrand the Stratix V Network Reference Platform (s5_net) to your own Custom
Platform, use the tools and utilities in the Custom Platform to establish communication between your
FPGA accelerator board and your host application.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Initializing Your Custom Platform
1-5
7. Connecting the Memory on page 1-8
You must calibrate the external memory intellectual properties (IPs) and controllers in your Custom
Platform, and connect them to the host.
8. Integrating an OpenCL Kernel on page 1-9
After you establish host communication and connect the external memory, you can test the FPGA
programming process from kernel creation to program execution.
9. Programming Your FPGA Quickly Using CvP on page 1-9
After you verify that the host can program you FPGA device successfully, you can establish the
Configuration via Protocol (CvP) programming capability of your Custom Platform.
10.Guaranteeing Timing Closure on page 1-10
When you modify the Stratix V Network Reference Platform (s5_net) into your own Custom Platform,
you must ensure that guaranteed timing closure holds true for your Custom Platform.
Initializing Your Custom Platform
To initialize your Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform, copy
the Stratix V Network Reference Platform (s5_net) to another directory and rename it.
1. Download the s5_net Reference Platform from the OpenCL Reference Platforms page on the Altera
website.
2. Store the s5_net directory into a directory that you own (that is, not a system directory) and then
rename it ( <your_custom_platform_name>).
3. Remove the <your_custom_platform_name>/hardware/s5_net/persona directory.
4. Rename the <your_custom_platform_name>/hardware/s5_net directory to match the name of your FPGA
board (<board_name>), and then modify the board_spec.xml file with this name.
5. Modify the board_env.xml file so that the name and default fields match the changes you made in Step
4.
6. In the AOCL, invoke the command aoc --list-boards to confirm that the Altera Offline
Compiler (AOC) dsiplays the board name in your Custom Platform.
Removing Unused Hardware
After you store the Stratix V Network Reference Platform (s5_net) to your own directory and perform
some preliminary modifications, the next step is to modify the Quartus II files.
1. Instantiate your PCI Express (PCIe) controller.
For detailed instructions on instantiating your PCIe controller, refer to the Stratix V Avalon-MM
Interface for PCIe Solutions User Guide.
For information on the design parameters for instantiating the PCIe controller for the s5_net
Reference Platform, refer to the Host-FPGA Communication over PCIe section.
2. In Qsys, open the <your_custom_platform_name>/hardware/<board_name>/board.qsys Qsys System File.
a. Remove the cpld_bridge component.
b. Remove the qdr_0 component.
c. Remove the DDR3 memory controllers.
Because several components use the clock that ddr3a generates, it might be easier to remove only
the second DDR3 controller (ddr3b) and reparameterize ddr3a to match your memory.
3. Remove the cpld.sdc file from the <your_custom_platform_name>/hardware/<board_name> directory.
4. In Qsys, open the <your_custom_platform_name>/hardware/<board_name>/system.qsys file.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-6
OCL008-14.0.0
2014.07.25
Integrating Your Custom Platform with the AOCL
a. Remove the udp_0 component.
5. In the Qsys System menu, click Remove Dangling Connections to remove invalid connection points
between system.qsys and board.qys.
6. Modify both Quartus II Settings Files (.qsf) to use only the pinouts and settings for your system.
Ensure that the only differences between these files are in the settings in the Revision Specific
Settings section of the files.
Related Information
• Stratix V Avalon-MM Interface for PCIe Solutions User Guide
• Host-FPGA Communication over PCIe on page 1-11
Integrating Your Custom Platform with the AOCL
After you modify your Quartus II files, integrate your Altera Software Development Kit (SDK) for
OpenCL (AOCL) Custom Platform with the AOCL by performing the following tasks.
1. Update the <your_custom_platform_name>/hardware/<board_name>/board_spec.xml file by removing the
quad data rate (QDR) and Ethernet channels from it. Ensure that there is at least one global memory
interface, and all the global memory interfaces correspond to the exported interfaces from the
board.qsys Qsys System File.
2. In the <your_custom_platform_name>/hardware/<board_name>/scripts directory, modify the post_flow.tcl
file to not call the create_fpga_bin.tcl file. You can do so by commenting out the line of code containing
the command call_script_as_function scripts/create_fpga_bin.tcl.
3. Set the environment variable ACL_QSH_COMPILE_CMD to quartus_sh --flow compile
top -c base.
Setting this environment variable instructs the AOCL to compile the base revision corresponding to
the base.qsf Quartus II Settings File in the <your_custom_platform_name>/hardware/<board_name>
directory of your custom platform.
4. Perform the steps outlined in the custom_platform_toolkit/tests/README.txt file to compile the custom_
platform_toolkit/tests/boardtest/boardtest.cl OpenCL kernel source file.
The hardware compilation stage will fail because of the absence of the fpga.bin file. However, the
Quartus II compilation should complete successfully and produce a boardtest.aoco Altera Offline
Compiler Object File.
5. If compilation fails because of timing failures, fix the errors, or compile custom_platform_toolkit/tests/
boardtest.cl with different seeds by including the --seed <N> option of the aoc command (for
example, aoc --seed 2 boardtest.cl).
Setting up the Software Development Environment
Prior to building the software layer for your Altera Software Development Kit (SDK) for OpenCL (AOCL)
Custom Platform, you must set up the software development environment.
Setting Up Software Development Environment for Windows on page 1-6
Setting Up the Software Development Environment for Linux on page 1-7
Setting Up Software Development Environment for Windows
1. Install the GNU make utility on your Windows development machine.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Setting Up the Software Development Environment for Linux
1-7
Note: Altera used the GNU make utility version 3.81a to build the software in the Stratix V Network
Reference Platform (s5_net).
2. Install Microsoft Visual Studio.
3.
4.
5.
6.
Note: Microsoft Visual Studio 2008 (9.0) was used to build the software in the s5_net Reference
Platform.
Set up the software development environment so that the AOCL user can invoke AOCL commands
and utilities at a normal command prompt.
Modify the <your_custom_platform_name>/source/Makefile.common file so that TOP_DEST_DIR points to
the top level directory of your Custom Platform.
Set the JUNGO_LICENSE variable to your Jungo WinDriver license in the Makefile.common file.
For information on how to acquire a Jungo Windriver license, visit the Jungo Connectivity Ltd.
website.
To check that you set up the software development environment properly, invoke the gmake or
gmake clean command.
Related Information
Jungo Connectivity Ltd. website
Setting Up the Software Development Environment for Linux
1. Ensure that you use a Linux distribution that Altera supports.
Note: Altera used the GNU Compiler Collection (GCC) version 4.2.3 to build the software in the
Stratix V Network Reference Platform (s5_net).
2. Modify the <your_custom_platform>/source/Makefile.common file so that TOP_DEST_DIR points to the top
level directory of your Custom Platform.
3. To check that you set up the software environment properly, invoke the make or make clean
command.
Branding Your Custom Platform
Modify the library, driver and source files in the Stratix V Network Reference Platform (s5_net) to
reference your Altera Software Development Kit (SDK) for OpenCL (AOCL) Custom Platform.
1. In the sofware available with the s5_net Reference Platform, ensure that you replace all references to
the s5_net Reference Platform to your Custom Platform.
2. Modify the linklib element in <your_custom_platform_name>/board_env.xml Extensible Markup
Language (XML) file to your custom memory mapped devices (MMD) library name.
3. Modify the PACKAGE_NAME and MMD_LIB_NAME fields in the <your_custom_platform_name>/source/
Makefile.common file.
4. In your <your_custom_platform_name> directory, modify the information in the linux64/driver/hw_pcie_
constants.h file for Linux and the source/include/hw_pcie_constants.h file for Windows.
Update the following lines of code with information of your Custom Platform:
#define
#define
#define
#define
#define
ACL_PCI_SUBSYSTEM_VENDOR_ID 0x1172
ACL_PCI_SUBSYSTEM_DEVICE_ID 0x0005
ACL_BOARD_PKG_NAME "s5_net"
ACL_VENDOR_NAME "Altera Corporation"
ACL_BOARD_NAME "Network Reference Platform"
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-8
OCL008-14.0.0
2014.07.25
Establishing Host Communication
Note: The IDs must match the parameters in the PCI Express (PCIe) controller hardware.
5. For Windows systems, update the DeviceList field in the <your_custom_platform_name>/windows64/
driver/acl_boards.inf Setup Information file.
6. Run make in the <your_custom_platform_name>/source directory to generate the driver.
Establishing Host Communication
After you modify and rebrand the Stratix V Network Reference Platform (s5_net) to your own Custom
Platform, use the tools and utilities in the Custom Platform to establish communication between your
FPGA accelerator board and your host application.
1. Program your FPGA device with the base hardware configuration file and reboot your system.
2. Invoke lspci on Linux or open the Device Manager on Windows, and confirm that a PCI Express
(PCIe) device with your vendor and device IDs exists.
This step confirms that your operating system recognizes the PCIe device.
3. Set the environment variable AOCL_BOARD_PACKAGE_ROOT to point to the location of the current
Custom Platform. Then run the aocl install utility command to install the kernel driver on your
machine.
4. Ensure that you properly set the LD_LIBRARY_PATH environment variable on Linux or the PATH
environment variable on Windows.
For more information on the settings for LD_LIBRARY_PATH or PATH, refer to the Altera SDK for
OpenCL Getting Started Guide.
5. Perform one of the following tasks to instruct the memory mapped devices (MMD) software not to use
Configuration via Protocol (CvP) or flash memory to program the FPGA:
• To force the MMD to program via the quartus_pgm executable, set the environment variable
ACL_PCIE_FORCE_USB_PROGRAMMING to a value of 1.
• To force the MMD to program via your custom programming method, modify the
<your_custom_platform_name>/source/host/mmd/acl_pcie_device.cpp file. Trace the appearance of the
environment variable ACL_PCIE_FORCE_USB_PROGRAMMING in the source code, and replace
the existing instruction with your custom programming method.
6. Modify the version_id_test function in the MMD source code in the <your_custom_platform_name>/
source/host/mmd/acl_pcie_device.cpp file to exit after reading from the version ID register.
7. Remake the MMD software.
8. Run the aocl diagnose utility command and confirm the version ID register reads back the ID
successfully. You may set the environment variables ACL_HAL_DEBUG and ACL_PCIE_DEBUG to a
value of 1 to visualize the result of the diagnostic test on your terminal.
Related Information
• Host-FPGA Communication over PCIe on page 1-11
• Altera SDK for OpenCL Getting Started Guide
Connecting the Memory
You must calibrate the external memory intellectual properties (IPs) and controllers in your Custom
Platform, and connect them to the host.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Integrating an OpenCL Kernel
1-9
1. In your Custom Platform, instantiate your external memory IP based on the information in the DDR3
as Global Memory for OpenCL Applications section.
2. Update the <your_custom_platform_name>/hardware/<board_name>/board_spec.xml file to reflect the
modifications.
3. Remove the boardtest hardware configuration file that you create previously, and recompile the
custom_platform_toolkit/tests/boardtest/boardtest.cl kernel source file.
4. Reprogram the FPGA with the new boardtest hardware configuration file and then reboot your
machine.
5. Modify the memory mapped devices (MMD) source code to exit after checking the Uniphy status
register in the function wait_for_uniphy. Rebuild the MMD software.
6. Run the aocl diagnose utility command and confirm that the host reads back both the version ID
and the value 0 from the uniphy_status component.
The utility should return the message Uniphy are calibrated.
7. Consider using the SignalTap II Logic Analyzer to confirm the successful calibration of all memory
controllers.
Related Information
DDR3 as Global Memory for OpenCL Applications on page 1-16
Integrating an OpenCL Kernel
After you establish host communication and connect the external memory, you can test the FPGA
programming process from kernel creation to program execution.
1. Perform the steps outlined in custom_platform_toolkit/tests/README.txt file to build the hardware
configuration file from the custom_platform_toolkit/tests/boardtest/boardtest.cl kernel source file.
2. Program your FPGA device with the boardtest.aocx Altera Offline Compiler Executable file and reboot
your machine.
3. Remove the modifications to early exit that you implemented before and then invoke the aocl
diagnose <device_name> command, where <device_name> is the string you define in your
Custom Platform to identify each board.
By default, <device_name> is the acl number (for example, acl0 to acl15) that corresponds to your
FPGA device. In this case, invoke the aocl diagnose acl0 command.
4. Build the boardtest host application. The .sln file for Windows and the Makefile for Linux are available
in the custom_platform_toolkit/tests/boardtest directory.
Attention: You must modify the .sln file to link it against the memory mapped devices (MMD) library
in your Custom Platform.
5. Set the environment variable CL_CONTEXT_COMPILER_MODE_ALTERA to a value of 3.
For more information on this environment variable, refer to the Troubleshooting section.
Related Information
Troubleshooting on page 1-39
Programming Your FPGA Quickly Using CvP
After you verify that the host can program you FPGA device successfully, you can establish the
Configuration via Protocol (CvP) programming capability of your Custom Platform.
1. Invoke the following command to generate the CvP files:
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-10
Guaranteeing Timing Closure
2.
3.
4.
5.
6.
OCL008-14.0.0
2014.07.25
quartus_cpf -c --cvp <revision_name>.sof <revision_name>.rbf
You may include this command in the <your_custom_platform_name>/hardware/<board_name>/scripts/
post_flow.tcl file so that it generates the CvP files automatically after each compilation.
Your Quartus II compilation directory should contain the files <revision_name>.sof,
<revision_name>.periph.rbf, and <revision_name>.core.rbf files.
Program the base.sof file and then reboot your machine.
(Optional) You may use the Quartus II Programmer to verify basic CvP functionality. Invoke the
quartus_cvp command to program the base.core.rbf file.
Define the contents of your fpga.bin file by adding Tcl code to the <your_custom_platform_name>/
hardware/<board_name>/scripts/post_flow.tcl file that generates the file. Then, modify the memory
mapped devices (MMD) source code and the reprogram utility so that you can use the file.
You may use the existing format if you remove the proprietary host-to-flash programming over the
cpld_bridge component from both the hardware and software.
If you set the environment ACL_PCIE_FORCE_USB_PROGRAMMING earlier, unset it. Then, set the
environment variable ACL_PCIE_FORCE_PERIPH_REPLACE_USB to a value of 1. Alternatively,
modify the <your_custom_platform_name>/source/host/mmd/acl_pcie_device.cpp file to use CvP but not
flash memory fo reprogramming periphery changes. Flash programming is unavailable because an
earlier modification step has removed the necessary hardware.
Invoke the command aocl program <device_name> boardtest.aocx to reprogram the
device. Confirm that the message Program succeed appears.
Note: By default, <device_name> is the acl number. If you have retained the default naming
convention, nvoke the aocl program command using acl0 as <device_name>. Alternatively,
if you use another naming convention for <device_name>, use that in your aocl utility
command.
Note: Ensure that you are in the directory containing the boardtest.aocx Altera Offline Compiler
Executable file when you invoke the aocl program command.
Guaranteeing Timing Closure
When you modify the Stratix V Network Reference Platform (s5_net) into your own Custom Platform,
you must ensure that guaranteed timing closure holds true for your Custom Platform.
1. Establish the floorplan of your design.
2.
3.
4.
5.
6.
7.
Altera Corporation
Important: Consider all design criteria outlined in the FPGA System Design section and the Altera
SDK for OpenCL Custom Platform Toolkit User Guide.
Compile several seeds of boardtest.cl until you generate a compiled design that achieves timing closure
cleanly.
Copy the <path_to_s5_net>/hardware/s5_net/persona/base.root_partition.personax file into your Custom
Platform.
Copy the boardtest.aocx Altera Offline Compiler Executable file from the timing-closed compilation in
Step 2 into your Custom Platform. Rename the file base.aocx.
Derive the top revision top.qsf file from your base.qsf file by including the changes described in the CvP
section.
Remove the ACL_QSH_COMPILE_CMD environment variable.
Recompile boardtest.cl. In the Fitter Preservation section of the report, confirm that the Top partition is
imported.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Stratix V Network Reference Platform Design Architecture
1-11
The Incremental Compilation Placement Preservation section should show 100% placement for Top.
Similarly, the Incremental Compilation Routing Preservation section should show 100% routing for
Top.
8. Confirm that you can use the .aocx file to reprogram over CvP by invoking the aocl program
acl0 boardtext.aocx command.
9. Ensure that the environment variable CL_CONTEXT_COMPILER_MODE_ALTERA is not set. Run
the boardtest_host executable.
Related Information
• FPGA System Design on page 1-21
• Altera SDK for OpenCL Custom Platform Toolkit User Guide
• CvP on page 1-30
Stratix V Network Reference Platform Design Architecture
Altera created the Stratix V Network Reference Platform (s5_net) based on various design considerations.
Familiarize yourself with these design considerations. Having a thorough understanding of the design
decison-making process might help in the design of your own Altera Software Development Kit (SDK) for
OpenCL (AOCL) Custom Platform.
Host-FPGA Communication over PCIe
To set up the PCI Express (PCIe) Hard intellectual property (HIP) that enables communication between
the host and the FPGA board, you must conifgure the IP settings, and set various IDs, constants and
parameters..
Parameter Settings for PCIe Instantiation on page 1-11
Values for PCIe Device Identification Registers on page 1-12
Version ID on page 1-13
Definitions of Hardware Constants in Software Header Files on page 1-13
PCIe Kernel Driver on page 1-14
SG-DMA on page 1-16
Parameter Settings for PCIe Instantiation
The Stratix V Network Reference Platform (s5_net) instantiates the Stratix V PCI Express (PCIe) Hard
intellectual property (HIP) to implement a host-to-device connection over PCIe.
Dependencies
• Altera Stratix V PCIe HIP core
• For Windows systems, Jungo WinDriver
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-12
OCL008-14.0.0
2014.07.25
Values for PCIe Device Identification Registers
HIP Configuration Settings
The table below highlights some of the HIP configuration settings:
Parameter
Setting
Lanes
Lane Rate: Gen2 (5.0 gigabits per second (Gbps))
Number of Lanes: x8
Note: This is the fastest configuration that can
support Configuration via Protocol
(CvP).
Enable Configuration via Protocol (CvP)
On
Base Address Registers (BARs)
The design uses only a single BAR (BAR 0).
Rx Buffer Credit Allocation
Low
Note: This setting is derived experimentally.
Address Translation Tables
Number of address pages: 256
Note: This setting is derived experimentally.
Size of address pages: 12 bits
Important: The number and size of the
address pages must match the
values in the memory mapped
devices (MMD) layer.
Values for PCIe Device Identification Registers
When building the PCI Express (PCIe) hardware, you must set the following PCIe IDs related to the
device hardware.
ID Register Name
Vendor ID
Parameter Name in PCIe IP
Core
vendor_id_hwtcl
Description
Identifies the manufacturer of the FPGA
device.
Always set this register to Altera's vendor
ID: 0x1172
Device ID
device_id_hwtcl
Identifies the FPGA device.
Set the device ID to the device code of the
FPGA device on your accelerator board.
For the Stratix V Network Reference
Platform (s5_net), this register is set to
0xD800 for the Stratix V D8 FPGA.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Version ID
ID Register Name
Subsystem Vendor ID
Parameter Name in PCIe IP
Core
subsystem_vendor_id_hwtcl
1-13
Description
Identifies the manufacturer of the acceler‐
ator board.
Set this register to the vendor ID of
manufacturer of your accelerator board.
If you are a board vendor, set this register to
your vendor ID.
Subsystem Device ID
subsystem_device_id_hwtcl
Identifies the accelerator board.
The Altera Software Development Kit
(SDK) for OpenCL (AOCL) uses this ID to
identify the board because the software
might perform differently on different
boards. If you create a Custom Platform
that supports multiple boards, use this ID to
distinguish between the boards. Alterna‐
tively, if you have multiple Custom
Platforms, each supporting a single board,
you can use this ID to distinguish between
the Custom Platforms.
Note: Make this ID unique to your
Custom Platform.
You can find these PCIe ID definitions in the PCIe controller instantiated in the board.qsys system. These
IDs are necessary in the driver and the AOCL programming flow. The kernel driver uses the Vendor ID,
Subsystem Vendor ID and the Subsystem Device ID to identify the boards it supports. The AOCL
programming flow refers to the Device ID to ensure that it programs a device with an Altera Offline
Compiler Executable file (.aocx) targeting that specific device.
Version ID
The Stratix V Network Reference Platform (s5_net) instantiates a version_id component that connects to
the PCI Express (PCIe) Avalon master.
Before communicating with any part of the FPGA system, the PCIe first reads from this version_id
register to confirm the following:
• The PCIe can access the FPGA fabric successfully.
• The address map matches the map in the memory mapped devices (MMD) software.
Update the VERSION_ID parameter in the version_id component to a new value with every slave addition
or removal from the PCIe Base Address Registers (BAR) 0 bus, or whenever the address map changes.
Definitions of Hardware Constants in Software Header Files
After you build the PCI Express (PCIe) component in your hardware design, you need a software layer to
communicate with the board via PCIe. To enable this communication, you must define the hardware
constants for the software in the form of header files.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-14
OCL008-14.0.0
2014.07.25
PCIe Kernel Driver
The Stratix V Network Reference Platform (s5_net) includes three header files that describe the hardware
design to the software. The location of these header files are as follows:
• For Linux systems, the location is <path_to_s5_net>/linux64/driver
• For Windows systems, the location is <path_to_s5_net>\source\include
The following table describes the hardware constants header files:
Header File Name
hw_pcie_constants.h
Description
Header file that defines most of the hardware constants for the
board design.
Example constants in this file include the IDs described in the
Values for PCIe Device Identification section, Base Address
Registers (BAR) number, and offset for different components in
your design. In addition, this header file also defines the name
strings of ACL_BOARD_PKG_NAME, ACL_VENDOR_NAME and ACL_
BOARD_NAME.
Keep the information in this file in sync with any changes to the
board design.
hw_pcie_dma.h
Header file that defines direct memory access (DMA)-related
hardware constants.
Refer to the SG-DMA section for more information.
hw_pcie_cvp_constants.h
Header file that defines Configuration via Protocol (CvP)-related
hardware constants.
Refer to the CvP section for more information.
Related Information
• SG-DMA on page 1-16
• CvP on page 1-30
• Values for PCIe Device Identification Registers on page 1-12
PCIe Kernel Driver
A PCI Express (PCIe) kernel driver is necessary for the OpenCL runtime library to access your board
design via a PCIe bus.
The Stratix V Network Reference Platform (s5_net) provides the PCIe kernel driver for the following
operating systems:
• For Windows systems, the location of the driver is <path_to_s5_net>\windows64\driver
• For Linux systems, the location of the driver is <path_to_s5_net>/linux64/driver
Use the Altera Software Development Kit (SDK) for OpenCL (AOCL) install utility to install the
kernel driver. Refer to the aocl install section for more information.
For Windows systems, the kernel driver, the WinDriver application programming interface (API), is a
third-party driver from Jungo Connectivity Ltd. For more information about the WinDriver, refer to the
Jungo Connectivity Ltd. website or contact a Jungo Connectivity representative.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
PCIe Kernel Driver
1-15
For Linux systems, an open-source, memory mapped devices (MMD)-compatible kernel driver is
available with the s5_net Reference Platform. The following table highlights some of the files included in
the Linux kernel driver:
File Name
pcie_linux_driver_exports.h
Description
Header file that defines the special commands the kernel driver
supports. It defines the interace of the kernel driver. The MMD
layer uses this header file to communicate with the device.
After you install the kernel driver, it works as a character device.
The basic operations to the driver are open(), close(), read(),
and write(). To support more complex commands, an acl_cmd
struct is necessary to pass the command of interest to the
kernel driver through the read() or write() operation.
To execute a command, perform the following tasks:
1. Create a variable as type acl_cmd_struct.
2. Specify the command you want to execute with the
appropriate parameters.
3. Send the command through a read() or write() operation.
aclpci.c
File that implements the basic structures and functions that a
Linux kernel driver requires (for example, the init and remove
functions, probe function, and functions that handle interrupts).
aclpci_fileio.c
File that implements the file I/O operations of the kernel driver.
The Linux kernel driver available with the s5_net Reference
Platform supports four file I/O operations, namely open(),
close(), read() and write(). Implementation of these file I/O
operations allows the user application to access the kernel driver
via file I/O system calls (open/read/write/close).
aclpci_cmd.c
File that implements the special commands defined in the pcie_
linux_driver_exports.h file. Examples of these special commands
include SAVE_PCI_CONTROL_REGS, LOAD_PCI_
CONTROL_REGS, DO_CVP, and GET_PCI_SLOT_INFO.
aclpci_dma.c
File that implements direct memory access (DMA)-related
routine in the kernel driver.
Refer to the SG-DMA section for more information.
aclpci_cvp.c
File that implements Configuration via Protocol (CvP)-related
routine in the kernel driver.
Refer to the CvP section for more information.
aclpci_queue.c
File that implements a queue structure for use in the kernel
driver. Such a queue structure eases programming.
Related Information
• aocl install on page 1-37
• SG-DMA on page 1-16
• CvP on page 1-30
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-16
OCL008-14.0.0
2014.07.25
SG-DMA
• Jungo Connectivity Ltd. website
SG-DMA
For more information on scatter-gather direct memory access (SG-DMA), visit the Modular SG-DMA
page on the Altera Wiki website.
Hardware
The acl_dma_core.qsys file encapsulates and parameterizes the Modular SG-DMA hardware. It presents
slave ports for the control and status registers (dma_csr) and the descriptors (dma_descriptors). It also
provides separate masters for read and write operations. The acl_dma .qsys file is the component instanti‐
ated in the board.qys system. It adds the following features:
• An address span extender for nonDMA memory accesses
• A merged read/write master
The merged read/write master issues constant bursts of size 16. As a result, it suffers 1/16 efficiency
degradation from sharing the time interface. However, because the bandwidth of this unit exceeds the
bandwidth of the PCI Express (PCIe) connection by more than this amount, there is no observable hostto-memory bandwidth degradation.
Software
When the memory mapped devices (MMD) receives a request for data transfer, it uses DMA when both
of the following conditions are true:
1. If the transfer size is bigger than 1024 bytes.
2. There are 64-byte alignments with the starting addresses for both the host buffer and the device offset.
Perform the following tasks to carry out a DMA transfer:
1.
2.
3.
4.
5.
6.
Check if there are remaining bytes to be sent.
Unpin the memory from the previous transfer.
Pin the memory for the new transfer.
Set up the Address Translation Tables on the PCIe.
Create and send the DMA descriptor.
Wait until the DMA finishes and then repeat Step 1.
Attention: For the Stratix V Network Reference Platform (s5_net), this implementation is inside the
Linux kernel driver at <path_to_s5_net>/linux64/driver/aclpci_dma.c. For Windows systems, the
implementation is inside the MMD at <path_to_s5_net>\source\host\mmd\acl_pcie_dma_
windows.cpp.
Related Information
Modular SG-DMA page on Altera Wiki
DDR3 as Global Memory for OpenCL Applications
The Stratix V Network Reference Platform (s5_net) targets a computing card that has two banks of 4
gigabytes (GB) x72 DDR3-160 SDRAM.
Completion of the tasks below are necessary to access these banks as global memory for OpenCL
applications.
For more information on the Altera DDR3 Uniphy intellectual property (IP), refer to the External
Memory Interface Handbook.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
DDR3 IP Instantiation
1-17
DDR3 IP Instantiation on page 1-17
DDR3 Connection to PCIe Host on page 1-17
DDR3 Connection to OpenCL Kernel on page 1-18
Related Information
External Memory Interface Handbook
DDR3 IP Instantiation
The Stratix V Network Reference Platform (s5_net) uses two Altera DDR3 controllers with Uniphy
intellectual property (IP) to communicate with the physical memories.
The table below describes the IP configuration settings:
IP Parameter
Configuration Setting
Timing Parameters
As per the computing card's data specifications.
Phase-locked loop (PLL)/delay-locked
loop (DLL) Sharing
The s5_net Reference Platform is configured such that both
memory controllers can share the same PLL and DLL.
Avalon Width Power of 2
Currently, OpenCL does not support non-power-of-2 bus
widths. As a result, the s5_net Reference Platform uses the
option that forces the DDR3 controller to power of 2. Use the
additional pins of this x72 core for error checking between the
memory controller and the physical module.
Byte Enable Support
OpenCL requires byte-level granularity to all memories;
therefore, byte enable support is necessary in the core.
Performance
Enabling reordering and a deeper lookahead might provide
increased bandwidth for some OpenCL kernels. Adjust these and
other parameters as needed for a target application.
Debug
Debug is diabled for production.
After you instantiate the Uniphy IP, you typically need to run the <variation_name>_pin_assignments.tcl Tcl
script to add additional constraints to the Quartus II project. For more information on this process, refer
to the External Memory Interface Handbook.
Related Information
External Memory Interface Handbook
DDR3 Connection to PCIe Host
Connect all global memory systems to the host via the OpenCL Memory Bank Divider component.
The Altera DDR3 Uniphy intellectual property (IP) core has two banks where their width and address
configurations match those of the DDR3 SDRAM. Altera tunes the other parameters such as burst size,
pending reads, and pipelining. These parameters are customizable for an end application or board design.
The Avalon master interfaces from the bank divider connect to their respective memory controllers. The
Avalon slave connects to the PCI Express (PCIe) and direct memory access (DMA) cores. Implementa‐
tions of appropriate clock crossing and pipelining are based on the design floorplan and clock domains
specific to the computing card. The Altera SDK for OpenCL Custom Platform Toolkit User Guide specifies
the connection details of the snoop and memorg ports.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-18
OCL008-14.0.0
2014.07.25
DDR3 Connection to OpenCL Kernel
Important: Instruct the host to check for the successful calibration of the memory controller.
The board.qsys system uses a custom IP component named Uniphy Status to AVS to aggregate different
Uniphy status conduits into a single Avalon slave port named s. This slave connects to the
pipe_stage_host_ctrl component so that the PCIe host can access it.
Related Information
Altera SDK for OpenCL Custom Platform Toolkit User Guide
DDR3 Connection to OpenCL Kernel
The OpenCL kernel needs to connect directly to the memory controller via a FIFO-based clock crosser.
A clock crosser is necessary because the kernel interface for the compiler must be clocked in the kernel
clock domain. In addition, the width, address width, and burst size characteristics of the kernel interface
must match those specified in the bank divider connecting to the host. Appropriate pipninelig also exists
between the clock crosser and the memory controller.
QDRII as Heterogeneous Memory for OpenCL Applications
The OpenCL heterogeneous memory feature allows Altera Software Development Kit (SDK) for OpenCL
(AOCL) users to take advantage of the nonuniform memory architecture in a Custom Platform.
An AOCL Custom Platform groups memories with similar characteristics into a single global memory
system. Each Custom Platform has a designated default global memory system. In the case of the Stratix V
Network Reference Platform (s5_net), the default global memory system consists of the two DDR3
memory banks. The default global memory system must start at base address 0 from the host's perspec‐
tive. Both the hardware design and the board_spec.xml file in the Custom Platform reflect this address
assignment. In the s5_net Reference Platform, the DDR global memory system is named DDR.
In addition to the DDR global memory system, the computing card that the s5_net Reference Platform
targets includes four banks of quad data rate (QDR) memory. These four banks belong to a global
memory system named QDR. AOCL users can only allocate memory in the QDR global memory system
using an attribute on their global memory buffers. All addressable global memory must be contiguous
from the host's perspective; therefore, the QDR memory base address must start where the DDR memory
ends.
For more information on the Altera QDR Uniphy intellectual property (IP), refer to the External Memory
Interface Handbook.
The procedure for implementing the QDR subsystem is similar to the one outlined in the Developing Your
Custom Platform section. Below is a list of high-level tasks:
1. Instantiate and parameterize the Uniphy memory controllers.
2. Connect the Uniphy memory controllers to the host via a new OpenCL Memory Bank Divider
instance.
3. Connect the Uniphy memory controller to the Uniphy Status to AVS component.
4. Export the Uniiphy memory controller to the OpenCL kernel via clock-crossing bridges.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Host Connection to OpenCL Kernels
1-19
Below are special QDR subsystem design considerations for the s5_net Reference Platform:
• QDR provides separate read and write ports.
By default, the OpenCL Memory Bank Divider produces a single bidirectional master for each memory
controller. In Qsys, select the Separate read/write ports option to support separate read and write
masters. With respect to the kernel, instantiate clock crosses and separate read and write interfaces.
• The 275 MHz QDR afi clock and 4-to-1 multiplexing in the bank divider make it difficult to meeting
timing.
To achieve timing closure more robustly, open the qdr.qsys file in Qsys, and select the option to
pipeline the outputs in the OpenCL Memory Bank Divider memory_bank_divider_1. Doing so adds a
pipeline stage for each master that the bank divider creates.
Related Information
• Developing Your Custom Platform on page 1-4
• External Memory Interface Handbook
Host Connection to OpenCL Kernels
The PCI Express (PCIe) host needs to pass commands and arguments to the OpenCL kernels via the
control register access (CRA) Avalon slave port that each OpenCL kernel generates. The OpenCL Kernel
Interface component exports an Avalon master interface (kernel_cra) that connects to this slave port.
The OpenCL Kernel Interface component also generates the kernel reset (kernel_reset) that resets all
logic in the kernel clock domain.
The Stratix V Network Reference Platform (s5_net) instantiates the OpenCL Kernel Interface component
and sets the number of global memory systems parameter to 2. The parameter setting is 2 because the
s5_net Reference Platform has DDR and QDR memory. Below is a list of connection settings in the s5_net
Reference Platform:
• For the default DDR memory, the generated memorg_host0x018 conduit must connect to the DDR
bank divider (memory_bank_divider_0).
• For the default DDR memory, the config_addr attribute in the board_spec.xml file must be set to
0x018.
• For the QDR memory, the memorg_host0x100 conduit must connect to the QDR bank divider
(memory_bank_divider_1).
• For the QDR memory, the config_addr attribute in the board_spec.xml file must be set to 0x100.
Implementation of UDP Cores as OpenCL Channels
OpenCL kernels can communicate directly with I/O using the Altera Software Development Kit (SDK) for
OpenCL (AOCL) channels extension.
For the Stratix V Network Reference Platform (s5_net), Altera uses the PLDA quick user datagram
protocol (QuickUDP) intellectual property (IP) core to implement a full UDP stack on top of the available
10 gigabits per second Ethernet (GbE) channels on the card. QuickUDP provides an Avalon Streaming
(Avalon-ST) interface that can connect directly to the OpenCL kernel, allowing it to send and receive
UDP network traffic without concern for UDP or lower level protocols.
QuickUDP IP Instantiation on page 1-20
QuickUDP Configuration via PCIe-Based Host on page 1-20
QuickUDP Connection to OpenCL Kernel on page 1-20
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-20
OCL008-14.0.0
2014.07.25
QuickUDP IP Instantiation
QuickUDP IP Instantiation
The Stratix V Network Reference Platform (s5_net) instantiates two quick user datagram protocol
(QuickUDP) intellectual property (IP) cores because the computing card it tagets has two 10 gigabits per
second Ethernet (GbE) channels.
The two 10 Gigabit Media Independent Interface (XGMII) interfaces from these cores connect to a single
10GBASE-R PHY with two channels. Some parameters such as the multitenant unit (MTU) and the
number of sessions supported are passed in the Verilog instantiation of this core in the quickudp_wrapper.v
file, but most parameters are accessible via QuickUDP's Avalon Memory-Mapped (Avalon-MM) slave
interface.
QuickUDP Configuration via PCIe-Based Host
Altera Software Development Kit (SDK) for OpenCL (AOCL) users need to set their own parameters such
as media access control (MAC), intellectual property (IP) address, ports, and destinations. As a result,
Altera provides access to the quick user datagram protocol (QuickUDP) configuration space to the host
over PCI Express (PCIe) by connecting pipe_stage_host_ctrl to the config_udp0 and config_udp1
interfaces of the udp.qsys subsystem. AOCL users can then configure the QuickUDP settings in their host
software using the application programming interface (API) in the aocl net.h header file available in the
Stratix V Network Reference Platform (s5_net).
QuickUDP Connection to OpenCL Kernel
Each quick user datagram protocol (QuickUDP) core produces a read and write stream for a total of four
Altera Software Development Kit (SDK) for OpenCL (AOCL) channels available to the kernel. These
streams cross into the kernel clk domain and are listed in the board spec.xml file.
Attention: AOCL supports only basic Avalon Streaming (Avalon-ST) with no packet support.
QuickUDP provides an Avalon-ST interface with full packet support along with additional metadata
about the payload. Because OpenCL does not support the packet extensions, the packetization signals are
converted to data, and the AOCL user's application must handle all packetization.
QuickUDP also provides additional metadata that the application can use. For a full explanation of these
signals, refer to the QuickUDP documentation on the PLDA website. In the Stratix V Network Reference
Platform (s5_net), the payload, packetization signals, and metadata are concatenated into a single 256-bits
wide vector exported as an AOCL channel.
Use the information in the following table to access the desired components of the channel's data:
Table 1-1: Bit Mapping for the 256-Bit AOCL Channel to QuickUDP
Bit Range
Name
Description
[0:127]
payload
Packet payload
[128]
sop
Start of packet signal
[129]
eop
End of packet signal
[130:133]
empty
[134:149]
payload_size
On EOP, this field indicates how many bytes
are unused
Size of the packet
Set to 0 for outbound packets
[150:181]
Altera Corporation
rem_ip
Indicates the remote IP for incoming packets
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
FPGA System Design
Bit Range
Name
1-21
Description
[182:197]
rem_port
Indicates the remote port for incoming packets
[198:205]
channel
Avalon channel
[206]
error
Avalon error signal
Related Information
PLDA website
FPGA System Design
To integrate all components, close timing, and deliver a post-fit netlist that functions in the hardware, you
must first address several additional FPGA design complexities. These design complexities include a
robust reset sequence, establishment of a design floorplan, global routing management, pipelining, and
intellectual property (IP) encryption. Optimizations of these design complexities occur in tandem with
one another in order to meet timing and board hardware optimization requirements.
Clocks on page 1-21
Resets on page 1-22
Floorplan on page 1-22
Global Routing on page 1-24
Pipelining on page 1-24
Encrypted IPs on page 1-25
Clocks
The following clock domains affect the Qsys hardware system:
•
•
•
•
•
•
250 MHz PCI Express (PCIe) clock
200 MHz DDR3 clock
275 MHz quad data rate (QDR) clock
156.25 MHz ethernet clock
100 MHz general clock (config_clk)
Kernel clock that can take on any clock frequency
With the exception of the kernel clock, the Stratix V Network Reference Platform (s5_net) is responsible
for closing timing of these clocks. However, because the board design must clock cross all interfaces in the
kernel clock domain, the board design also has logic in this clock domain. It is crucial that this logic is
minimal and achieves an Fmax higher than typical kernel performance.
Related Information
Guaranteed Timing Closure on page 1-25
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-22
OCL008-14.0.0
2014.07.25
Resets
Resets
The FPGA system design includes the implementation of the following reset drivers:
1. The por_reset_counter in the board.qsys system implements the power-on-reset. This reset issues a
reset for a number of cycles after the FPGA completes configuration. It resets all the hardware on the
device.
2. The PCI Express (PCIe) bus issues a PERST reset that resets all hardware on the device.
3. The OpenCL Kernel Interface component issues the kernel_reset that resets all logic in the kernel
clock domain.
The first two resets are combined into a single global_reset; therefore, there are only two reset sources
in the system. However, these resets are explicitly synchronized across the various clock domains, which
results in several reset interfaces.
Below are several important notes regarding resets:
1. Synchronizing resets to different clock domains might cause several high fan-out resets.
Qsys automatically synchronizes resets to the clock domain of each connected component. In doing
so, Qsys instantiates new reset controllers with derived names that might change when the design
changes. This name change makes it difficult to make and maintain global clock assignments to some
of the resets. As a result, for each clock domain, there are explicit reset controllers. For example,
global_reset drives reset_controller_pcie and reset_controller_ddr3a; however, they are
synchronized to the PCIe and DDR3 clock domains, respectively. Because both of these resets have
high fan-out signals, they are assigned to global routing in the Quartus II Settings File (.qsf).
2. Resets and clocks must work together to propagate reset to all logic.
Resetting a circuit in a given clock domain involves asserting the reset over a number of clock cycles.
However, your design may apply resets to the phase-locked loops (PLLs) that generate the clocks for a
given clock domain. This means a clock domain can hold in reset without receiving the clock edge
necessary for synchronous resets. In addition, a clock holding in reset might prevent propagation of a
reset signal because it is synchronized to and from that clock domain. Avoid such situations by
ensuring that your design satisfies the following criteria:
• Generate the global_reset signal off the free-running config_clk.
• Never reset the Uniphy controllers.
• Clock the reset controller for the Ethernet PHYs by its free-running reference clock.
3. Apply resets to both reset interfaces of a clock-crossing bridge or FIFO.
FIFO content corruption might occur if only part of a clock-crossing bridge or dual clock FIFO is
reset. These components typically provide a reset input for each clock domain; therefore, reset both
interfaces or none at all. For example, in the Stratix V Network Reference Platform (s5_net),
kernel_reset resets all the kernel clock-crossing bridges between DDR, quad data rate (QDR), and
user datagram protocol (UDP) on both the m0_reset and s0_reset interfaces.
Floorplan
The Altera Software Development Kit (SDK) for OpenCL (AOCL) requires all board logic to be
constrianed along the edges of the FPGA device. This constraint provides a large contiguous space for
OpenCL kernels implementation, which leads to generally better circuit performance (Fmax).
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Floorplan
1-23
The figure below shows the floorplan that the Stratix V Network Reference Platform (s5_net) uses.:
This floorplan shows that all board interface logic are along the edges of the device. The logic in the center
is the OpenCL kernel. At the bottom of the device are the PCI Express (PCIe) and the two DDR3 cores.
The quad data rate (QDR) controllers are along the top of the device, and the two user datagram protocol
(UDP) stacks are on the right. The Stratix V global clock buffers are all around the middle of the device.
This floorplan accomodates access to the global clock buffers by extending the bottom region edges up the
left and right sides. This allows the placement of reset and other global route drivers in the bottom region
to be near the global clock buffer.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-24
OCL008-14.0.0
2014.07.25
Global Routing
You can derive a floorplan for any board by following these steps:
1. Compile the design without any region constraints.
2. Examine the placement location of each of the intellectual property (IP) cores in the Chip Planner.
3. Apply LogicLock regions to push the IP cores to the edges of the device.
Global Routing
FPGAs have dedicated clock trees that distribute high fan-out signals to various sections of the devices.
In the FPGA system that the Stratix V Network Reference Platform (s5_net) targets, a global route can
distribute high fan-out signals in the following manners:
1. Regional—Across any quadrant of the device
2. Dual-regional—Across any half of the device
3. Global—Across the entire device
Because there is no restriction on the placement location of the OpenCL kernel on the device, the kernel
clocks and kernel reset must perform Global distribution.
The DDR3 clock clocks all direct memory access (DMA) logic and carries data into the quad data rate
(QDR) region at the top of the device. As a result, this clock and the reset synchronized to this clock
domain also perform Global distribution.
Pipelining
To implement pipelining in Qsys, refer to the Qsys Interconnect chapter of the Quartus II Handbook for
more information.
Below are some specific examples of pipelining:
• Signals that traverse long distances because of the floorplan require additional pipelining.
The direct memory access (DMA) at the bottom of the FPGA must connect to the quad data rate
(QDR) memory at the top of the FPGA. QDR provides a 64-bits wide interface at 275 MHz. The DMA
is 512 bits in width at 200 MHz. This latter connection is converted to a 128-bits wide 200-MHz
interface in the pipe_stage_qdr_host_0 module, which is pipelined in both command and response.
This narrower bus is used to cross from the bottom region to the QDR region at the top, where it goes
directly into pipe_stage_qdr_host_1. The pipe_stage_qdr_host_1 module is configured in the
same way to ensure no logic insertion between the two regions. This effectively implements pipelined
routing in the 200-MHz clock domain, which the clock crosses into the 275-MHz domain. Finally, the
width of the clock domain is adapted to ensure this entire connection can still fully saturate the QDR
bandwidth.
• The OpenCL kernel might need to connect the DDR interfaces at the bottom of the device and the
QDR kernel interfaces at the top of the device.
The kernel interfaces for QDR memory are located in the top region of the FPGA; however, the host
and DDR connections originate from the bottom region of the FPGA. This can force the kernel to
stretch across the vertical span of the device, resulting in slower Fmax. To alleviate this, enable
additional pipelining in the kernel when connecting to QDR memory by adding an addpipe attribute
to each of the QDR interface element in board spec.xml and assigning it a value of 1.
Related Information
Quartus II Handbook: Qsys Interconnect
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Encrypted IPs
1-25
Encrypted IPs
The Stratix V Network Reference Platform (s5_net) incorporates two encrypted intellectual property (IP)
cores. They are the PLDA quick user datagram protocol (QuickUDP) IP and the CPLD_bridge IP. The
CPLD_bridge IP is for communication between the FPGA and external CPLD.
Incorporation of these IP cores in the s5_net Reference Platform demonstrates that it is feasible to use the
Altera encryption infrastructure to encrypt IPs within a Custom Platform.
Contact your field application engineer for more information on how to encrypt IP for use with the
Quartus II software.
Guaranteed Timing Closure
One of the key features of the Altera Software Development Kit (SDK) for OpenCL (AOCL) is that it
abstracts away hardware details, such as timing closure, for software developers. Implementation of the
AOCL guaranteed timing closure feature is a joint responsibility between the AOCL and you, the
Customer Platform designer. The AOCL provides intellectual property (IP) to generate the kernel clock,
along with a post-flow script that ensures this clock is configured with a safe operating frequency
confirmed by timing analysis. On the other hand, you provide a Custom Platform that imports a post-fit
netlist that has already achieved timing closure on all nonkernel clocks.
Supply the Kernel Clock on page 1-25
Guarantee Kernel Clock Timing on page 1-27
Provide a Timing-Closed Post-Fit Netlist on page 1-27
Supply the Kernel Clock
The OpenCL Kernel Clock Generator component provides the kernel clock and its 2x variant. For more
information on the OpenCL Kernel Clock Generator, refer to the Altera SDK for OpenCL Custom
Platform Toolkit User Guide.
The figure below shows the OpenCL Kernel Clock Generator parameter editor GUI. It shows where the
REF_CLK_RATE parameter specifies the frequency of the reference clock that connects to the
pll_refclk. In this case, the frequency is 100 MHz.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-26
Supply the Kernel Clock
OCL008-14.0.0
2014.07.25
The KERNEL_TARGET_CLOCK_RATE parameter specifes the frequency that Quartus II software
attempts to achieve during compilation. The board hardware contains some logic that the kernel clock
clocks; at a minimum it includes the clock crossing hardware. To prevent this logic from limiting a
kernel's Fmax, the KERNEL_TARGET_CLOCK_RATE must be higher than the frequency that a simple
kernel can achieve on your device. For the Stratix V C2 device that the Stratix V Network Reference
Platform (s5_net) targets, the KERNEL_TARGET_CLOCK_RATE is 380 MHz.
Caution: When developing a Custom Platform, a high target Fmax might cause difficulty in achieving
timing closure.
When developing your Custom Platform and attempting to close timing, add an overriding Synopsys
Design Constraints (SDC) definition to relax the timing of the kernel. The following code example from
the top_post.sdc file applies a 5 ns (200 MHz) maximum delay constraint on the OpenCL kernel during
base revision compilations:
if {! [string equal $::TimeQuestInfo(nameofexecutable) "quartus_map"]}
{
if { [get_current_revision] eq "base" }
{
post_message -type critical_warning "Compiling with slowed OpenCL Kernel clock."
if {! [string equal $::TimeQuestInfo(nameofexecutable) "quartus_sta"]}
{
set kernel_keepers [get_keepers system_inst\|kernel_system\|*]
set_max_delay 5 -from $kernel_keepers -to $kernel_keepers
}
}
}
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Guarantee Kernel Clock Timing
1-27
Caution: Applying this 5 ns SDC definition constrains both the kernel clock and the 2x clock to 5 ns,
resulting in significantly slower kernel speeds.
Related Information
Altera SDK for OpenCL Custom Platform Toolkit User Guide
Guarantee Kernel Clock Timing
The OpenCL Kernel Clock Generator works together with a script that the Quartus II database interface
executable (quartus_cdb) runs after every Quartus II software compilation as a post-flow script.
The following Quartus II Settings File (.qsf) setting invokes the scripts/post_flow.tcl Tcl script in the s5_net
Reference Platform (s5_net) after every Quartus II software compilation using quartus_cdb:
set_global_assignment -name POST_FLOW_SCRIPT_FILE "quartus_cdb:scripts/post_flow.tcl"
Within this script, the following statement calls the OpenCL script to determine and confgure the kernel
clock to a functional frequency:
source $::env(ALTERAOCLSDKROOT)/ip/board/bsp/adjust_plls.tcl
where ALTERAOCLSDKROOT points to the path to the Altera Software Development Kit (SDK) for
OpenCL (AOCL) installation.
Note: Ensure that this flow executes during every Quartus II software compilation of an OpenCL kernel.
Provide a Timing-Closed Post-Fit Netlist
All nodes clocked by nonkernel clocks must have their placement and routing imported from a post-fit
netlist that has closed timing already.
Altera® provides several mechanisms for preserving the placement and routing of some previouslycompiled logic and for importing it into a new compilation. For the Stratix V Network Reference Platform
(s5_net), the following features are desirable from such a flow:
1. Timing preservation
2. Version compatibility to allow the import of the netlist into a newer Quartus II software version
3. Strict preservation of the FPGA periphery to guarantee successful Configuration via Protocol (CvP)
programming
The Altera CvP compilation flow for Stratix V provides all of these features through an exported .personax
file for the top-level partition. This means the s5_net Reference Platform is configured with the necessary
project revisions and partitions necessary for implementing this flow. By default, the Altera Software
Development Kit (SDK) for OpenCL (AOCL) invokes the Quartus II software on revision top. This
revision is configured to import the persona/base.root_partition.personax file, which has been precompiled
and exported from a base revision compilation.
For more information, refer to the CvP section.
Related Information
CvP on page 1-30
Addition of Timing Constraints
A Custom Platform must apply the correct timing constraints to the Quartus II project. In the Stratix V
Network Reference Platform (s5_net), the top.sdc file contains all timing constraints applicable before IP
instantiation in Qsys. The top_post.sdc file contains timing constraints applicable after Qsys. The order of
the application is based on the order of appearance of the top.sdc and top_post.sdc in the top.qsf file.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-28
OCL008-14.0.0
2014.07.25
Connection to the Altera SDK for OpenCL
One noteworthy constraint in the s5_net Reference Platform is the multicycle constraint for the kernel
reset in the top_post.sdc file. Using global routing saves routing resources and provides more balanced
skew. However, the delay across the global route might cause recovery timing issues that limit kernel clock
speed. Although Altera requires all logic to exit reset mode in the same clock cycle, it is not necessary for
that to happen in the same clock cycle as reset deassertion. Therefore, Altera adds a multicycle setup
constraint of 2 and multicycle hold of 1 to the kernel reset. Without these additions, even with reset
drivers located directly adjacent to global clock buffers, the highest kernel Fmax Altera achieves is around
320 MHz.
Connection to the Altera SDK for OpenCL
A Custom Platform requires a board_env.xml file to describe its general contents to the Altera Offline
Compiler (AOC). For each hardware design, your Custom Platform also requires a board_spec.xml file that
describes the hardware.
The following sections describe the implementation of these files for the Stratix V Network Reference
Platform (s5_net).
Describe s5_net to the AOCL
The board_env.xml file describes a Custom Platform to the Altera Software Development Kit (SDK) for
OpenCL (AOCL). Details of each field in the board_env.xml file is available in the Altera SDK for OpenCL
Custom Platform Toolkit User Guide.
In the Stratix V Network Reference Platform (s5_net), Altera uses the \bin directory for Windows
dynamic link libraries (DLLs), \lib directory for delivering libraries, and \libexec directory for delivering
the AOCL utility executables. This directoy structure allows the PATH environment variable to point to
the location of the DLLs (that is, \bin) in isolation of the AOCL utility executables.
The s5_net Reference Platform also supplies an end-user application programming interface (API) for
user datagram protocol (UDP) initialization. The header in the include/aocl net.h file provides this API.
The compileflags element in the board_env.xml file points the compiler to this directory when the AOCL
user invokes the aocl compile-config utility command to derive compiler arguments.
Related Information
Altera SDK for OpenCL Custom Platform Toolkit User Guide
Describe the s5_net Hardware to the AOCL
The Stratix V Network Reference Platform (s5_net) includes a board_spec.xml file that describes the
hardware to the Altera Software Development Kit (SDK) for OpenCL (AOCL) in the contexts described
below.
Device
The device section contains the name of the device model file available in the ALTERAOCLSDKROOT/share/
models/dm directory of the AOCL and in the board spec.xml file. The used_resources element accounts
for all logic outside of the kernel. From the Partition Statistic section of the Fitter report, this is [A] - [A].
[c] for the Top partition. In other words, the value of used_resources equals the number of adaptive
logic modules (ALMs) used in final placement minus the number of ALMs used for registers.
Global Memory
In the board_spec.xml file, there are separete global_mem sections for DDR and quad data rate (QDR)
memory, namely DDR and QDR, respectively. Assign DDR and QDR to the name attribute of the global_mem
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
FPGA Programming Flow
1-29
element. The board instance in Qsys provides all of these interfaces; therefore, board is specified in the
name attribute of all the interface elements within global_mem.
• DDR
Because DDR memory serves as the default memory for the board that the s5_net Reference Platform
targets, its address attribute begins at zero. Its config_ddr is 0x018 to match the the memorg conduit
used to connect to the corresponding Memory Bank Divider for DDR.
Attention: The width and burst sizes must match the parameters in the Memory Bank Divider for
DDR (memory_bank_divider_0).
• QDR
The QDR section begins its address attribute directly after the DDR address space stops, and its
config_addr is 0x100, as indicated in the name of its memorg conduit. Because QDR provides separate
read and write ports, each of these ports are described to the Altera Offline Compiler (AOC) in
separate port attributes.
As discussed in the Pipelining section, the addpipe option is necessary because the QDR kernel
interfaces are at the top of the FPGA and the rest of the kernel interface signals are along the bottom of
the device.
Attention: The width and burst sizes must match the parameters in the Memory Bank Divider for
QDR (memory_bank_divider_1).
Channels
The channels section describes the send and receive Avalon Streaming (Avalon-ST) channels for each of
the user datagram protocol (UDP) cores, for a total of four 256-bit AOCL channels. These channel
interfaces originate in the hardware from the udp_0 instance. Therefore, udp_0 is specified in all the name
attributes. The port attribute identifies the name of the Qsys interface to which a channel connects. The
chan_id attribute is the identifier with which the AOCL user declares the channel.
Interfaces
The interfaces section describes kernel clocks, reset, control register access (CRA), and snoop
interfaces. The Memory Bank Divider for the default memory (in this case, memory_bank_divider_0)
exports the snoop interface described in the interfaces section.. The width of the snoop interface should
match the width of the corresponding streaming interface.
Related Information
Pipelining on page 1-24
FPGA Programming Flow
There are three ways to program the FPGA: Configuration via Protocol (CvP), Flash (if available), and
Quartus II Programmer command-line executable (quartus_pgm) . To replace only the FPGA core, use
CvP programming. To replace both the FPGA periphery and the core, use either Flash programming, or
quartus_pgm programming via cables such as USB-Blaster™.
The default FPGA programming flow is to compare the periphery currently programmed on the FPGA
with the periphery of a new design. If they match, programming through CvP replaces the existing FPGA
core with the new core. If they differ, programming through external flash memory replaces the existing
FPGA periphery with the new design. The quartus_pgm programming via USB-Blaster is an old
approach. You can only use this programming method if you use a cable to connect the board and the
host computer. Cabling is a point of potential failure, and it does not scale well to large deployments. The
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-30
OCL008-14.0.0
2014.07.25
CvP
quartus_pgm approach remains for development and testing purposes, and for use on boards that do
not have an alternative method (such as Flash) for periphery replacement.
CvP on page 1-30
Flash on page 1-32
Defining the Contents of the fpga.bin File on page 1-35
CvP
The Configuration via Protocol (CvP) feature enables core logic update over the PCI Express (PCIe) Hard
intellectual property (HIP) for Stratix V and Arria® V GZ devices. Refer to the Configuration via Protocol
(CvP) Implementation in Altera FPGAs User Guide for more information.
To successfully program the FPGA core logic, the Fitter must ensure that all FPGA periphery program‐
ming bits remain unchanged. The CvP revision flow expresses this hard constraint to the Fitter. Use this
CvP revision flow to achieve reliable CvP programming of the core logic. Specifically, the flow involves
the following steps:
1. Create a base revision. In the Stratix V Network Reference Platform (s5_net), the base revision is
base.qsf.
2. Create a CvP update revision.
This update version is derived from the base revision and includes an imported .personax file.
The .personax file is created during a base revision compilation. It includes the root partition imported
from the base revision compilation. In the s5_net Reference Platform, this CvP update revision is the
top.qsf file, which becomes the project revision the Altera Offline Compiler (AOC) compiles by defualt.
3. Create a kernel partition in both base and update revisions (marked as having multiple personas).
4. Flash the base revision compilation programming file output as power-up configuration.
5. Use the CvP update revision compilation programming file output for all subsequent FPGA reconfgu‐
rations.
To enable the CvP feature, perform the following tasks in Qsys:
1. In the Stratix V HIP for PCI Express GUI, under System Settings, select Enable configuration via
the PCIe link to enable CvP on the PCIe IP.
2. Include the following INI settings in the quartus.ini file:
skip_hssi_gen3_pcie_hip_cvp_enable_rule = on
skip_hssi_gen3_pcie_hip_hip_hard_reset_rule = on
skip_hssi_gen3_pcie_hip_hrdrstctrl_en_rule = on
3. In the Quartus II software, click Assignments > Device > Device and Pin Options to open the Device
and Pin Options dialog box.
a. Under General, select Enable autonomous PCIe HIP mode.
b. Under CvP Settings, perform the following tasks:
a. Set Configuration via Protocol to Core update.
b. Select Enable CvP_CONFDONE pin.
c. Select Enable open drain on CvP_CONFDONE pin.
The PCIe core must have the force_hrc parameter set to a value of 1 in the board.qsys file. Because
you cannot set this parameter using the Qsys GUI, you must save and exit Qsys, and then edit the
setting in the board.qsys file.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
CvP
1-31
Attention: Depending on whether the PCIe core is used in the base or CvP revision, additional modifica‐
tions to the PCIe controller behavior might be necessary. An additional multpersona partition
named cvp_update_reset_partition is implemented to work in conjunction with file edits
to the PCIe core. These file edits are as folows:
• Replacement of altpcie_sv_hip_ast_hwtcl.v and altpcie_hip_256_pipen1b.v in the system/
synthesis/submodules/ directory within the OpenCL kernel folder.
• Addition of cvp_update_reset.v (for base revision) and cvp_update_reset_zero.v (for CvP
update revision) in the system/synthesis/submodules/ directory within the OpenCL kernel
folder.
These file edits are performed automatically in the scripts/pre_flow.tcl Tcl script after Qsys
verilog generation and before launching the Quartus II compilation. The verilog source files
that are being copied over reside in the <path_s5_net>/hardware/s5_net/scripts/cvpupdatefix/
directory.
Base revision and CvP update revision have separate but almost identical Quartus II Settings Files named
base.qsf and top.qsf, respectively. The base.qsf file includes the following parameter settings:
set_global_assignment -name
cvp_update_reset_zero.v
set_global_assignment -name
set_global_assignment -name
set_global_assignment -name
section_id Top
VERILOG_FILE system/synthesis/submodules/
REVISION_ TYPE CVP
BASE_REVISION base
INPUT_PERSONA persona/base.root_partition.personax -
The top.qsf file includes the following parameter settings:
set_global_assignment -name VERILOG_FILE system/synthesis/submodules/
cvp_update_reset.v
set_global_assignment -name CVP_REVISION top
set_global_assignment -name ROUTING_BACK_ANNOTATION_FILE super_kernel_clock.rcf
When you open the top-level Quartus II Project File top.qpf in the Quartus II software, the Project
Navigator shows the base revision (base) and the CvP revision (top), as shown below:
When you click Assignments > Design Partitions Window, the Design Partitions window lists two
defined design partitions for the base revision, as shown below:
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-32
OCL008-14.0.0
2014.07.25
Flash
• The acl_kernel_partition contains all kernel-related logic.
• The cvp_update_reset_partition contains the fix for the CvP issue described in the Attention note
above.
For both partitions, the Allow Multiple Personas parameter is set to On, indicating that the board
interface logic remains unchanged, but these partitions might change across different compilations. The
acl_kernel_partition comprises of different kernel logic resulting from the compilation of the OpenCL
kernel source code. The Top partition, which contains everything other than the two defined partitions,
preserves placement-and-routing by importing the input persona persona/base.root_partition.personax file,
as shown below:
In the case of a base revision compilation, the persona/base.root_partition.personax file is exported form the
scripts/post_flow.tcl file via the export_persona -overwrite -partition Top Tcl command.
This .personax file is imported into all subsequent CvP update revision compilations. It ensures the
preservation of placement and routing of all nonkernel-related logic so that CvP updates only change the
logic residing in the kernel partition. Within the kernel partition, the kernel_system.qsys system is
automatically generated at compilation time. It is a Qsys subhierarchy of system.qsys. It contains one or
more kernel IPs and wrapper logic to connect the kernel partition to the OpenCL board logic in board.qsys
and the user datagram protocol (UDP) logic in udp.qsys.
Related Information
Configuration via Protocol (CvP) Implementation in Altera FPGAs User Guide
Flash
Configuration via Protocol (CvP) programming reprograms the FPGA core quickly, but it cannot replace
the periphery configuration. When a new design uses a different board variant within a Custom Platform,
changes to the periphery are necessary to reflect differences in hardware resources such as the number of
memory controllers. Periphery changes require full-device reprogramming using hardware external to the
FPGA. Full-device programming through the Flash memory is commonly used to store power-on FPGA
configuration images. With this technique, the host system programs the Flash memory across PCI
Express (PCIe) using custom bridge intellectual property (IP) on the FPGA. An FPGA reprogramming
operation from Flash is then carried out, followed by PCIe link restoration with the newly programmed
FPGA.
The information below describes the implementation of Flash programming for the Stratix V Network
Reference Platform (s5_net). If your board offers alternative means of FPGA periphery reconfiguration,
Flash programming is unnecessary.
Remember: Flash memory is one of many possible techniques to program the FPGA periphery. It is a
board-specific choice. Flash programming depends on board-specific communication link
between the host and Flash memory. It also relies on the ability to command FPGA
reprogram operation from Flash in live system.
Alternative FPGA periphery programming methods, preferably accessible from the PCIe bus, can be built
into a board. You can use external cables to program the FPGA periphery with an external device such as
the USB-Blaster (either separate or integrated onto the board). However, cables are points of failure that
do not scale well to large deployments.
Attention: Altera does not recommend external cabling as a solution for periphery programming.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Flash
1-33
Periphery Hashing and Hash ROM
At runtime, infrastructure is necessary to enable decision making on whether FPGA reconfiguration
happens through CvP (core replacement only) or a slower method such as Flash (periphery and core
replacement). It is unsafe to first attempt CvP programming and then Flash programming (if CvP
programming fails) because Flash programming requires PCIe communication with the FPGA. A CvP
programming failure because of mismatched peripheries renders the PCIe link unusable and eliminates
the communication link necessary to program the Flash device. In that failure mode, a system power cycle
is necessary to restore the FPGA and the PCIe link.
The s5_net Reference Platform includes two infrastructure components to enable runtime decision
making on the FPGA configuration method:
1. ROM storage in the locked-down portion of the FPGA, which contains a hash of the currently
programmed periphery configuration. The memory mapped devices (MMD) software layer can read
this ROM via PCIe.
2. Hash of the FPGA periphery bitstream, which is created at the end of the Quartus II compilation flow.
This hash is embedded in two locations:
a. The fpga.bin file, embedded in the Altera Offline Compiler Executable file (.aocx).
b. The FPGA configuration bitstream, so that the correct hash populates the ROM.
By comparing the hash of the periphery currently programmed to the FPGA against that of the new
design, the following function in the acl_pcie_flash.cpp MMD file decides whether CvP programming is
sufficient, of if a fall-back method such as Flash programming is necessary to reprogram the device
periphery:
ACL_PCIE_FLASH::does_programmed_periphery_differ_from_fpga_bin()
Note: The s5_net Reference Platform uses a SHA-1 hash function.
Attention: Populating the periphery hash within the FPGA configuration bitstream changes the
periphery hash value derived from the bitstream.
To avoid this update loop, the Quartus II compilation uses a ROM initialization value of all
zeros. The output from this compilation goes into the computation of the periphery hash.
Then, when you run quartus_cdb --update_mif, it replaces the ROM value with the
output hash from the original compilation.
Communicating with Flash Memory over PCIe
In the s5_net Reference Platform, the host communicates with Flash memory through a CPLD that
connects to a set of FPGA pins. The CPLD is located between the FPGA and Flash, and it masters
communication between the FPGA and off-chip peripherals. A CPLD_bridge IP block is instantiated in
the locked-down interface portion of the FPGA design. It provides memory mapped communication
between the PCIe controller on the FPGA and the external CPLD communication bus.
The CPLD uses a custom packet-based communication protocol for communication with the FPGA. The
MMD host code creates the necessary packets to be sent to the CPLD, and transmits them over PCIe to
the CPLD_bridge on the FPGA. The bridge in turn communicates the packets to the CPLD for further
processing and routing. Flash programming commands are embedded within these packets.
Flash Memory Programming
A full programming bitstream is stored in the Flash memory. The operations necessary for programming
are specific to the Flash chip. For more information on the configuration protocol, refer to the source
code of the acl_pcie_flash.cpp MMD file and the Flash device datasheet.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-34
OCL008-14.0.0
2014.07.25
Flash
The high-level Flash memory programming tasks are as follows:
1. Erase the Flash lines that you want to program.
2. Program the data lines.
3. Read back the data to verify that the programming bitstream is correct.
In the s5_net Reference Platform, raw binary file (RBF) bitstreams are programmed to the Flash memory
because the configuration hardware on the board expects the RBF file format. Alternative file formats
might be necessary on boards with different configuration methods. You can use Quartus II software
utilities, such as quartus_cdb, to perform file format conversion using the post-flow scripts (scripts/
post_flow.tcl and subscripts).
Note: For simplicity, the FPGA bitstreams are not compressed in the s5_net Reference Platform.
The s5_net Reference Platform verifies the successful programming of the Flash memory (that is,
no bit errors). In a production environment where programming speed is of concern, you can take
multiple steps to reduce the Flash programming time. For example, you can use compressed
bitstreams, reduce or eliminate the verification of Flash contents, and remove multiple busy wait
loops in the Flash programming code. Because the s5_net Reference Platform is intended as an
instructional proof of concept, it is not optimized for programming time.
If the device is configured with a compressed bitstream, then CvP must also use a compressed
bitstream.
Base and CvP Revisions for Flash Programming
The Quartus II compilation of an OpenCL kernel can produce two different compilation revisions: base
and CvP. For more information on these revisions, refer to the CvP section.
The s5_net Reference Platform uses Flash programming for two purposes:
1. Modification of the FPGA periphery configuration.
2. Replacement of the power-on Flash configuration image (using the Altera Software Development Kit
(SDK) for OpenCL (AOCL) flash utility).
Modifying the FPGA periphery requires a bitstream from a CvP revision compilation. Replacing the
power-on image requires an RBF from a base revision compilation to guarantee CvP reliability. RBF
bitstreams are large. To avoid storing both the base and CvP revision compilation RBF files for every
design, only include the base revision RBF in the fpga.bin file. As a result, you can use the AOCL flash
utility to replace the power-on bitstream with the RBF in the fpga.bin file. However, periphery replace‐
ment in a live system becomes more complicated. The base revision compilation contains the correct
periphery for an AOCL user's design, but it does not contain the design itself. The design is only available
in a CvP revision compilation. The solution is to replace the periphery through Flash programming using
the base revision compilation, and then to immediately CvP program the user's design on top of that
periphery. The result is identical to programming a full RBF bitstream from the user's CvP revision
compilation, but without storing that RBF.
FPGA Reprogramming from Flash
After you program the Flash memory with the new configuraiton bitstream from a base revision compila‐
tion, you must reconfigure the FPGA in the live system by performing the following tasks:
1.
2.
3.
4.
Altera Corporation
Reprogram the FPGA from the bitstream in Flash memory.
Wait for device programming to complete.
Restore PCIe link and verify communication with the FPGA.
Program the AOCL user design core onto the FPGA via CvP. Refer to the CvP section for more
information.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Defining the Contents of the fpga.bin File
1-35
The s5_net Reference Platform performs FPGA reprogramming from Flash via a control command to the
CPLD by the host, through the bridge on the FPGA. Details of the command are specific to the board
hardware and are different across manufacturers.
The PCIe link is restored in the same way as after a quartus_pgm FPGA programming process. The
PCIe configuration space is saved on the host before reprogramming. After reprogramming, the registers
are restored by copying the original configuration data from the host to the device configuration space
across PCIe. PCIe advanced error reporting (AER) is disabled during the programming operation because
the FPGA effectively disappears from the PCIe bus during programming, which is typically a fatal PCIe
event (Basic Input/Output System (BIOS) often halts the CPU). By restoring the PCIe configuration space
registers after reprogramming, the device begins to communicate with the same configuration as the
original power-on PCIe enumeration.
From the host computer's perspective, the FPGA PCIe endpoint remains unchanged. After PCIe
communication with the FPGA that has the new and verified periphery configuration, the FPGA core is
populated with the user's design via CvP programming. Refer to the CvP section for more details.
Note: The s5_net Reference Platform targets a board with Flash memory that stores the power-on FPGA
configuration bitstream.
When changing the periphery through Flash programming at runtime, to avoid overwriting the
power-on bitstream, you may use a different region of Flash memory as the intermediate storage
location. However, this technique requires a means to specify the Flash memory address from
which the FPGA will be reprogrammed. For boards without the ability to load from multiple Flash
regions dynamically, you might need to overwrite the power-on programming bitstream.
Related Information
CvP on page 1-30
Defining the Contents of the fpga.bin File
You may arbitrarily define the contents of the fpga.bin file in a Custom Platform because it passes from
the Altera Software Development Kit (SDK) for OpenCL (AOCL) to the Custom Platform as a black box.
The contents of the fpga.bin file in the Stratix V Network Reference Platform (s5_net) is defined as an
Executable and Linkable Format (ELF) library that organizes the various fields. The following table
describes the contents of the s5_net Reference Platform fpga.bin file:
Field
.acl.sof
.acl.core.rbf
.acl.periph.hash
.acl.compile_revision
.acl.pcie.dev
Description
The full programming bits for the compiled design.
The Configuration via Protocol (CvP) programming bits for the
compiled design.
The hash of the periph.rbf file that the current compilation
generates. This hash is also embedded in the on-chip Hash ROM.
The Hash ROM is compared against this hash to determine, ahead
of time, whether CvP programming will succeed.
The name of the compiled Quartus II project revision.
The device ID of the PCI Express (PCIe) controller. The PCIe
device ID is set to match the FPGA part number (for example, D8)
so that this field can be compared to the FPGA part number as a
sanity check. The check ensures that the programming files
correspond to the device undergoing programming.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-36
OCL008-14.0.0
2014.07.25
Host-to-Device MMD Software Implementation
Field
.acl.base_revision.rbf
.acl.base_
revision.periph.hash
Description
The full-FPGA raw binary file (.rbf) of the base revision compilation
used to generate the post-fit netlist. This .rbf must be the power-on
image of the FPGA. All other designs can be programmed via CvP
on top of this image.
The hash of the periph.rbf of the base revision compilation from
which the post-fit netlist is derived. This field is retrieved from the
base.aocx file and should match the .acl.periph.hash field for any
AOCL user compilation.
Host-to-Device MMD Software Implementation
The memory mapped devices (MMD) layer is a thin software layer for communicating with the board. A
full implementation of the MMD library is necessary for every Custom Platform for the proper
functioning of the OpenCL host applications and board utilities. Details of the application programming
interface (API) functions, their arguments, and return values for MMD layer are specified in the
TOP_DEST_DIR/source/include/aocl mmd.h file, where TOP_DEST_DIR points to the top-level directory of
your Custom Platform.
The source codes of an MMD library that demonstrates good performance are available in the
TOP_DEST_DIR/source/host/mmd directory.
acl_pcie.cpp
The acl_pcie.cpp file implements the MMD API and provides multiple devices support. This file also
handles the PCI Express (PCIe) interrupt. For Linux, the kernel driver uses signal to notify the MMD
about an interrupt from the PCIe. For Windows, the MMD use the Jungo WinDriver API to handle the
interrupt from PCIe.
In addition, this file includes a signal handler for Ctrl-C event. The MMD needs to capture the Ctrl-C
event to ensure the program does not terminate itself during unsafe operation, such as programming the
device or running quartus_pgm.
acl_pcie_device.cpp
The acl_pcie_device.cpp file implements a class to represent a device, and abstracts details to allow easier
handling of multiple devices. Examples of the supported operations of the device by this class include
write_block, read_block, reprogram, and flash.
During the instantiation of an instant for the device class, the following verifications take place:
1.
2.
3.
4.
Ensures that the kernel driver is installed and that its version matches the MMD version.
Ensures that the device with the given name can be found.
Ensures that the Version ID of the device matches the supported ID in the software.
Wait for Uniphy intellectual property (IP) calibration.
acl_pcie_mm_io.cpp
The acl_pcie_mm_io.cpp file implements a class to allow access to the device as a memory mapped I/O. It
provides access to GLOBAL-MEM, PCIE-CRA, DMA-CSR, DMA-DESCRIPTOR, KERNEL, etc.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
OpenCL Utilities Implementation
1-37
acl_pcie_dma_linux.cpp or acl_pcie_dma_windows.cpp
The acl_pcie_dma_linux.cpp or acl_pcie_dma_windows.cpp file implements direct memory access (DMA)related functions. For more information, refer to the SG-DMA section.
acl_pcie_config.cpp
The acl_pcie_config.cpp file implements functions for configuring the device. For more information, refer
to the FPGA Programming Flow section.
acl_pcie_flash.cpp
The acl_pcie_flash.cpp file implements Flash-related functions. For more information, refer to the Flash
section.
acl_pcie_quickudp.cpp
The acl_pcie_quickudp.cpp file implements user datagram protocol (UDP)-related functions. For more
information, refer to the Implementation of UDP Cores as OpenCL Channels section.
acl_pcie_debug.cpp
The acl_pcie_debug.cpp file defines the commonly used debug functions and parameters. It is included by
most of the other files.
acl_pcie_timer.cpp
The acl_pcie_timer.cpp file implements a timer module to measure performance.
Related Information
•
•
•
•
SG-DMA on page 1-16
FPGA Programming Flow on page 1-29
Flash on page 1-32
Implementation of UDP Cores as OpenCL Channels on page 1-19
OpenCL Utilities Implementation
A Custom Platform requires four board utilities.
aocl install on page 1-37
aocl program on page 1-38
aocl flash on page 1-38
aocl diagnose on page 1-38
aocl install
The install utility installs the kernel driver on the host computer. Users of the Altera Software
Development Kit (SDK) for OpenCL (AOCL) only need to install the driver once, afterwhich the driver
should be automatically loaded each time the machine reboots.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-38
aocl program
OCL008-14.0.0
2014.07.25
Windows
The install.bat script is located in the TOP_DEST_DIR\windows64\libexec directory, where TOP_DEST_DIR
points to the top-level directory of your Custom Platform. This install.bat script triggers the install
executable from Jungo Connectivity Ltc. to install the WinDriver on the host machine.
Linux
The install script is located in the TOP_DEST_DIR/linux64/libexec directory. This install script first compiles
the kernel module in a temporary location and then performs the necessary setup to enable automatic
driver loading after reboot.
aocl program
The program utility programs the board with the specified Altera Offline Compiler Executable (.aocx)
file. Calling the aocl_mmd_reprogram() memory mapped devices (MMD) application programming
interface (API) function implements the program utility.
aocl flash
The flash utility configures the power-on image for the FPGA using the specified Altera Offline
Compiler Executable (.aocx) file. Calling into the memory mapped devices (MMD) library implements the
flash utility.
aocl diagnose
The diagnose utility reports device information and identifies issues. First, it verifies if the kernel driver
is installed, and then it performs different tasks based on extra arguments, if provided. Without an
argument, the utility returns the overall information of all the devices installed in a host machine. If a
specific device name is provided (that is, aocl diagnose <device_name>), the diagnose utility
runs a memory transfer test and then reports the host-device transfer performance.
Stratix V Network Reference Platform Implementation Considerations
The implementation of the Stratix V Network Reference Platform (s5_net) includes some workarounds
that address certain Quartus II software known issues.
1. The quartus_map executable reads the Synopsys Design Constraints (SDC) files. However, it does
not support the Tcl command get_current_revision. Therefore, in the top_post.sdc file, a
check is in place to determine whether quartus_map has read the file before checking the current
version.
2. Configuration via PCI Express (PCIe) requires the setting of the force_hrc parameter to a value of 1,
and the inclusion of the three PCIe INI settings described in the CvP section.
3. The kernel clock requires a lot of connectivity. Therefore, Altera recommends compiling the base
revision using the super_kernel_clock.rcf Routing Constraints File.
4. Use the INI setting bpm_hard_block_partition=off to improve version compatibility.
5. Use the INI setting qic_pf_no_input_rotation=on to prevent certain failures to route.
6. The CvP revision (that is, top), which imports the .personax file, must include auto global clock
promotion for clocks, resets, and enablements.
7. To avoid certain routing failures, set the Fitter Preservation Level for the Top partition to Netlist
Only. You may assign the setting via the Design Partitions Window or the Tcl Console.
Altera Corporation
Altera Stratix V Network Reference Platform User Guide
Send Feedback
OCL008-14.0.0
2014.07.25
Troubleshooting
1-39
In addition to these workarounds, take into account the following considerations:
1. The pll_rom.hex file exists before compilation.
2. Quartus II compilation is only ever performed after the Altera Offline Compiler (AOC) has embedded
an OpenCL kernel inside the system.
3. Perform Quartus II compilation after you install the Altera Software Development Kit (SDK) for
OpenCL (AOCL) and set the ALTERAOCLSDKROOT environment variable to point to the AOCL
installation.
4. The name of the directory where the Quartus II project resides must match the name field in its board_
spec.xml file. The name must be case sensitive.
5. The PATH or LD_LIBRARY_PATH environment variable must point to the memory mapped devices
(MMD) library in the s5_net Reference Platform.
Troubleshooting
You can use the following environment variables to help diagnose problems:
Environment Variable
Description
ACL_HAL_DEBUG
You can set this variable to a value of 1 through 5 to
enable increasing debug output from the Altera Hardware
Abstraction Layer (HAL), which interfaces directly with
the memory mapped devices (MMD) layer.
ACL_PCIE_DEBUG
You can set this variable to a value of 1 to 10000 to enable
increasing debug output from the MMD. This is useful
for confirming that the version ID register was read
correctly and the Uniphy intellectual property (IP) cores
are calibrated.
ACL_PCIE_JTAG_CABLE
You can set this variable to override the default
quartus_pgm argument that specifies the cable
number. By default, this is cable 1. If there are multiple
USB-Blaster cables, you can specify a particular one here.
ACL_PCIE_JTAG_DEVICE_INDEX
You can set this variable to override the default
quartus_pgm argument that specifies the FPGA device
index. By default, this variable has a value of 1. If the
FPGA is not the first device in the JTAG chain, you can
customize the value.
CL_CONTEXT_COMPILER_MODE_
ALTERA
You can unset this variable or set it to a value of 3. The
OpenCL host runtime reprograms the FPGA as needed,
which it does at least once during initialization. To
prevent the host application from programming the
FPGA, set this variable to a value of 3.
Important: When setting CL_CONTEXT_
COMPILER_MODE_ALTERA, only use
a value of 3.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Altera Corporation
1-40
OCL008-14.0.0
2014.07.25
Document Revision History
Document Revision History
Date
July 2014
Altera Corporation
Version
14.0.0
Changes
• Initial Release.
Altera Stratix V Network Reference Platform User Guide
Send Feedback
Download PDF
Similar pages