Introducing Innovations at 28 nm to Move Beyond Moore’s Law

Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Introducing Innovations at 28 nm to Move
Beyond Moore’s Law
WP-01125-1.2
White Paper
In addition to processing techniques, FPGA innovations allow Altera to move beyond
Moore’s Law to meet higher bandwidth requirements while meeting cost and power
budgets. Altera’s Stratix® V FPGAs provide breakthrough bandwidth via 28-Gbps
power-efficient transceivers, and allow users to integrate more of their design on a
single FPGA by using Embedded HardCopy® Blocks while increasing flexibility
through partial reconfiguration. This white paper explains how Stratix V FPGAs
allow customers to maximize bandwidth while staying within their cost and power
budgets.
Introduction
Bandwidth requirements are growing at a compound annual growth of 40% due to
bandwidth intensive applications as estimated by Cisco Systems (1) (2). The increased
bandwidth is due to audio/video streaming to computers, televisions, and mobile
phones and to internet applications like email, games, and file sharing. Global
Internet protocol traffic is expected to quintuple from 10 exabytes (1018 bytes or half a
zettabyte) per month in 2008 to more than 56 exabytes per month in 2013. Figure 1
shows the bandwidth requirements in 2013: mobile traffic is expected to approach 2.2
exabytes per month, business traffic to approach 13 exabytes per month, and
consumer traffic to exceed 40 exabytes per month.
Bandwidth (Total Internet Protocol
Traffic) Exabytes per Month
Figure 1. Global Internet Protocol Traffic Growth, 2008–2013
60
50
Mobile
‡ Internet traffic due to handsets,
notebook cards, mobile
broadband gateways
Mobile
Business
Consumer
40
40
30
R
AG
%C
8200
201
Business
‡ Business IP WAN traffic
‡ Business Internet traffic
3
Consumer
‡ Web/email
‡ File sharing
‡ Internet gaming
‡ Internet voice
‡ Internet video communications
‡ Internet video to PC
‡ Internet video to television
‡ Ambient video
‡ Non-Internet IP
20
10
0
2008
2009
2010
2011
2012
2013
Source: Cisco Visual Networking Index: Forecast and Methodology, 2008–2013, 2009
101 Innovation Drive
San Jose, CA 95134
www.altera.com
June 2012
© 2012 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS,
QUARTUS and STRATIX are Reg. U.S. Pat. & Tm. Off. and/or trademarks of Altera Corporation in the U.S. and other countries.
All other trademarks and service marks are the property of their respective holders as described at
www.altera.com/common/legal.html. Altera warrants performance of its semiconductor products to current specifications in
accordance with Altera’s standard warranty, but reserves the right to make changes to any products and services at any time
without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or
service described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest
version of device specifications before relying on any published information and before placing orders for products or services.
Altera Corporation
Feedback
Subscribe
Page 2
Addressing the Bandwidth Challenge While Reducing Cost and Power
To meet these ever-increasing bandwidth requirements, services providers must
upgrade existing network infrastructure. In doing so and to remain competitive
despite fixed footprints, they are also constantly faced with additional challenges of
staying within tight cost and power budgets. As a result, service providers and
enterprises are looking to their vendors to not only increase bandwidth but also
reduce cost and power.
FPGA vendors have progressed from previous silicon processing technologies to
28 nm to provide the benefits of Moore’s Law—doubling the FPGA capacity and
performance every 18 months. For years, this has enabled FPGA vendors to provide
increased functionality, customizable capabilities, reprogrammability, and higher
processing performance while reducing cost. However, at every generation, smaller
silicon geometries result in increased leakage currents resulting in higher static power,
which in turn raises the FPGA’s total power.
Riding the train of Moore’s Law will not mitigate the problem of increased power
because processing techniques only go so far. FPGA vendors must find innovative
ways to go beyond Moore’s Law on the 28-nm process to meet the ever increasing
demand of bandwidth requirements while reducing cost and power.
Addressing the Bandwidth Challenge While Reducing Cost and Power
Altera® Stratix V FPGAs address bandwidth, cost, and power challenges through
processing techniques and unique architectural innovations that take a design beyond
the benefits of Moore’s Law. Stratix V FPGAs allow a designer to improve:
■
Bandwidth and performance—Get breakthrough bandwidth with integrated
power-efficient transceivers capable of 28 Gbps, and increase system performance
by 50%
■
Highest system integration—Integrate more and get twice the density, without the
cost or power penalty, with Altera’s Embedded HardCopy Blocks and integrated
hard intellectual property (IP) in transceivers and core.
■
Ultimate flexibility—Achieve ultimate flexibility while reducing cost and power
with easy-to-use fine-grained partial reconfiguration (core) and dynamic
reconfiguration (transceivers) for multiprotocol client support, additional cost
reduction gained using Configuration via Protocol (CvP).
■
Power—Reduce total power by 30% compared to previous-generation devices
Each Stratix V variant offers a distinct set of features optimized for diverse
applications:
■
Stratix V GT FPGA—Optimized for designs with 28-Gbps transceivers requiring
ultra-high bandwidth and performance, such as 40G/100G/400G applications
■
Stratix V GX FPGA—Optimized for high-performance high-bandwidth
applications with integrated 14.1-Gbps transceivers supporting backplanes and
optical modules
■
Stratix V GS FPGA—Optimized for high-performance, variable-precision digital
signal processing (DSP) applications with integrated 14.1-Gbps transceivers
supporting backplanes and optical modules
■
Stratix V E FPGA—Optimized for ASIC prototyping with over 1 million logic
elements (LEs) on the highest performance logic fabric
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
June 2012 Altera Corporation
Addressing the Bandwidth Challenge While Reducing Cost and Power
Page 3
Providing Industry’s Highest Bandwidth
Stratix V FPGAs integrate power-efficient transceivers with data rates of 14.1 Gbps
and 28 Gbps. Stratix V GX FPGAs integrate 14.1-Gbps transceivers to support a data
range from 600 Mbps (or 150 Mbps with oversampling) to 14.1 Gbps with best-in-class
signal integrity and lowest jitter. Stratix V GX FPGAs offer up to 66 identical powerefficient 14.1-Gbps transceivers that provide up to 44 independent data rates through
independent clock sources. As shown in Figure 2, each transceiver channel comes
with a hardened physical coding sublayer (PCS) for protocols like PCI Express®
(PCIe®) Gen1, Gen2, and Gen3, 10G Ethernet, XAUI, and Interlaken.
Figure 2. Stratix V FPGA Transceivers
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Hard PCS
Hard PCS
Hard PCS
Clock Networks
Transceiver PMA
LC Transmit PLLs
Hard PCS
Transceiver PMA
Transceiver PMA
Transceiver PMA
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Stratix V FPGAs with transceivers were designed for 30" backplane (with two
connectors) drive capability up to 14.1 Gbps and are designed for 10GBASE-KR
multiboard applications, as shown in Figure 3. To mitigate the losses and crosstalk
that exist across backplanes and other mediums, an extensive level of advanced signal
conditioning has been added as dedicated circuitry in Stratix V FPGAs. In addition to
the improvements in the adaptive linear equalization, a 5-tap adaptive decision
feedback equalizer (DFE) is added to mitigate crosstalk effects. When combined with
the low-jitter transmitter and high-jitter rejecting receiver, the Stratix V FPGA’s
equalization offerings provide a complete link solution that helps achieve a low bit
error rate (BER).
June 2012
Altera Corporation
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Page 4
Addressing the Bandwidth Challenge While Reducing Cost and Power
Figure 3. Stratix V FPGAs Designed for 10GBase-KR Backplane Applications up to 14.1 Gbps
Eye Diagram at the Near End of
Backplane with Pre-Emphasis
Eye Diagram at the Far End of
Backplane with Pre-Emphasis
0.8
Amplitude (Vpp)
Amplitude (Vpp)
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-2 -1.5 -1 -0.5 0 0.5
Time (S)
1
1.5
2
-10
x10
0.5
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
-2 -1.5 -1 -0.5 0 0.5
Time (S)
TX
1
1.5
2
-10
x10
RX
30” Backplane
In addition to the backplane support, Stratix V FPGAs with transceivers are designed
to support optical modules directly by including optical electrical dispersion
compensation (EDC) features, thus removing the need for an external EDC chip when
interoperating with all types of optical modules, including SFP+.
Stratix V GT devices provide breakthrough transceiver performance of 28 Gbps per
channel and are optimized for ultra-high bandwidth applications. These devices have
four transceivers that cover data rates in the range 20 Gbps to 28 Gbps, plus an
additional 32 backplane-capable transceivers that cover data rates from 600 Mbps to
14.1 Gbps. Support circuits for the transceivers include hard IP for PCIe Gen1, Gen2,
and Gen3, 10G Ethernet, and Interlaken.
Figure 4 shows how the 28-Gbps channels allow Stratix V GT FPGAs to interface
directly with next-generation 100G optical modules via four 28-Gbps channels, while
eliminating the need for a 10:4 multiplexer/demultiplexer serializer in the optical
module.
Multi-Client
Interface
OTU-4 FEC
Multi-Client
Interface
OTU-4 Framer, Mapper
Multi-Client
Interface
Port Mux/De-Mux
Figure 4. Stratix V FPGAs Interfacing to Next-Generation 100G Optical Modules
VCXO
100G
10 x 11.3-Gbps
or 4 x 28-Gbps
Reference
Clock
Stratix V GT OTU-4 Muxponder
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
June 2012 Altera Corporation
Addressing the Bandwidth Challenge While Reducing Cost and Power
Page 5
The transceivers in Stratix V FPGAs, including the 28-Gbps transceiver, are power
efficient. Each 28-Gbps channel consumes 200 mW of physical medium attachment
(PMA) power, which is about 7 mW per gigabit. Moving from 10 x 10-Gbps
transceivers to 4 x 25-Gbps transceivers allows designers to achieve the same
bandwidth at half the power. Figure 5 shows the transceiver power per channel
(yellow bars) and the transceiver power per gigabit (green line) for varying data rates
on Stratix V FPGAs.
250
25
200
20
150
15
100
10
50
5
0
Power (mW) per Gigabit
Power (mW)
per Channel (PMA)
Figure 5. Stratix V Transceiver Power per Channel and per Gigabit
0
6.5
8.5
10.3
11.3
12.5
Transceiver Data Rate (Gbps)
28
Power (mW) per Gigabit
Power (mW) per Channel (PMA)
Stratix V FPGAs allow designers to increase the effective bandwidth of their
application for next-generation chip-to-chip and chip-to-optical module interfaces
while reducing power and costs by requiring fewer high-bandwidth power-efficient
transceivers.
Enabling High System Performance
Stratix V FPGAs provide high-performance ubiquitous I/Os and power-efficient
transceivers complemented with a high-performance core to increase system
performance by 50%. The Stratix V FPGA features (see Figure 6) that enable high
system performance include:
June 2012
■
Enhanced adaptive logic module (ALM) with four registers to provide higher
performance, easier timing closure for register-rich and heavily pipelined designs.
In addition, the four registers per ALM allow designers to pack more of their
design in the logic array block (LAB).
■
Enhanced MultiTrack routing architecture with more routing resources to enable
less routing congestion, higher logic utilization, and reduced compile times for
tightly packed designs
■
New high-performance, high-precision variable digital signal processing (DSP)
blocks that enable 1,755 GMACS of DSP performance and 1 TFLOPS of singleprecision floating-point operations
■
New 20-Kbit internal memory block to enable higher performance, up to 600 MHz,
in various memory modes, with built-in error correcting code (ECC) protection
■
Enhanced distributed memory (MLAB) blocks with additional built-in registers to
deliver higher performance, up to 600 MHz, for optimized implementation of
wide shallow FIFOs
■
Embedded HardCopy Blocks and integrated hard IP to eliminate system
bottlenecks
Altera Corporation
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Page 6
Addressing the Bandwidth Challenge While Reducing Cost and Power
Figure 6. Stratix V FPGAs Features Enable Higher System Performance
1-GHz DDR3 DIMM
Enhanced ALM
and Routing
New VariablePrecision
DSP Blocks
600-MHz
Memory Blocks
Embedded
HardCopy Block
14.1-Gbps/28-Gbps
Serial Transceivers
In addition, significant circuit enhancements are implemented for Stratix V FPGAs to
achieve higher system performance on memory interfaces. Stratix V FPGAs are
targeted to support up to six 72 DDR3 multirank DIMM interfaces, each running up to
1 GHz. To support these interfaces, all of the critical circuits in the read/write paths
are hardened to guarantee timing closure at higher frequencies. Stratix V FPGAs are
supported by the new UniPHY, shown in Figure 7, in Altera’s Quartus® II design
software.
Figure 7. Stratix V FPGA UniPHY
DLL
PLL
Clock
Gen
DQS
Path
ReConfig
Calibration
Sequencer
UniPHY
Write Path
DQ I/O
FIFO
Read Path
I/O Block
Address/cmd Path
Memory IP Controller
Memory
I/O Structure
Hard
Soft
The hard FIFO in the I/O blocks enables the new UniPHY to halve the PHY latency,
and features like duty-cycle correction, advanced calibration algorithms, and voltageand temperature-compensated deskew delays increase the operating margin for high
data rates and high system reliability. The new UniPHY allows the sharing of phaselocked loops (PLLs) and delay-locked loops (DLLs) across multiple interfaces for
easier memory-interface implementation, and it will be made available to customers
as clear text for easier debug and customization capabilities. Table 1 lists the
performance targets of LVDS and memory interfaces supported on Stratix V FPGAs.
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
June 2012 Altera Corporation
Moving Beyond Moore’s Law
Page 7
Table 1. Stratix V FPGA I/O Performance Targets
Interconnect
Performance
DDR3
1 GHz
DDR2
400 MHz
QDR II
350 MHz
QDR II+
550 MHz
RLDRAM III
800 MHz
RLDRAM II
533 MHz
LVDS
1.4 Gbps
Moving Beyond Moore’s Law
To achieve higher integration on a single chip at 28 nm while reducing cost and
power, Stratix V FPGAs leverage Altera’s new Embedded HardCopy Blocks as well as
integrate hard IP in the core and transceivers. In addition, designers can achieve
ultimate flexibility while reducing cost and power with easy-to-use fine-grained
partial reconfiguration (core) and dynamic reconfiguration (transceivers) for
multiprotocol client support, with additional flexibility gained using CvP.
Highest System Integration Through Embedded HardCopy Blocks
The Embedded HardCopy Blocks, shown in Figure 8, are customizable hard IP blocks
that utilize Altera’s unique HardCopy ASIC capabilities. This innovation
substantially increases FPGA capabilities by dramatically increasing density per area
and offers up to 14.3 million ASIC gates or up to 1.19 million LEs while increasing
performance and lowering power. The Embedded HardCopy Blocks are used to
harden standard or logic-intensive functions such as interface protocols, applicationspecific functions, and proprietary custom IP.
June 2012
Altera Corporation
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Page 8
Moving Beyond Moore’s Law
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Hard PCS
Hard PCS
LC PLLs
Clock Network
Fractional PLLs
Customizable Embedded HardCopy Block
Variable Precision DSP Blocks
M20K Internal Memory Blocks
Core Logic Fabric
Figure 8. Customizable Embedded HardCopy Block
Transceiver PMA
Transceiver PMA
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
Hard PCS
Transceiver PMA
PCI Express Gen1/Gen2/Gen3
or
Other Variants or Custom Solutions
This innovation creates a new class of application-targeted Stratix V FPGAs that are
optimized for:
■
Bandwidth-centric applications and protocols including PCIe Gen1, Gen2, and
Gen3
■
Data-intensive applications for 40G, 100G, and beyond
Stratix V FPGAs harden specific digital functionality in the PCS per transceiver
channel for a number of key protocols used in backplane, line card, and chip-to-chip
applications (Table 2). In addition, the core of the FPGA also includes hard IP blocks
like the variable precision DSP and memory blocks for high performance applications
(Table 3).
Table 2. Hard IP in the PCS per Transceiver Channel
IP
Features
Interlaken
Gearbox, block sync, 64B/67B, frame sync,
scrambler/descrambler, CRC-32, asynchronous buffer/deskew
10G (10GBASE-R)
Gearbox, block sync, scrambler/descrambler, 64B/66B, rate
matcher
PCIe Gen1, Gen2, and Gen3
Word aligner, lane sync state machine, deskew, rate matcher,
8B/10B, gearbox, 128B/130B, PIPE-8/16/32
Serial RapidIO® 2.0
Word aligner, lane sync state machine, deskew, rate matcher,
8B/10B
CPRI/OBSAI
Word aligner, bit slip (determinist latency), 8B/10B
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
June 2012 Altera Corporation
Moving Beyond Moore’s Law
Page 9
Table 3. Core Hard IP
IP
Features
DSP
Up to 3,926 new variable-precision 18x18 DSP blocks in the core
Embedded memory
Up to 50 Mb or 2,660 M20K embedded memory blocks
Analysis of a real design shows that by implementing 24 channels of Interlaken and
two PCIe Gen3 x8 cores, a 240K-LE Stratix V FPGA is equivalent to a 610K-LE FPGA.
This is because the hardened PCS in 24 channels of Interlaken provide a savings of
120K LEs, and two PCIe Gen3 x8 hard IP save approximately 250K LEs and associated
memories, for a total savings of 370K LEs (Table 4). This savings allows customers to
implement their application on a smaller FPGA, thereby reducing cost and power.
Table 4. Total LE Savings
Hardened IP for Protocol
LE Savings
24 channels of Interlaken
120K
2 PCIe Gen3 x8 cores
250K
Total LE savings
370K
Another immediate benefit of the Embedded HardCopy Blocks is that they allow
customers to integrate more functionality on a single chip without the penalty of
increased power and costs. If the density of the design is doubled on a FPGA with no
Embedded HardCopy Block (Figure 9), then a designer must use a larger FPGA that
not only increases costs but also consumes twice the static power.
FPGA
(no Embedded
HardCopy Block)
700K LEs
+
FPGA
(no Embedded
HardCopy Block)
700K LEs
=
Relative Static Power
Figure 9. Doubling the Density on a FPGA with No Hard IP Increases Static Power and Costs
2X Static Power
2X
Density
FPGA
(no Embedded HardCopy Block)
Due to the Embedded HardCopy Blocks in Stratix V FPGAs (Figure 10), designers can
double the size of their design on the same FPGA with minimal impact—only 35%—
to static power. The Embedded HardCopy Blocks provide a capacity up to 700K LEs
and provide a power saving of 65% compared to soft logic implementation.
June 2012
Altera Corporation
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Page 10
Moving Beyond Moore’s Law
+
Stratix V
FPGA
=
Relative Static Power
Stratix V
FPGA
Embedded
HardCopy Block
Figure 10. Doubling the Density on a Stratix V FPGA Using an Embedded HardCopy Block Has
Minimal Impact on Power and Cost
1.19M LEs or
14.3M ASIC Gates
+35% Static Power
2X
Density
Altera
Ultimate Flexibility
Ultimate flexibility results in reduced system downtime, power and costs due to
higher integration in a smaller FPGA. Ultimate flexibility is enabled by equipping the
designer to easily change the transceiver and core functionality enabled by partial
reconfiguration of the FPGA core, as well as dynamic reconfiguration of transceivers.
Partial Reconfiguration and Dynamic Reconfiguration
Stratix V FPGAs are designed to allow users to easily change the core and transceiver
functionality on the fly while other portions of the design are still running. As shown
in Figure 11, this flexibility is enabled by:
■
Easy-to-use fine-grain partial reconfiguration in the core, which requires less
development time and effort than competing solutions
■
Dynamically reconfigurable transceivers which lets the design easily support
multiple protocols, data rates, and PMA settings
Figure 11. Partial and Dynamic Reconfiguration in Stratix V FPGAs
B1
C1
D1
E1
F1
A2
B1
C2
D1
E1
F1
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Dynamic Reconfiguration
for Transceivers
A1
Transceivers
C2
Transceivers
FPGA Core
FPGA Core
A2
June 2012 Altera Corporation
Moving Beyond Moore’s Law
Page 11
Having this level of flexibility is imperative for high-bandwidth applications that
support multistandard client interfaces from 600 Mbps to 14.1 Gbps. Such
applications require service providers to make updates or adjust functionality of the
FPGA on-the-fly without disrupting services to other clients. This significantly
reduces system down time.
In addition, to increase their competitive edge, customers are constantly incorporating
more functionality and system performance in their FPGA-based designs. Many times
these changes require a large FPGA that not only increases costs but also power.
Partial reconfiguration improves effective logic density by removing the necessity to
place functions that do not operate simultaneously in the FPGA. Instead, these
functions are stored in external memory and loaded as needed. This reduces the size
of the FPGA by allowing multiple applications on a single FPGA, thus saving board
space and cost and reducing power.
Traditionally, partial reconfiguration capabilities required much longer engineering
cycles and greater design-flow complexity, which meant that designers had to know
all of the intricate FPGA architecture details. Altera has simplified the partial
reconfiguration process with a new, state-of-the-art, reconfigurable fabric in Stratix V
FPGAs and a design based on the proven incremental compile design and
LogicLock™ flows in Quartus II design software. The benefits of Quartus II design
software include:
■
No need for intricate detailed knowledge of the FPGA
■
Unlimited number of regions (partitions)
■
Unlimited number of programming files
■
No restrictions to the order of loading the partitioned region in the FPGA
CvP and Autonomous PCIe Cores
PCIe is one of the most widely used interfaces between FPGAs and processors, ASIC,
or ASSP devices. The PCIe hard IP block embeds the PCIe protocol stack in Stratix V
FPGAs. Stratix V GX FPGAs embed up to four hard IP that target PCIe Base
specification 3.0.
As shown in Figure 12, the FPGA fabric is initially programmed through the PCIe
link, and the FPGA fabric image can be later updated through the same link. In
addition, CvP is fully supported in all of the following PCIe link operating modes.
■
Gen1—x1, x2, x4, x8
■
Gen2—x1, x2, x4, x8
■
Gen3—x1, x2, x4, x8
Figure 12. Stratix V FPGA CvP
June 2012
Altera Corporation
PCIe Hard IP
PCIe Link
Gen1/Gen2/Gen3
x1, x2, x4, x8
Load FPGA
Image
via PCIe Link
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Page 12
High-Performance Process Optimized for Low Power
The time it takes to configure large FPGAs increases as FPGAs continue to pack more
logic into smaller geometries. With the proliferation of PCIe as a control plane
interface between processors and the devices it monitors, it becomes imperative that
the FPGA is quickly and fully programmed to act as a PCIe port. If this does not
happen, then there is a risk of the host CPU failing to recognize the FPGA as an
endpoint, resulting in the host CPU operating without it.
In order to circumvent the possible failure of the discovery mechanism described
above, Altera developed autonomous PCIe cores that are operational before or while
the FPGA fabric is being programmed. According to the PCIe power-up timing
sequence, as described in the PCIe Base and PCIe Card Electro Mechanical (CEM)
specs, the minimum amount of time allocated for device initialization is <100 ms. The
autonomous PCIe core innovation allows Stratix V FPGAs to always meet the PCIe
wake-up time specification.
CvP and autonomous PCIe cores in Stratix V FPGAs allow for higher user flexibility
and provide the following benefits:
■
Reduced system costs by reducing the number of required external components
(flash and programming controllers) because the programming files are stored in a
CPU memory
■
Enables simpler board design in less board space
■
Protects user application image as image copies are accessible only to the host
CPU and are encrypted and/or compressed
■
No host-CPU stall or reboot is needed following fabric image updates when the
FPGA operates in the user mode. CvP is just another software application that the
CPU can execute.
High-Performance Process Optimized for Low Power
Migrating to smaller geometries has always provided higher integration and greater
performance than the previous node, and 28 nm is no exception. The 28-nm process
delivers clear performance benefits, but to realize the full potential of these benefits,
the proper “flavor” of the 28-nm process must be selected. Altera chose the TSMC’s
28HP (high performance) high-K metal gate (HKMG) process and leveraged its
decade-long relationship with TSMC to optimize the process for low power on Stratix
V FPGAs. This process also allows Stratix V FPGAs to provide 28-Gbps powerefficient transceivers for ultra-high bandwidth applications.
The exceptional performance at 28 nm is driven not only by the introduction of
HKMG, but also by the second generation of advanced strain technology, including
embedded silicon germanium (SiGe) in source-drain regions of transistors for faster
circuit designs. Altera produces tensile strain in NMOS transistors through a cap
layer, and compressive strain for PMOS transistors through embedded SiGe in the
source and drain (see Figure 13). These strained silicon techniques increase electron
and hole mobility by up to 30%, and the resulting transistor performance by up to
40%. Since better performance at the same level of leakage is achieved with strained
silicon, part of this performance gain is traded off for reduced leakage, leading to a
superior process that has faster performance and lower leakage compared to
processes without strained silicon. No other 28-nm process flavor has this potent
combination of HKMG and advanced strain available for maximum performance.
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
June 2012 Altera Corporation
High-Performance Process Optimized for Low Power
Page 13
Figure 13. Strained Silicon Techniques at 28 nm Enable Higher Performance Transistors
NMOS
PMOS
Although the increased density and performance are valuable benefits, another
pressing design consideration for today’s system developers is power consumption.
Power is composed of static and dynamic (or active) power. Static power is the power
consumed by the FPGA when it is programmed but no clocks are operating. Both
digital and analog logic consume static power and, as shown in Figure 14, the static
power increases as the channel length decreases when process geometries shrink.
Figure 14. Transistor with Sources of Leakage Current
Gate Oxide
Gate
Source
n+
Drain
(1)
n+
(2)
Channel Length
Notes:
(1) Drain-to-source leakage
(2) Gate-oxide leakage
Dynamic power is the additional power consumed through the operation of the
device caused by signals toggling and capacitive loads charging and discharging. As
shown in this equation, the main variables affecting dynamic power are capacitance
charging, the supply voltage, and the clock frequency:
1
2
P dynamic = --- CV f ⋅ activity
2
The challenge of increasing power with small process geometries is felt industrywide, and a large number of widely used technologies at the 28-nm process node are
used to maintain or increase performance while managing leakage power. Stratix V
FPGAs use the techniques shown in Table 5 to lower power while delivering the
highest performance.
June 2012
Altera Corporation
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Page 14
Conclusion
Table 5. Key Process and Design Techniques Used in Stratix V FPGAs to Lower Power
Lower Static
Power
Lower Dynamic
Power
28-nm HKMG process optimized for lower power
✔
✔
Lower core voltage
✔
✔
Programmable Power Technology
✔
NA
Extensive hardening of IP, Embedded HardCopy Blocks
✔
✔
Hard power-down of functional blocks
✔
✔
Clock gating
NA
✔
Customized extra-low leakage devices
✔
NA
Partial reconfiguration
✔
✔
DDR3 and dynamic on-chip termination
✔
✔
Quartus II software PowerPlay power optimization
✔
✔
Process or Design Technology
Since dynamic power is proportional to the voltage squared, the much lower Vcc level
for the 28HP process (0.85 V to 0.9 V) is indispensable for allowing high-performance
FPGAs to attain the maximum performance possible while still keeping total power in
check. When static power is also controlled with customized low-leakage devices and
Altera’s third generation of Programmable Power Technology for select circuit blocks
that do not need high performance, the 28HP process is ideally suited for those
designs that require high performance, high density FPGAs with reduced power
consumption.
Conclusion
Migrating to smaller geometries delivers the expected Moore’s Law benefits of
increased density and performance, but smaller geometries also mean higher static
power if nothing is done to control it. FPGA innovations allow Altera to move beyond
Moore’s Law to meet higher bandwidth requirements while meeting cost and power
budgets.
TSMC’s 28HP (HKMG high performance) process optimized for lower power and
unique architectural technologies enables Stratix V FPGAs to:
■
Lower total power by 30% compared to previous generation devices
■
Integrate power efficient transceivers capable of 14.1 Gbps and 28 Gbps
■
Provide lower power and higher performance when compared to competing
28-nm processes
Stratix V FPGAs provide breakthrough bandwidth via 28-Gbps power-efficient
transceivers and allow users to integrate more on a single FPGA by using Embedded
HardCopy Blocks. Combined with the added benefits of increased flexibility through
partial reconfiguration, CvP, and autonomous PCIe cores, Stratix V FPGAs allow
users to increase their system bandwidth, reduce power, and allow customers to stay
within their cost budgets.
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
June 2012 Altera Corporation
Further Information
Page 15
Further Information
1. White paper: Hyperconnectivity and the Approaching Zettabyte Era:
www.cisco.com
2. White paper: Cisco Visual Networking Index: Forecast and Methodology, 2008–2013:
www.cisco.com
3. Stratix V FPGAs: Built for Bandwidth:
www.altera.com/products/devices/stratix-fpgas/stratix-v/stxv-index.jsp
4. Webcast: “Introducing 28-nm Stratix V FPGAs and HardCopy V ASICs: Built for
Bandwidth”:
www.altera.com/education/webcasts/all/wc-2010-introducing-stratix-v.html
5. Literature: Stratix V Devices:
www.altera.com/products/devices/stratix-fpgas/stratix-v/literature/stvliterature.jsp
Acknowledgements
■
Seyi Verma, Product Marketing Manager, High-End FPGAs, Altera Corporation
■
Peter McElheny, Director, Process Technology Development, Altera Corporation
Document Revision History
Table 6 shows the revision history for this document.
Table 6. Document Revision History
Date
June 2012
Version
1.2
July 2010
1.1
April 2010
1.0
June 2012
Altera Corporation
Changes
■
Updated Figure 3, Figure 6, Table 1, Figure 8, Table 3.
■
Removed Table 2.
■
Minor text edits.
■
Updated Table 1, Table 4, Figure 10, and Figure 14.
■
Minor text edits.
Initial release.
Introducing Innovations at 28 nm to Move Beyond Moore’s Law
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Related manuals

Download PDF

advertisement