Intel® 82576 Gigabit Ethernet Controller Datasheet

Intel® 82576 Gigabit Ethernet Controller Datasheet
Intel® 82576 Gigabit Ethernet
Controller Datasheet
LAN Access Division (LAD)
PRODUCT FEATURES
Virtualization Ready
External Interfaces
 PCIe* v2.0 (2.5 GT/s) x4/x2/x1; called PCIe in this
document
 MDI (Copper) standard IEEE 802.3 Ethernet interface
for 1000BASE-T, 100BASE-TX, and 10BASE-T
applications (802.3, 802.3u, and 802.3ab)
 Serializer-Deserializer (SERDES) to support 1000BaseSX/X/LX (optical fiber) for Gigabit backplane
applications.
 SGMII for SFP/external PHY connections
 NC-SI (Type C) or SMBus for Manageability connection
to BMC.
 IEEE 1149.1 JTAG
Intel® I/O Acceleration Technology
 Stateless offloads (Header split, RSS)
 Intel® QuickData (DCA - Direct Cache Access)
 Next Generation VMDq support (8 VMs)
 PCI-SIG Single Root I/O Virtualization (Direct
assignment)
 Queues per port: 16 TX queues and 16 RX queues
Full-Spectrum Security
 IPsec (256 SA’s) in 82576EB; IPsec not present in
82576NS [Non-Security]
 MACSec
Additional Product Details
 25mm x 25mm Package
 Power 2.8W (max)
 Support for PCI 3.0 Vital Product Data
 Memories Parity or ECC Protection
 IPMI MC Pass-thru; Multi-drop NC-SI
 802.1AS draft standard implementation
 Layout Compatible with 82575
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller — Legal
Legal
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,
BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED
IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL
DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR
WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life
sustaining, critical control or safety systems, or in nuclear facility applications.
Intel may make changes to specifications and product descriptions at any time, without notice.
Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that
relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any
license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property
rights.
Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel
reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future
changes to them.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature may be obtained by
calling 1-800-548-4725 or by visiting Intel's website at http://www.intel.com.
Intel and Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other
countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2007, 2008, 2009, 2010; Intel Corporation. All Rights Reserved.
Intel® 82576 GbE Controller
Datasheet
2
320961-015EN
Revision: 2.61
December 2010
Revisions — Intel® 82576 GbE Controller
Revisions
Revision
Date
Comments
0.5
6/2007
Initial availability.
1.0
11/2007
Updates and corrections.
1.9
5/2008
PRQ release.
2.0
6/2008
SRA release.
2.1
7/2008
Maintenance update. Added checklist chapter.
2.2
11/2008
Maintenance update.
2.3
12/2008
2.4
4/1/2009
320961-015EN
Revision: 2.61
December 2010
•
ected device ID reference to 0x10C9.
•
Section 3.3.1.7; Section 12.3.2.2.1 - EEPROM-less information updated; stronger
statements about EEPROM-less design.
•
Table 3-17 - Device ID corrected.
•
GIO_PWR_GOOD updated to PERST# throughout.
•
Section 6.1 - More PXE information documented. Entire section updated. See PXE
listings on EEPROM map. Also, links added for entire EEPROM reference map.
•
Section 7.10.3.5.1, Section 7.10.3.5.2- Notes added after VFRE filtering
paragraphs in numbered list.
•
Section 8.8.7, Section 8.8.8, Section 8.8.9, Section 8.8.10 - The ICR, ICS, IMS,
IMC registers were corrected. See bit 3 in each.
•
Chapter 10.0, System Manageability updated; organization changed; some
additional information provided.
•
Section 10.6.2.12 - Bit description in table updated (to 0x21).
•
Table 10-10 - IPV4 and IPV6 filter parameter information corrected.
•
Table 10-33 - List of supported commands has been updated.
•
Table 11.4.2.1 - Current consumption data updated. See bold text in table. Also,
see power data in summary on title page.
•
Table 12-2 - Additional magnetics recommendation added.
•
Section 6.2.18 - Bit 15 information updated; Enable WAKE# Assertion.
•
Jumbo frame size consistently indicated at 9500 bytes (max).
•
SKU 82576NS documented. The IPsec function is present in the 82576EB SKU.
IPsec is not present in the 82576NS SKU. This is indicated throughout the
document.
•
Section 3.3.4.2, Flash Write Control - Typing correction. Note that attempts to write
to the Flash device when writes are disabled (EEC.FWE=01b) should not be
attempted.
•
Section 3.4.2, Software Watchdog - Updated. Edited to describe the software
interrupt (ICR[26]) and to reduce confusion.
•
Section 3.5.6.5.1, Setting the 82576 to External PHY loopback Mode - Text added
at the end of the section for clarity: The above procedure puts the device in PHY
loopback mode. After using the procedure, wait for link to become up. Once PHY
register 1 bit 2 is set (this can take up to 750ms), transmit and receive normally. If
you are unable to get link after 750ms, reset the PHY using CTRL.PHY_RST and
then repeat the above procedure. When exiting External PHY loopback mode, a full
PHY reset must be done. Use CTRL.PHY_RST.
Intel® 82576 GbE Controller
Datasheet
3
Intel® 82576 GbE Controller — Revisions
Revision
Date
Comments
•
Section 4.4, Device Disable - The following phrase in the section has been changed:
The EEPROM "Power Down Enable" bit (Section 6.2.7) enables device disable mode
(hardware default is that the mode is disabled).
•
Table 4-5, 82576 Reset Effects - Per Function Resets - Table updated. See the
entries on PCI Configuration registers and the associated footnotes.
•
Section 4.2.1.6.3, VF Software Reset - Replaced VFCTRL with VTCTRL (corrects a
typo). Added information that indicates what happens when VTCTRL.RST is set.
Setting VTCTRL.RST resets interrupts and queue enable bits. Other VF registers are
not reset.
•
Section 5.0, Power Management updated for clarity.
•
Section 6.10.7.1, iSCSI Module Structure - Description of structure updated.
Multiple errors were corrected
•
Section 7.1.3.1, Host Buffers - Text added. For advanced descriptor usage, the
SRRCTL.BSIZEHEADER field is used to define the size of the buffers allocated to
headers. The maximum buffer size supported is 960 bytes..
•
Section 8.2.4, MDI Control Register - MDIC (0x00020; R/W) - Description of bit 31
corrected.
•
Section 8.10.2, Split and Replication Receive Control - SRRCTL (0x0C00C + 0x40*n
[n=0...15]; R/W). Maximum 960 bytes now indicated for SRRCTL.BSIZEHEADER.
•
Section 10.4.4.3, RMCP Filtering - Title of section updated.
•
Section 10.5.10.1.4, Force TCO Command and Section 10.6.2.13.1, Perform Intel
TCO Reset Command (Intel Command 0x22) - Added description of RESET_MGMT
bit.
•
Section 10.5.12, Example Configuration Steps - Added pseudocode describing the
setup of common filtering configurations.
•
Table 10-35, Command Summary - Commands added, see:
0x02 0x67/68 Set EtherType Filter/Packet Add. Ext. Filter
0x03 0x67/68 Get EtherType Filter/Packet Add. Ext. Filter
2.41
4/8/2009
•
Section 10.5.10.2.1, Receive TCO LAN Packet Transaction. Description of packet
structure added.
•
Section 10.6.2.6.19, Set Intel Filters - Packet Addition Extended Decision Filter
Command (Intel Command 0x02, Filter parameter 0x68). Text in section updated:
Extended decision filter index range adjusted to 0..4.
•
Table 11-5, Current Consumption Details - Added SGMII note to table. (3) To
estimate power for SGMII mode, use the SerDes mode power numbers provided.
•
Table 11-22, Package Height - Table added. Provides a summary of package height
information.
•
Section 7.1.4, Legacy Receive Descriptor Format and Section 7.2.2, Transmit
Descriptors. Recommendation regarding legacy descriptors changed to ‘must not be
used’ from ‘should not be used.’
5/5/2009
2.42
7/5/2009
Internal release for test and review.
2.43
10/2/2009
MACSec capability exposed. You must have a MACSec-ready switch in order to complete the ecosystem and make use of MACSec functionality.
Maintenance issues addressed:
Intel® 82576 GbE Controller
Datasheet
4
•
Section 7.2.4.7.2, TCP/IP/UDP Headers for the Subsequent Frames and Section
7.2.4.7.3, TCP/IP/UDP Headers for the Last Frame updated to document UDP fields.
•
Section 7.3.3.2, Interrupt Moderation and Section 8.8.12, Interrupt Throttle - EITR
(0x01680 + 4*n [n = 0...24]; R/W) updated to correct minor issues; redundant
data removed.
•
Table 7-9, VLAN Tag Field Layout (for 802.1q Packet) - Note added to table that
clarifies usage:
• NOTE: This table is relevant only if VMVIR.VLANA = 00b (use descriptor
command) for the queue.
320961-015EN
Revision: 2.61
December 2010
Revisions — Intel® 82576 GbE Controller
Revision
2.44
2.45
320961-015EN
Revision: 2.61
December 2010
Date
10/14/2009
10/30/2009
Comments
•
Section 7.10.3.2.1, Filtering Capabilities - Typo corrected. In bullet, VM changed to
VF. Below:
• Promiscuous multicast & enable broadcast per VF.
•
Section 7.10.3.8, Offloads - Note added; text below:
• NOTE: VLAN strip offload is determined based only on the L2 MAC address. In
order to make sure VLAN strip offload is correctly applied, all packets should be
initially forwarded using one of the L2 MAC address filters (RAH/RAL, UTA,
MTA, VMOLR.BAM, VMOLR.MPE.
•
Two table titles corrected. Could have caused confusion. Minor edits also made to
field descriptions.
• Table 7-35, TCP/IP or UDP/IP Packet Format Sent by Host
• Table 7-36, TCP/IP or UDP/IP Packet Format Sent by 82576
•
Section 8.10.7, Receive Descriptor Ring Length - RDLEN (0x0C008 + 0x40*n
[n=0...15]; R/W) - Description updated. LEN text added: The maximum allowed
value is 0x80000 (32K descriptors).
•
Section 8.12.2, Transmit Control Extended - TCTL_EXT (0x0404; R/W) - Default
value of COLD corrected (0x42) in text description.
•
Section 10.5.10.1.4, Force TCO Command - Clarification note added to table. See
below:
• NOTE: Before initiating a Firmware reset command, one should disable TCO
receive via Receive Enable Command -- setting RCV_EN to 0 -- and wait for 200
milliseconds before initiating Firmware Reset command. In addition, the
MCshould not transmit during this period.
•
Section 10.5.10.2.1, Receive TCO LAN Packet Transaction - Receive TCO packet
format table updated; numerous changes. For clarity.
•
Section 10.7.10, Read Fail-Over Configuration Host Command - Both tables in
section updated.
• Table 10-49, Commands to Read the Fail-Over Configuration Register - Last row
in table deleted; was incorrect.
• Table 10-50, States Returned - Description column (byte 1) updated.
Description was confusing.
•
Section 10.5.12.3.1, Example 3 - Pseudo Code - Pseudo Code, step 5: MAC Address
Filtering is bit 0, not bit 1. Also the MDEF value is 00000009 and not 00000040.
•
Section 10.5.12.4.1, Example 4 - Pseudo Code - Step 5: Configure MDEF[0], MDEF
value is 0000004 and not 00000040.
•
Section 9.6.4.3, PCIe SR-IOV Control Register (0x168; RW); Bit 4; ARI Capable
Hierarchy. Text updated.
•
Section 10.0, System Manageability; More information on MACSec parameters
provided. See Section 10.5.10.1.6, Update MACSec Parameters and Section 10.8,
MACSec and Manageability in particular.
•
Section 10.5.10.1.3, Receive Enable Command; Section 10.5.10.2.5, Read
Management Receive Filter Parameters. Bit order expression corrected in two
tables. See bold text.
•
References to BMC changed to MC if the reference is not programmatic.
•
Section 3.3.1.6, EEPROM Recovery. Section now exposed in the datasheet.
•
Section 8.10.8, Receive Descriptor Head - RDH (0x0C010 + 0x40*n [n=0...15];
RO) and Section 8.12.11, Transmit Descriptor Head - TDH (0x0E010 + 0x40*n
[n=0...15]; RO). Both registers indicated RW incorrectly. Changed to RO.
•
Table 10-33, Supported NC-SI Commands and Table 10-34, Optional NC-SI
Features Support. List of supported commands/functions updated to correct an
error in our support statements. See bold text in both tables.
Intel® 82576 GbE Controller
Datasheet
5
Intel® 82576 GbE Controller — Revisions
Revision
Date
2.46
12/1/2009
2.47
2.48
2.49
2.50
3/10/2010
6/14/2011
8/11/2010
9/14/2010
Intel® 82576 GbE Controller
Datasheet
6
Comments
•
Table 7-18, Table 7-39, Table 7-41. ‘Packet is greater than 1552 bytes; (LPE=1b).’
updated to ‘Packet is greater than 1518/1522/1526 bytes; (LPE=1b).’
•
Chapter 8.0, Receive Control Register - RCTL (0x00100; R/W). Description of LPE
field updated.
•
Chapter 10.0, System Manageability. Changes and clarifications to list of NC-SI
commands. Added the Get Ethertype and Get Intel Filters - Packet Addition
Extended Decision Filter commands. Added the Set/Get Unicast/Broadcast/
Multicast Packet Reduction filters. Added a recommendation to use the Packet
Addition Extended Decision Filter commands (0x68) instead of the Packet Addition
Decision Filter commands (0x61).
•
Chapter 5.0, Power Management. In tables where these fields occur, the following
fields have been flipped to reflect this order. They were previously reversed in the
tables.
• Possible VLAN Tag
• Possible LLC/SNAP Header
•
Chapter 5.0, Power Management. Table 5-5 through Table 5-10; offset and byte
information has been updated.
•
Section 6.10.6.1, Main Setup Options PCI Function 0 (Word 0x30). Description of
Bit 5 updated to “IBD: iSCSI Boot Disable.”
•
Section 6.10.6.7, iSCSI Option ROM Version (Word 0x36). Description of Word 0x36
added. Describes option ROM versioning.
•
Section 6.2.18, PCIe Control (Word 0x1B). Decription of Bit 12 updated to “Lane
Reversal Disable”.
•
Section 7.10.3.6.2, Replication Mode Disabled - The following list item was deleted:
‘3. Multicast or Broadcast - If the packet is a Multicast or Broadcast packet and was
not forwarded in step 1 and 2, set the default pool bit in the pool list (from
VT_CTL.DEF_PL).’
•
Section 7.10.3.4, Size Filtering. This section added.
•
Section 10.5.10.1.6, Update MACSec Parameters. Table rows in the section
updated. See:
• Initialize MACSec RX
• Initialize MACSec TX
• Set MACSec TX Key
• Enable MACSec
•
Section 11.4.2.2, Digital I/O. Table Notes have been corrected in the table that
resides in the section. Two notes weren’t referenced in the table correctly.
•
Appendix A. Changes from the 82575. Appendix added (to datasheet).
•
NC-SI identified as Type C..
•
Section 7.2.5.3, SCTP CRC Offloading. This note added to section: The CRC field of
the SCTP header must be set to zero prior to requesting a CRC calculation offload.
•
Section 8.17.23, Time Sync RX Configuration - TSYNCRXCFG (0x05F50; RW). The
TRNSSPC description column was updated.
•
LinkSec references corrected; to MACSec.
•
Table 2-8; JTAG Reset Input (AC5) described.
•
Section 6.10.5, PBA Number Module (Word 0x08, 0x09). PBA format updated.
•
Section 7.1.1.2, Rx Queuing in a Virtualized Environment. Corrected.
•
Table 2-9, Reserved Pins and No-Connects. Table corrected.
•
Section 6.10.5, PBA Number Module (Word 0x08, 0x09). Language of section
updated to address isses.
•
Section 8.8.7, Interrupt Cause Read Register - ICR (0x01500; RC/W1C). Table was
updated. See ICR.MDDET [bit 28].
•
Table 11-14, NC-SI AC Specifications. Table corrected.
320961-015EN
Revision: 2.61
December 2010
Revisions — Intel® 82576 GbE Controller
Revision
Date
2.6
11/5/2010
2.61
320961-015EN
Revision: 2.61
December 2010
12/10/2010
Comments
•
On Title page, in feature table, under additional product features: bullet updated to
“Memories Parity or ECC Protection”.
•
Chapter 6.0, Non-Volatile Memory Map - EEPROM. Chapter now includes example
settings for sample EEPROM and makes hardware settings clear.
•
Section 7.2.2.3.11, PAYLEN (18). Note text updated.
•
Section 8.12.14, Tx Descriptor Completion Write–Back Address Low - TDWBAL
(0x0E038 + 0x40*n [n=0...15]; R/W). Description clarified; see bits 32:2.
•
Indicated hardware defaults in Chapter 6.0, Non-Volatile Memory Map - EEPROM.
Added loaded values for 82576_dev_start_No_Mgmt_Copper_A1 image, where
applicable.
Intel® 82576 GbE Controller
Datasheet
7
Intel® 82576 GbE Controller — Contents
Contents
1.0
Introduction .............................................................................................................................. 43
1.1
Scope ...................................................................................................................................... 43
1.2
Terminology and Acronyms ......................................................................................................... 43
1.2.1
External Specification and Documents .................................................................................... 46
1.2.1.1
Network Interface Documents......................................................................................... 46
1.2.1.2
Host Interface Documents .............................................................................................. 46
1.2.1.3
Virtualization Documents ............................................................................................... 46
1.2.1.4
Networking Protocol Documents ...................................................................................... 46
1.2.1.5
Manageability documents ............................................................................................... 46
1.2.1.6
Security Documents ...................................................................................................... 46
1.2.2
Intel Application Notes ......................................................................................................... 47
1.2.3
Reference Schematics .......................................................................................................... 47
1.2.4
Checklists........................................................................................................................... 47
1.3
Product Overview ...................................................................................................................... 47
1.3.1
System Configurations ......................................................................................................... 47
1.4
External Interface...................................................................................................................... 47
1.4.1
PCIe* Interface ................................................................................................................... 47
1.4.2
Network interfaces .............................................................................................................. 47
1.4.3
EEPROM Interface ............................................................................................................... 48
1.4.4
Serial Flash Interface ........................................................................................................... 48
1.4.5
SMBus Interface.................................................................................................................. 48
1.4.6
NC-SI Interface................................................................................................................... 48
1.4.7
MDIO/2 wires Interfaces....................................................................................................... 48
1.4.8
Software-Definable Pins (SDP) Interface (General-Purpose I/O) ................................................. 49
1.4.9
LEDs Interface .................................................................................................................... 49
1.5
Comparing Product Features ....................................................................................................... 49
1.6
Overview of New Capabilities ...................................................................................................... 53
1.6.1
IPsec Off Load for Flows ....................................................................................................... 53
1.6.2
Security ............................................................................................................................. 54
1.6.3
Transmit Rate Limiting (TRL) ................................................................................................ 54
1.6.4
Performance ....................................................................................................................... 54
1.6.4.1
Tx Descriptor Write-Back ............................................................................................... 54
1.6.5
Rx and Tx Queues ............................................................................................................... 54
1.6.6
Interrupts .......................................................................................................................... 54
1.6.7
Virtualization ...................................................................................................................... 55
1.6.7.1
PCI SR IOV .................................................................................................................. 55
1.6.7.2
Packets Classification..................................................................................................... 55
1.6.7.3
Hardware Virtualization.................................................................................................. 55
1.6.7.4
Bandwidth Allocation ..................................................................................................... 55
1.6.8
VPD................................................................................................................................... 56
1.6.9
64 bit BARs support ............................................................................................................. 56
1.6.10
IEEE 1588 - Precision Time Protocol (PTP) .............................................................................. 56
1.7
Device Data Flows ..................................................................................................................... 56
1.7.1
Transmit Data Flow ............................................................................................................. 56
1.7.2
Receive Data Flow ............................................................................................................... 57
2.0
Pin Interface ............................................................................................................................. 59
2.1
Pin Assignment ......................................................................................................................... 59
2.1.1
PCIe ................................................................................................................................. 59
2.1.2
Flash and EEPROM Ports (8) ................................................................................................. 60
2.1.3
System Management Bus (SMB) Interface ............................................................................. 61
2.1.4
NC-SI Interface Pins ........................................................................................................... 61
2.1.5
Miscellaneous Pins .............................................................................................................. 62
2.1.6
SERDES/SGMII Pins ............................................................................................................ 62
2.1.7
SFP Pins ............................................................................................................................ 63
2.1.8
Media Dependent Interface (PHY’s MDI) Pins........................................................................... 63
2.1.8.1
LED’s (8) ..................................................................................................................... 63
2.1.8.2
Analog Pins ................................................................................................................. 64
Intel® 82576 GbE Controller
Datasheet
8
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
2.1.9
Testability Pins ................................................................................................................... 64
2.1.10
Reserved Pins and No-Connects ............................................................................................ 64
2.1.11
Power Supply Pins ............................................................................................................... 65
2.2
Pull-ups/Pull-downs ................................................................................................................... 66
2.3
Strapping ................................................................................................................................. 69
2.4
Interface Diagram ..................................................................................................................... 70
2.5
Pin List (Alphabetical) ................................................................................................................ 71
2.6
Ball Out.................................................................................................................................... 73
3.0
Interconnects............................................................................................................................ 75
3.1
PCIe ........................................................................................................................................ 75
3.1.1
PCIe Overview .................................................................................................................... 75
3.1.1.1
Architecture, Transaction and Link Layer Properties ........................................................... 76
3.1.1.2
Physical Interface Properties........................................................................................... 77
3.1.1.3
Advanced Extensions ..................................................................................................... 77
3.1.2
Functionality - General ......................................................................................................... 77
3.1.2.1
Native/Legacy .............................................................................................................. 77
3.1.2.2
Locked Transactions ...................................................................................................... 77
3.1.2.3
End to End CRC (ECRC) ................................................................................................. 77
3.1.3
Host I/F ............................................................................................................................. 77
3.1.3.1
Tag IDs ....................................................................................................................... 77
3.1.3.1.1
TAG ID Allocation for Read Transactions........................................................................ 77
3.1.3.1.2
TAG ID Allocation for Write Transactions ....................................................................... 78
3.1.3.1.2.1
Case 1 - DCA Disabled in the System: .................................................................... 78
3.1.3.1.2.2
Case 2 - DCA Enabled in the System, but Disabled for the Request: ........................... 79
3.1.3.1.2.3
Case 3 - DCA Enabled in the System, DCA Enabled for the Request:........................... 79
3.1.3.2
Completion Timeout Mechanism ...................................................................................... 79
3.1.3.2.1
Completion Timeout Enable ......................................................................................... 80
3.1.3.2.2
Resend Request Enable............................................................................................... 80
3.1.3.2.3
Completion Timeout Period.......................................................................................... 80
3.1.4
Transaction Layer................................................................................................................ 81
3.1.4.1
Transaction Types Accepted by the 82576 ........................................................................ 82
3.1.4.1.1
Configuration Request Retry Status .............................................................................. 82
3.1.4.1.2
Partial Memory Read and Write Requests ...................................................................... 82
3.1.4.2
Transaction Types Initiated by the 82576 ......................................................................... 83
3.1.4.2.1
Data Alignment.......................................................................................................... 83
3.1.4.2.2
Multiple Tx Data Read Requests ................................................................................... 83
3.1.4.3
Messages..................................................................................................................... 84
3.1.4.3.1
Message Handling by the 82576 (as a Receiver)............................................................. 84
3.1.4.3.2
Message Handling by the 82576 (as a Transmitter) ........................................................ 84
3.1.4.4
Ordering Rules ............................................................................................................. 85
3.1.4.4.1
Out of Order Completion Handling ................................................................................ 85
3.1.4.5
Transaction Definition and Attributes ............................................................................... 86
3.1.4.5.1
Max Payload Size ....................................................................................................... 86
3.1.4.5.2
Traffic Class (TC) and Virtual Channels (VC) .................................................................. 86
3.1.4.5.3
Relaxed Ordering ....................................................................................................... 86
3.1.4.5.4
Snoop Not Required ................................................................................................... 86
3.1.4.5.5
No Snoop and Relaxed Ordering for LAN Traffic .............................................................. 86
3.1.4.5.5.1
No-Snoop Option for Payload ................................................................................ 87
3.1.4.5.5.2
No Snoop Option for TSO Header ........................................................................... 87
3.1.4.6
Flow Control................................................................................................................. 87
3.1.4.6.1
82576 Flow Control Rules............................................................................................ 87
3.1.4.6.2
Upstream Flow Control Tracking................................................................................... 88
3.1.4.6.3
Flow Control Update Frequency.................................................................................... 88
3.1.4.6.4
Flow Control Timeout Mechanism ................................................................................. 88
3.1.4.7
Error Forwarding........................................................................................................... 89
3.1.5
Data Link Layer................................................................................................................... 89
3.1.5.1
ACK/NAK Scheme ......................................................................................................... 89
3.1.5.2
Supported DLLPs .......................................................................................................... 89
3.1.5.3
Transmit EDB Nullifying ................................................................................................. 90
3.1.6
Physical Layer..................................................................................................................... 90
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
9
Intel® 82576 GbE Controller — Contents
3.1.6.1
Link Width ................................................................................................................... 90
3.1.6.2
Polarity Inversion .......................................................................................................... 90
3.1.6.3
L0s Exit latency ............................................................................................................ 91
3.1.6.4
Lane-to-Lane De-Skew .................................................................................................. 91
3.1.6.5
Lane Reversal............................................................................................................... 91
3.1.6.6
Reset .......................................................................................................................... 92
3.1.6.7
Scrambler Disable ......................................................................................................... 92
3.1.7
Error Events and Error Reporting ........................................................................................... 92
3.1.7.1
Mechanism in General.................................................................................................... 92
3.1.7.2
Error Events ................................................................................................................. 93
3.1.7.3
Error Pollution .............................................................................................................. 95
3.1.7.4
Completion with Unsuccessful Completion Status............................................................... 95
3.1.7.5
Error Reporting Changes ................................................................................................ 95
3.1.8
Performance Monitoring ....................................................................................................... 96
3.1.8.1
Leaky Bucket Mode ....................................................................................................... 96
3.1.9
PCIe Power Management ...................................................................................................... 97
3.1.10
PCIe Programming Interface ................................................................................................. 97
3.2
Management Interfaces .............................................................................................................. 97
3.2.1
SMBus ............................................................................................................................... 97
3.2.1.1
Channel Behavior .......................................................................................................... 97
3.2.1.1.1
SMBus Addressing...................................................................................................... 97
3.2.1.1.2
SMBus Notification Methods......................................................................................... 98
3.2.1.1.2.1
SMBus Alert and Alert Response Method ................................................................. 98
3.2.1.1.2.2
Asynchronous Notify Method ................................................................................. 99
3.2.1.1.2.3
Direct Receive Method .......................................................................................... 99
3.2.1.1.3
Receive TCO Flow .....................................................................................................100
3.2.1.1.4
Transmit TCO Flow ....................................................................................................100
3.2.1.1.5
Transmit Errors in Sequence Handling..........................................................................101
3.2.1.1.6
TCO Command Aborted Flow ......................................................................................102
3.2.1.1.7
Concurrent SMBus Transactions ..................................................................................102
3.2.1.1.8
SMBus ARP Functionality ............................................................................................102
3.2.1.1.8.1
SMBus ARP in Dual-/Single-Address Mode ..............................................................102
3.2.1.1.8.2
SMBus ARP Flow .................................................................................................103
3.2.1.1.8.3
SMBus ARP UDID Content ................................................................................104
3.2.1.1.9
LAN Fail-Over Through SMBus ....................................................................................106
3.2.2
NC-SI .............................................................................................................................. 106
3.2.2.1
Electrical Characteristics ...............................................................................................106
3.2.2.2
NC-SI Transactions ......................................................................................................107
3.3
Flash / EEPROM....................................................................................................................... 107
3.3.1
EEPROM Interface ............................................................................................................. 107
3.3.1.1
General Overview.........................................................................................................107
3.3.1.2
EEPROM Device ...........................................................................................................108
3.3.1.3
Software Accesses .......................................................................................................108
3.3.1.4
Signature Field ............................................................................................................108
3.3.1.5
Protected EEPROM Space ..............................................................................................109
3.3.1.5.1
Initial EEPROM Programming ......................................................................................109
3.3.1.5.2
Activating the Protection Mechanism............................................................................109
3.3.1.5.3
Non Permitted Accessing to Protected Areas in the EEPROM ............................................109
3.3.1.6
EEPROM Recovery ........................................................................................................110
3.3.1.7
EEPROM-Less Support ..................................................................................................110
3.3.1.7.1
Access to the EEPROM Controlled Feature.....................................................................111
3.3.2
Shared EEPROM ................................................................................................................ 112
3.3.2.1
EEPROM Deadlock Avoidance .........................................................................................112
3.3.2.2
EEPROM Map Shared Words ..........................................................................................112
3.3.3
Vital Product Data (VPD) Support ........................................................................................ 113
3.3.4
Flash Interface.................................................................................................................. 114
3.3.4.1
Flash Interface Operation ..............................................................................................114
3.3.4.2
Flash Write Control.......................................................................................................115
3.3.4.3
Flash Erase Control ......................................................................................................115
3.3.5
Shared FLASH................................................................................................................... 115
3.3.5.1
Flash Access Contention................................................................................................115
Intel® 82576 GbE Controller
Datasheet
10
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
3.3.5.2
Flash Deadlock Avoidance ............................................................................................. 115
3.4
Configurable I/O Pins ............................................................................................................... 116
3.4.1
General-Purpose I/O (Software-Definable Pins) ......................................................................116
3.4.2
Software Watchdog ............................................................................................................116
3.4.2.1
Watchdog rearm ..........................................................................................................117
3.4.3
LEDs ................................................................................................................................117
3.5
Network Interfaces .................................................................................................................. 117
3.5.1
Overview ..........................................................................................................................117
3.5.2
MAC Functionality...............................................................................................................118
3.5.2.1
Internal GMII/MII Interface ........................................................................................... 118
3.5.2.2
MDIO/MDC.................................................................................................................. 119
3.5.2.2.1
MDIC Register Usage................................................................................................. 119
3.5.2.3
Duplex Operation with Copper PHY ................................................................................. 120
3.5.2.3.1
Full Duplex............................................................................................................... 120
3.5.2.3.2
Half Duplex .............................................................................................................. 121
3.5.3
SerDes, SGMII Support .......................................................................................................121
3.5.3.1
SerDes Analog Block .................................................................................................... 121
3.5.3.2
SerDes/SGMII PCS Block .............................................................................................. 121
3.5.3.3
GbE Physical Coding Sub-Layer (PCS) ............................................................................. 121
3.5.3.3.1
8B10B Encoding/Decoding ......................................................................................... 122
3.5.3.3.2
Code Groups and Ordered Sets ................................................................................... 122
3.5.4
Auto-Negotiation and Link Setup Features .............................................................................123
3.5.4.1
SerDes Link Configuration ............................................................................................. 123
3.5.4.1.1
Signal Detect Indication ............................................................................................. 123
3.5.4.1.2
MAC Link Speed........................................................................................................123
3.5.4.1.3
SerDes Mode Auto-Negotiation ................................................................................... 124
3.5.4.1.4
Forcing Link ............................................................................................................. 124
3.5.4.1.5
HW Detection of Non-Auto-Negotiation Partner ............................................................. 125
3.5.4.2
SGMII Link Configuration .............................................................................................. 125
3.5.4.2.1
SGMII Auto-Negotiation ............................................................................................. 125
3.5.4.2.2
Forcing Link ............................................................................................................. 126
3.5.4.2.3
MAC Speed Resolution ............................................................................................... 126
3.5.4.3
Copper PHY Link Configuration....................................................................................... 126
3.5.4.3.1
PHY Auto-Negotiation (Speed, Duplex, Flow Control) .....................................................126
3.5.4.3.2
MAC Speed Resolution ............................................................................................... 126
3.5.4.3.2.1
Forcing MAC Speed ............................................................................................. 127
3.5.4.3.2.2
Using Internal PHY Direct Link-Speed Indication .....................................................127
3.5.4.3.3
MAC Full-/Half- Duplex Resolution ............................................................................... 127
3.5.4.3.4
Using PHY Registers .................................................................................................. 128
3.5.4.3.5
Comments Regarding Forcing Link............................................................................... 128
3.5.4.4
Loss of Signal/Link Status Indication .............................................................................. 128
3.5.5
Ethernet Flow Control (FC) ..................................................................................................128
3.5.5.1
MAC Control Frames and Receiving Flow Control Packets ...................................................129
3.5.5.1.1
Structure of 802.3X FC Packets................................................................................... 129
3.5.5.1.2
Operation and Rules .................................................................................................. 130
3.5.5.1.3
Timing Considerations ............................................................................................... 130
3.5.5.2
PAUSE and MAC Control Frames Forwarding .................................................................... 131
3.5.5.3
Transmission of PAUSE Frames ...................................................................................... 131
3.5.5.3.1
Operation and Rules .................................................................................................. 131
3.5.5.3.2
Software Initiated PAUSE Frame Transmission .............................................................. 132
3.5.5.4
IPG Control and Pacing ................................................................................................. 132
3.5.5.4.1
Fixed IPG Extension .................................................................................................. 132
3.5.5.4.2
Limiting Payload Rate ................................................................................................ 133
3.5.6
Loopback Support ..............................................................................................................133
3.5.6.1
General ...................................................................................................................... 133
3.5.6.2
MAC Loopback ............................................................................................................. 133
3.5.6.2.1
Setting the 82576 to MAC loopback Mode..................................................................... 134
3.5.6.3
Internal PHY Loopback.................................................................................................. 134
3.5.6.3.1
Setting the 82576 to PHY loopback Mode ..................................................................... 134
3.5.6.4
SerDes Loopback .........................................................................................................135
3.5.6.4.1
Setting SerDes loopback Mode.................................................................................... 135
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
11
Intel® 82576 GbE Controller — Contents
3.5.6.5
External PHY Loopback .................................................................................................135
3.5.6.5.1
Setting the 82576 to External PHY loopback Mode .........................................................135
3.5.7
Integrated Copper PHY Functionality .................................................................................... 136
3.5.7.1
PHY Initialization Functionality .......................................................................................136
3.5.7.1.1
Auto MDIO Register Initialization.................................................................................136
3.5.7.1.2
General Register Initialization .....................................................................................136
3.5.7.1.3
Mirror Bit Initialization ...............................................................................................137
3.5.7.2
Determining Link State .................................................................................................137
3.5.7.2.1
False Link ................................................................................................................138
3.5.7.2.2
Forced Operation ......................................................................................................139
3.5.7.2.3
Auto Negotiation .......................................................................................................139
3.5.7.2.4
Parallel Detection ......................................................................................................139
3.5.7.2.5
Auto Cross-Over .......................................................................................................139
3.5.7.2.6
10/100 MB/s Mismatch Resolution ...............................................................................140
3.5.7.2.7
Link Criteria .............................................................................................................141
3.5.7.2.7.1
1000BASE-T ......................................................................................................141
3.5.7.2.7.2
100BASE-TX ......................................................................................................141
3.5.7.2.7.3
10BASE-T ..........................................................................................................141
3.5.7.3
Link Enhancements ......................................................................................................141
3.5.7.3.1
SmartSpeed .............................................................................................................141
3.5.7.3.1.1
Using SmartSpeed ...........................................................................................142
3.5.7.4
Flow Control ................................................................................................................142
3.5.7.5
Management Data Interface ..........................................................................................143
3.5.7.6
Low Power Operation and Power Management .................................................................143
3.5.7.6.1
Power Down via the PHY Register ................................................................................143
3.5.7.6.2
Power Management State...........................................................................................143
3.5.7.6.3
AN1000_dis .............................................................................................................143
3.5.7.6.4
Low Power Link Up - Link Speed Control.......................................................................144
3.5.7.6.4.1
D0a State ..........................................................................................................144
3.5.7.6.4.2
Non-D0a State ...................................................................................................145
3.5.7.6.5
Smart Power-Down (SPD) ..........................................................................................145
3.5.7.6.5.1
Back-to-Back Smart Power-Down .........................................................................146
3.5.7.6.6
Link Energy Detect ....................................................................................................146
3.5.7.6.7
PHY Power-Down State ..............................................................................................146
3.5.7.7
Advanced Diagnostics ...................................................................................................147
3.5.7.7.1
TDR - Time Domain Reflectometry...............................................................................147
3.5.7.7.2
Channel Frequency Response .....................................................................................147
3.5.7.8
1000 Mb/s Operation ....................................................................................................147
3.5.7.8.1
Introduction .............................................................................................................147
3.5.7.8.2
Transmit Functions....................................................................................................148
3.5.7.8.2.1
Scrambler..........................................................................................................148
3.5.7.8.2.2
Transmit FIFO ....................................................................................................149
3.5.7.8.2.3
Transmit Phase-Locked Loop PLL ..........................................................................149
3.5.7.8.2.4
Trellis Encoder ...................................................................................................149
3.5.7.8.2.5
4DPAM5 Encoder ................................................................................................149
3.5.7.8.2.6
Spectral Shaper..................................................................................................149
3.5.7.8.2.7
Low-Pass Filter ...................................................................................................149
3.5.7.8.2.8
Line Driver.........................................................................................................150
3.5.7.8.3
Receive Functions .....................................................................................................150
3.5.7.8.3.1
Hybrid...............................................................................................................150
3.5.7.8.3.2
Automatic Gain Control (AGC) ..............................................................................150
3.5.7.8.3.3
Timing Recovery.................................................................................................151
3.5.7.8.3.4
Analog-to-Digital Converter (ADC) ........................................................................151
3.5.7.8.3.5
Digital Signal Processor (DSP) ..............................................................................151
3.5.7.8.3.6
De scrambler .....................................................................................................151
3.5.7.8.3.7
Viterbi Decoder/Decision Feedback Equalizer (DFE) .................................................151
3.5.7.8.3.8
4DPAM5 Decoder ................................................................................................151
3.5.7.8.3.9
100 Mb/s Operation ............................................................................................152
3.5.7.8.3.10
10 Mb/s Operation ..............................................................................................152
3.5.7.8.3.11
Link Test ...........................................................................................................152
3.5.7.8.3.12
10Base-T Link Failure Criteria and Override ............................................................152
Intel® 82576 GbE Controller
Datasheet
12
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
3.5.7.8.3.13
Jabber .............................................................................................................. 152
3.5.7.8.3.14
Polarity Correction .............................................................................................. 152
3.5.7.8.3.15
Dribble Bits........................................................................................................ 152
3.5.7.8.3.16
PHY Address ...................................................................................................... 153
3.5.8
Media Auto Sense...............................................................................................................153
3.5.8.1
Auto Sense Setup ........................................................................................................153
3.5.8.1.1
SerDes/SGMII Detect Mode (PHY is active) ................................................................... 153
3.5.8.1.2
PHY Detect Mode (SerDes/SGMII is active) ................................................................... 153
3.5.8.2
Switching Between Medias. ........................................................................................... 154
3.5.8.2.1
Transition to SerDes/SGMII mode ............................................................................... 154
3.5.8.2.2
Transition to Internal PHY Mode .................................................................................. 154
4.0
Initialization ............................................................................................................................155
4.1
Power Up ............................................................................................................................... 155
4.1.1
Power-Up Sequence............................................................................................................155
4.1.2
Power-Up Timing Diagram ...................................................................................................156
4.1.2.1
Timing Requirements.................................................................................................... 157
4.1.2.2
Timing Guarantees .......................................................................................................157
4.2
Reset Operation ...................................................................................................................... 157
4.2.1
Reset Sources....................................................................................................................157
4.2.1.1
Internal_Power_On_Reset ............................................................................................. 158
4.2.1.2
PE_RST_N................................................................................................................... 158
4.2.1.3
In-Band PCIe Reset ......................................................................................................158
4.2.1.4
D3hot to D0 Transition ................................................................................................. 158
4.2.1.5
Function Level Reset (FLR) ............................................................................................ 158
4.2.1.5.1
PF (Physical Function) FLR or FLR in non-IOV Mode .......................................................158
4.2.1.5.2
VF (Virtual Function) FLR (Function Level Reset) ........................................................... 158
4.2.1.5.3
IOV (IO Virtualization) Disable .................................................................................... 158
4.2.1.6
Software Reset ............................................................................................................ 159
4.2.1.6.1
Full Software Reset ...................................................................................................159
4.2.1.6.2
Physical Function (PF) Software Reset.......................................................................... 159
4.2.1.6.3
VF Software Reset.....................................................................................................159
4.2.1.7
Force TCO................................................................................................................... 159
4.2.1.8
Firmware Reset ........................................................................................................... 160
4.2.1.9
EEPROM Reset ............................................................................................................. 160
4.2.1.10
PHY Reset ................................................................................................................... 160
4.2.2
Reset Effects .....................................................................................................................161
4.2.3
PHY Behavior During a Manageability Session ........................................................................166
4.3
Function Disable...................................................................................................................... 167
4.3.1
General.............................................................................................................................167
4.3.2
Overview ..........................................................................................................................167
4.3.3
Control Options ..................................................................................................................169
4.3.3.1
PCI functions Disable Options ........................................................................................ 169
4.3.4
Event Flow for Enable/Disable Functions ................................................................................170
4.3.4.1
Multi-Function Advertisement ........................................................................................ 170
4.3.4.2
Legacy Interrupts Utilization .......................................................................................... 170
4.3.4.3
Power Reporting .......................................................................................................... 171
4.4
Device Disable ........................................................................................................................ 171
4.4.1
BIOS Handling of Device Disable ..........................................................................................171
4.5
Software Initialization and Diagnostics ...................................................................................... 172
4.5.1
Introduction ......................................................................................................................172
4.5.2
Power Up State ..................................................................................................................172
4.5.3
Initialization Sequence ........................................................................................................172
4.5.4
Interrupts During Initialization .............................................................................................172
4.5.5
Global Reset and General Configuration.................................................................................173
4.5.6
Flow Control Setup .............................................................................................................173
4.5.7
Link Setup Mechanisms and Control/Status Bit Summary.........................................................173
4.5.7.1
PHY Initialization.......................................................................................................... 173
4.5.7.2
MAC/PHY Link Setup (CTRL_EXT.LINK_MODE = 00).......................................................... 173
4.5.7.2.1
MAC Settings Automatically Based on Duplex and Speed
Resolved by PHY (CTRL.FRCDPLX = 0b, CTRL.FRCSPD = 0b,) ..........................................173
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
13
Intel® 82576 GbE Controller — Contents
4.5.7.2.2
MAC Duplex and Speed Settings Forced by Software Based on Resolution of PHY (CTRL.FRCDPLX
= 1b, CTRL.FRCSPD = 1b) ..........................................................................................174
4.5.7.2.3
MAC/PHY Duplex and Speed Settings Both Forced by Software (Fully-Forced Link Setup)
(CTRL.FRCDPLX = 1b, CTRL.FRCSPD = 1b, CTRL.SLU = 1b) ............................................174
4.5.7.3
MAC/SERDES Link Setup
(CTRL_EXT.LINK_MODE = 11b)......................................................................................175
4.5.7.3.1
Hardware Auto-Negotiation Enabled (PCS_LCTL. AN ENABLE = 1b;
CTRL.FRCSPD = 0b; CTRL.FRCDPLX = 0) ......................................................................175
4.5.7.3.2
Auto-Negotiation Skipped (PCS_LCTL. AN ENABLE = 0b;
CTRL.FRCSPD = 1b; CTRL.FRCDPLX = 1) ......................................................................175
4.5.7.4
MAC/SGMII Link Setup (CTRL_EXT.LINK_MODE = 10b).....................................................176
4.5.7.4.1
Hardware Auto-Negotiation Enabled (PCS_LCTL. AN ENABLE = 1b,
CTRL.FRCDPLX = 0b, CTRL.FRCSPD = 0b) ....................................................................176
4.5.8
Initialization of Statistics .................................................................................................... 176
4.5.9
Receive Initialization .......................................................................................................... 177
4.5.9.1
Initialize the Receive Control Register .............................................................................177
4.5.9.2
Dynamic Enabling and Disabling of Receive Queues ..........................................................177
4.5.10
Transmit Initialization ........................................................................................................ 178
4.5.10.1
Dynamic Queue Enabling and Disabling...........................................................................178
4.5.11
Virtualization Initialization Flow ........................................................................................... 179
4.5.11.1
Next Generation VMDq Mode .........................................................................................179
4.5.11.1.1
Global Filtering and Offload Capabilities........................................................................179
4.5.11.1.2
Mirroring rules. .........................................................................................................179
4.5.11.1.3
Per Pool Settings.......................................................................................................179
4.5.11.1.4
Security Features ......................................................................................................180
4.5.11.1.4.1
Anti spoofing......................................................................................................180
4.5.11.1.4.2
Storm control.....................................................................................................180
4.5.11.1.5
Allocation of Tx Bandwidth to VMs ...............................................................................180
4.5.11.1.5.1
Configuring Tx Bandwidth to VMs..........................................................................180
4.5.11.1.5.2
Link Speed Change Procedure ..............................................................................181
4.5.11.2
IOV Initialization ..........................................................................................................181
4.5.11.2.1
PF Driver Initialization ...............................................................................................181
4.5.11.2.1.1
VF Specific Reset Coordination..............................................................................182
4.5.11.2.2
VF Driver Initialization ...............................................................................................182
4.5.11.2.3
Full Reset Coordination ..........................................................................................182
4.5.11.2.4
IOV disable ..............................................................................................................183
4.5.11.2.5
VFRE/VFTE ...............................................................................................................183
4.5.12
Transmit Rate Limiting Configuration ................................................................................... 183
4.5.12.1
Link Speed Change Procedure........................................................................................183
4.5.12.2
Configuration Flow .......................................................................................................183
4.5.12.3
Configuration Rules ......................................................................................................184
4.6
Access to shared resources ....................................................................................................... 184
4.6.1
Acquiring ownership over a shared resource.......................................................................... 184
4.6.2
Releasing ownership over a shared resource ......................................................................... 185
5.0
Power Management................................................................................................................. 187
5.1
General Power State Information ............................................................................................... 187
5.1.1
PCI Device Power States .................................................................................................... 187
5.1.2
PCIe Link Power States ...................................................................................................... 188
5.1.3
PCIe Link Power States ...................................................................................................... 188
5.2
82576 Power States ................................................................................................................. 188
5.2.1
D0 Uninitialized State (D0u) ............................................................................................... 189
5.2.1.1
Entry into D0u state .....................................................................................................189
5.2.1.2
Exit from D0u state ......................................................................................................189
5.2.2
D0active State .................................................................................................................. 190
5.2.2.1
Entry to D0a state........................................................................................................190
5.2.3
D3 State (PCI-PM D3hot) ................................................................................................... 190
5.2.3.1
Entry to D3 State .........................................................................................................190
5.2.3.2
Exit from D3 State .......................................................................................................191
5.2.3.3
Master Disable Via CTRL Register ...................................................................................191
5.2.4
Dr State (D3cold) .............................................................................................................. 191
Intel® 82576 GbE Controller
Datasheet
14
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
5.2.4.1
Dr Disable Mode .......................................................................................................... 192
5.2.4.2
Entry to Dr State ......................................................................................................... 192
5.2.4.3
Auxiliary Power Usage .................................................................................................. 193
5.2.5
Link Disconnect..................................................................................................................193
5.2.6
Device Power-Down State ...................................................................................................193
5.3
Power Limits by Certain Form Factors......................................................................................... 194
5.4
Interconnects Power Management ............................................................................................. 194
5.4.1
PCIe Link Power Management ..............................................................................................194
5.4.2
NC-SI Clock Control............................................................................................................196
5.4.3
PHY Power-Management .....................................................................................................196
5.4.4
SerDes/SGMII Power Management .......................................................................................196
5.5
Timing of Power-State Transitions.............................................................................................. 196
5.5.1
Power Up (Off to Dup to D0u to D0a .....................................................................................197
5.5.2
Transition from D0a to D3 and Back Without PE_RST_N ..........................................................198
5.5.3
Transition From D0a to D3 and Back With PE_RST_N ..............................................................199
5.5.4
Transition From D0a to Dr and Back Without Transition to D3...................................................200
5.6
Wake Up ................................................................................................................................ 201
5.6.1
Advanced Power Management Wake Up ................................................................................201
5.6.2
PCIe Power Management Wake Up .......................................................................................202
5.6.3
Wake-Up Packets ...............................................................................................................203
5.6.3.1
Pre-Defined Filters ....................................................................................................... 203
5.6.3.1.1
Directed Exact Packet ................................................................................................ 203
5.6.3.1.2
Directed Multicast Packet ........................................................................................... 203
5.6.3.1.3
Broadcast ................................................................................................................ 203
5.6.3.1.4
If the Broadcast Wake Up Enable bit in the Wake Up Filter Control (WUFC.BC) register is set, the
82576 generates a wake-up event when it receives a broadcast packet.Magic Packet .........203
5.6.3.1.5
ARP/IPv4 Request Packet ........................................................................................... 205
5.6.3.1.6
Directed Ipv4 Packet ................................................................................................. 205
5.6.3.1.7
Directed IPv6 Packet ................................................................................................. 206
5.6.3.2
Flexible Filters ............................................................................................................. 207
5.6.3.2.1
IPX Diagnostic Responder Request Packet .................................................................... 207
5.6.3.2.2
Directed IPX Packet...................................................................................................208
5.6.3.2.3
IPv6 Neighbor Discovery Filter .................................................................................... 208
5.6.3.3
Wake Up Packet Storage ............................................................................................... 209
6.0
Non-Volatile Memory Map - EEPROM ........................................................................................211
6.1
EEPROM General Map............................................................................................................... 211
6.2
Hardware Accessed Words ........................................................................................................ 213
6.2.1
Ethernet Address (Words 0x00:02) .......................................................................................213
6.2.2
Initialization Control Word 1 (Word 0x0A)..............................................................................213
6.2.3
Subsystem ID (Word 0x0B) .................................................................................................214
6.2.4
Subsystem Vendor ID (Word 0x0C) ......................................................................................215
6.2.5
Device ID (Word 0x0D, 0x11) ..............................................................................................215
6.2.6
Dummy Device ID (Word 0x1D) ...........................................................................................215
6.2.7
Initialization Control Word 2 LAN1 (Word 0x0F) ......................................................................215
6.2.8
Software Defined Pins Control LAN1 (Word 0x10) ...................................................................216
6.2.9
Software Defined Pins Control LAN0 (Word 0x20) ...................................................................218
6.2.10
EEPROM Sizing and Protected Fields (Word 0x12) ...................................................................219
6.2.11
Reserved (Word 0x13) ........................................................................................................220
6.2.12
Initialization Control 3 (Word 0x14, 0x24) .............................................................................221
6.2.13
PCIe Completion Timeout Configuration (Word 0x15) ..............................................................223
6.2.14
MSI-X Configuration (Word 0x16).........................................................................................223
6.2.15
PCIe Init Configuration 1 Word (Word 0x18) ..........................................................................223
6.2.16
PCIe Init Configuration 2 Word (Word 0x19) ..........................................................................224
6.2.17
PCIe Init Configuration 3 Word (Word 0x1A) ..........................................................................224
6.2.18
PCIe Control (Word 0x1B) ...................................................................................................225
6.2.19
LED 1,3 Configuration Defaults (Word 0x1C, 0x2A) .................................................................226
6.2.20
Device Rev ID (Word 0x1E) .................................................................................................228
6.2.21
LED 0,2 Configuration Defaults (Word 0x1F, 0x2B) .................................................................228
6.2.22
Functions Control (Word 0x21).............................................................................................230
6.2.23
LAN Power Consumption (Word 0x22) ...................................................................................231
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
15
Intel® 82576 GbE Controller — Contents
6.2.24
I/O Virtualization (IOV) Control (Word 0x25)......................................................................... 231
6.2.25
IOV Device ID (Word 0x26) ................................................................................................ 232
6.2.26
End of Read-Only (RO) Area (Word 0x2C)............................................................................. 232
6.2.27
Start of RO Area (Word 0x2D)............................................................................................. 232
6.2.28
Watchdog Configuration (Word 0x2E)................................................................................... 232
6.2.29
VPD Pointer (Word 0x2F).................................................................................................... 232
6.2.30
NC-SI Arbitration Enable (Word 0x40).................................................................................. 233
6.3
Analog Blocks Configuration Structures....................................................................................... 233
6.3.1
Analog Configuration Pointers Start Address (Offset 0x17) ...................................................... 233
6.3.2
PCIe Initialization Pointer (Offset 0, Relative to Word 0x17 Value)............................................ 233
6.3.3
PHY Initialization Pointer (Offset 1, Relative to Word 0x17 Value) ............................................ 233
6.3.4
SerDes Initialization Pointer (Offset 2, Relative to Word 0x17 Value) ........................................ 234
6.4
SerDes/PHY/PCIe/PLL/CCM Initialization Structures ...................................................................... 234
6.4.1
Block Header (Offset 0x0) .................................................................................................. 234
6.4.2
CRC8 (Offset 1) ................................................................................................................ 235
6.4.3
Next Buffer Pointer (Offset 2 - Optional) ............................................................................... 235
6.4.4
Address/Data (Offset 3:Word Count).................................................................................... 235
6.5
Firmware Pointers & Control Words ............................................................................................ 236
6.5.1
Loader Patch Pointer (Word 0x51) ....................................................................................... 236
6.5.2
No Manageability Patch Pointer (Word 0x52) ......................................................................... 236
6.5.3
Manageability Capability/Manageability Enable (Word 0x54).................................................... 236
6.5.4
PT Patch Configuration Pointer (Word 0x55).......................................................................... 237
6.5.5
PT LAN0 Configuration Pointer (Word 0x56) .......................................................................... 237
6.5.6
Sideband Configuration Pointer (Word 0x57) ......................................................................... 237
6.5.7
Flex TCO Filter Configuration Pointer (Word 0x58) ................................................................. 237
6.5.8
PT LAN1 Configuration Pointer (Word 0x59) .......................................................................... 237
6.5.9
Management HW Config Control (Word 0x23)........................................................................ 238
6.6
Patch Structure ....................................................................................................................... 238
6.6.1
Patch Data Size (Offset 0x0) ............................................................................................... 238
6.6.2
Block CRC8 (Offset 0x1)..................................................................................................... 239
6.6.3
Patch Entry Point Pointer Low Word (Offset 0x2) ................................................................... 239
6.6.4
Patch Entry Point Pointer High Word (Offset 0x3)................................................................... 239
6.6.5
Patch Version 1 Word (Offset 0x4) ....................................................................................... 239
6.6.6
Patch Version 2 Word (Offset 0x5) ....................................................................................... 239
6.6.7
Patch Version 3 Word (Offset 0x6) ....................................................................................... 239
6.6.8
Patch Version 4 Word (Offset 0x7) ....................................................................................... 240
6.6.9
Patch Data Words (Offset 0x8, Block Length) ........................................................................ 240
6.7
PT LAN Configuration Structure ................................................................................................. 240
6.7.1
Section Header (Offset 0x0)................................................................................................ 240
6.7.2
LAN0 IPv4 Address 0 LSB, MIPAF0 (Offset 0x01) ................................................................... 240
6.7.3
LAN0 IPv4 Address 0 MSB, MIPAF0 (Offset 0x02) .................................................................. 240
6.7.4
LAN0 IPv4 Address 1; MIPAF1 (Offset 0x03:0x04) ................................................................. 241
6.7.5
LAN0 IPv4 Address 2; MIPAF2 (Offset 0x05h:0x06) ............................................................... 241
6.7.6
LAN0 IPv4 Address 3; MIPAF3 (Offset 0x07h:0x08) ............................................................... 241
6.7.7
LAN0 MAC Address 0 LSB, MMAL0 (Offset 0x09) .................................................................... 241
6.7.8
LAN0 MAC Address 0 LSB, MMAL0 (Offset 0x0A).................................................................... 241
6.7.9
LAN0 MAC Address 0 MSB, MMAH0 (Offset 0x0B) .................................................................. 241
6.7.10
LAN0 MAC Address 1; MMAL/H1 (Offset 0x0C:0x0E) .............................................................. 241
6.7.11
LAN0 MAC Address 2; MMAL/H2 (Offset 0x0F:0x11)............................................................... 242
6.7.12
LAN0 MAC Address 3; MMAL/H3 (Offset 0x12:0x14) .............................................................. 242
6.7.13
LAN0 UDP Flex Filter Ports 0:15; MFUTP Registers (Offset 0x15:0x24)...................................... 242
6.7.14
LAN0 VLAN Filter 0:7; MAVTV Registers (Offset 0x25:0x2C) .................................................... 242
6.7.15
LAN0 Manageability Filters Valid; MFVAL LSB (Offset 0x2D) .................................................... 242
6.7.16
LAN0 Manageability Filters Valid; MFVAL MSB (Offset 0x2E) .................................................... 242
6.7.17
LAN0 MANC Value LSB (Offset 0x2F).................................................................................... 243
6.7.18
LAN0 MANC Value MSB (Offset 0x30) ................................................................................... 243
6.7.19
LAN0 Receive Enable 1 (Offset 0x31) ................................................................................... 244
6.7.20
LAN0 Receive Enable 2 (Offset 0x32) ................................................................................... 244
6.7.21
LAN0 MANC2H Value LSB (Offset 0x33) ................................................................................ 244
6.7.22
LAN0 MANC2H Value MSB (Offset 0x34) ............................................................................... 244
6.7.23
Manageability Decision Filters; MDEF0,1 (Offset 0x35) ........................................................... 245
Intel® 82576 GbE Controller
Datasheet
16
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
6.7.24
Manageability Decision Filters; MDEF0,2 (Offset 0x36) ............................................................245
6.7.25
Manageability Decision Filters; MDEF0,3 (Offset 0x37) ............................................................246
6.7.26
Manageability Decision Filters; MDEF0,4 (Offset 0x38) ............................................................246
6.7.27
Manageability Decision Filters; MDEF1:6, 1:4 (Offset 0x39:0x50) .............................................246
6.7.28
Ethertype Data (Word 0x.....................................................................................................246
6.7.29
Ethertype filter; METF0, 1 (Offset 0x51) ................................................................................246
6.7.30
Ethertype filter; METF0, 1 (Offset 0x52) ................................................................................246
6.7.31
Ethertype filter; METF1:3,1:2 (Offset 0x53:0x58) ...................................................................247
6.7.32
ARP Response IPv4 Address 0 LSB (Offset 0x59) ....................................................................247
6.7.33
ARP Response IPv4 Address 0 MSB (Offset 0x5A) ...................................................................247
6.7.34
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x5B)......................................................................247
6.7.35
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x5C).....................................................................247
6.7.36
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x5D) .....................................................................248
6.7.37
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x5E) .....................................................................248
6.7.38
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x5F) ......................................................................248
6.7.39
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x60) .....................................................................248
6.7.40
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x61)......................................................................248
6.7.41
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x62) .....................................................................249
6.7.42
LAN0 IPv6 Address 1; MIPAF (Offset 0x63:0x6A)....................................................................249
6.7.43
LAN0 IPv6 Address 2; MIPAF (Offset 0x6B:0x72)....................................................................249
6.8
Sideband Configuration Structure .............................................................................................. 249
6.8.1
Section Header (Offset 0x0) ................................................................................................249
6.8.2
SMBus Max Fragment Size (Offset 0x1) .................................................................................249
6.8.3
SMBus Notification Timeout and Flags (Offset 0x2) .................................................................250
6.8.4
SMBus Slave Address (Offset 0x3) ........................................................................................250
6.8.5
SMBus Fail-Over Register; Low Word (Offset 0x4) ..................................................................250
6.8.6
SMBus Fail-Over Register; High Word (Offset 0x5)..................................................................251
6.8.7
NC-SI Configuration (Offset 0x6)..........................................................................................251
6.8.8
NC-SI Hardware arbitration Configuration (Offset 0x8) ............................................................251
6.8.9
Reserved (Offset 0x9 - 0xC) ................................................................................................251
6.9
Flex TCO Filter Configuration Structure ....................................................................................... 252
6.9.1
Section Header (Offset 0x0) ................................................................................................252
6.9.2
Flex Filter Length and Control (Offset 0x01) ...........................................................................252
6.9.3
Flex Filter Enable Mask (Offset 0x02:0x09) ............................................................................252
6.9.4
Flex Filter Data (Offset 0x0A - Block Length)..........................................................................252
6.10
Software Accessed Words ......................................................................................................... 252
6.10.1
Compatibility (Word 0x03)...................................................................................................254
6.10.2
OEM specific (Word 0x04) ...................................................................................................255
6.10.3
OEM Specific (Word 0x06, 0x07) ..........................................................................................255
6.10.4
EEPROM Image Revision (Word 0x05) ...................................................................................255
6.10.5
PBA Number Module (Word 0x08, 0x09)................................................................................255
6.10.6
PXE Configuration Words (Word 0x30:3B) .............................................................................256
6.10.6.1
Main Setup Options PCI Function 0 (Word 0x30) .............................................................. 257
6.10.6.2
Configuration Customization Options PCI Function 0 (Word 0x31).......................................258
6.10.6.3
PXE Version (Word 0x32) ............................................................................................ 260
6.10.6.4
IBA Capabilities (Word 0x33)......................................................................................... 260
6.10.6.5
Setup Options PCI Function 1 (Word 0x34)...................................................................... 261
6.10.6.6
Configuration Customization Options PCI Function 1 (Word 0x35).......................................261
6.10.6.7
iSCSI Option ROM Version (Word 0x36) .......................................................................... 261
6.10.6.8
Setup Options PCI Function 2 (Word 0x38)...................................................................... 261
6.10.6.9
Configuration Customization Options PCI Function 2 (Word 0x39).......................................261
6.10.6.10
Setup Options PCI Function 3 (Word 0x3A)...................................................................... 261
6.10.6.11
Configuration Customization Options PCI Function 3 (Word 0x3B).......................................261
6.10.7
iSCSI Boot Configuration Offset (Word 0x3D) .........................................................................261
6.10.7.1
iSCSI Module Structure.................................................................................................261
6.10.8
Alternate MAC Address Pointer (Word 0x37) ..........................................................................263
6.10.9
Checksum Word (Word 0x3F) ..............................................................................................263
6.10.10
Image Unique ID (Word 0x42, 0x43) ....................................................................................263
7.0
7.1
Inline Functions .......................................................................................................................265
Receive Functionality ............................................................................................................... 265
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
17
Intel® 82576 GbE Controller — Contents
7.1.1
Rx Queues Assignment ...................................................................................................... 265
7.1.1.1
Queuing in a Non-Virtualized Environment.......................................................................267
7.1.1.2
Rx Queuing in a Virtualized Environment .........................................................................268
7.1.1.3
Queue Configuration Registers .......................................................................................271
7.1.1.4
L2 Ether-Type Filters ....................................................................................................271
7.1.1.5
L3/L4 5-Tuple Filters ....................................................................................................272
7.1.1.6
SYN Packet Filters ........................................................................................................273
7.1.1.7
Receive-Side Scaling (RSS) ...........................................................................................273
7.1.1.7.1
RSS Hash Function ....................................................................................................274
7.1.1.7.1.1
Hash for IPv4 with TCP ........................................................................................277
7.1.1.7.1.2
Hash for IPv4 with UDP .......................................................................................277
7.1.1.7.1.3
Hash for IPv4 without TCP ...................................................................................277
7.1.1.7.1.4
Hash for IPv6 with TCP ........................................................................................277
7.1.1.7.1.5
Hash for IPv6 with UDP .......................................................................................277
7.1.1.7.1.6
Hash for IPv6 without TCP ...................................................................................277
7.1.1.7.2
Indirection Table.......................................................................................................277
7.1.1.7.3
RSS Verification Suite ................................................................................................277
7.1.1.7.3.1
IPv4..................................................................................................................278
7.1.1.7.3.2
IPv647 ..............................................................................................................278
7.1.1.7.4
Association Through MAC Address ...............................................................................278
7.1.2
L2 Packet Filtering ............................................................................................................. 278
7.1.2.1
MAC Address Filtering ...................................................................................................280
7.1.2.1.1
Unicast Filter ............................................................................................................281
7.1.2.1.2
Multicast Filter (Partial)..............................................................................................282
7.1.2.2
VLAN Filtering..............................................................................................................282
7.1.2.3
Manageability Filtering ..................................................................................................283
7.1.3
Receive Data Storage ........................................................................................................ 285
7.1.3.1
Host Buffers ................................................................................................................285
7.1.3.2
On-Chip Rx Buffers.......................................................................................................285
7.1.3.3
On-Chip descriptor Buffers ............................................................................................285
7.1.4
Legacy Receive Descriptor Format ....................................................................................... 285
7.1.5
Advanced Receive Descriptors ............................................................................................. 289
7.1.5.1
Advanced Receive Descriptors (Read Format) ..................................................................289
7.1.5.2
Advanced Receive Descriptors — Writeback Format ..........................................................289
7.1.6
Receive Descriptor Fetching ................................................................................................ 295
7.1.7
Receive Descriptor Write-Back ............................................................................................ 295
7.1.8
Receive Descriptor Ring Structure........................................................................................ 296
7.1.8.1
Low Receive Descriptors Threshold .................................................................................297
7.1.9
Header Splitting and Replication .......................................................................................... 298
7.1.9.1
Purpose ......................................................................................................................298
7.1.9.2
Description..................................................................................................................298
7.1.10
Receive Packet Checksum Off Loading.................................................................................. 300
7.1.10.1
Filters details...............................................................................................................302
7.1.10.1.1
MAC Address Filter ....................................................................................................302
7.1.10.1.2
SNAP/VLAN Filter ......................................................................................................302
7.1.10.1.3
IPv4 Filter ................................................................................................................303
7.1.10.1.4
IPv6 Filter ................................................................................................................303
7.1.10.1.5
IPv6 Extension Headers .............................................................................................303
7.1.10.1.6
UDP/TCP Filter ..........................................................................................................304
7.1.10.2
Receive UDP Fragmentation Checksum ...........................................................................304
7.1.11
SCTP Offload .................................................................................................................... 305
7.2
Transmit Functionality .............................................................................................................. 305
7.2.1
Packet Transmission .......................................................................................................... 305
7.2.1.1
Transmit Data Storage..................................................................................................306
7.2.1.2
On-Chip Tx Buffers.......................................................................................................306
7.2.1.3
On-Chip descriptor Buffers ............................................................................................306
7.2.1.4
Transmit Contexts........................................................................................................306
7.2.2
Transmit Descriptors.......................................................................................................... 307
7.2.2.1
Legacy Transmit Descriptor Format ................................................................................308
7.2.2.1.1
Address (64) ............................................................................................................308
7.2.2.1.2
Length.....................................................................................................................308
Intel® 82576 GbE Controller
Datasheet
18
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
7.2.2.1.3
Checksum Offset and Start — CSO and CSS ................................................................. 308
7.2.2.1.4
Command Byte — CMD.............................................................................................. 309
7.2.2.1.5
Status – STA ............................................................................................................ 310
7.2.2.1.6
DD (Bit 0) — Descriptor Done Status ........................................................................... 310
7.2.2.1.7
VLAN....................................................................................................................... 310
7.2.2.2
Advanced Transmit Context Descriptor............................................................................ 311
7.2.2.2.1
IPLEN (9)................................................................................................................. 311
7.2.2.2.2
MACLEN (7) ............................................................................................................. 311
7.2.2.2.3
IPsec SA IDX (8)....................................................................................................... 312
7.2.2.2.4
Reserved (24) .......................................................................................................... 312
7.2.2.2.5
IPS_ESP_LEN (9) ...................................................................................................... 312
7.2.2.2.6
TUCMD (11) ............................................................................................................. 312
7.2.2.2.7
DTYP (4).................................................................................................................. 312
7.2.2.2.8
RSV (5) ................................................................................................................... 312
7.2.2.2.9
DEXT....................................................................................................................... 312
7.2.2.2.10
RSV (6) ................................................................................................................... 312
7.2.2.2.11
IDX (3).................................................................................................................... 313
7.2.2.2.12
RSV (1) ................................................................................................................... 313
7.2.2.2.13
L4LEN (8) ................................................................................................................ 313
7.2.2.2.14
MSS (16) ................................................................................................................. 313
7.2.2.3
Advanced Transmit Data Descriptor ................................................................................ 314
7.2.2.3.1
Address (64) ............................................................................................................ 314
7.2.2.3.2
DTALEN (16) ............................................................................................................ 314
7.2.2.3.3
RSV (2) ................................................................................................................... 314
7.2.2.3.4
MAC (2)................................................................................................................... 314
7.2.2.3.5
DTYP (4).................................................................................................................. 315
7.2.2.3.6
DCMD (8) ................................................................................................................ 315
7.2.2.3.7
STA (4) ................................................................................................................... 316
7.2.2.3.8
IDX (3).................................................................................................................... 316
7.2.2.3.9
RSV (1) ................................................................................................................... 316
7.2.2.3.10
POPTS (6)................................................................................................................ 316
7.2.2.3.11
PAYLEN (18) ............................................................................................................ 316
7.2.2.4
Transmit Descriptor Ring Structure................................................................................. 317
7.2.2.5
Transmit Descriptor Fetching ......................................................................................... 318
7.2.2.6
Transmit Descriptor Write-Back ..................................................................................... 319
7.2.3
Tx Completions Head Write-Back ..........................................................................................320
7.2.3.1
Description ................................................................................................................. 320
7.2.4
TCP/UDP Segmentation .......................................................................................................321
7.2.4.1
Assumptions ............................................................................................................... 321
7.2.4.2
Transmission Process ................................................................................................... 321
7.2.4.2.1
TCP Segmentation Data Fetch Control.......................................................................... 322
7.2.4.2.2
TCP Segmentation Write-Back Modes........................................................................... 322
7.2.4.3
TCP Segmentation Performance ..................................................................................... 323
7.2.4.4
Packet Format ............................................................................................................. 323
7.2.4.5
TCP/UDP Segmentation Indication .................................................................................. 324
7.2.4.6
Transmit Checksum Offloading with TCP/UD Segmentation ................................................325
7.2.4.7
IP/TCP/UDP Header Updating ........................................................................................ 326
7.2.4.7.1
TCP/IP/UDP Header for the First Frames ...................................................................... 326
7.2.4.7.2
TCP/IP/UDP Headers for the Subsequent Frames........................................................... 327
7.2.4.7.3
TCP/IP/UDP Headers for the Last Frame ....................................................................... 328
7.2.4.8
IP/TCP/UDP Checksum Offloading .................................................................................. 328
7.2.4.9
Data Flow ................................................................................................................... 328
7.2.5
Checksum Offloading in Non-Segmentation Mode ...................................................................329
7.2.5.1
IP Checksum ............................................................................................................... 330
7.2.5.2
TCP Checksum............................................................................................................. 330
7.2.5.3
SCTP CRC Offloading .................................................................................................... 331
7.2.5.4
Checksum Supported Per Packet Types ........................................................................... 331
7.2.6
Multiple Transmit Queues ....................................................................................................332
7.2.6.1
Bandwidth Allocation to Virtual Machines / Transmit Queues ..............................................332
7.3
Interrupts............................................................................................................................... 333
7.3.1
Mapping of Interrupt Causes ................................................................................................333
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
19
Intel® 82576 GbE Controller — Contents
7.3.1.1
Legacy and MSI Interrupt Modes ....................................................................................333
7.3.1.2
MSI-X Mode — Non-IOV Mode .......................................................................................334
7.3.1.3
MSI-X Interrupts in SR-IOV Mode...................................................................................336
7.3.2
Registers.......................................................................................................................... 337
7.3.2.1
Interrupt Cause Register (ICR) ......................................................................................338
7.3.2.1.1
Legacy Mode ............................................................................................................338
7.3.2.1.2
Advanced Mode ........................................................................................................338
7.3.2.2
Interrupt Cause Set Register (ICS) .................................................................................338
7.3.2.3
Interrupt Mask Set/Read Register (IMS)..........................................................................338
7.3.2.4
Interrupt Mask Clear Register (IMC) ...............................................................................338
7.3.2.5
Interrupt Acknowledge Auto-mask register (IAM) .............................................................339
7.3.2.6
Extended Interrupt Cause Registers (EICR) .....................................................................339
7.3.2.6.1
MSI/INT-A Mode .......................................................................................................339
7.3.2.6.2
MSI-X Mode .............................................................................................................339
7.3.2.7
Extended Interrupt Cause Set Register (EICS) .................................................................339
7.3.2.8
Extended Interrupt Mask Set and Read Register (EIMS) & Extended Interrupt Mask Clear Register
(EIMC)........................................................................................................................339
7.3.2.9
Extended Interrupt Auto Clear Enable Register (EIAC).......................................................340
7.3.2.10
Extended Interrupt Auto Mask Enable Register (EIAM) ......................................................340
7.3.2.11
GPIE ..........................................................................................................................340
7.3.3
MSI-X and Vectors............................................................................................................. 341
7.3.3.1
Usage of Spare MSI-X Vectors by Physical Function ..........................................................341
7.3.3.2
Interrupt Moderation ....................................................................................................342
7.3.3.2.1
More on Using EITR ...................................................................................................344
7.3.4
Clearing Interrupt Causes ................................................................................................... 344
7.3.4.1
Auto-Clear ..................................................................................................................344
7.3.4.2
Write to Clear ..............................................................................................................345
7.3.4.3
Read to Clear ..............................................................................................................345
7.3.5
Rate Controlled Low Latency Interrupts (LLI) ........................................................................ 345
7.3.5.1
Rate Control Mechanism ...............................................................................................346
7.3.6
TCP Timer Interrupt........................................................................................................... 346
7.3.6.1
Introduction ................................................................................................................346
7.3.6.2
Description..................................................................................................................347
7.4
802.1q VLAN Support............................................................................................................... 347
7.4.1
802.1q VLAN Packet Format................................................................................................ 347
7.4.2
802.1q Tagged Frames ...................................................................................................... 347
7.4.3
Transmitting and Receiving 802.1q Packets .......................................................................... 348
7.4.3.1
Adding 802.1q Tags on Transmits ..................................................................................348
7.4.3.2
Stripping 802.1q Tags on Receives .................................................................................348
7.4.4
802.1q VLAN Packet Filtering .............................................................................................. 348
7.4.5
Double VLAN Support ........................................................................................................ 349
7.4.5.1
Transmit Behavior........................................................................................................349
7.4.5.2
Receive Behavior .........................................................................................................350
7.5
Configurable LED Outputs ......................................................................................................... 350
7.5.1
MODE Encoding for LED Outputs.......................................................................................... 351
7.6
Memory Error Correction and Detection ...................................................................................... 352
7.7
DCA....................................................................................................................................... 352
7.7.1
Description ....................................................................................................................... 352
7.7.2
Details of Implementation .................................................................................................. 353
7.7.2.1
PCIe Message Format for DCA .......................................................................................353
7.8
Transmit Rate Limiting (TRL)..................................................................................................... 355
7.9
Next Generation Security.......................................................................................................... 358
7.9.1
MACSec ........................................................................................................................... 358
7.9.1.1
Packet Format ............................................................................................................358
7.9.1.2
MACSec Header (SecTag) Format ...................................................................................359
7.9.1.2.1
MACSec Ethertype.....................................................................................................359
7.9.1.2.2
TCI and AN ..............................................................................................................359
7.9.1.2.3
Short Length ............................................................................................................360
7.9.1.2.4
Packet Number (PN) ..................................................................................................360
7.9.1.2.5
Secure Channel Identifier (SCI) ..................................................................................360
7.9.1.2.6
Initial Value (IV) Calculation .......................................................................................360
Intel® 82576 GbE Controller
Datasheet
20
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
7.9.1.3
MACSec Management – KaY (Key Agreement Entity) ........................................................ 360
7.9.1.4
Receive Flow ............................................................................................................... 361
7.9.1.4.1
MACSec Receive Modes.............................................................................................. 362
7.9.1.4.2
Receive SA Exhausting – Re-Keying............................................................................. 362
7.9.1.4.3
Receive SA Context and Identification.......................................................................... 363
7.9.1.4.4
Receive Statistic Counters .......................................................................................... 363
7.9.1.5
Transmit Flow.............................................................................................................. 363
7.9.1.5.1
Transmit SA Exhausting – Re-keying ........................................................................... 363
7.9.1.5.2
Transmit SA Context ................................................................................................. 364
7.9.1.5.3
Transmit Statistic Counters ........................................................................................ 364
7.9.1.6
Manageability Engine/ Host Relations.............................................................................. 364
7.9.1.6.1
Key and Tamper Protection ........................................................................................ 364
7.9.1.6.2
Key Protection .......................................................................................................... 364
7.9.1.6.3
Tamper Protection.....................................................................................................365
7.9.1.6.4
MACSec Control Switch Between Firmware and Software ................................................365
7.9.1.7
Manageability Flow....................................................................................................... 365
7.9.1.7.1
Initialization ............................................................................................................. 365
7.9.1.7.2
Operation flow .......................................................................................................... 365
7.9.1.8
Switching ownership between Host and Manageability.......................................................365
7.9.2
IPSec Support....................................................................................................................365
7.9.2.1
Related RFCs and Other References ................................................................................ 366
7.9.2.2
Hardware Features List ................................................................................................. 366
7.9.2.2.1
Main Features........................................................................................................... 366
7.9.2.2.2
Cross Features ......................................................................................................... 367
7.9.2.3
Software/Hardware Demarcation.................................................................................... 368
7.9.2.4
IPsec Formats Exchanged Between Hardware and Software ...............................................369
7.9.2.4.1
Single Send.............................................................................................................. 369
7.9.2.4.2
Single Send With TCP/UDP Checksum Offload ............................................................... 369
7.9.2.4.3
Large Send TCP/UDP ................................................................................................. 369
7.9.2.5
TX SA Table ................................................................................................................ 372
7.9.2.5.1
Tx SA Table Structure................................................................................................ 372
7.9.2.5.2
Access to Tx SA Table................................................................................................ 372
7.9.2.6
TX Hardware Flow ........................................................................................................373
7.9.2.6.1
Single Send Without TCP/UDP Checksum Offload: ......................................................... 373
7.9.2.6.2
Single Send With TCP/UDP Checksum Offload: .............................................................. 373
7.9.2.6.3
Large Send TCP/UDP: ................................................................................................ 373
7.9.2.7
AES-128 Operation in Tx............................................................................................... 374
7.9.2.7.1
AES-128-GCM for ESP — Both Authenticate and Encrypt ................................................375
7.9.2.7.2
AES-128-GMAC for ESP — Authenticate Only ................................................................ 375
7.9.2.7.3
AES-128-GMAC for AH — Authenticate Only ................................................................. 376
7.9.2.8
RX Descriptors............................................................................................................. 376
7.9.2.9
Rx SA Table ................................................................................................................ 376
7.9.2.9.1
Rx SA Table Structure ............................................................................................... 376
7.9.2.9.2
Normal Access to Rx SA Table .................................................................................... 377
7.9.2.9.3
Debugging Read Access to Rx SA Table........................................................................ 377
7.9.2.10
RX Hardware Flow Without TCP/UDP Checksum Offload.....................................................378
7.9.2.11
RX Hardware Flow With TCP/UDP Checksum Offload ......................................................... 378
7.9.2.12
AES-128 Operation in Rx .............................................................................................. 379
7.9.2.13
Handling IPsec Packets in Rx ......................................................................................... 379
7.10
Virtualization .......................................................................................................................... 379
7.10.1
Overview ..........................................................................................................................379
7.10.1.1
Direct Assignment Model............................................................................................... 380
7.10.1.1.1
Rationale ................................................................................................................. 380
7.10.1.2
System Overview ......................................................................................................... 381
7.10.1.3
VMDq1 Versus Next Generation VMDq ............................................................................ 384
7.10.2
PCI Sig SR-IOV Support ......................................................................................................384
7.10.2.1
SR-IOV Concepts ......................................................................................................... 384
7.10.2.2
Config Space Replication ...............................................................................................384
7.10.2.2.1
Legacy PCI Config Space............................................................................................ 385
7.10.2.2.2
Memory BARs Assignment ......................................................................................... 385
7.10.2.2.3
PCIe Capability Structure ........................................................................................... 386
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
21
Intel® 82576 GbE Controller — Contents
7.10.2.2.4
PCI-Express capability structure..................................................................................386
7.10.2.2.5
MSI and MSI-X Capabilities ........................................................................................386
7.10.2.2.6
VPD Capability ..........................................................................................................387
7.10.2.2.7
Power Management Capability ....................................................................................387
7.10.2.2.8
Serial ID ..................................................................................................................387
7.10.2.2.9
Error Reporting Capabilities (Advanced & Legacy).........................................................387
7.10.2.3
Function Level Reset (FLR) Capability .............................................................................387
7.10.2.4
Error Reporting ............................................................................................................387
7.10.2.5
ARI & IOV Capability Structures .....................................................................................388
7.10.2.6
Requester ID Allocation.................................................................................................388
7.10.2.6.1
Bus-Device-Function Layout .......................................................................................388
7.10.2.6.1.1
ARI Mode ..........................................................................................................388
7.10.2.6.1.2
Non ARI Mode ....................................................................................................389
7.10.2.7
Hardware Resources Assignment....................................................................................389
7.10.2.7.1
Physical Function Resources .......................................................................................389
7.10.2.7.2
Resource Summary ...................................................................................................389
7.10.2.8
CSR Organization .........................................................................................................390
7.10.2.9
IOV Control .................................................................................................................390
7.10.2.9.1
VF to PF Mailbox .......................................................................................................390
7.10.2.10
Interrupt Handling .......................................................................................................392
7.10.2.10.1 Low latency Interrupts ...............................................................................................393
7.10.2.10.2 MSI-X......................................................................................................................393
7.10.2.10.3 MSI.........................................................................................................................393
7.10.2.10.4 Legacy Interrupt (INT-x)............................................................................................393
7.10.2.11
DMA...........................................................................................................................393
7.10.2.11.1 Requester ID ............................................................................................................393
7.10.2.11.2 Sharing DMA Resources .............................................................................................394
7.10.2.11.3 DCA ........................................................................................................................394
7.10.2.12
Timers and Watchdog ...................................................................................................394
7.10.2.12.1 TCP Timer ................................................................................................................394
7.10.2.12.2 IEEE 1588................................................................................................................394
7.10.2.12.3 Watchdog. ...............................................................................................................394
7.10.2.12.4 Free Running Timer ...................................................................................................394
7.10.2.13
Power Management and Wakeup....................................................................................394
7.10.2.14
Link Control ................................................................................................................394
7.10.2.14.1 Special Filtering Options.............................................................................................394
7.10.2.14.2 Allocation of memory space for IOV functions ...............................................................395
7.10.3
Packet Switching ............................................................................................................... 395
7.10.3.1
Assumptions................................................................................................................395
7.10.3.2
VF Selection ................................................................................................................395
7.10.3.2.1
Filtering Capabilities ..................................................................................................395
7.10.3.3
L2 Filtering..................................................................................................................396
7.10.3.4
Size Filtering ...............................................................................................................396
7.10.3.5
RX Packets Switching ...................................................................................................396
7.10.3.5.1
Replication Mode Enabled...........................................................................................397
7.10.3.5.2
Replication Mode Disabled ..........................................................................................399
7.10.3.6
TX Packets Switching....................................................................................................401
7.10.3.6.1
Replication Mode Enabled...........................................................................................403
7.10.3.6.2
Replication Mode Disabled ..........................................................................................404
7.10.3.7
Mirroring Support.........................................................................................................405
7.10.3.8
Offloads......................................................................................................................406
7.10.3.8.1
Replication by Exact MAC Address ...............................................................................406
7.10.3.8.2
Replication by Promiscuous Modes...............................................................................406
7.10.3.8.3
Replication by Mirroring .............................................................................................406
7.10.3.8.4
VLAN Only Filtering ...................................................................................................406
7.10.3.8.5
Local Traffic Offload...................................................................................................407
7.10.3.8.6
Small Packets Padding ...............................................................................................407
7.10.3.9
Security Features .........................................................................................................407
7.10.3.9.1
Inbound Security ......................................................................................................407
7.10.3.9.2
Outbound Security ....................................................................................................407
7.10.3.9.2.1
Anti Spoofing .....................................................................................................408
Intel® 82576 GbE Controller
Datasheet
22
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
7.10.3.9.2.2
VLAN Insertion From Register Instead of Descriptor ................................................408
7.10.3.9.2.3
Egress VLAN Filtering .......................................................................................... 408
7.10.3.9.3
Interrupt Misbehavior of VM. ...................................................................................... 408
7.10.3.10
Congestion Control....................................................................................................... 409
7.10.3.10.1 Receive Priority ........................................................................................................ 409
7.10.3.10.2 Queue Arbitration and Rate Control ............................................................................. 409
7.10.3.10.3 Storm Control........................................................................................................... 409
7.10.3.10.3.1 Assumptions ...................................................................................................... 409
7.10.3.10.3.2 Storm Control Functionality.................................................................................. 409
7.10.3.11
External Switch Loopback Support.................................................................................. 410
7.10.3.12
Switch Control ............................................................................................................. 410
7.10.4
Virtualization of the Hardware ..............................................................................................411
7.10.4.1
Per Pool Statistics ........................................................................................................ 411
7.11
Time SYNC (IEEE1588 and 802.1AS).......................................................................................... 412
7.11.1
Overview ..........................................................................................................................412
7.11.2
Flow and Hardware/Software Responsibilities .........................................................................412
7.11.2.1
TimeSync Indications in Receive and Transmit Packet Descriptors.......................................414
7.11.3
Hardware Time Sync Elements .............................................................................................414
7.11.3.1
System Time Structure and Mode of Operation................................................................. 414
7.11.3.2
Time Stamping Mechanism............................................................................................ 415
7.11.3.3
Time Adjustment Mode of Operation ............................................................................... 416
7.11.4
Time Sync Related Auxiliary Elements ...................................................................................416
7.11.4.1
Target Time ................................................................................................................ 416
7.11.4.2
Time Stamp Events ......................................................................................................416
7.11.5
PTP Packet Structure ..........................................................................................................416
7.12
Statistics ................................................................................................................................ 419
7.12.1
IEEE 802.3 clause 30 management.......................................................................................419
7.12.2
OID_GEN_STATISTICS........................................................................................................421
7.12.3
RMON ...............................................................................................................................421
7.12.4
Linux net_device_stats........................................................................................................422
7.12.5
MACSec statistics ...............................................................................................................423
7.12.6
Rx statistics.......................................................................................................................423
7.12.7
Statistics hierarchy. ............................................................................................................424
8.0
Programming Interface ............................................................................................................429
8.1
Introduction............................................................................................................................ 429
8.1.1
Memory and I/O Address Decoding .......................................................................................429
8.1.1.1
Memory-Mapped Access to Internal Registers and Memories ..............................................429
8.1.1.2
Memory-Mapped Access to Flash .................................................................................... 429
8.1.1.3
Memory-Mapped Access to MSI-X Tables......................................................................... 430
8.1.1.4
Memory-Mapped Access to Expansion ROM...................................................................... 430
8.1.1.5
I/O-Mapped Access to Internal Registers, Memories, and Flash ..........................................430
8.1.1.5.1
IOADDR (I/O offset 0x00) .......................................................................................... 430
8.1.1.5.2
IODATA (I/O offset 0x04) .......................................................................................... 431
8.1.1.5.3
Undefined I/O offsets ................................................................................................ 432
8.1.2
Register Conventions ..........................................................................................................432
8.1.2.1
Registers Byte Ordering ................................................................................................ 434
8.1.3
Register Summary..............................................................................................................435
8.1.4
MSI-X BAR Register Summary .............................................................................................453
8.2
General Register Descriptions.................................................................................................... 454
8.2.1
Device Control Register - CTRL (0x00000; R/W) .....................................................................454
8.2.2
Device Status Register - STATUS (0x00008; R) ......................................................................458
8.2.3
Extended Device Control Register - CTRL_EXT (0x00018; R/W) ................................................460
8.2.4
MDI Control Register - MDIC (0x00020; R/W) ........................................................................463
8.2.5
SerDes ANA - SERDESCTL (0x00024; R/W) ...........................................................................464
8.2.6
Copper/Fiber Switch Control - CONNSW (0x00034; R/W).........................................................464
8.2.7
VLAN Ether Type - VET (0x00038; R/W) ................................................................................465
8.2.8
LED Control - LEDCTL (0x00E00; RW) ...................................................................................465
8.3
Packet Buffers Control Register Descriptions ............................................................................... 466
8.3.1
RX PB Size - RXPBS (0x2404; RW) .......................................................................................466
8.3.2
TX PB Size - TXPBS (0x3404; RW)........................................................................................467
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
23
Intel® 82576 GbE Controller — Contents
8.3.3
Switch PB Size - SWPBS (0x3004; RW) ................................................................................ 467
8.3.4
Tx Packet Buffer Wrap Around Counter - PBTWAC (0x34e8; RO).............................................. 467
8.3.5
Rx Packet Buffer Wrap Around Counter - PBRWAC (0x24e8; RO) ............................................. 467
8.3.6
Switch Packet Buffer Wrap Around Counter - PBSWAC (0x30e8; RO)........................................ 467
8.4
EEPROM/Flash Register Descriptions .......................................................................................... 468
8.4.1
EEPROM/Flash Control Register - EEC (0x00010; R/W) ........................................................... 468
8.4.2
EEPROM Read Register - EERD (0x00014; RW)...................................................................... 469
8.4.3
Flash Access - FLA (0x0001C; R/W) ..................................................................................... 470
8.4.4
Flash Opcode - FLASHOP (0x0103C; R/W) ............................................................................ 471
8.4.5
EEPROM Diagnostic - EEDIAG (0x01038; RO)........................................................................ 471
8.4.6
EEPROM Auto Read Bus Control - EEARBC (0x01024; R/W)..................................................... 472
8.4.7
VPD diagnostic register -VPDDIAG (0x1060; RO) ................................................................... 473
8.4.8
MNG-EEPROM CSR I/F ....................................................................................................... 473
8.4.8.1
MNG EEPROM Control Register - EEMNGCTL (0x1010; RO) ................................................474
8.4.8.2
MNG EEPROM Read/Write data - EEMNGDATA (0x1014; RO)..............................................474
8.5
Flow Control Register Descriptions ............................................................................................. 475
8.5.1
Flow Control Address Low - FCAL (0x00028; RO) ................................................................... 475
8.5.2
Flow Control Address High - FCAH (0x0002C; RO) ................................................................. 475
8.5.3
Flow Control Type - FCT (0x00030; R/W) ............................................................................. 475
8.5.4
Flow Control Transmit Timer Value - FCTTV (0x00170; R/W) ................................................... 476
8.5.5
Flow Control Receive Threshold Low - FCRTL0 (0x02160; R/W) .............................................. 476
8.5.6
Flow Control Receive Threshold High - FCRTH0 (0x02168; R/W) .............................................. 476
8.5.7
Flow Control Refresh Threshold Value - FCRTV (0x02460; R/W)............................................... 477
8.5.8
Flow Control Status - FCSTS0 (0x2464; RO) ......................................................................... 477
8.6
PCIe Register Descriptions ........................................................................................................ 478
8.6.1
PCIe Control - GCR (0x05B00; RW) ..................................................................................... 478
8.6.2
IOV control- IOVCTL (0x05BBC; RW) ................................................................................... 479
8.6.3
Function Tag - FUNCTAG (0x05B08; R/W) ............................................................................ 480
8.6.4
Function Active and Power State to MNG - FACTPS (0x05B30; RO)........................................... 480
8.6.5
SerDes/CCM/PCIe CSR - GIOANACTL0 (0x05B34; R/W).......................................................... 481
8.6.6
SerDes/CCM/PCIe CSR - GIOANACTL1 (0x05B38; R/W).......................................................... 481
8.6.7
SerDes/CCM/PCIe CSR - GIOANACTL2 (0x05B3C; R/W) ......................................................... 482
8.6.8
SerDes/CCM/PCIe CSR - GIOANACTL3 (0x05B40; R/W).......................................................... 482
8.6.9
SerDes/CCM/PCIe CSR - GIOANACTLALL (0x05B44; R/W) ...................................................... 482
8.6.10
SerDes/CCM/PCIe CSR - CCMCTL (0x05B48; R/W)................................................................. 483
8.6.11
SerDes/CCM/PCIe CSR - SCCTL (0x05B4C; R/W)................................................................... 483
8.6.12
Mirrored Revision ID - MREVID (0x05B64; R/W) .................................................................... 483
8.7
Semaphore registers ................................................................................................................ 483
8.7.1
Software Semaphore - SWSM (0x05B50; R/W)...................................................................... 484
8.7.2
Firmware Semaphore - FWSM (0x05B54; R/WS) ................................................................... 484
8.7.3
Software–Firmware Synchronization - SW_FW_SYNC (0x05B5C; RWS)..................................... 486
8.8
Interrupt Register Descriptions .................................................................................................. 487
8.8.1
Extended Interrupt Cause - EICR (0x01580; RC/W1C)............................................................ 487
8.8.2
Extended Interrupt Cause Set - EICS (0x01520; WO)............................................................. 488
8.8.3
Extended Interrupt Mask Set/Read - EIMS (0x01524; RWS).................................................... 488
8.8.4
Extended Interrupt Mask Clear - EIMC (0x01528; WO) ........................................................... 489
8.8.5
Extended Interrupt Auto Clear - EIAC (0x0152C; R/W) ........................................................... 490
8.8.6
Extended Interrupt Auto Mask Enable - EIAM (0x01530; R/W)................................................. 490
8.8.7
Interrupt Cause Read Register - ICR (0x01500; RC/W1C) ....................................................... 491
8.8.8
Interrupt Cause Set Register - ICS (0x01504; WO) ................................................................ 493
8.8.9
Interrupt Mask Set/Read Register - IMS (0x01508; R/W)........................................................ 494
8.8.10
Interrupt Mask Clear Register - IMC (0x0150C; WO) .............................................................. 496
8.8.11
Interrupt Acknowledge Auto Mask Register - IAM (0x01510; R/W) ........................................... 497
8.8.12
Interrupt Throttle - EITR (0x01680 + 4*n [n = 0...24]; R/W).................................................. 497
8.8.13
Interrupt Vector Allocation Registers - IVAR (0x1700 + 4*n [n=0...7]; RW) .............................. 498
8.8.14
Interrupt Vector Allocation Registers - MISC IVAR_MISC (0x1740; RW) .................................... 499
8.8.15
General Purpose Interrupt Enable - GPIE (0x1514; RW) ......................................................... 500
8.9
MSI-X Table Register Descriptions ............................................................................................. 500
8.9.1
MSI–X Table Entry Lower Address MSIXTADD (BAR3: 0x0000 + 0x10*n [n=0...24]; R/W).......................................................... 501
Intel® 82576 GbE Controller
Datasheet
24
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
8.9.2
MSI–X Table Entry Upper Address MSIXTUADD (BAR3: 0x0004 + 0x10*n [n=0...24]; R/W) ........................................................501
8.9.3
MSI–X Table Entry Message MSIXTMSG (BAR3: 0x0008 + 0x10*n [n=0...24]; R/W) ..........................................................501
8.9.4
MSI–X Table Entry Vector Control MSIXTVCTRL (BAR3: 0x000C + 0x10*n [n=0...24]; R/W) .......................................................502
8.9.5
MSIXPBA Bit Description –
MSIXPBA (BAR3: 0x02000; RO) ...........................................................................................502
8.9.6
MSI-X PBA Clear – PBACL (0x05B68; R/W1C) ........................................................................502
8.10
Receive Register Descriptions.................................................................................................... 502
8.10.1
Receive Control Register - RCTL (0x00100; R/W) ...................................................................502
8.10.2
Split and Replication Receive Control - SRRCTL (0x0C00C + 0x40*n [n=0...15]; R/W) ................506
8.10.3
Packet Split Receive Type - PSRTYPE (0x05480 + 4*n [n=0...7]; R/W) .....................................507
8.10.4
Replicated Packet Split Receive Type - RPLPSRTYPE (0x054C0; R/W) ........................................508
8.10.5
Receive Descriptor Base Address Low - RDBAL (0x0C000 + 0x40*n [n=0...15]; R/W) .................508
8.10.6
Receive Descriptor Base Address High - RDBAH (0x0C004 + 0x40*n [n=0...15]; R/W)................509
8.10.7
Receive Descriptor Ring Length - RDLEN (0x0C008 + 0x40*n [n=0...15]; R/W) .........................509
8.10.8
Receive Descriptor Head - RDH (0x0C010 + 0x40*n [n=0...15]; RO) ........................................509
8.10.9
Receive Descriptor Tail - RDT (0x0C018 + 0x40*n [n=0...15]; R/W).........................................510
8.10.10
Receive Descriptor Control - RXDCTL (0x0C028 + 0x40*n [n=0...15]; R/W) ..............................510
8.10.11
Receive Queue Drop Packet Count - RQDPC (0xC030 + 0x40*n [n=0...15]; RC) .........................512
8.10.12
DMA RX Max Outstanding Data - DRXMXOD (0x2540; RW) ......................................................512
8.10.13
Receive Checksum Control - RXCSUM (0x05000; R/W) ............................................................512
8.10.14
Receive Long Packet Maximum Length - RLPML (0x5004; R/W) ................................................514
8.10.15
Receive Filter Control Register - RFCTL (0x05008; R/W) ..........................................................514
8.10.16
Multicast Table Array - MTA (0x05200 + 4*n [n=0...127]; R/W)...............................................515
8.10.17
Receive Address Low - RAL (0x05400 + 8*n [n=0...15];
0x054E0 + 8*n [n=0...7]; R/W) ..........................................................................................516
8.10.18
Receive Address High - RAH (0x05404 + 8*n [n=0...15]; 0x054E4 + 8*n [n=0...7]; R/W) ..........516
8.10.19
VLAN Filter Table Array - VFTA (0x05600 + 4*n [n=0...127]; R/W) ..........................................517
8.10.20
Multiple Receive Queues Command Register - MRQC (0x05818; R/W) .......................................518
8.10.21
RSS Random Key Register - RSSRK (0x05C80 + 4*n [n=0...9]; R/W) .......................................519
8.10.22
Redirection Table - RETA (0x05C00 + 4*n [n=0...31]; R/W) ....................................................520
8.11
Filtering Register Descriptions ................................................................................................... 521
8.11.1
Immediate Interrupt Rx - IMIR (0x05A80 + 4*n [n=0...7]; R/W) .............................................521
8.11.2
Immediate Interrupt Rx Ext. - IMIREXT (0x05AA0 + 4*n [n=0...7]; R/W)..................................522
8.11.3
Source Address Queue Filter - SAQF (0x5980 + 4*n[n=0...7]; RW) ..........................................522
8.11.4
Destination Address Queue Filter - DAQF (0x59A0 + 4*n[n=0...7]; RW) ....................................523
8.11.5
Source Port Queue Filter - SPQF (0x59C0 + 4*n[n=0...7]; RW)................................................523
8.11.6
5-tuple Queue Filter - FTQF (0x59E0 + 4*n[n=0...7]; RW) ......................................................523
8.11.7
Immediate Interrupt Rx VLAN Priority - IMIRVP (0x05AC0; R/W) ..............................................524
8.11.8
SYN Packet Queue Filter - SYNQF (0x55FC; RW).....................................................................524
8.11.9
EType Queue Filter - ETQF (0x5CB0 + 4*n[n=0...7]; RW) .......................................................524
8.12
Transmit Register Descriptions .................................................................................................. 525
8.12.1
Transmit Control Register - TCTL (0x00400; R/W) ..................................................................525
8.12.2
Transmit Control Extended - TCTL_EXT (0x0404; R/W) ...........................................................526
8.12.3
Transmit IPG Register - TIPG (0x0410; R/W) .........................................................................526
8.12.4
DMA Tx Control - DTXCTL (0x03590; R/W) ............................................................................527
8.12.5
DMA TX TCP Flags Control Low - DTXTCPFLGL (0x359C; RW) ...................................................529
8.12.6
DMA TX TCP Flags Control High - DTXTCPFLGH (0x35A0; RW)..................................................529
8.12.7
DMA TX Max Total Allow Size Requests - DTXMXSZRQ (0x3540; RW) ........................................529
8.12.8
Transmit Descriptor Base Address Low - TDBAL (0xE000 + 0x40*n [n=0...15]; R/W)..................530
8.12.9
Transmit Descriptor Base Address High - TDBAH (0x0E004 + 0x40*n [n=0...15]; R/W)...............530
8.12.10
Transmit Descriptor Ring Length - TDLEN (0x0E008 + 0x40*n [n=0...15]; R/W) ........................530
8.12.11
Transmit Descriptor Head - TDH (0x0E010 + 0x40*n [n=0...15]; RO) .......................................531
8.12.12
Transmit Descriptor Tail - TDT (0x0E018 + 0x40*n [n=0...15]; R/W)........................................531
8.12.13
Transmit Descriptor Control - TXDCTL (0x0E028 + 0x40*n [n=0...15]; R/W) .............................531
8.12.14
Tx Descriptor Completion Write–Back Address Low - TDWBAL (0x0E038 + 0x40*n [n=0...15]; R/W) .
533
8.12.15
Tx Descriptor Completion Write–Back Address High - TDWBAH (0x0E03C + 0x40*n [n=0...15];R/W) .
533
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
25
Intel® 82576 GbE Controller — Contents
8.13
DCA Register Descriptions ........................................................................................................ 533
8.13.1
Rx DCA Control Registers - RXCTL (0x0C014 + 0x40*n [n=0...15]; R/W) ................................. 533
8.13.2
Tx DCA Control Registers - TXCTL (0x0E014 + 0x40*n [n=0...15]; R/W).................................. 535
8.13.3
DCA Requester ID Information - DCA_ID (0x05B70; RO) ........................................................ 536
8.13.4
DCA Control - DCA_CTRL (0x05B74; R/W) ............................................................................ 537
8.14
Virtualization Register Descriptions ............................................................................................ 537
8.14.1
Next Generation VMDq Control register – VT_CTL (0x0581C; R/W) .......................................... 538
8.14.2
Physical Function Mailbox - PFMailbox (0x0C00 + 4*n[n=0...7]; RW) ....................................... 538
8.14.3
Virtual Function Mailbox - VFMailbox (0x0C40 + 4*n [n=0...7]; RW) ........................................ 539
8.14.4
Virtualization Mailbox Memory - VMBMEM (0x0800:0x083C + 0x40*n [n=0...7]; R/W) ............... 539
8.14.5
Mailbox VF Interrupt Causes Register - MBVFICR (0x0C80; R/W1C) ......................................... 540
8.14.6
Mailbox VF Interrupt Mask Register - MBVFIMR (0x0C84; RW)................................................. 540
8.14.7
FLR Events - VFLRE (0x0C88; R/W1C).................................................................................. 540
8.14.8
VF Receive Enable- VFRE (0x0C8C; RW) ............................................................................... 541
8.14.9
VF Transmit Enable - VFTE (0x0C90; RW)............................................................................. 541
8.14.10
Wrong VM Behavior Register - WVBR (0x3554; RC) ............................................................... 541
8.14.11
VM Error Count Mask – VMECM (0x3510; RW)....................................................................... 541
8.14.12
Last VM Misbehavior Cause – LVMMC (0x3548; RC) ............................................................... 541
8.14.13
Queue drop Enable Register - QDE (0x2408;RW) ................................................................... 542
8.14.14
DMA Tx Switch control - DTXSWC (0x3500; R/W) .................................................................. 542
8.14.15
VM VLAN Insert Register – VMVIR (0x3700 + 4 *n [n=0..7]; RW)............................................ 543
8.14.16
VM Offload Register - VMOLR (0x05AD0 + 4*n [n=0...7]; RW) ................................................ 543
8.14.17
Replication Offload Register - RPLOLR (0x05AF0; RW) ............................................................ 544
8.14.18
VLAN VM Filter - VLVF (0x05D00 + 4*n [n=0...31]; RW) ........................................................ 544
8.14.19
Unicast Table Array - UTA (0xA000 + 4*n [n=0...127]; WO)................................................... 544
8.14.20
Storm Control Control Register- SCCRL (0x5DB0;RW) ............................................................ 545
8.14.21
Storm Control Status - SCSTS (0x5DB4;RO) ......................................................................... 545
8.14.22
Broadcast Storm Control Threshold - BSCTRH (0x5DB8;RW) ................................................... 545
8.14.23
Multicast Storm Control Threshold - MSCTRH (0x5DBC; RW) ................................................... 546
8.14.24
Broadcast Storm Control Current Count - BSCCNT (0x5DC0;RO).............................................. 546
8.14.25
Multicast Storm Control Current Count - MSCCNT (0x5DC4;RO)............................................... 546
8.14.26
Storm Control Time Counter - SCTC (0x5DC8; RO) ................................................................ 546
8.14.27
Storm Control Basic Interval- SCBI (0x5DCC; RW)................................................................. 546
8.14.28
Virtual Mirror Rule Control - VMRCTL (0x5D80 + 0x4*n [n= 0..3]; RW) .................................... 547
8.14.29
Virtual Mirror Rule VLAN - VMRVLAN (0x5D90 + 0x4*n [n= 0..3]; RW) .................................... 547
8.14.30
Virtual Mirror Rule VM - VMRVM (0x5DA0 + 0x4*n [n= 0..3]; RW)........................................... 547
8.14.31
Transmit Rate-er Config - RC (0x36B0; RW) ......................................................................... 548
8.14.32
Transmit Rate-er Status - (0x36B4; RO).............................................................................. 548
8.15
Tx Bandwidth Allocation to VM Register Description ...................................................................... 548
8.15.1
VM Bandwidth Allocation Control & Status - VMBACS (0x3600; RW) ......................................... 549
8.15.2
VM Bandwidth Allocation Max Memory Window - VMBAMMW (0x3670; RW) ............................... 549
8.15.3
VM Bandwidth Allocation Select - VMBASEL (0x3604; RW) ...................................................... 549
8.15.4
VM Bandwidth Allocation Config - VMBAC (0x3608; RW) ......................................................... 550
8.16
Timer Register Descriptions ...................................................................................................... 550
8.16.1
Watchdog Setup - WDSTP (0x01040; R/W)........................................................................... 550
8.16.2
Watchdog Software Device Status - WDSWSTS (0x01044; R/W).............................................. 551
8.16.3
Free Running Timer - FRTIMER (0x01048; RWS) ................................................................... 551
8.16.4
TCP Timer - TCPTIMER (0x0104C; R/W) ............................................................................... 552
8.17
Time Sync Register Descriptions ................................................................................................ 553
8.17.1
RX Time Sync Control Register - TSYNCRXCTL (0xB620;RW)................................................... 553
8.17.2
RX Timestamp Low - RXSTMPL (0x0B624; RO) ...................................................................... 553
8.17.3
RX Timestamp High - RXSTMPH (0x0B628; RO) .................................................................... 553
8.17.4
RX Timestamp Attributes Low - RXSATRL(0x0B62C; RO) ........................................................ 553
8.17.5
RX Timestamp Attributes High- RXSATRH (0x0B630; RO) ....................................................... 554
8.17.6
TX Time Sync Control Register - TSYNCTXCTL (0x0B614; RW) ................................................ 554
8.17.7
TX Timestamp Value Low - TXSTMPL (0x0B618;RO)............................................................... 554
8.17.8
TX Timestamp Value High - TXSTMPH(0x0B61C; RO) ............................................................. 554
8.17.9
System Time Register Low - SYSTIML (0x0B600; RWS) .......................................................... 554
8.17.10
System Time Register High - SYSTIMH (0x0B604; RWS) ........................................................ 554
8.17.11
Increment Attributes Register - TIMINCA (0x0B608; RW) ....................................................... 555
8.17.12
Time Adjustment Offset Register Low - TIMADJL (0x0B60C; RW) ............................................. 555
Intel® 82576 GbE Controller
Datasheet
26
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
8.17.13
Time Adjustment Offset Register High - TIMADJH (0x0B610;RW)..............................................555
8.17.14
TimeSync Auxiliary Control Register - TSAUXC (0x0B640; RW).................................................555
8.17.15
Target Time Register 0 Low - TRGTTIML0 (0x0B644; RW)........................................................556
8.17.16
Target Time Register 0 High - TRGTTIMH0 (0x0B648; RW) ......................................................556
8.17.17
Target Time Register 1 Low - TRGTTIML1 (0x0B64C; RW) .......................................................556
8.17.18
Target Time Register 1 High - TRGTTIMH1 (0x0B650; RW) ......................................................556
8.17.19
Auxiliary Time Stamp 0 Register Low - AUXSTMPL0 (0x0B65C; RO) ..........................................556
8.17.20
Auxiliary Time Stamp 0 Register High -AUXSTMPH0 (0x0B660; RO) ..........................................556
8.17.21
Auxiliary Time Stamp 1 Register Low AUXSTMPL1 (0x0B664; RO).............................................557
8.17.22
Auxiliary Time Stamp 1 Register High - AUXSTMPH1 (0x0B668; RO) .........................................557
8.17.23
Time Sync RX Configuration - TSYNCRXCFG (0x05F50; RW).....................................................557
8.17.24
Time Sync SDP Config Reg - TSSDP (0x0003C; RW) ...............................................................557
8.18
PCS Register Descriptions ......................................................................................................... 559
8.18.1
PCS Configuration - PCS_CFG (0x04200; R/W).......................................................................559
8.18.2
PCS Link Control - PCS_LCTL (0x04208; RW) .........................................................................559
8.18.3
PCS Link Status - PCS_LSTS (0x0420C; RO) ..........................................................................561
8.18.4
AN Advertisement - PCS_ANADV (0x04218; R/W) ..................................................................562
8.18.5
Link Partner Ability - PCS_LPAB (0x0421C; RO)......................................................................563
8.18.6
Next Page Transmit - PCS_NPTX (0x04220; RW) ....................................................................564
8.18.7
Link Partner Ability Next Page - PCS_LPABNP (0x04224; RO) ...................................................565
8.18.8
SFP I2C Command- I2CCMD (0x01028; R/W) ........................................................................565
8.18.9
SFP I2C Parameters - I2CPARAMS (0x0102C; R/W) ................................................................566
8.19
Statistics Register Descriptions.................................................................................................. 567
8.19.1
CRC Error Count - CRCERRS (0x04000; RC)...........................................................................567
8.19.2
Alignment Error Count - ALGNERRC (0x04004; RC) ................................................................568
8.19.3
Symbol Error Count - SYMERRS (0x04008; RC) ......................................................................568
8.19.4
RX Error Count - RXERRC (0x0400C; RC) ..............................................................................568
8.19.5
Missed Packets Count - MPC (0x04010; RC)...........................................................................568
8.19.6
Excessive Collisions Count - ECOL (0x04018; RC) ...................................................................569
8.19.7
Multiple Collision Count - MCC (0x0401C; RC) ........................................................................569
8.19.8
Late Collisions Count - LATECOL (0x04020; RC) .....................................................................569
8.19.9
Collision Count - COLC (0x04028; RC) ..................................................................................569
8.19.10
Defer Count - DC (0x04030; RC) ..........................................................................................569
8.19.11
Transmit with No CRS - TNCRS (0x04034; RC).......................................................................570
8.19.12
Host Transmit Discarded Packets by MAC Count - HTDPMC (0x0403C; RC).................................570
8.19.13
Receive Length Error Count - RLEC (0x04040; RC) .................................................................570
8.19.14
Circuit Breaker Rx dropped packet- CBRDPC (0x04044; RC).....................................................571
8.19.15
XON Received Count - XONRXC (0x04048; RC) ......................................................................571
8.19.16
XON Transmitted Count - XONTXC (0x0404C; RC) ..................................................................571
8.19.17
XOFF Received Count - XOFFRXC (0x04050; RC) ....................................................................571
8.19.18
XOFF Transmitted Count - XOFFTXC (0x04054; RC) ................................................................571
8.19.19
FC Received Unsupported Count - FCRUC (0x04058; RC).........................................................571
8.19.20
Packets Received [64 Bytes] Count - PRC64 (0x0405C; RC) .....................................................572
8.19.21
Packets Received [65—127 Bytes] Count - PRC127 (0x04060; RC) ...........................................572
8.19.22
Packets Received [128—255 Bytes] Count - PRC255 (0x04064; RC)..........................................572
8.19.23
Packets Received [256—511 Bytes] Count - PRC511 (0x04068; RC)..........................................573
8.19.24
Packets Received [512—1023 Bytes] Count - PRC1023 (0x0406C; RC) ......................................573
8.19.25
Packets Received [1024 to Max Bytes] Count - PRC1522 (0x04070; RC)....................................573
8.19.26
Good Packets Received Count - GPRC (0x04074; RC) ..............................................................573
8.19.27
Broadcast Packets Received Count - BPRC (0x04078; RC)........................................................574
8.19.28
Multicast Packets Received Count - MPRC (0x0407C; RC) ........................................................574
8.19.29
Good Packets Transmitted Count - GPTC (0x04080; RC)..........................................................574
8.19.30
Good Octets Received Count - GORCL (0x04088; RC) .............................................................574
8.19.31
Good Octets Received Count - GORCH (0x0408C; RC).............................................................575
8.19.32
Good Octets Transmitted Count - GOTCL (0x04090; RC) .........................................................575
8.19.33
Good Octets Transmitted Count - GOTCH (04094; RC) ............................................................575
8.19.34
Receive No Buffers Count - RNBC (0x040A0; RC) ...................................................................575
8.19.35
Receive Undersize Count - RUC (0x040A4; RC) ......................................................................576
8.19.36
Receive Fragment Count - RFC (0x040A8; RC) .......................................................................576
8.19.37
Receive Oversize Count - ROC (0x040AC; RC)........................................................................576
8.19.38
Receive Jabber Count - RJC (0x040B0; RC) ...........................................................................576
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
27
Intel® 82576 GbE Controller — Contents
8.19.39
Management Packets Received Count - MNGPRC (0x040B4; RC).............................................. 577
8.19.40
BMC Management Packets Received Count - BMNGPRC (0x0413C; RC)..................................... 577
8.19.41
Management Packets Dropped Count - MPDC (0x040B8; RC) .................................................. 577
8.19.42
BMC Management Packets Dropped Count - BMPDC (0x04140; RC) ......................................... 577
8.19.43
Management Packets Transmitted Count - MNGPTC (0x040BC; RC) ......................................... 578
8.19.44
BMC Management Packets Transmitted Count - BMNGPTC (0x04144; RC) ................................. 578
8.19.45
Total Octets Received - TORL (0x040C0; RC) ........................................................................ 578
8.19.46
Total Octets Received - TORH (0x040C4; RC)........................................................................ 578
8.19.47
Total Octets Transmitted - TOTL (0x040C8; RC) .................................................................... 579
8.19.48
Total Octets Transmitted - TOTH (0x040CC; RC) ................................................................... 579
8.19.49
Total Packets Received - TPR (0x040D0; RC) ........................................................................ 579
8.19.50
Total Packets Transmitted - TPT (0x040D4; RC) .................................................................... 579
8.19.51
Packets Transmitted [64 Bytes] Count - PTC64 (0x040D8; RC)................................................ 580
8.19.52
Packets Transmitted [65—127 Bytes] Count - PTC127 (0x040DC; RC)...................................... 580
8.19.53
Packets Transmitted [128—255 Bytes] Count - PTC255 (0x040E0; RC)..................................... 580
8.19.54
Packets Transmitted [256—511 Bytes] Count - PTC511 (0x040E4; RC)..................................... 580
8.19.55
Packets Transmitted [512—1023 Bytes] Count - PTC1023 (0x040E8; RC) ................................. 581
8.19.56
Packets Transmitted [1024 Bytes or Greater] Count - PTC1522 (0x040EC; RC).......................... 581
8.19.57
Multicast Packets Transmitted Count - MPTC (0x040F0; RC).................................................... 581
8.19.58
Broadcast Packets Transmitted Count - BPTC (0x040F4; RC)................................................... 581
8.19.59
TCP Segmentation Context Transmitted Count - TSCTC (0x040F8; RC)..................................... 582
8.19.60
Circuit Breaker Rx manageability packet count - CBRMPC (0x040FC; RC) .................................. 582
8.19.61
Interrupt Assertion Count - IAC (0x04100; RC) ..................................................................... 582
8.19.62
Rx Packets to Host Count - RPTHC (0x04104; RC) ................................................................. 582
8.19.63
Debug Counter 1 - DBGC1 (0x04108; RC) ............................................................................ 582
8.19.64
Debug Counter 2 - DBGC2 (0x0410C; RC) ............................................................................ 583
8.19.65
Debug Counter 3 - DBGC3 (0x04110; RC) ............................................................................ 583
8.19.66
Debug Counter 4 - DBGC4 (0x0411C; RC) ............................................................................ 584
8.19.67
Host Good Packets Transmitted Count-HGPTC (0x04118; RC) ................................................. 584
8.19.68
Receive Descriptor Minimum Threshold Count-RXDMTC (0x04120; RC)..................................... 585
8.19.69
Host TX Circuit Breaker dropped Packets Count- HTCBDPC (0x04124; RC) ................................ 585
8.19.70
Host Good Octets Received Count - HGORCL (0x04128; RC) ................................................... 585
8.19.71
Host Good Octets Received Count - HGORCH (0x0412C; RC)................................................... 585
8.19.72
Host Good Octets Transmitted Count - HGOTCL (0x04130; RC) ............................................... 586
8.19.73
Host Good Octets Transmitted Count - HGOTCH (0x04134; RC)............................................... 586
8.19.74
Length Error Count - LENERRS (0x04138; RC) ...................................................................... 586
8.19.75
SerDes/SGMII Code Violation Packet Count - SCVPC (0x04228; RW) ........................................ 586
8.19.76
Switch Security Violation Packet Count - SSVPC (0x41A0; RC) ................................................ 587
8.19.77
Switch Drop Packet Count - SDPC (0x41A4; RC).................................................................... 587
8.20
Wake Up Control Register Descriptions ....................................................................................... 587
8.20.1
Wakeup Control Register - WUC (0x05800; R/W)................................................................... 587
8.20.2
Wakeup Filter Control Register - WUFC (0x05808; R/W) ......................................................... 588
8.20.3
Wakeup Status Register - WUS (0x05810; R/W1C) ................................................................ 588
8.20.4
Wakeup Packet Length - WUPL (0x05900; RO) ...................................................................... 589
8.20.5
Wakeup Packet Memory - WUPM (0x05A00 + 4*n [n=0...31]; RO) .......................................... 589
8.20.6
IP Address Valid - IPAV (0x5838; R/W) ................................................................................ 590
8.20.7
IPv4 Address Table - IP4AT (0x05840 + 8*n [n=0...3]; R/W) ................................................. 590
8.20.8
IPv6 Address Table - IP6AT (0x05880 + 4*n [n=0...3]; R/W) ................................................. 590
8.20.9
Flexible Host Filter Table Registers - FHFT (0x09000 - 0x093FC; RW) ....................................... 591
8.20.10
Flexible Host Filter Table Extended Registers - FHFT_EXT (0x09A00 - 0x09BFC; RW).................. 592
8.21
Management Register Descriptions............................................................................................. 592
8.21.1
Management VLAN TAG Value - MAVTV (0x5010 +4*n [n=0...7]; RW) ..................................... 592
8.21.2
Management Flex UDP/TCP Ports - MFUTP (0x5030 + 4*n [n=0...7]; RW) ................................ 593
8.21.3
Management Ethernet Type Filters- METF (0x5060 + 4*n [n=0...3]; RW) ................................. 593
8.21.4
Management Control Register - MANC (0x05820; RW) ........................................................... 593
8.21.5
Manageability Filters Valid - MFVAL (0x5824; RW) ................................................................. 594
8.21.6
Management Control to Host Register - MANC2H (0x5860; RW) .............................................. 595
8.21.7
Manageability Decision Filters- MDEF (0x5890 + 4*n [n=0...7]; RW) ....................................... 595
8.21.8
Manageability Decision Filters- MDEF_EXT (0x5930 + 4*n[n=0...7]; RW) ................................. 596
8.21.9
Manageability IP Address Filter - MIPAF (0x58B0 + 4*n [n=0...15]; RW) .................................. 597
8.21.10
Manageability MAC Address Low - MMAL (0x5910 + 8*n [n= 0...3]; RW).................................. 600
Intel® 82576 GbE Controller
Datasheet
28
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
8.21.11
Manageability MAC Address High - MMAH (0x5914 + 8*n [n=0...3]; RW) ..................................600
8.21.12
Flexible TCO Filter Table registers - FTFT (0x09400-0x097FC; RW) ...........................................600
8.22
MACSec Register Descriptions ................................................................................................... 602
8.22.1
MACSec TX Capabilities Register - LSECTXCAP (0xB000; RO) ...................................................602
8.22.2
MACSec RX Capabilities Register - LSECRXCAP (0xB300; RO)...................................................602
8.22.3
MACSec TX Control register - LSECTXCTRL (0xB004; RW) .......................................................603
8.22.4
MACSec RX Control register - LSECRXCTRL (0xB304; RW) .......................................................604
8.22.5
MACSec TX SCI Low - LSECTXSCL (0xB008; RW) ...................................................................604
8.22.6
MACSec TX SCI High - LSECTXSCH (0xB00C; RW) ..................................................................604
8.22.7
MACSec TX SA - LSECTXSA (0xB010; RW).............................................................................605
8.22.8
MACSec TX SA PN 0 - LSECTXPN0 (0xB018; RW) ...................................................................605
8.22.9
MACSec TX SA PN 1 - LSECTXPN1 (0xB01C; RW) ...................................................................606
8.22.10
MACSec TX Key 0 - LSECTXKEY0 (0xB020 + 4*n [n=0...3]; WO)..............................................606
8.22.11
MACSec TX Key 1 - LSECTXKEY1 (0xB030 + 4*n [n=0...3]; WO)..............................................606
8.22.12
MACSec RX SCI Low - LSECRXSCL (0xB3D0; RW)...................................................................607
8.22.13
MACSec RX SCI High - LSECRXSCH (0xB3E0; RW)..................................................................607
8.22.14
MACSec RX SA - LSECRXSA[n] (0xB310 + 4*n [n=0...1]; RW).................................................607
8.22.15
MACSec RX SA PN - LSECRXSAPN (0xB330 + 4*n [n=0...1]; RW) ............................................608
8.22.16
MACSec RX Key - LSECRXKEY (0xB350 + 16*n [n=0...1] + 4*m (m=0...3); WO).......................608
8.22.17
MACSec Software/Firmware interface- LSWFW (0x8F14; RO) ...................................................609
8.22.18
MACSec Tx Port Statistics ....................................................................................................609
8.22.18.1
Tx Untagged Packet Counter - LSECTXUT (0x4300; RC) ....................................................609
8.22.18.2
Encrypted Tx Packets Count - LSECTXPKTE (0x4304; RC)..................................................609
8.22.18.3
Protected Tx Packets Count - LSECTXPKTP (0x4308; RC) ..................................................610
8.22.18.4
Encrypted Tx Octets Count - LSECTXOCTE (0x430C; RC)...................................................610
8.22.18.5
Protected Tx Octets Count - LSECTXOCTP (0x4310; RC)....................................................610
8.22.19
MACSec Rx Port Statistic .....................................................................................................610
8.22.19.1
MACSec Untagged RX Packet Count - LSECRXUT (0x4314; RC) ..........................................610
8.22.19.2
MACSec RX Octets Decrypted count - LSECRXOCTE (0x431C; RC) ......................................611
8.22.19.3
MACSec RX Octets Validated count - LSECRXOCTP (0x4320; RC)........................................611
8.22.19.4
MACSec RX Packet with Bad Tag count - LSECRXBAD (0x4324; RC)....................................611
8.22.19.5
MACSec RX Packet No SCI count - LSECRXNOSCI (0x4328; RC) .........................................611
8.22.19.6
MACSec RX Packet Unknown SCI count - LSECRXUNSCI (0x432C; RC)................................612
8.22.20
MACSec Rx SC Statistic Register Descriptions.........................................................................612
8.22.20.1
MACSec RX Unchecked Packets Count - LSECRXUNCH (0x4330; RC)...................................612
8.22.20.2
MACSec RX Delayed Packets Count - LSECRXDELAY (0x4340; RC)......................................612
8.22.20.3
MACSec RX Late Packets Count - LSECRXLATE (0x4350; RC) .............................................612
8.22.21
MACSec Rx SA Statistic Register Descriptions.........................................................................613
8.22.21.1
MACSec RX Packet OK count - LSECRXOK[n] (0x4360+ 4*n [n=0...1]; RC) .........................613
8.22.21.2
MACSec RX Invalid count - LSECRXINV[n] (0x4380+ 4*n [n=0...1]; RC).............................613
8.22.21.3
MACSec RX Not valid count - LSECRXNV[n] (0x43A0 + 4*n [n=0...1]; RC)..........................613
8.22.21.4
MACSec RX Not using SA Count - LSECRXNUSA (0x43C0; RC) ...........................................613
8.22.21.5
MACSec RX Unused SA Count - LSECRXUNSA (0x43D0; RC) ..............................................613
8.23
IPsec Registers Description ....................................................................................................... 614
8.23.1
IPSec Control – IPSCTRL (0xB430; RW) ................................................................................614
8.23.2
IPsec Tx Index - IPSTXIDX (0xB450; RW) .............................................................................614
8.23.3
IPsec Tx Key Registers - IPSTXKEY (0xB460 + 4*n [n = 0...3]; RW) .........................................614
8.23.4
IPsec Tx Salt Register - IPSTXSALT (0xB454; RW) ..................................................................615
8.23.5
IPsec Rx Command Register - IPSRXCMD (0xB408; RW) .........................................................615
8.23.6
IPsec Rx SPI Register - IPSRXSPI (0xB40C; RW) ....................................................................616
8.23.7
IPsec Rx Key Register - IPSRXKEY (0xB410 + 4 * n [n = 0..3]; RW) .........................................616
8.23.8
IPsec Rx Salt Register - IPSRXSALT (0xB404; RW) .................................................................616
8.23.9
IPsec Rx IP address Register - IPSRXIPADDR (0xB420 + 4*n [n = 0..3]; RW) ............................617
8.23.10
IPsec Rx Index - IPSRXIDX (0xB400; RW) .............................................................................617
8.24
Diagnostic Registers Description ................................................................................................ 617
8.24.1
Receive Data FIFO Head Register - RDFH (0x02410; RWS) ......................................................617
8.24.2
Receive Data FIFO Tail Register - RDFT (0x02418; RWS) .........................................................618
8.24.3
Receive Data FIFO Head Saved Register - RDFHS (0x2420; RWS).............................................618
8.24.4
Receive Data FIFO Tail Saved Register - RDFTS (0x02428; RWS)..............................................618
8.24.5
Switch Buffer FIFO Head Register - SWBFH (0x03010; RWS) ...................................................619
8.24.6
Switch Buffer FIFO Tail Register - SWBFT (0x03018; RWS) ......................................................619
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
29
Intel® 82576 GbE Controller — Contents
8.24.7
Switch Buffers FIFO Head Saved Register - SWBFHS (0x03020; RWS)...................................... 619
8.24.8
Switch Buffers FIFO Tail Saved Register - SWBFTS (0x03028; RWS) ........................................ 620
8.24.9
Packet Buffer Diagnostic - PBDIAG (0x02458; R/W) ............................................................... 620
8.24.10
Transmit Data FIFO Head Register - TDFH (0x03410; RWS) .................................................... 620
8.24.11
Transmit Data FIFO Tail Register - TDFT (0x03418; RWS)....................................................... 621
8.24.12
Transmit Data FIFO Head Saved Register - TDFHS (0x03420; RWS)......................................... 621
8.24.13
Transmit Data FIFO Tail Saved Register - TDFTS (0x03428; RWS) ........................................... 621
8.24.14
Transmit Data FIFO Packet Count - TDFPC (0x03430; RO) ...................................................... 622
8.24.15
Receive Data FIFO Packet Count - RDFPC (0x02430; RO) ....................................................... 622
8.24.16
Switch Data FIFO Packet Count - SWDFPC (0x03030; RO) ...................................................... 622
8.24.17
IpSec Packet Buffer ECC Status - IPPBECCSTS (0xB470; RC) .................................................. 623
8.24.18
PB Slave Access Control - PBSLAC (0x3100; RW)................................................................... 623
8.24.19
PB Slave Access Data – PBSLAD (0x3110 + 4*n [n= 0...3]; RW) ............................................. 624
8.24.20
Rx Descriptor Handler Memory - RDHM (0x06000 + 4*n [n= 0..1023]; RO) .............................. 624
8.24.21
Rx Descriptor Handler Memory Page Number - RDHMP (0x025FC; RW)..................................... 624
8.24.22
Tx Descriptor Handler Memory - TDHM (0x07000 + 4*n [n= 0..1023]; RO) .............................. 625
8.24.23
Tx Descriptor Handler Memory Page Number - TDHMP (0x035FC; R/W) .................................... 625
8.24.24
Rx Packet Buffer ECC Status - RPBECCSTS (0x0245C; RC)...................................................... 626
8.24.25
Tx Packet Buffer ECC Status - TPBECCSTS (0x0345C; RC) ...................................................... 626
8.24.26
Switch Packet Buffer ECC Status - SWPBECCSTS (0x0305C; RC) ............................................. 627
8.24.27
IPSec Packet Buffer ECC Error Inject - IPPBEEI (0xB474; RW) ................................................. 627
8.24.28
Rx Descriptor Handler ECC Status - RDHESTS (0x025C0; RC) ................................................. 628
8.24.29
Tx Descriptor Handler ECC Status - TDHESTS (0x35C0; RC).................................................... 628
8.24.30
PCIe Retry Buffer ECC Status - PRBESTS (0x05BA0; RC) ........................................................ 629
8.24.31
PCIe Write Buffer ECC Status - PWBESTS (0x05BB0; RC) ....................................................... 629
8.24.32
PCIe MSI-X ECC Status - PMSIXESTS (0x05BA8; RC) ............................................................. 629
8.24.33
Parity and ECC Error Indication- PEIND (0x1084; RC) ............................................................ 630
8.24.34
Parity and ECC Indication Mask – PEINDM (0x1088; RW) ........................................................ 631
8.24.35
Tx DMA Performance Burst and Descriptor Count - TXBDC (0x35E0; RC) .................................. 632
8.24.36
Tx DMA Performance Idle Count - TXIDLE (0x35E4; RC) ......................................................... 632
8.24.37
Rx DMA Performance Burst and Descriptor Count - RXBDC (0x25E0; RC) .................................. 633
8.24.38
Rx DMA Performance Idle Count - RXIDLE (0x25E4; RC) ........................................................ 633
8.25
PHY Software Interface (PHYREG) .............................................................................................. 633
8.25.1
PHY Control Register - PCTRL (00d; R/W) ............................................................................. 634
8.25.2
PHY Status Register - PSTATUS (01d; R) .............................................................................. 635
8.25.3
PHY Identifier Register 1 (LSB) - PHY ID 1 (02d; R) ............................................................... 636
8.25.4
PHY Identifier Register 2 (MSB) - PHY ID 2 (03d; R) .............................................................. 636
8.25.5
Auto–Negotiation Advertisement Register - ANA (04d; R/W) ................................................... 637
8.25.6
Auto–Negotiation Base Page Ability Register - (05d; R) .......................................................... 638
8.25.7
Auto–Negotiation Expansion Register - ANE (06d; R) ............................................................. 639
8.25.8
Auto–Negotiation Next Page Transmit Register - NPT (07d; R/W)............................................. 639
8.25.9
Auto–Negotiation Next Page Ability Register - LPN (08d; R) .................................................... 640
8.25.10
1000BASE–T/100BASE–T2 Control Register - GCON (09d; R/W) .............................................. 640
8.25.11
1000BASE–T/100BASE–T2 Status Register - GSTATUS (10d; R) .............................................. 641
8.25.12
Extended Status Register - ESTATUS (15d; R)....................................................................... 641
8.25.13
Port Configuration Register - PCONF (16d; R/W).................................................................... 642
8.25.14
Port Status 1 Register - PSTAT (17d; RO) ............................................................................. 643
8.25.15
Port Control Register - PCONT (18d; R/W) ............................................................................ 644
8.25.16
Link Health Register - LINK (19d; RO).................................................................................. 645
8.25.17
1000Base–T FIFO Register - PFIFO (20d; R/W)...................................................................... 646
8.25.18
Channel Quality Register - CHAN (21d; RO) .......................................................................... 647
8.25.19
PHY Power Management - (25d; R/W) .................................................................................. 647
8.25.20
Special Gigabit Disable Register - (26d; R/W) ....................................................................... 648
8.25.21
Misc. Control Register 1 - (27d; R/W) .................................................................................. 648
8.25.22
Misc. Control Register 2 - (28d; RO) .................................................................................... 648
8.25.23
Page Select Core Register - (31d; WO)................................................................................. 649
8.26
Virtual Function Device registers................................................................................................ 649
8.26.1
Queues Registers .............................................................................................................. 649
8.26.2
Non-queue Registers ......................................................................................................... 649
8.26.2.1
EITR registers..............................................................................................................650
8.26.2.2
MSI-X registers............................................................................................................650
Intel® 82576 GbE Controller
Datasheet
30
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
8.26.3
Register Set - CSR BAR .......................................................................................................650
8.26.4
Register set - MSI-X BAR.....................................................................................................652
8.27
Virtual function Register Descriptions ......................................................................................... 652
8.27.1
VT control register - VTCTRL (0x0000; RW) ...........................................................................653
8.27.2
VF Status Register - STATUS (0x00008; RO)..........................................................................653
8.27.3
VT Free Running Timer - VTFRTIMER (0x01048; RO)...............................................................653
8.27.4
VT Extended Interrupt Cause - VTEICR (0x01580; RC/W1C) ....................................................653
8.27.5
VT Extended Interrupt Cause Set - VTEICS (0x01520; WO) .....................................................653
8.27.6
VT Extended Interrupt Mask Set/Read - VTEIMS (0x01524; RWS).............................................653
8.27.7
VT Extended Interrupt Mask Clear - VTEIMC (0x01528; WO) ....................................................654
8.27.8
VT Extended Interrupt Auto Clear - VTEIAC (0x0152C; R/W)....................................................654
8.27.9
VT Extended Interrupt Auto Mask Enable - VTEIAM (0x01530; R/W) .........................................654
8.27.10
VT Interrupt Throttle - VTEITR (0x01680 + 4*n[n = 0...2]; R/W) .............................................654
8.27.11
VT Interrupt Vector Allocation Registers - VTIVAR (0x01700; RW) ............................................655
8.27.12
VT Interrupt Vector Allocation Registers - VTIVAR_MISC (0x01740; RW) ...................................655
8.27.13
MSI—X Table Entry Lower Address MSIXTADD (BAR3: 0x0000 + 16*n [n=0...2]; R/W)................................................................656
8.27.14
MSI—X Table Entry Upper Address MSIXTUADD (BAR3: 0x0004 + 16*n [n=0...2]; R/W)..............................................................656
8.27.15
MSI—X Table Entry Message MSIXTMSG (BAR3: 0x0008 + 16*n [n=0...2]; R/W) ...............................................................656
8.27.16
MSI—X Table Entry Vector Control MSIXTVCTRL (BAR3: 0x000C + 16*n [n=0...2]; R/W).............................................................656
8.27.17
MSIXPBA - MSIXPBA (BAR3: 0x02000; RO) ...........................................................................656
8.27.18
MSI—X PBA Clear - PBACL (0x00F04; R/W1C)........................................................................656
8.27.19
Receive Descriptor Base Address Low - RDBAL (0x02800 + 256*n [n=0...1];R/W)......................656
8.27.20
Receive Descriptor Base Address High - RDBAH (0x02804 + 256*n [n=0...1]; R/W) ...................657
8.27.21
Receive Descriptor Ring Length - RDLEN (0x02808 + 256*n [n=0...1]; R/W) .............................657
8.27.22
Receive Descriptor Head - RDH (0x02810 + 256*n [n=0...1]; R/0)...........................................657
8.27.23
Receive Descriptor Tail - RDT (0x02818 + 256*n [n=0...1]; R/W) ............................................657
8.27.24
Receive Descriptor Control - RXDCTL
(0x02828 + 256*n [n=0...1]; R/W)......................................................................................657
8.27.25
Split and Replication Receive Control Register queue SRRCTL(0x0280C + 256*n [n=0...1]; R/W)...........................................................................657
8.27.26
Receive Queue drop packet count - RQDPC
(0x2830 + 256*n [n=0...1]; RC)..........................................................................................657
8.27.27
Replication Packet Split Receive Type - PSRTYPE
(0x00F0C; R/W).................................................................................................................657
8.27.28
Transmit Descriptor Base Address Low - TDBAL
(0x3800 + 256*n [n=0...1]; R/W) .......................................................................................657
8.27.29
Transmit Descriptor Base Address High - TDBAH
(0x03804 + 256*n [n=0...1]; R/W)......................................................................................657
8.27.30
Transmit Descriptor Ring Length - TDLEN
(0x03808 + 256*n [n=0...1]; R/W)......................................................................................658
8.27.31
Transmit Descriptor Head - TDH
(0x03810 + 256*n [n=0...1]; R/0).......................................................................................658
8.27.32
Transmit Descriptor Tail - TDT
(0x03818 + 256*n [n=0...1]; R/W)......................................................................................658
8.27.33
Transmit Descriptor Control - TXDCTL
(0x03828 + 256*n [n=0...1]; R/W)......................................................................................658
8.27.34
Tx Descriptor Completion Write–Back Address Low TDWBAL (0x03838 + 256*n [n=0...1]; R/W) .........................................................................658
8.27.35
Tx Descriptor Completion Write–Back Address High TDWBAH (0x0383C + 256*n [n=0...1];R/W) .........................................................................658
8.27.36
Rx DCA Control Registers - RXCTL
(0x02814 + 256*n [n=0...1]; R/W)......................................................................................658
8.27.37
Tx DCA Control Registers - TXCTL
(0x03814 + 256*n [n=0...1]; R/W)......................................................................................658
8.27.38
Good Packets Received Count - VFGPRC (0x0F10; RO) ............................................................658
8.27.39
Good Packets Transmitted Count - VFGPTC (0x0F14; RO) ........................................................659
8.27.40
Good Octets Received Count - VFGORC (0x0F18; RO) .............................................................659
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
31
Intel® 82576 GbE Controller — Contents
8.27.41
8.27.42
8.27.43
8.27.44
8.27.45
8.27.46
8.27.47
8.27.48
8.27.49
8.27.50
8.27.51
Good Octets Transmitted Count - VFGOTC (0x0F34; RO) ........................................................
Multicast Packets Received Count - VFMPRC (0x0F3C; RO)......................................................
Good TX Octets loopback Count - VFGOTLBC (0x0F50; RO).....................................................
Good TX packets loopback Count - VFGPTLBC (0x0F44; RO) ...................................................
Good RX Octets loopback Count - VFGORLBC (0x0F48; RO) ....................................................
Good RX Packets loopback Count - VFGPRLBC (0x0F40; RO) ...................................................
Virtual Function Mailbox - VFMailbox (0x0C40; RW) ...............................................................
Virtualization Mailbox memory - VMBMEM (0x0800:0x083C; R/W) ...........................................
Tx packet buffer wrap around counter - PBTWAC (0x34e8; RO) ...............................................
Rx packet buffer wrap around counter - PBRWAC (0x24e8; RO)...............................................
Switch packet buffer wrap around counter - PBSWAC (0x30e8; RO) .........................................
659
660
660
660
660
661
661
661
661
661
662
9.0
PCIe Programming Interface................................................................................................... 663
9.1
PCIe Compatibility ................................................................................................................... 663
9.2
Configuration Sharing Among PCI Functions ................................................................................ 664
9.3
Register Map........................................................................................................................... 664
9.3.1
Register Attributes ............................................................................................................ 664
9.3.2
PCIe Configuration Space Summary ..................................................................................... 666
9.4
Mandatory PCI Configuration Registers ....................................................................................... 668
9.4.1
Vendor ID Register (0x0; RO) ............................................................................................. 668
9.4.2
Device ID Register (0x2; RO).............................................................................................. 668
9.4.3
Command Register (0x4; R/W) ........................................................................................... 669
9.4.4
Status Register (0x6; RO) .................................................................................................. 670
9.4.5
Revision Register (0x8; RO)................................................................................................ 670
9.4.6
Class Code Register (0x9; RO) ............................................................................................ 671
9.4.7
Cache Line Size Register (0xC; R/W).................................................................................... 671
9.4.8
Latency Timer Register (0xD; RO) ....................................................................................... 671
9.4.9
Header Type Register (0xE; RO).......................................................................................... 671
9.4.10
BIST Register (0xF; RO)..................................................................................................... 671
9.4.11
Base Address Registers (0x10:0x27; R/W)............................................................................ 671
9.4.11.1
32-bit Mapping ............................................................................................................672
9.4.11.2
64-bit Mapping without I/O BAR.....................................................................................672
9.4.11.3
64-bit Mapping Without Flash BAR..................................................................................673
9.4.12
CardBus CIS Register (0x28; RO) ........................................................................................ 675
9.4.13
Subsystem Vendor ID Register (0x2C; RO) ........................................................................... 675
9.4.14
Subsystem ID Register (0x2E; RO) ...................................................................................... 675
9.4.15
Expansion ROM Base Address Register (0x30; RO)................................................................. 675
9.4.16
Cap_Ptr Register (0x34; RO)............................................................................................... 675
9.4.17
Interrupt Line Register (0x3C; RW)...................................................................................... 675
9.4.18
Interrupt Pin Register (0x3D; RO) ....................................................................................... 675
9.4.19
Max_Lat/Min_Gnt (0x3E; RO) ............................................................................................. 676
9.5
PCI Capabilities ....................................................................................................................... 676
9.5.1
PCI Power Management Registers........................................................................................ 676
9.5.1.1
Capability ID Register (0x40; RO) ..................................................................................676
9.5.1.2
Next Pointer (0x41; RO) ...............................................................................................676
9.5.1.3
Power Management Capabilities - PMC (0x42; RO) ...........................................................677
9.5.1.4
Power Management Control / Status Register - PMCSR (0x44; R/W) ...................................677
9.5.1.5
Bridge Support Extensions - PMCSR_BSE (0x46; RO)........................................................678
9.5.1.6
Data Register (0x47; RO)..............................................................................................678
9.5.2
MSI Configuration ............................................................................................................. 678
9.5.2.1
Capability ID Register (0x50; RO) ..................................................................................678
9.5.2.2
Next Pointer Register (0x51; RO) ...................................................................................679
9.5.2.3
Message Control Register (0x52; R/W)............................................................................679
9.5.2.4
Message Address Low Register (0x54; R/W) ....................................................................679
9.5.2.5
Message Address High Register (0x58; R/W) ...................................................................679
9.5.2.6
Message Data Register (0x5C; R/W) ...............................................................................679
9.5.2.7
Mask Bits Register (0x60; R/W) .....................................................................................680
9.5.2.8
Pending Bits Register (0x64; R/W) ................................................................................680
9.5.3
MSI-X Configuration .......................................................................................................... 680
9.5.3.1
Capability ID Register (0x70; RO) ..................................................................................680
9.5.3.2
Next Pointer Register (0x71; RO) ...................................................................................680
Intel® 82576 GbE Controller
Datasheet
32
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
9.5.3.3
Message Control Register (0x72; R/W) ........................................................................... 681
9.5.3.4
Table Offset Register (0x74; R/W).................................................................................. 682
9.5.3.5
PBA Offset Register (0x78; R/W) ................................................................................... 682
9.5.4
Vital Product Data Registers.................................................................................................683
9.5.4.1
Capability ID Register (0xE0; RO) .................................................................................. 683
9.5.4.2
Next Pointer Register (0xE1; RO) ................................................................................... 683
9.5.4.3
VPD Address Register (0xE2; RW) .................................................................................. 683
9.5.4.4
VPD Data Register (0xE4; RW) ...................................................................................... 683
9.5.5
PCIe Configuration Registers................................................................................................683
9.5.5.1
Capability ID Register (0xA0; RO) .................................................................................. 684
9.5.5.2
Next Pointer Register (0xA1; RO) ................................................................................... 684
9.5.5.3
PCIe CAP Register (0xA2; RO) ....................................................................................... 684
9.5.5.4
Device Capability Register (0xA4; RW)............................................................................ 684
9.5.5.5
Device Control Register (0xA8; RW) ............................................................................... 685
9.5.5.6
Device Status Register (0xAA; RW1C)............................................................................. 686
9.5.5.7
Link CAP Register (0xAC; RO)........................................................................................ 687
9.5.5.8
Link Control Register (0xB0; RO) ................................................................................... 688
9.5.5.9
Link Status Register (0xB2; RO) .................................................................................... 689
9.5.5.10
Reserved Registers (0xB4-0xC0; RO) ............................................................................. 690
9.5.5.11
Device CAP 2 Register (0xC4; RO).................................................................................. 690
9.5.5.12
Device Control 2 Register (0xC8; RW) ............................................................................ 691
9.6
PCIe Extended Configuration Space ........................................................................................... 692
9.6.1
Advanced Error Reporting (AER) Capability ............................................................................693
9.6.1.1
PCIe CAP ID Register (0x100; RO) ................................................................................. 694
9.6.1.2
Uncorrectable Error Status Register (0x104; R/W1CS) ......................................................694
9.6.1.3
Uncorrectable Error Mask Register (0x108; RWS) ............................................................. 695
9.6.1.4
Uncorrectable Error Severity Register (0x10C; RWS) ........................................................ 695
9.6.1.5
Correctable Error Status Register (0x110; R/W1CS) ......................................................... 696
9.6.1.6
Correctable Error Mask Register (0x114; RWS) ................................................................ 696
9.6.1.7
Advanced Error Capabilities and Control Register (0x118; RO) ...........................................697
9.6.1.8
Header Log Register (0x11C:0x128; RO)......................................................................... 697
9.6.2
Serial Number ...................................................................................................................697
9.6.2.1
Device Serial Number Enhanced Capability Header Register (0x140; RO).............................697
9.6.2.2
Serial Number Register (0x144:0x148; RO)..................................................................... 698
9.6.3
ARI Capability Structure ......................................................................................................699
9.6.3.1
PCIe ARI Header Register (0x150; RO) ........................................................................... 700
9.6.3.2
PCIe ARI Capabilities & Control Register (0x154; RO) .......................................................700
9.6.4
IOV Capability Structure......................................................................................................700
9.6.4.1
PCIe SR-IOV Header Register (0x160; RO) ...................................................................... 702
9.6.4.2
PCIe SR-IOV Capabilities Register (0x164; RO) ................................................................ 702
9.6.4.3
PCIe SR-IOV Control Register (0x168; RW) ..................................................................... 702
9.6.4.4
PCIe SR-IOV Max/Total VFs Register (0x16C) .................................................................. 703
9.6.4.5
PCIe SR-IOV Num VFs Register (0x170; R/W).................................................................. 704
9.6.4.6
PCIe SR-IOV VF RID Mapping Register (0x174; RO).......................................................... 704
9.6.4.7
PCIe SR-IOV VF Device ID Register (0x178; RO) .............................................................. 705
9.6.4.8
PCIe SR-IOV Supported Page Size Register (0x17C; RO) ...................................................705
9.6.4.9
PCIe SR-IOV System Page Size Register (0x180; R/W) .....................................................706
9.6.4.10
PCIe SR-IOV BAR 0 - Low Register (0x184; R/W) ............................................................. 706
9.6.4.11
PCIe SR-IOV BAR 0 - High Register (0x188; R/W) ............................................................ 706
9.6.4.12
PCIe SR-IOV BAR 2 Register (0x18C; RO) ....................................................................... 707
9.6.4.13
PCIe SR-IOV BAR 3 - Low Register (0x190; R/W) ............................................................. 707
9.6.4.14
PCIe SR-IOV BAR 3 - High Register (0x194; R/W) ............................................................ 707
9.6.4.15
PCIe SR-IOV BAR 5 Register (0x198; RO) ....................................................................... 707
9.6.4.16
PCIe SR-IOV VF Migration State Array Offset Register (0x19C; RO) ....................................707
9.7
Virtual Functions (VF) Configuration Space.................................................................................. 708
9.7.1
Legacy Header Details ........................................................................................................710
9.7.1.1
VF Command Register (0x4; RW) ................................................................................... 710
9.7.1.2
VF Status Register (0x6; RW) ........................................................................................ 711
9.7.2
VF Legacy Capabilities.........................................................................................................711
9.7.2.1
VF MSI-X Capability .....................................................................................................711
9.7.2.1.1
VF MSI-X Control Register (0x72; RW)......................................................................... 711
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
33
Intel® 82576 GbE Controller — Contents
9.7.2.2
9.7.2.2.1
9.7.2.2.2
9.7.2.3
9.7.2.3.1
9.7.2.3.2
VF PCIe Capability Registers ..........................................................................................712
VF Device Control Register (0xA8; RW) ........................................................................712
VF Device Status Register (0xAA; RW1C) .....................................................................712
VF Advanced Error Reporting Registers ...........................................................................713
VF Uncorrectable Error Status Register (0x104; R/W1CS) ...............................................713
VF Correctable Error Status Register (0x110; R/W1CS) ..................................................714
10.0 System Manageability ............................................................................................................. 715
10.1
Pass-Through (PT) Functionality ................................................................................................ 715
10.2
Sideband Packet Routing .......................................................................................................... 716
10.3
Components of the Sideband Interface ....................................................................................... 716
10.3.1
Physical Layer................................................................................................................... 716
10.3.1.1
SMBus ........................................................................................................................716
10.3.1.2
NC-SI .........................................................................................................................716
10.3.2
Logical Layer .................................................................................................................... 717
10.3.2.1
SMBus ........................................................................................................................717
10.3.2.2
NC-SI .........................................................................................................................717
10.4
Packet Filtering ....................................................................................................................... 717
10.4.1
Manageability Receive Filtering............................................................................................ 717
10.4.2
EtherType Filters ............................................................................................................... 719
10.4.3
L2 Layer Filtering .............................................................................................................. 719
10.4.4
L3/L4 Filtering .................................................................................................................. 719
10.4.4.1
ARP Filtering ...............................................................................................................719
10.4.4.2
Neighbor Discovery Filtering ..........................................................................................720
10.4.4.3
RMCP Filtering .............................................................................................................720
10.4.4.4
Flexible Port Filtering ....................................................................................................720
10.4.4.5
Flexible 128 Byte Filter .................................................................................................720
10.4.4.5.1
Flexible Filter Structure..............................................................................................720
10.4.4.5.2
TCO Filter Programming.............................................................................................720
10.4.4.6
IP Address Filtering ......................................................................................................721
10.4.4.7
Checksum Filtering.......................................................................................................721
10.4.5
Configuring Manageability Filters ......................................................................................... 721
10.4.5.1
Manageability Decision Filters (MDEF) and Extended Manageability Decision Filters (MDEF_EXT) ..
722
10.4.5.2
Management to Host Filter ............................................................................................724
10.4.6
Possible Configurations ...................................................................................................... 725
10.4.6.1
Dedicated MAC Packet Filtering ......................................................................................725
10.4.6.2
Broadcast Packet Filtering .............................................................................................725
10.4.6.3
VLAN Packet Filtering....................................................................................................726
10.4.6.4
Receive Filtering with Shared IP .....................................................................................726
10.4.7
Determining Manageability MAC address............................................................................... 727
10.5
SMBus Pass-Through Interface .................................................................................................. 727
10.5.1
General............................................................................................................................ 727
10.5.2
Pass-Through Capabilities................................................................................................... 727
10.5.3
Pass-Through Multi-Port Modes ........................................................................................... 728
10.5.4
Automatic Ethernet ARP Operation....................................................................................... 728
10.5.4.1
ARP Packet Formats .....................................................................................................728
10.5.5
SMBus Transactions........................................................................................................... 730
10.5.5.1
SMBus Addressing........................................................................................................731
10.5.5.2
SMBus ARP Functionality ...............................................................................................731
10.5.5.3
SMBus ARP Flow ..........................................................................................................731
10.5.5.4
SMBus ARP UDID Content .............................................................................................734
10.5.5.5
SMBus ARP in Dual/Single Mode.....................................................................................735
10.5.5.6
Concurrent SMBus Transactions .....................................................................................735
10.5.6
SMBus Notification Methods ................................................................................................ 736
10.5.6.1
SMBus Alert and Alert Response Method .........................................................................736
10.5.6.2
Asynchronous Notify Method..........................................................................................737
10.5.6.3
Direct Receive Method ..................................................................................................737
10.5.7
Receive TCO Flow.............................................................................................................. 738
10.5.8
Transmit TCO Flow ............................................................................................................ 738
10.5.8.1
Transmit Errors in Sequence Handling.............................................................................739
Intel® 82576 GbE Controller
Datasheet
34
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
10.5.8.2
TCO Command Aborted Flow ......................................................................................... 739
10.5.9
SMBus ARP Transactions .....................................................................................................739
10.5.9.1
Prepare to ARP ............................................................................................................ 739
10.5.9.2
Reset Device (General) .................................................................................................740
10.5.9.3
Reset Device (Directed) ................................................................................................740
10.5.9.4
Assign Address ............................................................................................................ 740
10.5.9.5
Get UDID (General and Directed) ................................................................................... 741
10.5.10
SMBus Pass-Through Transactions ........................................................................................742
10.5.10.1
Write SMBus Transactions ............................................................................................. 743
10.5.10.1.1 Transmit Packet Command......................................................................................... 743
10.5.10.1.2 Request Status Command .......................................................................................... 743
10.5.10.1.3 Receive Enable Command .......................................................................................... 744
10.5.10.1.3.1 Management MAC Address (Data Bytes 7:2) .......................................................... 745
10.5.10.1.3.2 Management IP Address (Data Bytes 11:8) ............................................................ 745
10.5.10.1.3.3 Asynchronous Notification SMBus Address (Data Byte 12) ........................................745
10.5.10.1.3.4 Interface Data (Data Byte 13) .............................................................................. 745
10.5.10.1.3.5 Alert Value Data (Data Byte 14) ........................................................................... 745
10.5.10.1.4 Force TCO Command................................................................................................. 746
10.5.10.1.5 Management Control ................................................................................................. 746
10.5.10.1.5.1 Update Management Receive Filter Parameters.......................................................747
10.5.10.1.6 Update MACSec Parameters ....................................................................................... 749
10.5.10.2
Read SMBus Transactions ............................................................................................. 751
10.5.10.2.1 Receive TCO LAN Packet Transaction ........................................................................... 752
10.5.10.2.1.1 Receive TCO LAN Status Payload Transaction ......................................................... 753
10.5.10.2.2 Read Status Command .............................................................................................. 755
10.5.10.2.3 Get System MAC Address ........................................................................................... 757
10.5.10.2.4 Read Management Parameters.................................................................................... 758
10.5.10.2.5 Read Management Receive Filter Parameters ................................................................ 759
10.5.10.2.6 Read Receive Enable Configuration.............................................................................. 761
10.5.10.2.7 Read MACSec Parameters ...................................................................................... 761
10.5.11
LAN Fail-Over in LAN Teaming Mode .....................................................................................764
10.5.11.1
Fail-Over Functionality .................................................................................................. 764
10.5.11.1.1 Transmit Functionality ............................................................................................... 764
10.5.11.1.2 Receive Functionality................................................................................................. 764
10.5.11.1.3 Port Switching (Fail-Over) ..........................................................................................764
10.5.11.1.4 Device Driver Interactions ..........................................................................................764
10.5.11.2
Fail-Over Configuration ................................................................................................. 765
10.5.11.2.1 Preferred Primary Port ............................................................................................... 765
10.5.11.2.2 Gratuitous ARPs........................................................................................................ 765
10.5.11.2.3 Link Down Timeout ...................................................................................................765
10.5.11.3
Fail-Over Register ........................................................................................................ 765
10.5.12
Example Configuration Steps ...............................................................................................767
10.5.12.1
Example 1 - Shared MAC, RMCP only ports ...................................................................... 767
10.5.12.1.1 Example 1 Pseudo Code............................................................................................. 767
10.5.12.2
Example 2 - Dedicated MAC, Auto ARP Response
and RMCP port filtering ................................................................................................. 768
10.5.12.2.1 Example 2 - Pseudo Code........................................................................................... 768
10.5.12.3
Example 3 - Dedicated MAC & IP Address ........................................................................ 770
10.5.12.3.1 Example 3 - Pseudo Code........................................................................................... 770
10.5.12.4
Example 4 - Dedicated MAC and VLAN Tag ..................................................................... 772
10.5.12.4.1 Example 4 - Pseudo Code........................................................................................... 773
10.5.13
SMBus Troubleshooting .......................................................................................................774
10.5.13.1
TCO Alert Line Stays Asserted After a Power Cycle ........................................................... 774
10.5.13.2
When SMBus Commands Are Always NACK'd ................................................................... 775
10.5.13.3
SMBus Clock Speed Is 16.6666 KHz ............................................................................... 775
10.5.13.4
A Network Based Host Application Is Not Receiving
Any Network Packets .................................................................................................... 775
10.5.13.5
Unable to Transmit Packets from the MC ......................................................................... 776
10.5.13.6
SMBus Fragment Size ...................................................................................................776
10.5.13.7
Losing Link.................................................................................................................. 777
10.5.13.8
Enable XSum Filtering .................................................................................................. 777
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
35
Intel® 82576 GbE Controller — Contents
10.5.13.9
Still Having Problems? ..................................................................................................777
10.6
NC-SI Pass Through Interface ................................................................................................... 777
10.6.1
Overview ......................................................................................................................... 778
10.6.1.1
Terminology ................................................................................................................778
10.6.1.2
System Topology .........................................................................................................779
10.6.1.3
Data Transport ............................................................................................................780
10.6.1.3.1
Control Frames .........................................................................................................780
10.6.1.3.2
NC-SI Frames Receive Flow ........................................................................................781
10.6.2
NC-SI Support .................................................................................................................. 782
10.6.2.1
Supported Features ......................................................................................................782
10.6.2.2
NC-SI Mode — Intel Specific Commands .........................................................................783
10.6.2.2.1
Overview .................................................................................................................783
10.6.2.2.2
OEM Command (0x50) ..............................................................................................784
10.6.2.2.3
OEM Response (0xD0) ...............................................................................................784
10.6.2.2.4
OEM Specific Command Response Reason Codes ...........................................................785
10.6.2.3
Proprietary Commands Format.......................................................................................786
10.6.2.3.1
Set Intel Filters Control Command
(Intel Command 0x00) ...............................................................................................786
10.6.2.3.2
Set Intel Filters Control Response Format
(Intel Command 0x00) ...............................................................................................787
10.6.2.4
Set Intel Filters Control — IP Filters Control Command
(Intel Command 0x00, Filter Control Index 0x00) .............................................................787
10.6.2.4.1
Set Intel Filters Control — IP Filters Control Response
(Intel Command 0x00, Filter Control Index 0x00) ..........................................................788
10.6.2.5
Get Intel Filters Control Commands
(Intel Command 0x01)..................................................................................................788
10.6.2.5.1
Get Intel Filters Control — IP Filters Control Command
(Intel Command 0x01, Filter Control Index 0x00) ..........................................................788
10.6.2.5.2
Get Intel Filters Control — IP Filters Control Response
(Intel Command 0x01, Filter Control Index 0x00) ..........................................................789
10.6.2.6
Set Intel Filters Formats................................................................................................789
10.6.2.6.1
Set Intel Filters Command (Intel Command 0x02) .........................................................789
10.6.2.6.2
Set Intel Filters Response (Intel Command 0x02) ..........................................................789
10.6.2.6.3
Set Intel Filters — Manageability to Host Command
(Intel Command 0x02, Filter Parameter 0x0A)...............................................................789
10.6.2.6.4
Set Intel Filters — Manageability to Host Response
(Intel Command 0x02, Filter Parameter 0x0A)...............................................................790
10.6.2.6.5
Set Intel Filters — Flex Filter 0 Enable Mask and Length Command
(Intel Command 0x02, Filter Parameter 0x10/0x20/0x30/0x40) ......................................790
10.6.2.6.6
Set Intel Filters — Flex Filter 0 Enable Mask and Length Response
(Intel Command 0x02, Filter Parameter 0x10/0x20/0x30/0x40) ......................................791
10.6.2.6.7
Set Intel Filters — Flex Filter 0 Data Command
(Intel Command 0x02, Filter Parameter 0x11/0x21/0x31/0x41) ......................................791
10.6.2.6.8
Set Intel Filters — Flex Filter 0 Data Response
(Intel Command 0x02, Filter Parameter 0x11/0x21/0x31/0x41) ......................................792
10.6.2.6.9
Set Intel Filters — Packet Addition Decision Filter Command
(Intel Command 0x02, Filter Parameter 0x61) ...............................................................792
10.6.2.6.10 Set Intel Filters — Packet Addition Decision Filter Response
(Intel Command 0x02, Filter Parameter 0x61) ...............................................................794
10.6.2.6.11 Set Intel Filters — Flex TCP/UDP Port Filter Command
(Intel Command 0x02, Filter Parameter 0x63) ...............................................................794
10.6.2.6.12 Set Intel Filters — Flex TCP/UDP Port Filter Response
(Intel Command 0x02, Filter Parameter 0x63) ...............................................................794
10.6.2.6.13 Set Intel Filters — IPv4 Filter Command
(Intel Command 0x02, Filter Parameter 0x64) ...............................................................794
10.6.2.6.14 Set Intel Filters — IPv4 Filter Response
(Intel Command 0x02, Filter Parameter 0x64) ...............................................................795
10.6.2.6.15 Set Intel Filters — IPv6 Filter Command
(Intel Command 0x02, Filter Parameter 0x65) ...............................................................795
10.6.2.6.16 Set Intel Filters — IPv6 Filter Response
(Intel Command 0x02, Filter Parameter 0x65) ...............................................................796
Intel® 82576 GbE Controller
Datasheet
36
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
10.6.2.6.17
Set Intel Filters - EtherType Filter Command
(Intel Command 0x02, Filter parameter 0x67)............................................................... 796
10.6.2.6.18 Set Intel Filters - EtherType Filter Response (Intel Command 0x02, Filter parameter 0x67).796
10.6.2.6.19 Set Intel Filters - Packet Addition Extended Decision Filter
Command (Intel Command 0x02, Filter parameter 0x68)................................................797
10.6.2.6.20 Set Intel Filters – Packet Addition Extended Decision Filter
Response (Intel Command 0x02, Filter parameter 0x68).................................................799
10.6.2.7
Get Intel Filters Formats ............................................................................................... 799
10.6.2.7.1
Get Intel Filters Command (Intel Command 0x03)......................................................... 799
10.6.2.7.2
Get Intel Filters Response (Intel Command 0x03).......................................................... 799
10.6.2.7.3
Get Intel Filters — Manageability to Host Command
(Intel Command 0x03, Filter Parameter 0x0A)............................................................... 800
10.6.2.7.4
Get Intel Filters — Manageability to Host Response
(Intel Command 0x03, Filter Parameter 0x0A)............................................................... 800
10.6.2.7.5
Get Intel Filters — Flex Filter 0 Enable Mask and Length Command
(Intel Command 0x03, Filter Parameter 0x10/0x20/0x30/0x40) ......................................801
10.6.2.7.6
Get Intel Filters — Flex Filter 0 Enable Mask and Length Response
(Intel Command 0x03, Filter Parameter 0x10/0x20/0x30/0x40) ......................................801
10.6.2.7.7
Get Intel Filters — Flex Filter 0 Data Command
(Intel Command 0x03, Filter Parameter 0x11/0x21/0x31/0x41) ......................................801
10.6.2.7.8
Get Intel Filters — Flex Filter 0 Data Response
(Intel Command 0x03, Filter Parameter 0x11)............................................................... 802
10.6.2.7.9
Get Intel Filters — Packet Addition Decision Filter Command
(Intel Command 0x03, Filter Parameter 0x61)............................................................... 802
10.6.2.7.10 Get Intel Filters — Packet Addition Decision Filter Response
(Intel Command 0x03, Filter Parameter 0x0A)............................................................... 802
10.6.2.7.11 Get Intel Filters — Flex TCP/UDP Port Filter Command
(Intel Command 0x03, Filter Parameter 0x63)............................................................... 803
10.6.2.7.12 Get Intel Filters — Flex TCP/UDP Port Filter Response
(Intel Command 0x03, Filter Parameter 0x63)............................................................... 803
10.6.2.7.13 Get Intel Filters — IPv4 Filter Command
(Intel Command 0x03, Filter Parameter 0x64)............................................................... 803
10.6.2.7.14 Get Intel Filters — IPv4 Filter Response
(Intel Command 0x03, Filter Parameter 0x64)............................................................... 804
10.6.2.7.15 Get Intel Filters — IPv6 Filter Command
(Intel Command 0x03, Filter Parameter 0x65)............................................................... 804
10.6.2.7.16 Get Intel Filters — IPv6 Filter Response
(Intel Command 0x03, Filter parameter 0x65)............................................................... 805
10.6.2.8
Set Intel Packet Reduction Filters Formats....................................................................... 805
10.6.2.8.1
Set Intel Packet Reduction Filters Command
(Intel Command 0x04) ............................................................................................... 805
10.6.2.8.2
Set Intel Packet Reduction Filters Response
(Intel Command 0x04) ............................................................................................... 805
10.6.2.8.3
Set Unicast Packet Reduction Command
(Intel Command 0x04, Reduction Filter Index 0x00).......................................................805
10.6.2.8.4
Set Unicast Packet Reduction Response
(Intel Command 0x04, Reduction Filter Index 0x00).......................................................807
10.6.2.8.5
Set Multicast Packet Reduction Command
(Intel Command 0x04, Reduction Filter Index 0x01).......................................................807
10.6.2.8.6
Set Multicast Packet Reduction Response (Intel Command 0x04, Reduction Filter Index 0x01) .
809
10.6.2.8.7
Set Broadcast Packet Reduction Command
(Intel Command 0x04, Reduction Filter Index 0x02).......................................................809
10.6.2.8.8
Set Broadcast Packet Reduction Response
(Intel Command 0x08) ............................................................................................... 811
10.6.2.9
Get Intel Packet Reduction Filters Formats....................................................................... 811
10.6.2.9.1
Get Intel Packet Reduction Filters Command
(Intel Command 0x05) ............................................................................................... 811
10.6.2.9.2
Set Intel Packet Reduction Filters Response
(Intel Command 0x05) ............................................................................................... 811
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
37
Intel® 82576 GbE Controller — Contents
10.6.2.9.3
Get Unicast Packet Reduction Command
(Intel Command 0x05, Reduction Filter Index 0x00).......................................................812
10.6.2.9.4
Get Unicast Packet Reduction Response
(Intel Command 0x05, Reduction Filter Index 0x00).......................................................812
10.6.2.9.5
Get Multicast Packet Reduction Command
(Intel Command 0x05, Reduction Filter Index 0x01).......................................................812
10.6.2.9.6
Get Multicast Packet Reduction Response
(Intel Command 0x05, Reduction Filter Index 0x01).......................................................813
10.6.2.9.7
Get Broadcast Packet Reduction Command
(Intel Command 0x05, Reduction Filter Index 0x02).......................................................813
10.6.2.9.8
Get Broadcast Packet Reduction Response
(Intel Command 0x05, Reduction Filter Index 0x02).......................................................813
10.6.2.10
System MAC Address....................................................................................................813
10.6.2.10.1 Get System MAC Address Command
(Intel Command 0x06) ...............................................................................................813
10.6.2.10.2 Get System MAC Address Response
(Intel Command 0x06) ...............................................................................................814
10.6.2.11
Set Intel Management Control Formats ...........................................................................814
10.6.2.11.1 Set Intel Management Control Command
(Intel Command 0x20) ...............................................................................................814
10.6.2.11.2 Set Intel Management Control Response
(Intel Command 0x20) ...............................................................................................815
10.6.2.12
Get Intel Management Control Formats ...........................................................................815
10.6.2.12.1 Get Intel Management Control Command
(Intel Command 0x21) ...............................................................................................815
10.6.2.12.2 Get Intel Management Control Response
(Intel Command 0x21) ...............................................................................................815
10.6.2.13
TCO Reset...................................................................................................................816
10.6.2.13.1 Perform Intel TCO Reset Command
(Intel Command 0x22) ...............................................................................................816
10.6.2.13.2 Perform Intel TCO Reset Response (Intel Command 0x22)..............................................817
10.6.2.14
Checksum Offloading ....................................................................................................817
10.6.2.14.1 Enable Checksum Offloading Command
(Intel Command 0x23) ...............................................................................................817
10.6.2.14.2 Enable Checksum Offloading Response
(Intel Command 0x23) ...............................................................................................817
10.6.2.14.3 Disable Checksum Offloading Command
(Intel Command 0x24) ...............................................................................................818
10.6.2.14.4 Disable Checksum Offloading Response
(Intel Command 0x24) ...............................................................................................818
10.6.2.15
MACSec Control Commands format (Intel Command 0x30)................................................818
10.6.2.15.1 Transfer MACSec Ownership to MC Command (Intel Command 0x30, Parameter 0x10) ......818
10.6.2.15.2 Transfer MACSec Ownership to MC Response (Intel Command 0x30, Parameter 0x10) .......819
10.6.2.15.3 Transfer MACSec Ownership to Host Command (Intel Command 0x30, Parameter 0x11) ....819
10.6.2.15.4 Transfer MACSec Ownership to Host Response (Intel Command 0x30, Parameter 0x11) .....819
10.6.2.15.5 Initialize MACSec RX Command (Intel Command 0x30, Parameter 0x12)..........................820
10.6.2.15.6 Initialize MACSec RX Response (Intel Command 0x30, Parameter 0x12)...........................820
10.6.2.15.7 Initialize MACSec TX Command (Intel Command 0x30, Parameter 0x13) ..........................820
10.6.2.15.8 Initialize MACSec TX Response (Intel Command 0x30, Parameter 0x13) ...........................821
10.6.2.15.9 Set MACSec RX Key Command (Intel Command 0x30, Parameter 0x14)...........................822
10.6.2.15.10 Set MACSec RX Key Response (Intel Command 0x30, Parameter 0x14)............................822
10.6.2.15.11 Set MACSec TX Key Command (Intel Command 0x30, Parameter 0x15) ...........................823
10.6.2.15.12 Set MACSec TX Key Response (Intel Command 0x30, Parameter 0x15) ............................823
10.6.2.15.13 Enable Network TX Encryption Command (Intel Command 0x30, Parameter 0x16) ............823
10.6.2.15.14 Enable Network TX Encryption Response (Intel Command 0x30, Parameter 0x16) .............824
10.6.2.15.15 Disable Network TX Encryption Command (Intel Command 0x30, Parameter 0x17)............824
10.6.2.15.16 Disable Network TX Encryption Response (Intel Command 0x30, Parameter 0x17) ............824
10.6.2.15.17 Enable Network RX Decryption Command (Intel Command 0x30, Parameter 0x18) ............825
10.6.2.15.18 Enable Network RX Decryption Response (Intel Command 0x30, Parameter 0x18).............825
10.6.2.15.19 Disable Network RX Decryption Command (Intel Command 0x30, Parameter 0x19) ...........825
10.6.2.15.20 Disable Network RX Decryption Response (Intel Command 0x30, Parameter 0x19) ............826
Intel® 82576 GbE Controller
Datasheet
38
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
10.6.2.15.21 Get MACSec Parameters format (Intel Command 0x31)..................................................826
10.6.2.15.22 Get MACSec RX Parameters Command (Intel Command 0x31, Parameter 0x01) ................826
10.6.2.15.23 Get MACSec RX Parameters Response (Intel Command 0x31, Parameter 0x01).................826
10.6.2.15.24 Get MACSec TX Parameters Command (Intel Command 0x31, Parameter 0x02) ................827
10.6.2.15.25 Get MACSec TX Parameters Response (Intel Command 0x31, Parameter 0x02) .................828
10.6.2.16
MACSec AEN (Intel AEN 0x80) ....................................................................................... 829
10.6.3
Basic NC-SI Workflows........................................................................................................829
10.6.3.1
Package States ............................................................................................................ 829
10.6.3.2
Channel States ............................................................................................................ 830
10.6.3.3
Discovery.................................................................................................................... 830
10.6.3.4
Configurations ............................................................................................................. 830
10.6.3.4.1
NC Capabilities Advertisement .................................................................................... 830
10.6.3.4.2
Receive Filtering ....................................................................................................... 831
10.6.3.4.2.1
MAC Address Filtering ......................................................................................... 831
10.6.3.4.3
VLAN....................................................................................................................... 831
10.6.3.5
Pass-Through Traffic States ........................................................................................... 832
10.6.3.6
Channel Enable............................................................................................................ 832
10.6.3.7
Network Transmit Enable .............................................................................................. 832
10.6.4
Asynchronous Event Notifications .........................................................................................833
10.6.5
Querying Active Parameters.................................................................................................833
10.6.6
Resets ..............................................................................................................................833
10.6.7
Advanced Workflows...........................................................................................................833
10.6.7.1
Multi-NC Arbitration ..................................................................................................... 833
10.6.7.2
Package Selection Sequence Example ............................................................................. 834
10.6.7.3
External Link Control .................................................................................................... 835
10.6.7.4
Set Link While LAN PCIe Functionality is Disabled ............................................................. 835
10.6.7.5
Multiple Channels (Fail-Over)......................................................................................... 835
10.6.7.5.1
Fail-Over Algorithm Example ...................................................................................... 836
10.6.7.6
Statistics .................................................................................................................... 836
10.7
Manageability Host Interface ..................................................................................................... 837
10.7.1
HOST CSR Interface (Function 1/0) ......................................................................................837
10.7.2
Host Slave Command Interface to Manageability ....................................................................837
10.7.3
Host Slave Command Interface Low Level Flow ......................................................................837
10.7.4
Host Slave Command Registers ............................................................................................838
10.7.4.1
Host Interface Control Register
(CSR Address 0x8F00; AUX 0x0700)............................................................................... 838
10.7.4.2
Firmware Status 0 (FWS0R) Register
(CSR Address 0x8F0C; AUX 0x0702) .............................................................................. 838
10.7.4.3
Software Status Register (CSR Address 0x8F10; AUX 0x0703) ...........................................838
10.7.5
Host Interface Command Structure.......................................................................................838
10.7.6
Host Interface Status Structure ............................................................................................838
10.7.7
Checksum Calculation Algorithm...........................................................................................839
10.7.8
Host Slave Interface Commands...........................................................................................839
10.7.9
Fail-Over Configuration Host Command .................................................................................839
10.7.10
Read Fail-Over Configuration Host Command .........................................................................840
10.8
MACSec and Manageability ....................................................................................................... 840
10.8.1
Handover of MACSec Responsibility Between MC and Host .......................................................842
10.8.1.1
KaY Ownership Release by the Host................................................................................ 842
10.8.1.2
KaY Ownership Takeover by BMC ................................................................................... 842
10.8.1.3
KaY Ownership Request by the Host ............................................................................... 842
10.8.1.4
KaY Ownership Release by BMC ..................................................................................... 843
10.8.1.5
Control Registers ......................................................................................................... 843
10.8.2
Filtering of Non-MACSec Packets ..........................................................................................844
10.8.3
Sending of clear packets in a MACSec environment .................................................................844
11.0 Electrical / Mechanical Specification ........................................................................................847
11.1
Introduction............................................................................................................................ 847
11.2
Operating Conditions ............................................................................................................... 848
11.2.1
Recommended Operating Conditions .....................................................................................848
11.3
Power Delivery ........................................................................................................................ 848
11.3.1
Power Supply Specification ..................................................................................................848
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
39
Intel® 82576 GbE Controller — Contents
11.3.1.1
Power On/Off Sequence ................................................................................................850
11.4
DC/AC Specification ................................................................................................................. 851
11.4.1
Ball Summary ................................................................................................................... 851
11.4.2
DC specifications ............................................................................................................... 851
11.4.2.1
Current Consumption....................................................................................................851
11.4.2.2
Digital I/O...................................................................................................................854
11.4.2.3
Open Drain I/Os ..........................................................................................................855
11.4.2.4
NC-SI Input and Output Pads ........................................................................................856
11.4.3
Digital I/F AC Specifications ................................................................................................ 856
11.4.3.1
Digital I/O AC Specifications ..........................................................................................856
11.4.3.2
Reset signals ...............................................................................................................858
11.4.3.2.1
Internal_Power_On_Reset ..........................................................................................858
11.4.3.3
SMBus ........................................................................................................................858
11.4.3.4
FLASH AC Specification .................................................................................................860
11.4.3.5
EEPROM AC Specification ..............................................................................................861
11.4.3.6
NC-SI AC Specification..................................................................................................862
11.4.3.7
JTAG AC specification ...................................................................................................863
11.4.3.8
MDIO AC specification...................................................................................................864
11.4.3.9
SFP 2 Wires I/F AC Specification ....................................................................................865
11.4.3.10
PCIe/SerDes DC/AC Specification ...................................................................................865
11.4.3.11
PCIe Specification - Receiver .........................................................................................865
11.4.3.12
PCIe Specification - Transmitter .....................................................................................865
11.4.3.13
PCIe Specification - Input Clock .....................................................................................866
11.4.4
Serdes DC/AC Specification ................................................................................................ 866
11.4.4.1
Serdes Specification - Receiver ......................................................................................866
11.4.4.2
Serdes Specification - Transmitter ..................................................................................866
11.4.4.3
Serdes Specification -Input Clock ...................................................................................866
11.4.5
PHY Specification............................................................................................................... 866
11.4.6
XTAL/Clock specification..................................................................................................... 866
11.4.6.1
Crystal Specification .....................................................................................................866
11.4.6.2
External Clock Oscillator Specification .............................................................................867
11.4.7
RBIAS connection .............................................................................................................. 868
11.5
EEPROM Flash Devices ............................................................................................................. 869
11.5.1
Flash ............................................................................................................................... 869
11.5.2
EEPROM Device Options ..................................................................................................... 870
11.6
Package Information ................................................................................................................ 870
11.6.1
Mechanical ....................................................................................................................... 870
11.6.2
Intel® 82576 GbE Controller Package .................................................................................. 871
11.6.2.1
Package Schematics .....................................................................................................871
12.0 Design Guidelines.................................................................................................................... 881
12.1
82575/82576 .......................................................................................................................... 881
12.1.1
Pin Out Compatibility ......................................................................................................... 881
12.1.1.1
Printed Circuit Board Requirements ................................................................................881
12.1.1.2
82576 Design ..............................................................................................................882
12.1.1.3
82575 Design ..............................................................................................................882
12.2
Port Connection to the Device ................................................................................................... 882
12.2.1
PCIe Reference Clock ......................................................................................................... 882
12.2.2
Other PCIe Signals ............................................................................................................ 882
12.2.3
Physical Layer Features ...................................................................................................... 883
12.2.3.1
Link Width Configuration ...............................................................................................883
12.2.3.2
Polarity Inversion .........................................................................................................883
12.2.3.3
Lane Reversal..............................................................................................................883
12.2.4
PCIe Routing .................................................................................................................... 884
12.3
Ethernet Component Design Guidelines ...................................................................................... 884
12.3.1
General Design Considerations for Ethernet Controllers .......................................................... 884
12.3.1.1
Clock Source ...............................................................................................................885
12.3.1.2
Magnetics for 1000 BASE-T ...........................................................................................885
12.3.1.2.1
Magnetics Module Qualification Steps...........................................................................885
12.3.1.2.2
Magnetics Module for 1000 BASE-T Ethernet.................................................................885
12.3.1.2.3
Third-Party Magnetics Manufacturers ...........................................................................886
Intel® 82576 GbE Controller
Datasheet
40
320961-015EN
Revision: 2.61
December 2010
Contents — Intel® 82576 GbE Controller
12.3.1.2.4
Layout Guidelines for Use with Integrated and Discrete Magnetics ...................................886
12.3.2
Designing with the 82576 ....................................................................................................886
12.3.2.1
LAN Disable ................................................................................................................ 886
12.3.2.2
Serial EEPROM............................................................................................................. 886
12.3.2.2.1
EEPROM-less Operation ............................................................................................. 887
12.3.2.2.2
SPI EEPROMs ........................................................................................................... 887
12.3.2.2.3
EEUPDATE ............................................................................................................... 887
12.3.2.3
FLASH ........................................................................................................................ 887
12.3.2.3.1
FLASH Device Information.......................................................................................... 887
12.3.3
SMBus and NC-SI...............................................................................................................887
12.3.4
NC-SI Electrical Interface Requirements ................................................................................888
12.3.4.1
External Baseboard Management Controller (BMC) ........................................................... 888
12.3.4.2
Schematic Showing Pull-ups and Pull-downs for NC-SI Interface.........................................889
12.3.4.3
Resets ........................................................................................................................ 890
12.3.4.4
Layout Requirements....................................................................................................890
12.3.4.4.1
Board Impedance......................................................................................................890
12.3.4.4.2
Trace Length Restrictions ........................................................................................... 890
12.3.5
Power Supplies for the Intel® 82576 GbE Controller ................................................................891
12.3.5.1
Power Sequencing........................................................................................................ 893
12.3.5.1.1
Using Regulators With Enable Pins............................................................................... 894
12.3.5.2
Device Power Supply Filtering ........................................................................................ 894
12.3.5.3
Power Management and Wake Up .................................................................................. 895
12.3.6
Device Test Capability.........................................................................................................895
12.3.7
Software-Definable Pins (SDPs)............................................................................................895
12.4
Frequency Control Device Design Considerations ......................................................................... 896
12.4.1
Frequency Control Component Types ....................................................................................896
12.4.1.1
Quartz Crystal ............................................................................................................. 896
12.4.1.2
Fixed Crystal Oscillator ................................................................................................. 896
12.4.1.3
Programmable Crystal Oscillators ................................................................................... 896
12.4.1.4
Ceramic Resonator ....................................................................................................... 897
12.5
Crystal Selection Parameters..................................................................................................... 897
12.5.1
Vibrational Mode ................................................................................................................897
12.5.2
Nominal Frequency.............................................................................................................898
12.5.3
Frequency Tolerance...........................................................................................................898
12.5.4
Temperature Stability and Environmental Requirements ..........................................................898
12.5.5
Calibration Mode ................................................................................................................898
12.5.6
Load Capacitance ...............................................................................................................899
12.5.7
Shunt Capacitance .............................................................................................................899
12.5.8
Equivalent Series Resistance................................................................................................900
12.5.9
Drive Level ........................................................................................................................900
12.5.10
Aging ...............................................................................................................................900
12.5.11
Reference Crystal ...............................................................................................................900
12.5.11.1
Reference Crystal Selection ...........................................................................................900
12.5.11.2
Circuit Board ............................................................................................................... 901
12.5.11.3
Temperature Changes .................................................................................................. 901
12.6
Oscillator Support.................................................................................................................... 901
12.6.1
Oscillator Solution ..............................................................................................................902
12.7
Ethernet Component Layout Guidelines ...................................................................................... 903
12.7.1
Layout Considerations.........................................................................................................903
12.7.1.1
Guidelines for Component Placement .............................................................................. 903
12.7.1.2
Crystals and Oscillators................................................................................................. 906
12.7.1.2.1
Crystal layout considerations ...................................................................................... 906
12.7.1.3
Board Stack Up Recommendations ................................................................................. 907
12.7.1.4
Differential Pair Trace Routing for 10/100/1000 Designs....................................................907
12.7.1.4.1
Signal Termination and Coupling ................................................................................. 908
12.7.1.5
Signal Trace Geometry for 1000 BASE-T Designs.............................................................. 909
12.7.1.6
Trace Length and Symmetry for 1000 BASE-T Designs ......................................................909
12.7.1.6.1
Signal Detect............................................................................................................ 909
12.7.1.7
Routing 1.8 V to the Magnetics Center Tap ...................................................................... 909
12.7.1.8
Impedance Discontinuities............................................................................................. 910
12.7.1.9
Reducing Circuit Inductance .......................................................................................... 910
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
41
Intel® 82576 GbE Controller — Contents
12.7.1.10
Signal Isolation............................................................................................................910
12.7.1.11
Power and Ground Planes..............................................................................................910
12.7.1.12
Traces for Decoupling Capacitors....................................................................................911
12.7.1.13
Light Emitting Diodes for Designs Based on the 82576 ......................................................911
12.7.1.14
Thermal Design Considerations ......................................................................................911
12.7.2
Physical Layer Conformance Testing .................................................................................... 911
12.7.2.1
Conformance Tests for 10/100/1000 Mbps Designs...........................................................911
12.7.3
Troubleshooting Common Physical Layout Issues ................................................................... 912
12.8
Serdes Implementation ............................................................................................................ 912
12.8.1
Connecting the Serdes Interface.......................................................................................... 912
12.8.2
Output voltage Adjustment ................................................................................................. 913
12.8.3
Output Voltage Adjustment................................................................................................. 914
12.9
Thermal Management .............................................................................................................. 914
12.10
Reference Schematics .............................................................................................................. 914
12.11
Checklists ............................................................................................................................... 914
12.12
Symbols ................................................................................................................................. 914
13.0 Thermal Design Specifications................................................................................................. 915
13.1
Product Package Thermal Specification ....................................................................................... 915
13.2
Introduction............................................................................................................................ 915
13.3
Measuring the Thermal Conditions ............................................................................................. 916
13.4
Thermal Considerations ............................................................................................................ 916
13.5
Packaging Terminology............................................................................................................. 916
13.6
Thermal Specifications ............................................................................................................. 917
13.6.1
Case Temperature ............................................................................................................. 917
13.7
Thermal Attributes................................................................................................................... 917
13.7.1
Designing for Thermal Performance ..................................................................................... 917
13.7.2
Typical System Definitions .................................................................................................. 918
13.7.3
Package Thermal Characteristics ......................................................................................... 918
13.7.4
Clearance......................................................................................................................... 920
13.7.5
Default Enhanced Thermal Solution...................................................................................... 921
13.7.6
Extruded Heat sinks........................................................................................................... 921
13.7.7
Attaching the Extruded Heat sink......................................................................................... 921
13.7.7.1
Clips..........................................................................................................................921
13.7.7.2
Thermal Interface Material (PCM45F) ..............................................................................922
13.7.8
Reliability ......................................................................................................................... 923
13.7.9
Thermal Interface Management for Heat-Sink Solutions.......................................................... 923
13.7.9.1
Bond Line Management................................................................................................924
13.7.9.2
Interface Material Performance ......................................................................................924
13.7.9.2.1
Thermal Resistance of Material ...................................................................................924
13.7.9.2.2
Wetting/Filling Characteristics of Material .....................................................................924
13.8
Measurements for Thermal Specifications.................................................................................... 924
13.8.1
Case Temperature Measurements ........................................................................................ 924
13.8.1.1
Attaching the Thermocouple (No Heat Sink)....................................................................925
13.8.1.2
Attaching the Thermocouple (Heat Sink) .........................................................................925
13.9
Heat Sink and Attach Suppliers................................................................................................. 926
13.10
PCB Guidelines ........................................................................................................................ 926
14.0 Diagnostics ............................................................................................................................. 929
14.1
JTAG Test Mode Description ...................................................................................................... 929
15.0 Models, Symbols, Testing Options, Schematics and Checklists................................................. 931
15.1
Models and Symbols ................................................................................................................ 931
15.2
Physical Layer Conformance Testing........................................................................................... 931
15.3
Schematics ............................................................................................................................. 931
15.4
Checklists ............................................................................................................................... 931
Appendix A.
Changes from the 82575 ............................................................................ 933
Intel® 82576 GbE Controller
Datasheet
42
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
1.0
Introduction
The Intel® 82576 GbE Controller is a single, compact, low power component that offers two fullyintegrated Gigabit Ethernet Media Access Control (MAC) and physical layer (PHY) ports. This device
uses the PCIe* v2.0 (2.5GT/s). The 82576 enables two-port implementation in a relatively small area
and can be used for server system configurations such as rack mounted or pedestal servers, where the
82576 can be used as add-on NIC or LAN on Motherboard (LOM) design. Another system configuration
is blade servers, where it can be used as LOM. The 82576 can also be used in embedded applications
such as switch add-on cards and network appliances.
1.1
Scope
This document presents the external architecture (including device operation, pin descriptions, register
definitions, etc.) for the 82576, a dual 10/100/1000 LAN controller.
This document is intended to be a reference for software device driver developers, board designers,
test engineers, or others who may need specific technical or programming information.
1.2
Table 1-1.
Terminology and Acronyms
Glossary
Definition
Meaning
1000BASE-BX
1000BASE-BX is the PICMG 3.1 electrical specification for
transmission of
1 Gb/s Ethernet or 1 Gb/s fibre channel encoded data over the
backplane.
1000BASE-CX
1000BASE-X over specialty shielded 150  balanced copper
jumper cable assemblies as specified in IEEE 802.3 Clause 39.
1000BASE-T
1000BASE-T is the specification for 1 Gb/s Ethernet over
category 5e twisted pair cables as defined in IEEE 802.3
clause 40.
AH
IP Authentication Header - An IPsec header providing
authentication capabilities defined in RFC 4302.1
b/w
Bandwidth.
BIOS
Basic Input/Output System.
BMC
Baseboard Management Controller.
BT
Byte Time.
BWG
Bandwidth Group.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
43
Intel® 82576 GbE Controller — Introduction
Table 1-1.
Glossary (Continued)
Definition
Meaning
CA
Secure Connectivity Association (CA): A security relationship,
established and maintained by key agreement protocols. This
comprises a fully connected subset of the service access
points in stations attached to a single LAN that are to be
supported by MACSec.
CPID
Congestion Point Identifier.
CTS
Cisco Trusted Security
DCA
Intel® QuickData (Direct Cache Access).
DFP
Deficit Fixed Priority.
DFT
Design for Testability.
DQ
Descriptor Queue.
EEPROM
Electrically Erasable Programmable Memory. A non-volatile
memory located on the LAN controller that is directly
accessible from the host.
EOP
End of Packet.
ESP
IP Encapsulating Security Payload - An IPsec header providing
encryption and authentication capabilities defined in RFC
4303.1
FC
Flow Control.
Firmware (FW)
Embedded code on the LAN controller that is responsible for
the implementation of the NC-SI protocol and pass through
functionality.
Host Interface
RAM on the LAN controller that is shared between the
firmware and the host. RAM is used to pass commands from
the host to firmware and responses from the firmware to the
host.
HPC
High - Performance Computing.
IPC
Inter Processor Communication.
IPG
Inter Packet Gap.
LAN (auxiliary Power-Up)
The event of connecting the LAN controller to a power source
(occurs even before system power-up).
LOM
LAN on Motherboard.
LSO
Large Send Offload.
MAC
Media Access Control.
MDIO
Management Data Input/Output Interface over MDC/MDIO
lines.
MIFS/MIPG
Minimum Inter Frame Spacing/Minimum Inter Packet Gap.
MMW
Maximum Memory Window.
MSS
Maximum Segment Size.
NIC
Network Interface Controller.
Intel® 82576 GbE Controller
Datasheet
44
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
Table 1-1.
Glossary (Continued)
Definition
Meaning
PCS
Physical Coding Sub layer.
PF
Physical Function (in a virtualization context).
PHY
Physical Layer Device.
PMA
Physical Medium Attachment.
PMD
Physical Medium Dependent.
PN (in a MACSec context)
packet number (PN): A monotonically increasing value used to
uniquely identify a MACSec frame in the sequence of frames
transmitted using an SA.
NC-SI (Type C)
Reduced Media Independent Interface (Reduced MII).
SA
Source Address.
SA (in a MACSec context)
Secure Association (SA): A security relationship that provides
security guarantees for frames transmitted from one member
of a CA to the others. Each SA is supported by a single secret
key, or a single set of keys where the cryptographic
operations used to protect one frame require more than one
key.
SC
Secure Channel (SC): A security relationship used to provide
security guarantees for frames transmitted from one member
of a CA to the others. An SC is supported by a sequence of
SAs thus allowing the periodic use of fresh keys without
terminating the relationship.
SCI
A globally unique identifier for a secure channel, comprising a
globally unique MAC Address and a Port Identifier, unique
within the system allocated that address.
SDP
Software Defined Pins.
SerDes
Serializer and De-Serializer Circuit.
SFD
Start Frame Delimiter.
SGMII
Serialized Gigabit Media Independent Interface.
SMBus
System Management Bus. A bus that carries various
manageability components, including the LAN controller,
BIOS, sensors and remote-control devices.
TRL
Transmit Rate Limiting or Transmit Rate Limiter, according to
the context.
TSO
Transmit Segmentation offload - A mode in which a large TCP/
UDP I/O is handled to the device and the device segments it
to L2 packets according to the requested MSS.
VF
Virtual Function.
VM
Virtual Machine.
VPD
Vital Product Data (PCI protocol).
1. The IPsec function is present in the 82576EB SKU. IPsec is removed from the 82576NS SKU.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
45
Intel® 82576 GbE Controller — Introduction
1.2.1
External Specification and Documents
The 82576 implements features from the following specifications.
1.2.1.1
Network Interface Documents
1. IEEE standard 802.3, 2005 Edition (Ethernet). Incorporates various IEEE Standards previously
published separately. Institute of Electrical and Electronic Engineers (IEEE).
2. IEEE standard 1149.1, 2001 Edition (JTAG). Institute of Electrical and Electronics Engineers (IEEE)
3. IEEE standard 802.1Q for VLAN
4. PICMG3.1 Ethernet/Fibre Channel Over PICMG 3.0 Draft Specification January 14, 2003 Version
D1.0
5. Serial-GMII Specification, Cisco Systems document ENG-46158, Revision 1.7.
6. INF-8074i Specification for SFP (Small Formfactor Pluggable) Transceiver (ftp://ftp.seagate.com/
sff)
1.2.1.2
Host Interface Documents
1. PCI-Express 2.0 Base specification, Revision 1.0
2. PCI Specification, version 3.0
3. PCI Bus Power Management Interface Specification, Rev. 1.2, March 2004
4. Advanced Configuration and Power Interface Specification, Rev 2.0b, October 2002
1.2.1.3
Virtualization Documents
1. PCI-Express Single Root I/O Virtualization and Sharing Specification rev 0.9
2. PCI sig Alternative Routing-ID Interpretation (ARI) ECN (http://teamsites.ch.ith.intel.com/sites/
PASDPA/PCIe/PCI%20Express%20Product_Spec%20Coordination/pages/
PCISIG%20WIP%20Docs.aspx)
1.2.1.4
Networking Protocol Documents
1. IPv4 specification (RFC 791)
2. IPv6 specification (RFC 2460)
3. TCP/UDP specification (RFC 793/768)
4. SCTP specification (RFC 2960)
5. ARP specification (RFC 826)
6. EUI-64 specification, http://standards.ieee.org/regauth/oui/tutorials/EUI64.html.
1.2.1.5
Manageability documents
1. DMTF Network Controller Sideband Interface (NC-SI) Specification rev 0.7. This product is Type C.
2. System Management Bus (SMBus) Specification, SBS Implementers Forum, Ver. 2.0, August 2000
1.2.1.6
Security Documents
1. IEEE P802.1AE/D5.1 — Draft Standard for Local and Metropolitan Area Networks — Media Access
Control (MAC) Security.
2. The Use of Galois/Counter Mode (GCM) in IPsec Encapsulating Security Payload (ESP) (RFC 4106)
Intel® 82576 GbE Controller
Datasheet
46
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
3. IP Authentication Header (AH) (RFC 4302)
4. IP Encapsulating Security Payload (ESP) (RFC 4303)
5. The Use of Galois Message Authentication Code (GMAC) in IPsec ESP and AH (RFC 4543).
1.2.2
Intel Application Notes
1. Intel® Ethernet Controllers Loopback Modes - application note.
1.2.3
Reference Schematics
Reference schematics (SERDES\FIBER\SFP and COPPER) are available as a separate document through
Intel documentation channels.
1.2.4
Checklists
The Schematic Checklist and the Layout and Placement Checklist are available as a separate document
through Intel documentation channels.
1.3
Product Overview
The 82576 supports 2 ports with either an internal PHY or a SerDes or SGMII port which may connected
to an external PHY or directly to a blade connection for MAC to MAC communication.
1.3.1
System Configurations
The 82576 targets server system configurations such as rack mounted or pedestal servers, where the
82576 can be used as add-on NIC or LAN on Motherboard (LOM) design. Another system configuration
is blade servers, where it can be used as LOM. The 82576 can also be used in embedded applications
such as switch add-on cards and network appliances.
1.4
External Interface
1.4.1
PCIe* Interface
The PCIe v2.0 (2.5GT/s) interface is used by the 82576 as a host interface. It supports x4, x2 and x1
configurations, while each lane runs at 2.5 GHz speed. The maximum aggregated raw bandwidth for a
typical x4 configuration is 8 Gb/s in each direction. See Chapter 2.0 for a full description. The timing
characteristics of this interface are defined in PCI Express Card Electromechanical Specification rev 1.0
and in the PCIe v2.0 (2.5GT/s) specification.
1.4.2
Network interfaces
Two independent interfaces are used to connect the two 82576 ports to external devices. The following
protocols are supported:
• 10BASE-T and 100BASE-T.
• 1000Base-T interface to attach directly to a CAT 5e wire.
• SerDes interface to connect over a backplane to another SerDes compliant device or to an optic
module.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
47
Intel® 82576 GbE Controller — Introduction
• SGMII interface to attach to an external PHY, either on board or via an SFP module. The SGMII
shares the same interface as the SerDes.
• MDI (Copper) support for standard IEEE 802.3 Ethernet interface for 1000BASE-T, 100BASE-TX,
and 10BASE-T applications (802.3, 802.3u, and 802.3ab).
See Section 2.1.8.2 and Section 2.1.6 for full pin description; Section 11.4.4.1 to Section 11.4.4.3 for
timing characteristics of this interface.
1.4.3
EEPROM Interface
The 82576 uses an EEPROM device for storing product configuration information. Several words of the
EEPROM are accessed automatically by the 82576 after reset in order to provide pre-boot configuration
data that must be available to the 82576 before it is accessed by host software. The remainder of the
stored information is accessed by various software modules used to report product configuration, serial
number, etc.
The 82576 is intended for use with an SPI (4-wire) serial EEPROM device such as an AT25040AN or
compatible. See Section 2.1.2 for full pin description and Section 11.4.3.5 for timing characteristics of
this interface.
The 82576 also supports an EEPROM-less mode, where all of the setup is done by software.
1.4.4
Serial Flash Interface
The 82576 provides an external SPI serial interface to a Flash or Boot ROM device such as the Atmel*
AT25F1024 or AT25FB512. The 82576 supports serial Flash devices with up to 64 Mb (8 MB) of
memory. The size of the Flash used by the 82576 can be configured by the EEPROM. See Section 2.1.2
for full pin description and Section 11.4.3.4 for timing characteristics of this interface.
Note:
1.4.5
Though the 82576 supports devices with up to 8 MB of memory, bigger devices can also be
used. Accesses to memory beyond the Flash device size results in access wrapping as only
the lower address bits are used by the Flash device.
SMBus Interface
SMBus is an optional interface for pass-through and/or configuration traffic between a MC and the
82576.
The 82576's SMBus interface can be configured to support both slow and fast timing modes. See
Section 2.1.3 for full pin description and Section 11.4.3.3 for timing characteristics of this interface.
1.4.6
NC-SI Interface
NC-SI and SMBus interfaces are optional for pass-through and/or configuration traffic between a MC
and the 82576. The NC-SI interface meets the DMTF NC-SI Specification, Rev. 1.0.0.a.
1.4.7
MDIO/2 wires Interfaces
The 82576 implements two management Interfaces for control of an optional external PHY. Each
interface can be either a 2 wires interface used to control an SFP module or MDIO/MDC Management
Interface for control plane connection between the MAC and PHY devices (master side). This interface
provides the MAC and software with the ability to monitor and control the state of the PHY. The 82576
supports the data formats of 802.3 clause 22. Each MDIO interface should be connected to the relevant
PHY.
Intel® 82576 GbE Controller
Datasheet
48
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
See Section 2.1.7 for full pin description and Section 11.4.3.9 for timing characteristics of this
interface.
1.4.8
Software-Definable Pins (SDP) Interface (General-Purpose
I/O)
The 82576 has four software-defined pins (SDP pins) per port that can be used for miscellaneous
hardware or software-control purposes. These pins can be individually configurable to act as either
input or output pins. The default direction of each pin is configurable via the EEPROM (see Section 6.2.8
and Section 6.2.9), as well as the default value of all pins configured as outputs. To avoid signal
contention, all pins are set as input pins until the EEPROM configuration is loaded. All four of the SDP
pins can be configured for use as general-purpose interrupt (GPI) inputs. To act as GPI pins, the desired
pins must be configured as inputs. A corresponding GPI interrupt-detection enable bit is then used to
enable rising-edge detection of the input pin (rising-edge detection occurs by comparing values
sampled at the internal clock rate, as opposed to an edge-detection circuit). When detected, a
corresponding GPI interrupt is indicated in the Interrupt Cause register.
The use, direction, and values of SDP pins are controlled and accessed using fields in the Device Control
(CTRL) register and Extended Device Control (CTRL_EXT) register.
See Section 2.1.5 for full pin description of this interface.
1.4.9
LEDs Interface
The 82576 implements four output drivers per port intended for driving external LED circuits. Each of
the four LED outputs can be individually configured to select the particular event, state, or activity,
which is indicated on that output. In addition, each LED can be individually configured for output
polarity as well as for blinking versus non-blinking (steady-state) indication.
The configuration for LED outputs is specified via the LEDCTL register. Furthermore, the hardwaredefault configuration for all LED outputs can be specified via EEPROM fields (see Section 6.2.19 and
Section 6.2.21), thereby supporting LED displays configurable to a particular OEM preference.
See Section 2.1.8.1 for full pin description of this interface.
See Section 7.5 for more detailed description of LED behavior.
1.5
Comparing Product Features
The following tables compare features of similar Intel components.
Table 1-2.
82576 Features
Feature
82576
82575
82571EB
Number of ports
2
2
2
Serial FLASH interface
Y
Y
Y
4-wire SPI EEPROM interface
Y
Y
Y
Configurable LED operation for software or OEM custom-tailoring of LED
displays
Y
Y
Y
Protected EEPROM space for private configuration
Y
Y
Y
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
49
Intel® 82576 GbE Controller — Introduction
Table 1-2.
82576 Features (Continued)
Device Disable capability
Y
Y
Y
Package size (mm x mm)
25x25
25x25
17x17
Y
Y
N
82576
82575
82571EB
Half duplex at 10/100 Mb/s operation and full duplex operation at all
supported speeds
Y
Y
Y
10/100/1000 Copper PHY integrated on-chip
Y
Y
Y
Jumbo frames supported
Y
Y
Y
9500
bytes
9500
bytes
9000
bytes
Flow control support: send/receive PAUSE frames and receive FIFO
thresholds
Y
Y
Y
Statistics for management and RMON
Y
Y
Y
802.1q VLAN support
Y
Y
Y
SerDes interface for external PHY connection or system interconnect
Y
Y
Y
SGMII interface for embedded applications
Y
Y
N
Fiber/copper auto-sense*
Y
Y
N
SerDes support of non-Auto-Negotiation partner
Y
Y
N
SerDes signal detect
Y
Y
N
Watchdog timer
Table 1-3.
82576 Network Features
Feature
Max size of jumbo frames supported
*
Table 1-4.
82576 Host Interface Features (Sheet 1 of 2)
Feature
82576
82575
82571EB
2.0
2.0
1.0a
PCIe physical layer
(2.5 GT/s)
2.5 GT/
s)
2.5 GT/s)
Bus width
x1, x2, x4
x1, x2,
x4
x1, x2, x4
64-bit address support for systems using more than
4 GB of physical memory
Y
Y
Y
Outstanding requests for Tx buffers
4
4
4
Outstanding requests for Tx descriptors
1
1
1
Outstanding requests for Rx descriptors
1
1
1
Credits for posted writes
2
2
2
512 B
256 B
256 B
PCIe revision
Max payload size supported
Intel® 82576 GbE Controller
Datasheet
50
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
Table 1-4.
82576 Host Interface Features (Sheet 2 of 2)
Max request size supported
512 B
512 B
256 B
Link layer retry buffer size
2 KB
2 KB
2 KB
Y
N
N
Vital Product Data (VPD)
Table 1-5.
82576 LAN Functions Features
Feature
82576
82575
82571EB
Programmable host memory receive buffers
Y
Y
Y
Descriptor ring management hardware for transmit and receive
Y
Y
Y
ACPI register set and power down functionality supporting D0 & D3
states
Y
Y
Y
Software controlled global reset bit (resets everything except the
configuration registers)
Y
Y
Y
Software Definable Pins (SDP) - per port
4
4
4
Four SDP pins can be configured as general purpose interrupts
Y
Y
Only 2
Wake up
Y
Y
Y
IPv6 wake-up filters
Y
Y
Y
Configurable (through the EEPROM) flexible filter
Y
Y
Y
Default configuration by the EEPROM for all LEDs for pre-driver
functionality
Y
Y
Y
LAN function disable capability
Y
Y
Y
Programmable memory transmit buffers (up to 32 KB)
Y
Y
Y
Double VLAN
Y
Y
N
IEEE 1588
Y
N
N
82576
82575
82571EB
Y
Y
Y
Transmit Rate Limiting (TRL)
Y
N
N
IPv6 support for IP/TCP and IP/UDP receive checksum offload
Y
Y
Y
Fragmented UDP checksum offload for packet reassembly
Y
Y
Y
Message Signaled Interrupts (MSI)
Y
Y
Y
Message Signaled Interrupts (MSI-X)
Y
Y
N
Packet interrupt coalescing timers (packet timers) and absolutedelay interrupt timers for both transmit and receive operation
Y
N
N
Interrupt throttling control to limit maximum interrupt rate and
improve CPU utilization
Y
Y
Y
Table 1-6.
82576 LAN Performance Features
Feature
TCP segmentation offload
Up to 256 KB
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
51
Intel® 82576 GbE Controller — Introduction
Table 1-6.
82576 LAN Performance Features
Rx packet split header
Y
Y
Y
Up to 16
4
2
Total number of RX queues per port
16
4
2
Total number of TX queues per port
16
4
2
Yes to all
Yes to
all
Receive Side Scaling (RSS) number of queues
RX header replication
Low latency interrupt
Y
N
DCA support
N
TCP timer interrupts
N
No snoop
Y
Relax ordering
Y
TSO interleaving for reduced latency
Y
N
N
Receive side coalescing
N
N
N
SCTP receive and transmit checksum offload
Y
N
N
UDP TSO
Y
N
N
82576
82575
82571EB
8 pools
4
N
8 VF
N
N
Multicast/Broadcast Packet replication
Y
N
N
VM to VM Packet forwarding
Y
N
N
Traffic shaping
Y
N
N
MAC addresses
24
16
15
Y
N
N
Per pool
Global
Global
Per-pool statistics
Y
N
N
Per-pool off loads
Y
Partial
N
Per-pool jumbo support
Y
N
N
Mirroring rules
4
0
0
Table 1-7.
82576 Virtualization Features
Feature
Support for Virtual Machines Device queues (VMDq)
PCI-SIG SR IOV
MAC and VLAN anti-spoofing
VLAN filtering
Table 1-8.
82576 Manageability Features
Feature
82576
82575
82571EB
Advanced pass-through-compatible management packet transmit/
receive support
Y
Y
Y
Manageability support for ASF 1.0 and Alert on LAN 2.0
N
Y
Y
Intel® 82576 GbE Controller
Datasheet
52
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
Table 1-8.
82576 Manageability Features (Continued)
SMBus interface to external BMC
Y
Y
Y
DMTF NC-SI protocol standards support
Y
Y
N
L2 address filters
4
4
1
VLAN L2 filters
8
8
4
16
16
3
Flex TCO filters
4
4
2
L3 address filters (IPv4)
4
4
4
L3 address filters (IPv6)
4
4
1
Flex L3 port filters
Table 1-9.
82576 Security Features
Feature
Integrated MACSec security engines
•
GCM AES 128 encryption or authentication engine.
•
One Secure Connection
•
Two Security associations.
•
Replay protection with zero window.
Integrated IPSec Offload Engine1
82576
82575
82571EB
Y
N
N
Y
N
N
•
Security Associations - Rx
256
•
Security Associations - Tx
256
•
IP Authentication Header (AH) protocol
Y
•
IP Encapsulating Security Payload (ESP) for authentication and/or
Encryption.
Y
•
AES-128-GMAC (128-bit key) engine
•
IPv4 and IPv6 support (without options or extensions)
Y
Y
1. IPsec functionality is present in the 82576EB SKU. IPsec is removed from the 82576NS SKU.
*
1.6
Overview of New Capabilities
The following section describes features added in Intel® 82576 GbE Controller that are new related to
82575.
1.6.1
Note:
IPsec Off Load for Flows
The IPsec function is present in the 82576EB SKU. IPsec is removed from the 82576NS
SKU.
The 82576 (SKU: 82576EB) supports IPsec off load for a given number of flows. It is the operating
system’s responsibility to submit to hardware the most loaded flows, in order to take maximum
benefits of the IPsec off-load in terms of CPU utilization savings. Main features are:
• Off-load IPsec for up to 256 Security Associations (SA) for each of Tx and Rx.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
53
Intel® 82576 GbE Controller — Introduction
• AH and ESP protocols for authentication and encryption
AES-128-GMAC and AES-128-GCM crypto engines:
• Transport mode encapsulation
• IPv4 and IPv6 versions (no options or extension headers)
1.6.2
Security
The 82576 supports the IEEE 802.1ae specification. It incorporates an inline packet crypto unit to
support both privacy and integrity checks on a packet by packet basis. The transmit data path includes
both encryption and signing engines. On the receive data path, the 82576 includes a decryption engine
and an integrity checker. The crypto engines use an AES GCM algorithm that is designed to support the
802.1ae protocol. Note that both host traffic and MC management traffic might be subjected to
authentication and/or encryption.
1.6.3
Transmit Rate Limiting (TRL)
The 82576 supports the ability to limit the transmiting rate. TRL can be enabled for each transmit
queue. The following modes of TRL are used:
• Frame Overhead — IPG is extended by a fixed value for all transmit queues.
• Payload Rate — IPG, stretched relative to frame size, provides pre-determined data (bytes) rates
for each transmit queue.
1.6.4
Performance
The 82576 improvements include:
• Latency - The 82576 reduces end-to-end latency for high priority traffic in presence of other traffic.
Specifically, the 82576 reduces the delay caused by preceding TSO packets.
• CPU Utilization - The 82576 supports reducing CPU utilization in a virtualized system by
incorporating enhancements to the VMDq feature.
1.6.4.1
Tx Descriptor Write-Back
This functionality is an improvement to the way Tx descriptors are written back to memory. Instead of
writing back the DD bit into the descriptor location, the head pointer is updated in system memory. The
head pointer is updated based on the RS bit or prior to expiration of the corresponding interrupt vector.
1.6.5
Rx and Tx Queues
The number of Tx and Rx queues in the 82576 was increased to 16 queues.
1.6.6
Interrupts
The following changes in the interrupt scheme are implemented in the 82576:
• Rate controlling of low latency interrupts
• Extensions to the low latency interrupt filters to enable immediate interrupt by full 5-tuple
matching
Intel® 82576 GbE Controller
Datasheet
54
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
1.6.7
1.6.7.1
Virtualization
PCI SR IOV
The 82576supports the PCI-SIG Single-Root I/O Virtualization and Sharing specification (SR-IOV),
including the following functionality:
• Support for up to 8 virtual functions.
• Partial replication of PCI configuration space
• Allocation of MMIO space per virtual function
• Allocation of a requester ID per virtual function
• Virtualization of interrupts
1.6.7.2
Packets Classification
Received unicast packets are forwarded to the appropriate VM queue based on their unicast L2 address.
Broadcast and Multicast (MC) packets, however, might need to be forwarded to multiple VMs. Multicast
is commonly used to share information among a group of systems.
Received MC packets are forwarded to their destination VMs based on mapping between the MC
address and the target VMs.
Broadcast packets that are VLAN tagged are forwarded to destination VMs based on their VLAN tag.
Note that a VM might be associated with multiple VLAN addresses. A broadcast packet that is not VLAN
tagged can be optionally forwarded to all VMs.
Packet forwarding services inter-VM communication by forwarding transmit packets from a transmit
queue to an Rx software queue. The motivation to execute packet forwarding in the 82576 is in direct
assignment architecture, where it is desired that a guest VM interacts directly with the 82576 using a
standard device driver. If packet forwarding is to be done by system software, the guest VM (its device
driver) needs to filter local packets and forward those to a software switch to forward.
Transmit packets with a local destination are classified based on the same criteria as packets received
from the wire.
1.6.7.3
Hardware Virtualization
This section covers replication of hardware resources beyond the scope of PCI resources handled by PCI
SR-IOV. The following features are supported:
• Interrupts – part of the interrupts are assigned per VM.
• Statistics – enable read access to VMs in direct assignment model without the clear-on-read side
effect.
• Storm control - if an unusually high bandwidth of broadcast or multicast packets is detected, the
82576 can be configured to drop broadcast or multicast packets until the storm condition is over.
• Security features: VLAN and MAC anti-spoof are supported as well as insertion of VLAN according to
the physical function control.
1.6.7.4
Bandwidth Allocation
The 82576 allows allocation of transmit bandwidth among the virtual interfaces to avoid unfair use of
bandwidth by a single VM.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
55
Intel® 82576 GbE Controller — Introduction
1.6.8
VPD
The 82576 supports the Vital Product Data (VPD) capability defined in the PCI Specification, version
3.0.
1.6.9
64 bit BARs support
The 82576 supports different configuration of the I/O and MMIO Base Address Registers to allow
support of 64 bit mappings of BARs.
1.6.10
IEEE 1588 - Precision Time Protocol (PTP)
The IEEE 1588 International Standard enables networked Ethernet equipment to synchronize internal
clocks according to a network master clock. The protocol is implemented in software, with the 82576
providing accurate time measurements of special Tx and Rx packets close to the Ethernet link. These
packets measure the latency between the master clock and an end-point clock in both link directions.
The endpoint can then acquire an accurate estimate of the master time by compensating for link
latency.
The 82576 provides the following support for the IEEE 1588 protocol:
• Detection of specific PTP Rx packets and capturing the time of arrival of such packets in dedicated
CSRs
• Detection of specific PTP Tx packets and capturing the time of transmission of such packets in
dedicated CSRs
• A software-visible reference clock for the previously mentioned time captures.
• Both the L2 based and the UDP based version of the protocol are supported.
• Generation of an external clock on one of the SDPs.
• Triggering of external devices based on internal clock.
• Timestamps of external events.
1.7
Device Data Flows
1.7.1
Transmit Data Flow
Tx data flow provides a high level description of all data/control transformation steps needed for
sending Ethernet packets to the line.
Table 1-10.
Transmit Data Flow
Step
Description
1
The host creates a descriptor ring and configures one of the 82576's transmit queues with the address location,
length, head and tail pointers of the ring (one of 16 available Tx queues).
2
The host is requested by the TCP/IP stack to transmit a packet, it gets the packet data within one or more data
buffers.
3
The host initializes descriptor(s) that point to the data buffer(s) and have additional control parameters that
describe the needed hardware functionality. The host places that descriptor in the correct location at the
appropriate Tx ring.
4
The host updates the appropriate queue tail pointer (TDT)
Intel® 82576 GbE Controller
Datasheet
56
320961-015EN
Revision: 2.61
December 2010
Introduction — Intel® 82576 GbE Controller
Table 1-10.
Transmit Data Flow (Continued)
5
The 82576's DMA senses a change of a specific TDT and as a result sends a PCIe request to fetch the
descriptor(s) from host memory.
6
The descriptor(s) content is received in a PCIe read completion and is written to the appropriate location in the
descriptor queue internal cache.
7
The DMA fetches the next descriptor from the internal cache and processes its content. As a result, the DMA
sends PCIe requests to fetch the packet data from system memory.
8
The packet data is received from PCIe completions and passes through the transmit DMA that performs all
programmed data manipulations (various CPU off loading tasks as checksum off load, TSO off load, etc.) on the
packet data on the fly.
9
While the packet is passing through the DMA, it is stored into the transmit FIFO. After the entire packet is
stored in the transmit FIFO, it is forwarded to the transmit switch module.
10
If the packet destination is also local, it is sent also to the local switch memory and join the receive path.
11
The transmit switch arbitrates between host and management packets and eventually forwards the packet to
the Security engine.
12
The security engine optionally applies L3 (IPsec) or L2 (MACSec) encryption or authentication and forwards the
packet to the MAC.
13
The MAC appends the L2 CRC to the packet and sends the packet to the line using a pre-configured interface.
14
When all the PCIe completions for a given packet are done, the DMA updates the appropriate descriptor(s).
15
After enough descriptors are gathered for write back or the interrupt moderation timer expires, the descriptors
are written back to host memory using PCIe posted writes. Alternatively, the head pointer can only be written
back.
16
After the interrupt moderation timer expires, an interrupt is generated to notify the host device driver that the
specific packet has been read to the 82576 and the driver can release the buffers.
1.7.2
Receive Data Flow
Receive (Rx) data flow provides a high level description of all data/control transformation steps needed
for receiving Ethernet packets.
Table 1-11.
Receive Data Flow
Step
Description
1
The host creates a descriptor ring and configures one of the 82576's receive queues with the address location,
length, head, and tail pointers of the ring (one of 16 available Rx queues).
2
The host initializes descriptors that point to empty data buffers. The host places these descriptors in the correct
location at the appropriate Rx ring.
3
The host updates the appropriate queue tail pointer (RDT).
4
The 82576's DMA senses a change of a specific RDT and as a result sends a PCIe request to fetch the
descriptors from host memory.
5
The descriptors content is received in a PCIe read completion and is written to the appropriate location in the
descriptor queue internal cache.
6
A packet enters the Rx MAC. The RX MAC checks the CRC of the packet.
7
The MAC forwards the packet to an Rx filter
8
If the packet is a MACSec or an IPSec packet and the adequate key is stored in the hardware, the packet is
decrypted and authenticated.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
57
Intel® 82576 GbE Controller — Introduction
Table 1-11.
Receive Data Flow (Continued)
9
If the packet matches the pre-programmed criteria of the Rx filtering, it is forwarded to the Rx FIFO. VLAN and
CRC are optionally stripped from the packet and L3/L4 checksum are checked and the destination queue is
fixed.
10
The receive DMA fetches the next descriptor from the internal cache of the appropriate queue to be used for the
next received packet.
11
After the entire packet is placed into the Rx FIFO, the receive DMA posts the packet data to the location
indicated by the descriptor through the PCIe interface. If the packet size is greater than the buffer size, more
descriptors are fetched and their buffers are used for the received packet.
12
When the packet is placed into host memory, the receive DMA updates all the descriptor(s) that were used by
packet data.
12
After enough descriptors are gathered for write back or the interrupt moderation timer expires or the packet
requires immediate forwarding, the receive DMA writes back the descriptor content along with status bits that
indicate the packet information including what off loads were done on that packet.
13
After the interrupt moderation timer completes or an immediate packet is received, the 82576 initiates an
interrupt to the host to indicate that a new received packet is already in host memory.
14
Host reads the packet data and sends it to the TCP/IP stack for further processing. The host releases the
associated buffers and descriptors once they are no longer in use.
§§
Intel® 82576 GbE Controller
Datasheet
58
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
2.0
Pin Interface
2.1
Pin Assignment
The 82576 is packaged in 25mmx25mm FCBGA package with 1 mm ball pitch.
Table 2-1.
Signal Type Definition
Type
Description
DC specification
In
Input is a standard input-only signal.
See Section 11.4.2.2
Out
Totem Pole Output is a standard active driver.
See Section 11.4.2.2
T/S
Tri-State is a bi-directional, tri-state input/output
pin.
See Section 11.4.2.2
O/D
Open Drain allows multiple devices to share as a
wire-OR.
See Section 11.4.2.3
NC-SI-in
Input signal
See Section 11.4.2.4
NC-SI-out
Output signal
See Section 11.4.2.4
A
Analog PHY signals
See Section 11.4.5
A-in
Analog input signals
See Section 11.4.4
A-out
Analog output signals
See Section 11.4.4
B
Input bias
See Section 11.4.7
2.1.1
PCIe
The AC specification for these pins is described in Chapter 11.0.
Table 2-2.
Symbol
PCI* Pins
Ball #
PE_CLK_p
N2
PE_CLK_n
N1
PET_0_p
D2
PET_0_n
D1
PET_1_p
H2
PET_1_n
H1
320961-015EN
Revision: 2.61
December 2010
Type
Name and Function
A-in
PCIe* Differential Reference Clock in: A 100MHz differential clock input. This clock
is used as the reference clock for the PCIe* Tx/Rx circuitry and by the PCIe* core
PLL to generate clocks for the PCIe* core logic.
Aout
PCIe* Serial Data output: A serial differential output pair running at 2.5Gb/s. This
output carries both data and an embedded 2.5GHz clock that is recovered along
with data at the receiving end.
Aout
PCIe* Serial Data output: A serial differential output pair running at 2.5Gb/s. This
output carries both data and an embedded 2.5GHz clock that is recovered along
with data at the receiving end.
Intel® 82576 GbE Controller
Datasheet
59
Intel® 82576 GbE Controller — Pin Interface
Table 2-2.
PCI* Pins (Continued)
PET_2_p
R2
PET_2_n
R1
Aout
PCIe* Serial Data output: A serial differential output pair running at 2.5Gb/s. This
output carries both data and an embedded 2.5GHz clock that is recovered along
with data at the receiving end.
PET_3_p
W2
PET_3_n
W1
Aout
PCIe* Serial Data output: A serial differential output pair running at 2.5Gb/s. This
output carries both data and an embedded 2.5GHz clock that is recovered along
with data at the receiving end.
PER_0_p
F2
PER_0_n
F1
A-in
PCIe* Serial Data input: A Serial differential input pair running at 2.5Gb/s. An
embedded clock present in this input is recovered along with the data.
PER_1_p
K2
PER_1_n
K1
A-in
PCIe* Serial Data input: A Serial differential input pair running at 2.5Gb/s. An
embedded clock present in this input is recovered along with the data.
PER_2_p
U2
PER_2_n
U1
A-in
PCIe* Serial Data input: A Serial differential input pair running at 2.5Gb/s. An
embedded clock present in this input is recovered along with the data.
PER_3_p
AA2
PER_3_n
AA1
A-in
PCIe* Serial Data input: A Serial differential input pair running at 2.5Gb/s. An
embedded clock present in this input is recovered along with the data.
PE_WAKE_N
AC20
O/D
WAKE: Pulled to ‘0’ to indicate that a Power Management Event (PME) is pending
and the PCI Express link should be restored. Defined in the PCI Express
specifications.
PE_RST_N
AC9
In
Power and Clock Good Indication: Indicates that power and PCI Express reference
clock are within specified values. Defined in the PCI Express specifications.
This pin is used as a fundamental reset indication for the device.
RSVDM3_NC
M3
RSVDM2_NC
M2
PE_RCOMP
L1
2.1.2
Aout
Analog testing
B
Impedance compensation. Connect to ground through an external 1.4 Kohm 1%
100ppm resistor for impedance compensation. See Figure 11-13 for details.
Flash and EEPROM Ports (8)
The AC specification for these pins is described in Section 11.4.3.4 to Section 11.4.3.5.
Table 2-3.
Flash and EEPROM Ports
Symbol
Ball #
Type
Name and Function
FLSH_SI
AC14
T/S
Serial Data output to the Flash
FLSH_SO
AD14
In
Serial Data input from the Flash
FLSH_SCK
AD15
T/S
Flash serial clock Operates at ~20MHz.
FLSH_CE_N
AC15
T/S
Flash chip select Output
EE_DI
A21
T/S
Data output to EEPROM
EE_DO
A20
In
Data input from EEPROM
EE_SK
B20
T/S
EEPROM serial clock Operates at ~2MHz.
EE_CS_N
B21
T/S
EEPROM chip select Output
Intel® 82576 GbE Controller
Datasheet
60
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
2.1.3
System Management Bus (SMB) Interface
The AC specification for these pins is described in Section 11.4.3.3.
2.1.4
NC-SI Interface Pins
The AC specification for these pins is described in Section 11.4.3.6.
Table 2-4.
Symbol
NC-SI Interface Pins
Ball #
Type
Name and Function
NCSI_CLK_IN
B5
NC-SI-In
NC-SI Reference Clock Input – Synchronous clock reference for
receive, transmit and control interface. It is a 50MHz clock /- 50 ppm.
NCSI_CLK_OUT
B4
NC-SI-Out
NC-SI Reference Clock Output – Synchronous clock reference for
receive, transmit and control interface. It is a 50MHz clock /- 50 ppm.
Serves as a clock source to the MC and the 82576 (when configured
so).
NCSI_CRS_DV
A4
NC-SI-Out
CRS/DV – Carrier Sense / Receive Data Valid.
NCSI_RXD_1
A6
NC-SI-Out
Receive Data – Data signals from the 82576 to BMC.
NCSI_RXD_0
B7
NCSI_TX_EN
B6
NC-SI-In
Transmit Enable.
NCSI_TXD_1
A7
NC-SI-In
Transmit Data – Data signals from MC to the 82576.
NCSI_TXD_0
B8
NCSI_ARB_OUT
B3
NCSI_ARB_IN
AD3
320961-015EN
Revision: 2.61
December 2010
NC-SI-Out/
NC-SI-In
NC-SI-In
NC-SI HW arbitration token output pin.
NC-SI HW arbitration token input pin.
Intel® 82576 GbE Controller
Datasheet
61
Intel® 82576 GbE Controller — Pin Interface
2.1.5
Miscellaneous Pins
The AC specification for the XTAL pins is described in sections 11.4.6.
Table 2-5.
Miscellaneous Pins
Symbol
Ball #
Type
Name and Function
T/S
SW Defined Pins for function 0: These pins are reserved pins that
are software programmable w/rt input/output capability. These
default to inputs upon power up, but may have their direction and
output values defined in the EEPROM. The SDP bits may be mapped
to the General Purpose Interrupt bits when configured as inputs. The
SDP0[0] pin can be used as a watchdog output indication. All the
SDP pins can be used as SFP sideband signals (TxDisable, present &
TxFault). The 82576 does not use these signals; it is available for
SW control over SFP.
AD10
T/S
A12
NC-SI
SDP1_2
A13
T/S
SDP1_3
AC10
T/S
SW Defined Pins for function 1: Reserved pins that are software
programmable write/read input/output capability. These default to
inputs upon power up, but may have their direction and output
values defined in the EEPROM. The SDP bits may be mapped to the
General Purpose Interrupt bits when configured as inputs. The
SDP1[0] pin can be used as a watchdog output indication. All the
SDP pins can be used as SFP sideband signals (TxDisable, present &
TxFault). The 82576 does not use these signals; it is available for
SW control over SFP.
MAIN_PWR_OK
AD4
In
Main Power OK – Indicates that platform main power is up. Must be
connected externally to main core 3.3V power.
DEV_OFF_N
B9
In
Device Off: Assertion of DEV_OFF_N puts the device in Device
Disable mode. This pin is asynchronous and is sampled once the
EEPROM is ready to be read following power-up. The DEV_OFF_N pin
should always be connected to VCC3P3 to enable device operation.
XTAL1
N23
A-In
XTAL2
N24
A-out
Reference Clock / XTAL: These pins may be driven by an external
25MHz crystal or driven by a single ended external CMOS compliant
25MHz oscillator.
SDP0_0
A16
SDP0_1
B16
SDP0_2
B17
SDP0_3
B15
SDP1_0
SDP1_1
2.1.6
SERDES/SGMII Pins
The AC specification for these pins is described in Section 11.4.4.
Table 2-6.
SERDES/SGMII Pins
Symbol
Ball #
SRDSI_0_p
J23
SRDSI_0_n
J24
SRDSO_0_p
K23
SRDSO_0_n
K24
SRDS_0_SIG_DET
A9
Intel® 82576 GbE Controller
Datasheet
62
Type
A-in
Name and Function
SERDES/SGMII Serial Data input Port 0: Differential SERDES Receive
interface.
A Serial differential input pair running at 1.25Gb/s. An embedded clock
present in this input is recovered along with the data.
A-out
SERDES/SGMII Serial Data output Port 0: Differential SERDES Transmit
interface.
A serial differential output pair running at 1.25Gb/s. This output carries
both data and an embedded 1.25GHz clock that is recovered along with
data at the receiving end.
In
Port 0 Signal Detect: Indicates that signal (light) is detected from the Fiber.
High for signal detect, Low otherwise.
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
Table 2-6.
SERDES/SGMII Pins (Continued)
SRDSI_1_p
T23
SRDSI_1_n
T24
SRDSO_1_p
R23
SRDSO_1_n
R24
SRDS_1_SIG_DET
A10
In
Port 1 Signal Detect: Indicates that signal (light) is detected from the fiber.
High for signal detect, Low otherwise.
SER_RCOMP
L22
B
Impedance compensation. Connect to ground through an external 1.4
Kohm 1% 100ppm resistor for impedance compensation. See Figure 11-13
for details.
2.1.7
A-in
SERDES/SGMII Serial Data input Port 1: Differential fiber SERDES Receive
interface.
A Serial differential input pair running at 1.25Gb/s. An embedded clock
present in this input is recovered along with the data.
A-out
SERDES/SGMII Serial Data output Port 1: Differential fiber SERDES
Transmit interface.
A serial differential output pair running at 1.25Gb/s. This output carries
both data and an embedded 1.25GHz clock that is recovered along with
data at the receiving end.
SFP Pins
The AC specification for these pins is described in Chapter 11.0.
2.1.8
Media Dependent Interface (PHY’s MDI) Pins
2.1.8.1
LED’s (8)
The table below describes the functionality of the LED output pins. Default activity of the LED may be
modified in the EEPROM words 1Ch and 1Fh. The LED functionality is reflected and can be further
modified in the configuration registers LEDCTL.
Table 2-7.
Symbol
LED Output Pins
Ball #
Type
Name and Function
LED0_0
A19
Out
Port 0 LED0. Programmable LED which indicates by default Link
Up.
LED0_1
B19
Out
Port 0 LED1. Programmable LED which indicates by default activity
(when packets are transmitted or received that match MAC
filtering).
LED0_2
B18
Out
Port 0 LED2. Programmable LED which indicates by default a
100Mbps Link.
LED0_3
A18
Out
Port 0 LED3. Programmable LED which indicates by default a
1000Mbps Link.
LED1_0
AD13
Out
Port 1 LED0. Programmable LED which indicates by default Link up.
LED1_1
AC11
Out
Port 1 LED1. Programmable LED which indicates by default activity
(when packets are transmitted or received that match MAC
filtering).
LED1_2
AC13
Out
Port 1 LED2. Programmable LED which indicates by default a
100Mbps Link.
LED1_3
AC12
Out
Port 1 LED3. Programmable LED which indicates by default a
1000Mbps Link.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
63
Intel® 82576 GbE Controller — Pin Interface
2.1.8.2
Analog Pins
The AC specification for these pins is described in sections Chapter 11.0.
2.1.9
Table 2-8.
Testability Pins
Testability Pins
Symbol
2.1.10
Table 2-9.
Ball #
Type
Name and Function
JTCK
AC6
In
JTAG Clock Input
JTDI
AD7
In
JTAG TDI Input
JTDO
AC8
O/D
JTAG TDO Output
JTMS
AC7
In
JTAG TMS Input
RSVDAC5_3P3
AC5
In
JTAG Reset Input (Optional)
AUX_PWR
B14
T/S
Auxiliary Power Available: When set, indicates that
Auxiliary Power is available and the device should support
D3COLD power state if enabled to do so. This pin is also
used for testing and scan.
LAN1_DIS_N
A15
T/S
This pin is a strapping option pin latched at the rising
edge of PE_RST# or In-Band PCIe* Reset. This pin has an
internal weak pull-up resistor. In case this pin is not
connected or driven hi during init time, LAN 1 is enabled.
In case this pin is driven low during init time, LAN 1
function is disabled. This pin is also used for testing and
scan.
LAN0_DIS_N
B13
T/S
This pin is a strapping option pin latched at the rising
edge of PE_RST# or In-Band PCIe* Reset. This pin has an
internal weak pull-up resistor. In case this pin is not
connected or driven hi during init time, LAN 0 is enabled.
In case this pin is driven low during init time, LAN 0
function is disabled. This pin is also used for testing and
scan.
Reserved Pins and No-Connects
Reserved Pins and No-Connects
Symbol
Ball #
RSVDAB18_N
C
AB1
8
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDAB19_N
C
AB1
9
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDAC16_N
C
AC1
6
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDAC17_N
C
AC1
7
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
Intel® 82576 GbE Controller
Datasheet
64
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
Table 2-9.
Reserved Pins and No-Connects (Continued)
RSVDAD16_N
C
AD1
6
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDAD17_N
C
AD1
7
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDM2_NC
M2
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDM23_NC
M23
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDM24_NC
M24
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDM3_NC
M3
Reserved, no-connect. These pins are reserved by Intel and may have factory test functions. For
normal operation, do not connect any circuitry to these pins. Do not connect pull-up or pull-down
resistors.
RSVDA8_3P3
A8
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC3P3. Do not connect them to pull-up resistors.
RSVDA11_3P3
A11
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC3P3. Do not connect them to pull-up resistors.
RSVDB10_3P3
B10
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC3P3. Do not connect them to pull-up resistors.
RSVDB11_3P3
B11
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC3P3. Do not connect them to pull-up resistors.
RSVDB12_3P3
B12
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC3P3. Do not connect them to pull-up resistors.
RSVDAD9_3P
3
AD9
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC3P3. Do not connect them to pull-up resistors.
RSVDAC5_3P
3
AC5
Reserved, VCC3P3. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect directly to VCC3P3 with a 10k ohm pull-up resister.
RSVDL14_1P0
L14
Reserved, VCC1P0. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC1P0. Do not connect them to pull-up resistors.
RSVDP14_1P0
P14
Reserved, VCC1P0. These pins are reserved by Intel and may have factory test functions. For
normal operation, connect them directly to VCC1P0. Do not connect them to pull-up resistors.
RSVDAD8_VS
S
AD8
Reserved, VSS. These pins are reserved by Intel and may have factory test functions. For normal
operation, connect them directly to VSS. Do not connect them to pull-down resistors.
RSVDA14_VS
S
A14
Reserved, VSS. These pins are reserved by Intel and may have factory test functions. For normal
operation, connect them directly to VSS. Do not connect them to pull-down resistors.
NCAC3
AC3
Reserved, no connect. This pin is not connected internally.
2.1.11
Table 2-10.
Power Supply Pins
Power Supply Pins
Symbol
VCC3P3
320961-015EN
Revision: 2.61
December 2010
Ball #
AD6, AD12
Type
3.3V
Name and Function
3.3V power input top
Intel® 82576 GbE Controller
Datasheet
65
Intel® 82576 GbE Controller — Pin Interface
Table 2-10.
Power Supply Pins (Continued)
VCC3P3
A5, A17
3.3V
3.3V power input
bottom
VCC1P0
R14, R13, R12, R11, P13, P12, L13, L12, K14, K13, K12, K11
1V
1V power digital
VCC1P8
P9, P8,P5, P4, N9, N8, N5, N4, M9, M8, M5, M4, L9, L8, L5, L4
1.8V
1.8V analog power
input PCIe*
VCC1P8
L15, K15, J15, H15, G15, E20, E19, D20, D19, AA20, AA19,Y20,
Y19, V15, U15, T15, R15, P15, N21, N15, M21, M15
1.8V
1.8V analog power
input PHY
VCC1P0
V5, V4, U5, U4, P11, N11, M11, L11, H5, H4, G5, G4
1.0V
1.0V analog power
input PCIe*
VCC1P0
J21, J20, J18, J17, L21, L20, L18, L17,
1.0V
1.0V analog power
input PHY
K21, K20, K18, K17, T21, T20, T18, T17, P21, P20, P18, P17,
R21, R20, R18, R17
VSS
Y9, Y8, Y7, Y6, Y15, Y14, Y13, Y12, Y11, Y10, W9, W8, W7,
W14, W13, W12, W11, W10, V9, V8, V14, V13, V12, V11, V10,
U9, U14, U13, U12, U11, U10, T14, T13, T12, T11, N14, N13,
N12, M14, M13, M12, J14, J13, J12, J11, H9, H14, H13, H12,
H11, H10, G9, G8, G14, G13, G12, G11, G10, F9, F8, F7, F14,
F13, F12, F11, F10, E9, E8, E7, E6, E15, E14, E13, E12, E11,
E10, D9, D8, D7, D6, D5, D16, D15, D14, D13, D12, D11, D10,
C9, C8, C7, C6, C5, C4, C17, C16, C15, C14, C13, C12, C11,
C10, B2, B1, AD5, AD2, AD11, AD1, AC4, AC2, AC1, AB9, AB8,
AB7, AB6, AB5, AB4, AB17, AB16, AB15, AB14, AB13, AB12,
AB11, AB10, AA9, AA8, AA7, AA6, AA5, AA16, AA15, AA14,
AA13, AA12, AA11, AA10, A3, A2, A1
0V
Digital Ground
VSS
Y24, Y23, Y21, Y18, Y17, Y16, W22, W21, W20, W19, W18,
W17, W16, W15, V22, V21, V20, V19, V18, V17, V16, U24, U23,
U22, U21, U20, U19, U18, U17, U16, T22, T19, T16, R22, R19,
R16, P24, P23, P22, P19, P16, N22, N20, N19, N18, N17, N16,
M22, M20, M19, M18, M17, M16, L24, L23, L19, L16, K22, K19,
K16, J22, J19, J16, H24, H23, H22, H21, H20, H19, H18, H17,
H16, G22, G21, G20, G19, G18, G17, G16, F22, F21, F20, F19,
F18, F17, F16, F15, E24, E23, E21, E18, E17, E16, D22, D21,
D18, D17, C22, C21, C20, C19, C18, B24, B23, AD24, AD23,
AC24, AC23, AB22, AB21, AB20, AA22, AA21, AA18, AA17, A24,
A23
0V
PHY analog ground
VSS
Y5, Y4, Y3, Y2, Y1, W6, W5, W4, W3, V7, V6, V3, V2, V1, U8,
U7, U6, U3, T9, T8, T7, T6, T5, T4, T3, T2, T10, T1, R9, R8, R7,
R6, R5, R4, R3, R10, P7, P6, P3, P2, P10, P1, N7, N6, N3, N10,
M7,M6, M10, M1, L7, L6, L3, L2, L10, K9, K8, K7, K6, K5, K4,
K3, K10, J9, J8, J7, J6, J5, J4, J3, J2, J10, J1, H8, H7, H6, H3,
G7, G6, G3, G2, G1, F6, F5, F4, F3, E5, E4, E3, E2, E1, D4, D3,
C3, C2, C1, AB3, AB2, AB1, AA4, AA3
0V
PCIe* analog ground
2.2
Pull-ups/Pull-downs
The table below lists internal & external pull-up resistors and their functionality in different device
states.
Each internal PUP has a nominal value of 5K, ranging from 2.7K to 8.6K.. The recommended values
for external resistors are 400 for pull down resistors and 3Kfor pull up resistors.
The device states are defined as follow:
Intel® 82576 GbE Controller
Datasheet
66
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
• Power-up = while 3.3V is stable, yet 1.0V isn’t
• Active = normal mode (not power up or disable)
• Disable = device disable (a.k.a. dynamic IDDQ – see See Section 4.4)
Table 2-11.
Pull-Up Resistors
Signal Name
Power up
PUP
Active
Comments
PUP
Comments
Disable
PUP
External
Comments
PE_WAKE_N
N
N
N
Y
PE_RST_N
Y
N
N
N
FLSH_SI
Y
N
Y
N
FLSH_SO
Y
Y
Y
N
FLSH_SCK
Y
N
Y
N
FLSH_CE_N
Y
N
Y
N
EE_DI
Y
N
Y
N
EE_DO
Y
Y
Y
N
EE_SK
Y
N
Y
N
EE_CS_N
Y
N
Y
N
SMBD
N
N
N
Y
SMBCLK
N
N
N
Y
SMBALRT_N
N
N
N
Y
RSVDAD17_NC
Y
N
N
N
RSVDAC17_NC
Y
N
N
N
RSVDAC16_NC
Y
N
Y
HiZ
N
RSVDAD16_NC
Y
N
Y
HiZ
N
NC-SI_CLK_IN
N
HiZ
N
N
NC-SI_CLK_OUT
Y
HiZ
N
N
NC-SI_CRS_DV
N
HiZ
N
N
PD
NC-SI_RXD[1:0]
Y
HiZ
N
N
Y (Note 2)
NC-SI_TX_EN
N
HiZ
N
N
PD (Note 1)
NC-SI_TXD[1:0]
N
HiZ
N
N
PD (Note 1)
NC-SI_ARB_IN
N
Y
NC-SI_ARB_OUT
Y
Y
SDP0[3:0]
Y
Y
320961-015EN
Revision: 2.61
December 2010
Controlled
by EEPROM
Y
PD (Note 1)
If active,
stable
output
N
Controlled
by
EEPROM
Y
Until
EEPROM
done
N
May keep
state by
EEPROM
control
N
Intel® 82576 GbE Controller
Datasheet
67
Intel® 82576 GbE Controller — Pin Interface
Table 2-11.
Pull-Up Resistors (Continued)
Signal Name
Power up
PUP
Active
Comments
PUP
Disable
External
Comments
PUP
Comments
Until
EEPROM
done
N
N
SDP1[3:0]
Y
Y
DEV_OFF_N
Y
N
N
Must be
connected on
board
MAIN_PWR_OK
Y
N
N
Must be
connected on
board
SRDS_0_SIG_DET
Y
N
N
Must be
connected
externally
SRDS_1_SIG_DET
Y
N
N
Must be
connected
externally
SFP0_I2C_CLK
Y
N
Y
Y if active
SFP0_I2C_DATA
Y
N
N
Y
SFP1_I2C_CLK
Y
N
Y
Y if active
SFP1_I2C_DATA
Y
N
N
Y
LED0_0
Y
N
N
HiZ
LED0_1
Y
N
N
HiZ
LED0_2
Y
N
N
HiZ
LED0_3
Y
N
N
HiZ
LED1_0
Y
N
N
HiZ
LED1_1
Y
N
N
HiZ
LED1_2
Y
N
N
HiZ
LED1_3
Y
N
N
HiZ
JTCK
Y
N
N
N
JTDI
Y
N
N
Y
JTDO
Y
N
N
Y
JTMS
Y
N
N
Y
AUX_PWR
Y
N
N
PU or PD
(Note 3)
Intel® 82576 GbE Controller
Datasheet
68
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
Table 2-11.
Pull-Up Resistors (Continued)
Signal Name
Power up
PUP
Comments
Active
PUP
Comments
Disable
PUP
External
Comments
LAN1_DIS_N
Y
Y when
input
Y
PU or PD
(Note 4)
LAN0_DIS_N
Y
Y when
input
Y
PU or PD
(Note 4)
Notes:
1. Should be pulled down if NC-SI interface is disabled.
2. Only if NC-SI is unused or set to multi drop configuration.
3. If Aux power is connected, should be pulled up, else should be pulled down.
4. If the specific function is disabled, should be pulled down, else should be pulled up.
2.3
Strapping
The following signals are used for static configuration. Unless otherwise stated, strapping options are
latched on the rising edge of Internal_Power_On_Reset, at power up, at in-band PCI Express reset and
at PE_RST_N assertion. At other times, they revert to their standard usage.
Table 2-12.
Strapping Options
Purpose
LAN1 Disable
Pin
LAN1_Dis_N
Polarity
0b – LAN1 is disabled
Pull-up / Pull-down
Internal pull-up
1b – LAN1 is enabled
LAN0 Disable
LAN1_Dis_N
0b – LAN0 is disabled
Internal pull-up
1b – LAN0 is enabled
AUX_PWR
AUX_PWR
0b – AUX power is not available
None
1b – AUX power is available
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
69
Intel® 82576 GbE Controller — Pin Interface
2.4
Figure 2-1.
Interface Diagram
82576 Interface Diagram
Intel® 82576 GbE Controller
Datasheet
70
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
2.5
Pin List (Alphabetical)
Table 2-13 lists the pins and signals in pin alphabetical order. Note that where multiple pins are listed,
the list sorts by the lowest pin designator. VSS pins are in Table 2-14.
Table 2-13.
Pin List (Alphabetical by Pin Designation)
Signal
Pin
Signal
Pin
Signal
Pin
NCSI_CRS_DV
A4
LED0_2
B18
RSVDM2_NC
M2
VCC3P3
A5, A17
LED0_1
B19
RSVDM3_NC
M3
NC-SI_RXD_1
A6
EE_SK
B20
RSVDM23_NC
M23
NC-SI_TXD_1
A7
EE_CS_N
B21
RSVDM24_NC
M24
RSVDA8_3P3
A8
IEEE_ATEST0_
n
B22
SRDS_0_SIG_
DET
A9
PE_CLK_n
N1
SRDS_1_SIG_
DET
A10
MDI0_n_0
C23
PE_CLK_p
N2
RSVDA11_3P3
A11
MDI0_p_0
C24
XTAL1
N23
SDP1_1
A12
PET_0_n
D1
XTAL2
N24
SDP1_2
A13
PET_0_p
D2
VCC1P8
P9, P8,P5, P4, N9, N8,
N5, N4, M9, M8, M5,
M4, L9, L8, L5, L4
RSVDA14_VSS
A14
MDI0_n_1
D23
RSVDP14_1P
0
P14
LAN1_DIS_N
A15
MDI0_p_1
D24
SDP0_0
A16
RBIAS0
E22
PET_2_n
R1
LED0_3
A18
PER_0_n
F1
PET_2_p
R2
LED0_0
A19
PER_0_p
F2
VCC
R14, R13, R12, R11,
P13, P12, L13, L12,
K14, K13, K12, K11
EE_DO
A20
MDI0_n_2
F23
SRDSO_1_p
R23
EE_DI
A21
MDI0_p_2
F24
SRDSO_1_n
R24
IEEE_ATEST0_
p
A22
MDI0_n_3
G23
SRDSI_1_p
T23
NCSI_ARB_OUT
B3
MDI0_p_3
G24
SRDSI_1_n
T24
NCSI_CLK_OUT
B4
PET_1_n
H1
PER_2_n
U1
NC-SI_CLK_IN
B5
PET_1_p
H2
PER_2_p
U2
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
71
Intel® 82576 GbE Controller — Pin Interface
Table 2-13.
Pin List (Alphabetical by Pin Designation) (Continued)
Signal
Pin
Signal
Pin
Signal
Pin
NC-SI_TX_EN
B6
VCC1P0
J21, J20, J18, J17,
L21, L20, L18,
L17, K21, K20,
K18, K17, T21,
T20, T18, T17,
P21, P20, P18,
P17, R21, R20,
R18, R17
MDI1_n_3
V23
NC-SI_RXD_0
B7
SRDSI_0_p
J23
MDI1_p_3
V24
NC-SI_TXD_0
B8
SRDSI_0_n
J24
VCC1P0
DEV_OFF_N
B9
PER_1_n
K1
V5, V4, U5, U4, P11,
N11, M11, L11, H5,
H4, G5, G4
RSVDB10_3P3
B10
PER_1_p
K2
RSVDB11_3P3
B11
SRDSO_0_p
K23
PET_3_n
W1
RSVDB12_3P3
B12
SRDSO_0_n
K24
PET_3_p
W2
LAN0_DIS_N
B13
PE_RCOMP
L1
MDI1_n_2
W23
AUX_PWR
B14
VCC1P8
P9, P8,P5, P4, N9,
N8, N5, N4, M9,
M8, M5, M4, L9,
L8, L5, L4
MDI1_p_2
W24
SDP0_3
B15
RSVDL14_1P0
L14
RBIAS1
Y22
SDP0_1
B16
VCC1P8
L15, K15, J15,
H15, G15, E20,
E19, D20, D19,
AA20, AA19,Y20,
Y19, V15,U15,
T15, R15, P15,
N21, N15, M21,
M15
SDP0_2
B17
SER_RCOMP
L22
PER_3_n
AA1
SDP1_3
AC10
VCC3P3
AD6, AD12
PER_3_p
AA2
LED1_1
AC11
JTDI
AD7
MDI1_n_1
AA23
LED1_3
AC12
RSVDAD8_VS
S
AD8
MDI1_p_1
AA24
LED1_2
AC13
RSVDAD9_3P
3
AD9
RSVDAB18_NC
AB18
FLSH_SI
AC14
SDP1_0
AD10
RSVDAB19_NC
AB19
FLSH_CE_N
AC15
LED1_0
AD13
MDI1_n_0
AB23
SFP1_I2C_DAT
A/MDIO1
AC18
FLSH_SO
AD14
MDI1_p_0
AB24
SFP1_I2C_CLK
/MDC1
AC19
FLSH_SCK
AD15
NCAC3
AC3
PE_WAKE_N
AC20
SFP0_I2C_DA
TA/MDIO0
AD18
RSVDAC5_3P3
AC5
SMBCLK
AC21
SFP0_I2C_CL
K/MDC0
AD19
Intel® 82576 GbE Controller
Datasheet
72
320961-015EN
Revision: 2.61
December 2010
Pin Interface — Intel® 82576 GbE Controller
Table 2-13.
Pin List (Alphabetical by Pin Designation) (Continued)
Signal
Pin
Signal
Pin
Signal
Pin
JTCK
AC6
IEEE_ATEST1_
n
AC22
SMBALRT_N
AD20
JTMS
AC7
SMBD
AD21
JTDO
AC8
NC-SI_ARB_IN
AD3
IEEE_ATEST1
_p
AD22
PE_RST_N
AC9
MAIN_PWR_O
K
AD4
Table 2-14.
VSS Pins
Signal
Pin
VSS
Y24, Y23, Y21, Y18, Y17, Y16, W22, W21, W20, W19, W18, W17, W16, W15, V22, V21, V20, V19, V18,
V17, V16, U24, U23, U22, U21, U20, U19, U18, U17, U16, T22, T19, T16, R22, R19, R16, P24, P23, P22,
P19, P16, N22, N20, N19, N18, N17, N16, M22, M20, M19, M18, M17, M16, L24, L23, L19, L16, K22, K19,
K16, J22, J19, J16, H24, H23, H22, H21, H20, H19, H18, H17, H16, G22, G21, G20, G19, G18, G17, G16,
F22, F21, F20, F19, F18, F17, F16, F15, E24, E23, E21, E18, E17, E16, D22, D21, D18, D17, C22, C21,
C20, C19, C18, B24, B23, AD24, AD23, AC24, AC23, AB22, AB21, AB20, AA22, AA21, AA18, AA17, A24,
A23, Y5, Y4, Y3, Y2, Y1, W6, W5, W4, W3, V7, V6, V3, V2, V1, U8, U7, U6, U3, T9, T8, T7, T6, T5, T4, T3,
T2, T10, T1, R9, R8, R7, R6, R5, R4, R3, R10, P7, P6, P3, P2, P10, P1, N7, N6, N3, N10, M7,M6, M10, M1,
L7, L6, L3, L2, L10, K9, K8, K7, K6, K5, K4, K3, K10, J9, J8, J7, J6, J5, J4, J3, J2, J10, J1, H8, H7, H6, H3,
G7, G6, G3, G2, G1, F6, F5, F4, F3, E5, E4, E3, E2, E1, D4, D3, C3, C2, C1, AB3, AB2, AB1, AA4, AA3
2.6
Ball Out
This section provides a top view ball map of the 82576 in a 25 mmx25 mm package. Some names in
the layout are not accurate (short names were chosen to fit). See Figure 2-2 for the color key for the
ball out table.
Figure 2-2.
320961-015EN
Revision: 2.61
December 2010
Clock/BIAS/IEEE
test pins
MDI Interface
NC-SI Signals
VCC1P8
VCC1P0
VSS
Functional Pin
PCIe signals
VCC3P3
Open Drain
Reserved signals
MDIO/2 Wire Interface
signals
Color Key for Ball-Out
Intel® 82576 GbE Controller
Datasheet
73
Figure 2-3.
Intel® 82576 GbE Controller
Datasheet
74
A
B
C
D
E
F
G
H
J
K
L
M
N
P
R
T
U
V
W
Y
AA
AB
AC
AD
24
23
VSS
VSS
22
RBIAS1
VSS
VSS
XTAL1
VSS
VSS
VSS
VSS
VSS
VSS
RBIAS0
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
20
IEEE_ATE EE_DI
ST0_p
19
18
17
16
15
14
13
VSS
VCC1P8
VCC1P8
VSS
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VSS
VCC1P8
VCC1P8
VSS
EE_DO
LED0_0
LED0_1
VSS
VCC1P8
VCC1P8
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P8
VCC1P8
LED0_3
LED0_2
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VSS
VSS
VSS
VCC3P3
SDP0[2]
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VSS
VSS
VSS
RSVDAB19 RSVDAB18 VSS
_NC
_NC
SDP0[0]
SDP0[1]
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
AUX_PWR
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
12
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
VSS
VSS
VCC1P0
VCC1P0
VSS
VSS
VSS
VSS
VSS
VSS
VSS
LED1_3
VCC3P3
11
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
VCC1P0
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VSS
VSS
VSS
VSS
VSS
LED1_1
VSS
10
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
SDP1_3
SDP1_0
SDP1[1]
9
8
DEV_
OFF_N
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
6
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
JTCK
VCC3P3
5
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
VSS
VSS
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VSS
VSS
VSS
VSS
VSS
VCC1P0
VCC1P0
VSS
VSS
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
NCSI_CRS_VSS
1
VSS
PET_2_n
VSS
PER_2_n
VSS
PET_3_n
VSS
PER_3_n
VSS
VSS
VSS
PE_CLK_p PE_CLK_n
VSS
PET_2_p
VSS
PER_2_p
VSS
PET_3_p
VSS
PER_3_p
VSS
VSS
2
VSS
PET_0_p
VSS
PER_0_p
VSS
PET_1_p
VSS
PER_1_p
VSS
VSS
VSS
VSS
VSS
PET_0_n
VSS
PER_0_n
VSS
PET_1_n
VSS
PER_1_n
PE_RCOM
P
RSVDM3_ RSVDM2_ VSS
NC
NC
VSS
VSS
VSS
VSS
VCC1p0_P VCC1p0_P VSS
E
E
VSS
3
NCAC3
VCC1p0_P VCC1p0_P VSS
E
E
VSS
VSS
VSS
VSS
4
MAIN_PW NCSI_ARB VSS
R_OK
_IN
RSVDAC5 VSS
_3P3
VSS
NCSI_RXD[ NCSI_TX_ENCSI_CLK_NCSI_ CLK_NCSI_ARB VSS
_OUT
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
JTMS
7
RSVDA8_N NCSI_TXD[ NCSI_RXD[ VCC3P3
C
NCSI_
TXD[0]
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VSS
PE_RST_N JTDO
RSVDAD9 RSVDAD8 JTDI
_3P3
_VSS
RSVDA11_ SRDS1_
SRDS0_
NC
SIG_ DET SIG_DET
LAN0_Dis_N RSVDB12_N RSVDB11_N RSVDB10_
C
C
NC
VSS
VSS
VSS
VSS
VSS
VSS
VSS
VCC1P0
RSVDL14_ VCC1P0
1P0
VSS
VSS
RSVDP14_ VCC1P0
NC
VCC1P0
VSS
VSS
VSS
VSS
VSS
VSS
VSS
LED1_2
LAN1_DIS RSVDA14_ SDP1[2]
_N
NC
SDP0[3]
VSS
VSS
VSS
VSS
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VCC1P8
VSS
VSS
VSS
VSS
PE_WAKE SFP1_I2C_ SFP1_I2C_ RSVDAC1 RSVDAC1 FLSH_CE_ FLSH_SI
_N
CLK/MDC1 Data/MDIO 7_NC
6_NC
N
1
SMBALRT SFP0_I2C_ SFP0_I2C_ RSVDAD1 RSVDAD1 FLSH_SCK FLSH_SO LED1_0
_N
CLK/MDC0 Data/MDIO 7_NC
6_NC
0
IEEE_ATE EE_CS_N EE_SK
ST0_n
MDI0_p_0 MDI0_n_0 VSS
MDI0_p_1 MDI0_n_1 VSS
VSS
MDI0_p_2 MDI0_n_2 VSS
MDI0_p_3 MDI0_n_3 VSS
VSS
SRDSI_0_ SRDSI_0_ VSS
n
p
VSS
VCC1P8
VCC1P8
VCC1P0
VCC1P0
VCC1P0
VSS
VSS
VSS
VSS
SER_RCO VCC1P0
MP
SRDSO_0 SRDSO_0 VSS
_n
_p
VSS
RSVDM24 RSVDM23 VSS
_NC
_NC
XTAL2
VSS
SRDSO_1 SRDSO_1 VSS
_n
_p
SRDSI_1_ SRDSI_1_ VSS
n
p
VSS
MDI1_p_3 MDI1_n_3 VSS
MDI1_p_2 MDI1_n_2 VSS
VSS
VSS
VSS
IEEE_ATE SMBCLK
ST1_n
MDI1_p_1 MDI1_n_1 VSS
VSS
21
IEEE_ATE SMBD
ST1_p
MDI1_p_0 MDI1_n_0 VSS
VSS
VSS
Intel® 82576 GbE Controller — Pin Interface
Ball-Out Representation
§§
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.0
Interconnects
3.1
PCIe
3.1.1
PCIe Overview
PCIe is a third generation I/O architecture that enables cost competitive next generation I/O solutions
providing industry leading price/performance and feature richness. It is an industry-driven
specification.
PCIe defines a basic set of requirements that encases the majority of the targeted application classes.
Higher-end applications' requirements, such as enterprise class servers and high-end communication
platforms, are encased by a set of advanced extensions that compliment the baseline requirements.
To guarantee headroom for future applications of PCIe, a software-managed mechanism for introducing
new, enhanced, capabilities in the platform is provided. Figure 3-1 shows PCIe architecture.
Figure 3-1.
PCIe Stack Structure
PCIe's physical layer consists of a differential transmit pair and a differential receive pair. Full-duplex
data on these two point-to-point connections is self-clocked such that no dedicated clock signals are
required. The bandwidth of this interface increases linearly with frequency.
The packet is the fundamental unit of information exchange and the protocol includes a message space
to replace the various side-band signals found on many buses today. This movement of hard-wired
signals from the physical layer to messages within the transaction layer enables easy and linear
physical layer width expansion for increased bandwidth.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
75
Intel® 82576 GbE Controller — Interconnects
The common base protocol uses split transactions and several mechanisms are included to eliminate
wait states and to optimize the reordering of transactions to further improve system performance.
3.1.1.1
Architecture, Transaction and Link Layer Properties
• Split transaction, packet-based protocol
• Common flat address space for load/store access (such as PCI addressing model)
— Memory address space of 32-bit to allow compact packet header (must be used to access
addresses below 4 GB)
— Memory address space of 64-bit using extended packet header
• Transaction layer mechanisms:
— PCI-X style relaxed ordering
— Optimizations for no-snoop transactions
• Credit-based flow control
• Packet sizes/formats:
— Maximum packet size supports 128 byte and 256 byte data payload
— Maximum read request size of 512 bytes
• Reset/initialization:
— Frequency/width/profile negotiation performed by hardware
• Data integrity support
— Using CRC-32 for transaction layer packets
• Link layer retry for recovery following error detection
— Using CRC-16 for link layer messages
• No retry following error detection
— 8b/10b encoding with running disparity
• Software configuration mechanism:
— Uses PCI configuration and bus enumeration model
— PCIe-specific configuration registers mapped via PCI extended capability mechanism
• Baseline messaging:
— In-band messaging of formerly side-band legacy signals (such as interrupts, etc.)
— System-level power management supported via messages
• Power management:
— Full support for PCI-PM
— Wake capability from D3cold state
— Compliant with ACPI, PCI-PM software model
— Active state power management
• Support for PCIe v2.0 (2.5GT/s)
— Support for completion time out
— Support for additional registers in the PCIe capability structure.
Intel® 82576 GbE Controller
Datasheet
76
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.1.1.2
Physical Interface Properties
• Point to point interconnect
— Full-duplex; no arbitration
• Signaling technology:
— Low Voltage Differential (LVD)
— Embedded clock signaling using 8b/10b encoding scheme
• Serial frequency of operation: 2.5 GHz.
• Interface width of x4, x2, or x1.
• DFT and DFM support for high volume manufacturing
3.1.1.3
Advanced Extensions
PCIe defines a set of optional features to enhance platform capabilities for specific usage modes. The
82576 supports the following optional features:
• Advanced Error Reporting - messaging support to communicate multiple types/severity of errors
• Device serial number - Allows exposure of a unique serial number for each device.
• Alternative Requester ID (ARI) - allow support of more than 8 function per device.
• Single Root I/O virtualization (PCI-SIG SR-IOV) - allows exposure of virtual functions controlling a
subset of the resources to virtual machines.
3.1.2
3.1.2.1
Functionality - General
Native/Legacy
• All the 82576 PCI functions are native PCIe functions.
3.1.2.2
Locked Transactions
• The 82576 does not support locked requests as target or master.
3.1.2.3
End to End CRC (ECRC)
• Not supported by the 82576
3.1.3
3.1.3.1
Host I/F
Tag IDs
PCIe device numbers identify logical devices within the physical device (the 82576 is a physical device).
The 82576 implements a single logical device with up to two separate PCI functions: LAN 0, and LAN 1.
The device number is captured from each type 0 configuration write transaction.
Each of the PCIe functions interfaces with the PCIe unit through one or more clients. A client ID
identifies the client and is included in the Tag field of the PCIe packet header. Completions always carry
the tag value included in the request to enable routing of the completion to the appropriate client.
Tag IDs are allocated differently for read and write. Messages are sent with a tag of 0x1F.
3.1.3.1.1
320961-015EN
Revision: 2.61
December 2010
TAG ID Allocation for Read Transactions
Intel® 82576 GbE Controller
Datasheet
77
Intel® 82576 GbE Controller — Interconnects
Table 3-1 lists the Tag ID allocation for read accesses. The tag ID is interpreted by hardware in order to
forward the read data to the required device.
Table 3-1.
IDs in Read Transactions
Tag ID
Description
0
Reserved
1
Descriptor Rx
2
Reserved
3
Reserved
4
Descriptor Tx
5
Reserved
6
Reserved
7
Reserved
8
Data request 0
Like 82571/82572/82575
9
Data request 1
Like 82571/82572/82575
0a
Data request 2
Like 82571/82572/82575
0b
Data request 3
Like 82571/82572/82575
10
Reserved
11
Message unit
12-1F
Reserved
3.1.3.1.2
Comment
Like 82571/82572/82575
Like 82571/82572/82575
TAG ID Allocation for Write Transactions
Request tag allocation depends on these system parameters:
• DCA supported/not supported in the system (DCA_CTRL.DCA_DIS - see Section 8.13.4 for details)
• DCA enabled/disabled for each type of traffic (TXCTL.TX Descriptor DCA EN, RXCTL.RX Descriptor
DCA EN, RXCTL.RX Header DCA EN, RXCTL.Rx Payload DCA EN)
• System type: Legacy DCA vs. DCA 1.0 (DCA_CTRL.DCA_MODE - see Section 8.13.4 for details).
• CPU ID (RXCTL.CPUID or TXCTL.CPUID)
Since DCA is implemented differently in I/OAT 1 and in I/OAT 2/3 platforms, the tag IDs are different as
well (see Section 3.1.3.1.2.3 below).
3.1.3.1.2.1
Case 1 - DCA Disabled in the System:
Table 3-2 describes the write requests tags. Unlike read, the values are for debug only, allowing tracing
of requests through the system.
Table 3-2.
IDs in Write Transactions, DCA Disabled Mode
Tag ID
Description
0x0 - 0x1
Reserved
0x2
Tx descriptors write-back / Tx Head write-back
Intel® 82576 GbE Controller
Datasheet
78
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Table 3-2.
IDs in Write Transactions, DCA Disabled Mode (Continued)
0x3
Reserved
0x4
Rx descriptors write-back
0x5
Reserved
0x6
Write data
0x7 - 0x1D
Reserved
0x1E
MSI and MSI-X
0x1F
Reserved
3.1.3.1.2.2
Case 2 - DCA Enabled in the System, but Disabled for the Request:
• Legacy DCA platforms - If DCA is disabled for the request, the tags allocation is identical to the case
where DCA is disabled in the system. See Table 3-2 above.
• DCA 1.0 platforms - All write requests have the tag of 0x00.
Note:
When in DCA 1.0 mode, messages and MSI/MSI-x write requests are sent with the no-hint
tag.
3.1.3.1.2.3
Case 3 - DCA Enabled in the System, DCA Enabled for the Request:
• Legacy DCA Platforms: the request tag is constructed as follows:
— Bit[0] – DCA Enable
— Bits[3:1] - The CPU ID field taken from the CPUID[2:0] bits of the RXCTL or TXCTL registers
— Bits[7:4] - Reserved
• DCA 1.0 Platforms: the request tag (all 8 bits) is taken from the CPUID field of the RXCTL or TXCTL
registers
3.1.3.2
Completion Timeout Mechanism
In any split transaction protocol, there is a risk associated with the failure of a requester to receive an
expected completion. To enable requesters to attempt recovery from this situation in a standard
manner, the completion timeout mechanism is defined.
The completion timeout mechanism is activated for each request that requires one or more completions
when the request is transmitted. The PCIe specification, Rev. 1.1 requires that the completion timeout
timer:
• Should not expire in less than 10 ms.
• Must expire if a request is not completed within 50 ms.
However, some platforms experience completion latencies that are longer than 50 ms, in some cases up
to seconds. In PCIe specification, Rev 2.0 an mechanism to allow configuration of the completion
timeout was added. The 82576 supports both the legacy Rev. 1.1 and the default Rev 2.0 mechanisms,
To support the legacy mode, it provides a programmable range for the completion timeout, as well as
the ability to disable completion timeout altogether. The default PCIe Rev 2.0 mode programs
completion timeout through an extension of the PCIe capability structure. The new capability structure
is assigned a PCIe capability structure version of 0x2.
The 82576 controls the following aspects of completion timeout:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
79
Intel® 82576 GbE Controller — Interconnects
• Disabling or enabling completion timeout
• Disabling or enabling resending a request on completion timeout
• A programmable range of timeout values
Programming the behavior of the completion timeout is done differently whether capability Structure
version 0x1 is enabled or capability structure version 0x2 is enabled. Table 3-3 lists the behavior for
both cases.
Table 3-3.
Completion Timeout Programming
Capability
Capability Structure Version = 0x1
Capability Structure Version = 0x2
Completion timeout enabling
Loaded from EEPROM into
Completion_Timeout_Disable bit in the
PCIe Control Register (GCR 0x05000).
Controlled through PCIe configuration
space Device Control 2 Register (0xC8)
bit 4. Visible through read-only CSR
Resend request enable
Loaded from EEPROM into
Completion_Timeout_Resend bit in the
PCIe Control Register (GCR, 0x05000).
Same as version = 0x1
Completion timeout period
Loaded from EEPROM into CSR bit.
Controlled through PCIe configuration
space Device Control 2 Register (0xC8)
bits 3:0.
Visible through read-only CSR bit.
The capability structure exposed and the mode used are fixed by the GIO_CAP field in the PCIe Init
Configuration 3 EEPROM Word (Word 0x1A).
3.1.3.2.1
Completion Timeout Enable
• Version = 0x1- Loaded from the Completion Timeout Disable bit in the EEPROM (Word 0x15, bit 7)
into the Completion_Timeout_Disable bit in the PCIe Control Register (GCR). Completion Timeout
enabled is the default.
• Version = 0x2 - Programmed through PCIe configuration space Device Control 2 Register (0xC8) bit
4.. Visible through the Completion_Timeout_Disable bit in the PCIe Control Register (GCR).
Completion Timeout enabled is the default.
3.1.3.2.2
Resend Request Enable
• Version = 0x1- The Completion Timeout Resend EEPROM bit (Word 0x15, bit 4) , loaded to the
Completion_Timeout_Resend bit in the PCIe Control Register (GCR), enables resending the request
(applies only when completion timeout is enabled). The default is to resend a request that timed
out.
• Version = 0x2 - same as when version = 0x1.
3.1.3.2.3
Completion Timeout Period
• Version = 0x1.- Loaded from the Completion Timeout Value field in the EEPROM (word 0x15, bits
6:5) to the Completion_Timeout_Value bits in the PCIe Control Register (GCR). The following
values are supported.
Setting: Completion Timeout Value
PCIe Spec defined ranges
Ranges implemented
00 (default)
50 μs to 10 ms
500 μs – 1 ms
01
10 ms to 250 ms
50 ms – 100 ms
10
250 ms to 4 s
500 ms – 1s
11
4 s to 64 s
10s – 20s
Intel® 82576 GbE Controller
Datasheet
80
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
• Version = 0x2 - Programmed through PCI configuration. Visible through the
Completion_Timeout_Value bits in the PCIe Control Register (GCR). The 82576 supports all four
ranges defined by the PCIe ECR.
— 50 us to 10 ms
— 10 ms to 250 ms
— 250 ms to 4 s
— 4 s to 64 s
System software programs a range (one of nine possible ranges that sub-divide the four ranges
previously mentioned) into the PCIe configuration space Device Control 2 Register (0xC8) bits 3:0. The
following are supported sub-ranges.
Setting: Completion Timeout
Value Device Control 2 Register
(0xC8) bits 3:0
PCIe defined ranges
Ranges implemented
0000 (default)
50 μs- 10 ms
500 μs – 1ms
0001
50 us – 100 μs
50 μs – 100 us
0010
1 ms- 10 ms
2 ms – 4 ms
0101
16 ms – 55 ms
16 ms – 32 ms
0110
65 ms – 210 ms
65 ms – 130 ms
1001
260 ms – 900 ms
260 ms – 520 ms
1010
1 s – 3.5 s
1s–2s
1101
4 s – 13 s
4s–8s
1110
17 s – 64 s
17 s – 34 s
A memory read request for which there are multiple completions is considered completed only when all
completions are received by the requester. If some, but not all, requested data is returned before the
completion timeout timer expires, the requestor is permitted to keep or to discard the data that was
returned prior to timer expiration.
Note:
The completion timeout value must be programmed correctly in PCIe configuration space in
(Device Control 2 Register); the value must be set above the expected maximum latency
for completions in the system in which the device is installed. This will ensure that the
device receives the completions for the requests it sends out, avoiding a completion timeout
scenario. It is expected that the system BIOS will set this value appropriately for the
system.
3.1.4
Transaction Layer
The upper layer of the PCIe architecture is the transaction Layer. The transaction layer connects to the
82576 core using an implementation specific protocol. Through this core-to-transaction-layer protocol,
the application-specific parts of the 82576 interact with the PCIe subsystem and transmit and receive
requests to or from the remote PCIe agent, respectively.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
81
Intel® 82576 GbE Controller — Interconnects
3.1.4.1
Table 3-4.
Transaction Types Accepted by the 82576
Transaction Types at the Rx Transaction Layer
Transaction Type
Tx Later
Reaction
FC Type
Hardware Should Keep Data
From Original Packet
For Client
Configuration Read
Request
NPH
CPLH + CPLD
Requester ID, TAG, Attribute
Configuration space
Configuration Write
Request
NPH + NPD
CPLH
Requester ID, TAG, Attribute
Configuration space
Memory Read Request
NPH
CPLH + CPLD
Requester ID, TAG, Attribute
CSR
PH +
-
-
CSR
Memory Write Request
PD
IO Read Request
NPH
CPLH + CPLD
Requester ID, TAG, Attribute
CSR
IO Write Request
NPH + NPD
CPLH
Requester ID, TAG, Attribute
CSR
Read completions
CPLH +
CPLD
-
-
DMA
Message
PH
-
-
Message Unit / INT / PM /
Error Unit
Flow control types:
• PH - Posted request headers
• PD - Posted request data payload
• NPH - Non-posted request headers
• NPD - Non-posted request data payload
• CPLH - Completion headers
• CPLD - Completion data payload
3.1.4.1.1
Configuration Request Retry Status
PCIe supports devices requiring a lengthy self-initialization sequence to complete before they are able
to service configuration requests as it is the case for the 82576 that might have a delay in initialization
due to an EEPROM read.
If the read of the PCIe section in the EEPROM was not completed and the 82576 receives a
configuration request, the 82576 responds with a configuration request retry completion status to
terminate the request, and thus effectively stall the configuration request until such time that the
subsystem has completed local initialization and is ready to communicate with the host.
3.1.4.1.2
Partial Memory Read and Write Requests
The 82576 has limited support of read and write requests when only part of the byte enable bits are set
as described later in this section.
Partial writes to the MSI-X table are supported. All other partial writes are ignored and a completion
abort is sent.
Zero-length writes have no internal impact (nothing written, no effect such as clear-by-write). The
transaction is treated as a successful operation (no error event).
Intel® 82576 GbE Controller
Datasheet
82
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Partial reads with at least one byte enabled are answered as a full read. Any side effect of the full read
(such as clear by read) is applicable to partial reads also.
Zero-length reads generate a completion, but the register is not accessed and undefined data is
returned.
3.1.4.2
Transaction Types Initiated by the 82576
Table 3-5.
Transaction Types at the Tx Transaction Layer
Transaction type
Payload Size
FC Type
From Client
Configuration Read Request Completion
Dword
CPLH + CPLD
Configuration space
Configuration Write Request Completion
-
CPLH
Configuration space
I/O Read Request Completion
Dword
CPLH + CPLD
CSR
I/O Write Request Completion
-
CPLH
CSR
Read Request Completion
Dword/Qword
CPLH + CPLD
CSR
Memory Read Request
-
NPH
DMA
Memory Write Request
<= MAX_PAYLOAD_SIZE
PH + PD
DMA
Message
-
PH
Message Unit / INT /
PM / Error Unit
Note:
MAX_PAYLOAD_SIZE supported is loaded from EEPROM (128 bytes, 256 bytes or 512 bytes). IF ARI capability is not
exposed, the effective MAX_PAYLOAD_SIZE is defined for each PCI functions according to configuration space register of
this function. If ARI capability is exposed, effective MAX_PAYLOAD_SIZE is defined for all PCI functions according to
configuration space register of function zero
3.1.4.2.1
Data Alignment
Requests must never specify an address/length combination that causes a memory space access to
cross a 4 KB boundary. The 82576 breaks requests into 4 KB-aligned requests (if needed). This does
not pose any requirement on software. However, if software allocates a buffer across a 4 KB boundary,
hardware issues multiple requests for the buffer. Software should consider limiting buffer sizes and
base addresses to comply with a 4 KB boundary in cases where it improves performance.
The general rules for packet alignment are as follows:
1. The length of a single request should not exceed the PCIe limit of MAX_PAYLOAD_SIZE for write
and MAX_READ_REQ for read.
2. The length of a single request does not exceed the 82576’s internal limitations 512 bytes.
3. A single request should not span across different memory pages as noted by the 4 KB boundary
previously mentioned.
Note:
The rules apply to all the 82576 requests (read/write, snoop and no snoop).
If a request can be sent as a single PCIe packet and still meet rules 1-3, then it is not broken at a
cache-line boundary (as defined in the PCIe Cache line size configuration word), but rather, sent as a
single packet (motivation is that the chipset might break the request along cache-line boundaries, but
the 82576 should still benefit from better PCIe utilization). However, if rules 1-3 require that the
request is broken into two or more packets, then the request is broken at a cache-line boundary.
3.1.4.2.2
320961-015EN
Revision: 2.61
December 2010
Multiple Tx Data Read Requests
Intel® 82576 GbE Controller
Datasheet
83
Intel® 82576 GbE Controller — Interconnects
The 82576 supports four pipe lined requests for transmit data. In general, the four requests might
belong to the same packet or to consecutive packets. However, the following restriction applies:
• All requests for a packet are issued before a request is issued for a consecutive packet
Read requests can be issued from any of the supported queues, as long as the restriction is met.
Pipelined requests might belong to the same queue or to separate queues. However, as previously
noted, all requests for a certain packet are issued (from same queue) before a request is issued for a
different packet (potentially from a different queue).
The PCIe specification does not insure that completions for separate requests return in-order. Read
completions for concurrent requests are not required to return in the order issued. The 82576 handles
completions that arrive in any order. Once all completions arrive for a given request, the 82576 might
issue the next pending read data request.
• The 82576 incorporates a 2 KB re-order buffer to support re-ordering of completions for four
requests. Each request/completion can be up to 512 bytes long. The maximum size of a read
request is defined as the minimum {512, Max_Read_Request_Size}.
In addition to the four pipeline requests for transmit data, the 82576 can issue a single read request for
each of the Tx descriptors and Rx descriptors. The requests for Tx data, Tx descriptor, and Rx descriptor
are independently issued. Each descriptor read request can fetch up to 16 descriptors (equal to 256
bytes of data).
3.1.4.3
Messages
3.1.4.3.1
Message Handling by the 82576 (as a Receiver)
Message packets are special packets that carry a message code.
The upstream device transmits special messages to the 82576 by using this mechanism.
The transaction layer decodes the message code and responds to the message accordingly.
Table 3-6.
Message
code [7:0]
Supported Message in the 82576 (as a Receiver)
Routing r2r1r0
Message
Device later response
0x14
100
PM_Active_State_NAK
Internal signal set
0x19
011
PME_Turn_Off
Internal signal set
0x50
100
Slot power limit support (has one Dword data)
Silently drop
0x7E
010,011,100
Vendor_defined type 0 no data
Unsupported request1
0x7E
010,011,100
Vendor_defined type 0 data
Unsupported request1
0x7F
010,011,100
Vendor_defined type 1 no data
Silently drop
0x7F
010,011,100
Vendor_defined type 1 data
Silently drop
0x00
011
Unlock
Silently drop
1. No Completion is expected for this type of packets
3.1.4.3.2
Message Handling by the 82576 (as a Transmitter)
The transaction layer is also responsible for transmitting specific messages to report internal/external
events (such as interrupts and PMEs).
Intel® 82576 GbE Controller
Datasheet
84
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Table 3-7.
Supported Message in the 82576 (as a Transmitter)
Message code
[7:0]
Routing r2r1r0
0x20
100
Assert INT A
0x21
100
Assert INT B
0x22
100
Assert INT C
0x23
100
Assert INT D
0x24
100
De-assert INT A
0x25
100
De-assert INT B
0x26
100
De-assert INT C
0x27
100
De-Assert INT D
0x30
000
ERR_COR
0x31
000
ERR_NONFATAL
0x33
000
ERR_FATAL
0x18
000
PM_PME
0x1B
101
PME_TO_ACK
3.1.4.4
Message
Ordering Rules
The 82576 meets the PCIe ordering rules (PCI-X rules) by following the PCI simple device model:
• Deadlock avoidance - Master and target accesses are independent - The response to a target
access does not depend on the status of a master request to the bus. If master requests are
blocked, such as due to no credits, target completions might still proceed (if credits are available).
• Descriptor/data ordering - The 82576 does not proceed with some internal actions until respective
data writes have ended on the PCIe link:
— The 82576 does not update an internal header pointer until the descriptors that the header
pointer relates to are written to the PCIe link.
— The 82576 does not issue a descriptor write until the data that the descriptor relates to is
written to the PCIe link.
The 82576 might issue the following master read request from each of the following clients:
• Rx Descriptor Read (one for each LAN port)
• Tx Descriptor Read (two for each LAN port)
• Tx Data Read (up to four for each LAN port/ one for the manageability)
Completion separate read requests are not guaranteed to return in order. Completions for a single read
request are guaranteed to return in address order.
3.1.4.4.1
Out of Order Completion Handling
In a split transaction protocol, when using multiple read requests in a multi processor environment,
there is a risk that completions arrive from the host memory out of order and interleaved. In this case,
the 82576 sorts the request completion and transfers them to the Ethernet in the correct order.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
85
Intel® 82576 GbE Controller — Interconnects
3.1.4.5
Transaction Definition and Attributes
3.1.4.5.1
Max Payload Size
The 82576 policy to determine Max Payload Size (MPS) is as follows:
• Master requests initiated by the 82576 (including completions) limits MPS to the value defined for
the function issuing the request.
• Target write accesses to the 82576 are accepted only with a size of one Dword or two Dwords.
Write accesses in the range of (three Dwords, MPS, etc.) are flagged as UR. Write accesses above
MPS are flagged as malformed.
3.1.4.5.2
Traffic Class (TC) and Virtual Channels (VC)
The 82576 only supports TC=0 and VC=0 (default).
3.1.4.5.3
Relaxed Ordering
The 82576 takes advantage of the relaxed ordering rules in PCIe. By setting the relaxed ordering bit in
the packet header, the 82576 enables the system to optimize performance in the following cases:
• Relaxed ordering for descriptor and data reads: When the 82576 emits a read transaction, its split
completion has no ordering relationship with the writes from the CPUs (same direction). It should
be allowed to bypass the writes from the CPUs.
• Relaxed ordering for receiving data writes: When the 82576 masters receive data writes, it also
enables them to bypass each other in the path to system memory because software does not
process this data until their associated descriptor writes complete.
• The 82576 cannot relax ordering for descriptor writes, MSI/MSI-X writes or PCIe messages.
Relaxed ordering can be used in conjunction with the no-snoop attribute to enable the memory
controller to advance non-snoop writes ahead of earlier snooped writes.
Relaxed ordering is enabled in the 82576 by clearing the RO_DIS bit in the CTRL_EXT register. Actual
setting of relaxed ordering is done for LAN traffic by the host through the DCA registers.
3.1.4.5.4
Snoop Not Required
The 82576 sets the Snoop Not Required attribute bit for master data writes. System logic might provide
a separate path into system memory for non-coherent traffic. The non-coherent path to system
memory provides higher, more uniform, bandwidth for write requests.
Note:
The Snoop Not Required attribute does not alter transaction ordering. Therefore, to achieve
maximum benefit from Snoop Not Required transactions, it is advisable to set the relaxed
ordering attribute as well (assuming that system logic supports both attributes). In fact,
some chipsets require that relaxed ordering is set for no-snoop to take effect.
Global no-snoop support is enabled in the 82576 by clearing the NS_DIS bit in the CTRL_EXT register.
Actual setting of no snoop is done for LAN traffic by the host through the DCA registers.
3.1.4.5.5
No Snoop and Relaxed Ordering for LAN Traffic
Software might configure non-snoop and relax order attributes for each queue and each type of
transaction by setting the respective bits in the RXCTRL and TXCTRL registers.
Intel® 82576 GbE Controller
Datasheet
86
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Table 3-8 lists the default behavior for the No-Snoop and Relaxed Ordering bits for LAN traffic when I/
OAT 2 is enabled.
Table 3-8.
LAN Traffic Attributes
No-Snoop Default
Relaxed Ordering
Default
Rx Descriptor Read
N
Y
Rx Descriptor Write-Back
N
N
Relaxed ordering must never be
used for this traffic.
Rx Data Write
Y
Y
See the following note and
Section 3.1.4.5.5.1
Rx Replicated Header
N
Y
Tx Descriptor Read
N
Y
Tx Descriptor Write-Back
N
Y
Tx TSO Header Read
N
Y
Tx Data Read
N
Y
Transaction
Note:
Comments
Rx payload no-snoop is also conditioned by the NSE bit in the receive descriptor. See
Section 3.1.4.5.5.1.
3.1.4.5.5.1
No-Snoop Option for Payload
Under certain conditions, which occur when I/OAT is enabled, software knows that it is safe to transfer
(DMA) a new packet into a certain buffer without snooping on the front-side bus. This scenario typically
occurs when software is posting a receive buffer to hardware that the CPU has not accessed since the
last time it was owned by hardware. This might happen if the data was transferred to an application
buffer by the I/OAT DMA engine.
In this case, software should be able to set a bit in the receive descriptor indicating that the 82576
should perform a no-snoop DMA transfer when it eventually writes a packet to this buffer.
When a non-snoop transaction is activated, the TLP header has a non-snoop attribute in the
Transaction Descriptor field.
This is triggered by the NSE bit in the receive descriptor. See Section 7.1.5.
3.1.4.5.5.2
No Snoop Option for TSO Header
As hardware reads the header of a TSO request for each segment it sends, we may safely assume that
after the first read of the header it is updated in the main memory. As as result, all the subsequent
reads of the header might be done with the no-snoop option set. This option is triggered by setting the
NoSnoop_LSO_hdr_buf bit in the DTXCTL register.
3.1.4.6
Flow Control
3.1.4.6.1
82576 Flow Control Rules
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
87
Intel® 82576 GbE Controller — Interconnects
The 82576 implements only the default Virtual Channel (VC0). A single set of credits is maintained for
VC0.
Table 3-9.
Allocation of FC Credits
Credit Type
Posted Request Header (PH)
Operations
Target Write (one unit)
Message (one unit)
Posted Request Data (PD)
Target Write (Length/16 bytes=1)
Number Of Credits
Two units (to enable concurrent
accesses to both LAN ports).
MAX_PAYLOAD_SIZE/16
Message (one unit)
Non-Posted Request Header (NPH)
Target Read (one unit)
Configuration Read (one unit)
Two units (to enable concurrent target
accesses to both LAN ports).
Configuration Write (one unit)
Non-Posted Request Data (NPD)
Configuration Write (one unit)
Two units.
Completion Header (CPLH)
Read Completion (N/A)
Infinite (accepted immediately).
Completion Data (CPLD)
Read Completion (N/A)
Infinite (accepted immediately).
Rules for FC updates:
• The 82576 maintains two credits for NPD at any given time. It increments the credit by one after
the credit is consumed and sends an UpdateFC packet as soon as possible. UpdateFC packets are
scheduled immediately after a resource is available.
• The 82576 provides two credits for PH (such as for two concurrent target writes) and two credits for
NPH (such as for two concurrent target reads). UpdateFC packets are scheduled immediately after
a resource becomes available.
• The 82576 follows the PCIe recommendations for frequency of UpdateFC FCPs.
3.1.4.6.2
Upstream Flow Control Tracking
The 82576 issues a master transaction only when the required FC credits are available. Credits are
tracked for posted, non-posted, and completions (the later to operate against a switch).
3.1.4.6.3
Flow Control Update Frequency
In any case, UpdateFC packets are scheduled immediately after a resource becomes available.
When the link is in the L0 or L0s link state, Update FCPs for each enabled type of non-infinite FC credit
must be scheduled for transmission at least once every 30 μs (-0%/+50%), except when the Extended
Sync bit of the Control Link register is set, in which case the limit is 120 μs (-0%/+50%).
3.1.4.6.4
Flow Control Timeout Mechanism
The 82576 implements the optional FC update timeout mechanism.
The mechanism is activated when the Link is in L0 or L0s Link state. It uses a timer with a limit of
200μs (-0%/+50%), where the timer is reset by the receipt of any Init or Update FCP. Alternately, the
timer may be reset by the receipt of any DLLP.
After timer expiration, the mechanism instructs the PHY to re-establish the link (via the LTSSM recovery
state).
Intel® 82576 GbE Controller
Datasheet
88
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.1.4.7
Error Forwarding
If a TLP is received with an error-forwarding trailer, the packet is dropped and not delivered to its
destination. The 82576 does not initiate any additional master requests for that PCI function until it
detects an internal reset or a software reset for the associated LAN. Software is able to access device
registers after such a fault.
System logic is expected to trigger a system-level interrupt to inform the operating system of the
problem. The operating system can then stop the process associated with the transaction, re-allocate
memory instead of the faulty area, etc.
3.1.5
Data Link Layer
3.1.5.1
ACK/NAK Scheme
The 82576 supports two alternative schemes for ACK/NAK rate:
1. ACK/NAK is scheduled for transmission according to timeouts specified in the LTIV register
2. ACK/NAK is scheduled for transmission according to timeouts specified in the PCIe specification.
The PCIe Error Recovery bit loaded from EEPROM determines which of the two schemes is used.
3.1.5.2
Supported DLLPs
The following DLLPs are supported by the 82576 as a receiver:
Table 3-10.
DLLPs Received by the 82576
DLLP type
Remarks
Ack
Nak
PM_Request_Ack
InitFC1-P
Virtual Channel 0 only
InitFC1-NP
Virtual Channel 0 only
InitFC1-Cpl
Virtual Channel 0 only
InitFC2-P
Virtual Channel 0 only
InitFC2-NP
Virtual Channel 0 only
InitFC2-Cpl
Virtual Channel 0 only
UpdateFC-P
Virtual Channel 0 only
UpdateFC-NP
Virtual Channel 0 only
UpdateFC-Cpl
Virtual Channel 0 only
The following DLLPs are supported by the 82576 as a transmitter:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
89
Intel® 82576 GbE Controller — Interconnects
Table 3-11.
DLLPs Initiated by the 82576
DLLP type
Remarks
Ack
Nak
PM_Enter_L1
PM_Enter_L23
PM_Active_State_Request_L1
InitFC1-P
Virtual Channel 0 only
InitFC1-NP
Virtual Channel 0 only
InitFC1-Cpl
Virtual Channel 0 only
InitFC2-P
Virtual Channel 0 only
InitFC2-NP
Virtual Channel 0 only
InitFC2-Cpl
Virtual Channel 0 only
UpdateFC-P
Virtual Channel 0 only
UpdateFC-NP
Virtual Channel 0 only
Note:
3.1.5.3
UpdateFC-Cpl is not sent because of the infinite FC-Cpl allocation.
Transmit EDB Nullifying
In case of a retrain necessity, there is a need to guarantee that no abrupt termination of the Tx packet
happens. For this reason, early termination of the transmitted packet is possible. This is done by
appending an EDB (EnD Bad symbol) to the packet.
3.1.6
3.1.6.1
Physical Layer
Link Width
The 82576 supports a maximum link width of x4, x2, or x1 as determined by the Lane_Width field in
PCIe Init Configuration 3 EEPROM word.
The max link width is loaded into the Maximum Link Width field of the PCIe Capability register
(LCAP[11:6]). The hardware default is x4 link.
During link configuration, the platform and the 82576 negotiate on a common link width. The link width
must be one of the supported PCIe link widths (x1, x2, x4), such that:
• If Maximum Link Width = x4, then the 82576 negotiates to either x4, x2 or x1.1
• If Maximum Link Width = x2, then the 82576 negotiates to either x2 or x1.
• If Maximum Link Width = x1, then the 82576 only negotiates to x1.
3.1.6.2
Polarity Inversion
If polarity inversion is detected, the receiver must invert the received data.
1. See restriction in Section 3.1.6.5.
Intel® 82576 GbE Controller
Datasheet
90
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
During the training sequence, the receiver looks at Symbols 6-15 of TS1 and TS2 as the indicator of
lane polarity inversion (D+ and D- are swapped). If lane polarity inversion occurs, the TS1 Symbols 615 received are D21.5 as opposed to the expected D10.2. Similarly, if lane polarity inversion occurs,
Symbols 6-15 of the TS2 ordered set are D26.5 as opposed to the expected D5.2. This provides the
clear indication of lane polarity inversion.
3.1.6.3
L0s Exit latency
The number of FTS sequences (N_FTS) sent during L1 exit, is loaded from the EEPROM into an 8-bit
read-only register.
3.1.6.4
Lane-to-Lane De-Skew
A multi-lane link might have many sources of lane-to-lane skew. Although symbols are transmitted
simultaneously on all lanes, they cannot be expected to arrive at the receiver without lane-to-lane
skew. The skew can include components, which are less than a bit time, bit time units (400 ps for 2.5
Gb), or full symbol time units (4 ns) of skew caused by the re-timing repeaters' insert/delete
operations. Receivers use TS1 or TS2 or Skip Ordered Sets (SOS) to perform link de-skew functions.
The 82576 supports de-skew of up to 6 symbols time (24 ns).
3.1.6.5
Lane Reversal
The following lane reversal modes are supported (see Figure 3-2):
• Lane configuration of x4, x2, and x1
• Lane reversal in x4 and in x2
• Degraded mode (downshift) from x4 to x2 to x1 and from x2 to x1, with one restriction - if lane
reversal is executed in x4, then downshift is only to x1 and not to x2.
Note:
The restriction requires that a x2 interface to the 82576 must connect to lanes 0 and 1 on
the 82576. The PCIe Card Electromechanical specification does not allow to route a x2 link
to a wider connector. Therefore, a system designer is not allowed to connect a x2 link to
lanes 2 and 3 of a PCIe connector. It is also recommended that when used in x2 mode on a
NIC, the 82576 is connected to lanes 0 and 1 of the NIC.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
91
Intel® 82576 GbE Controller — Interconnects
Figure 3-2.
Lane Reversal Supported Modes
Configuration bits:
• EEPROM Lane Reversal Disable bit - disables lane reversal altogether. See Section 6.2.18, PCIe
Control (Word 0x1B) for the bit.
3.1.6.6
Reset
The PCIe PHY can supply core reset to the 82576. The reset can be caused by two sources:
1. Upstream move to hot reset - Inband Mechanism (LTSSM).
2. Recovery failure (LTSSM returns to detect).
3. Upstream component moves to Disable.
3.1.6.7
Scrambler Disable
The scrambler/de-scrambler functionality in the 82576 can be eliminated by two mechanisms:
1. Upstream according to the PCIe specification.
2. EPROM bit.
3.1.7
3.1.7.1
Error Events and Error Reporting
Mechanism in General
PCIe defines two error reporting paradigms: the baseline capability and the Advanced Error Reporting
(AER) capability. The baseline error reporting capabilities are required of all PCIe devices and define the
minimum error reporting requirements. The AER capability is defined for more robust error reporting
and is implemented with a specific PCIe capability structure.
Intel® 82576 GbE Controller
Datasheet
92
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Both mechanisms are supported by the 82576.
Also the SERR# Enable and the Parity Error bits from the legacy Command register take part in the
error reporting and logging mechanism.
Figure 3-3 shows, in detail, the flow of error reporting in the 82576.
Figure 3-3.
3.1.7.2
Error Reporting Mechanism
Error Events
Table 3-12 lists the error events identified by the 82576 and the response in terms of logging,
reporting, and actions taken. Consult the PCIe specification for the effect on the PCI Status register.
Table 3-12.
Error Name
Response and Reporting of Error Events
Error Events
Default Severity
Action
PHY errors
Receiver error
8b/10b decode errors
Correctable.
TLP to initiate NAK and drop data.
Packet framing error
Send ERR_CORR
DLLP to drop.
Data link errors
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
93
Intel® 82576 GbE Controller — Interconnects
Table 3-12.
Response and Reporting of Error Events (Continued)
•
Bad CRC
Correctable.
•
Not legal EDB
Send ERR_CORR
•
Wrong sequence number
•
Bad CRC
Replay timer
timeout
•
REPLAY_TIMER expiration
REPLAY NUM
rollover
•
REPLAY NUM rollover
Data link layer
protocol error
•
Received ACK/NACK not
corresponding to any TLP
Bad TLP
Bad DLLP
Correctable.
TLP to initiate NAK and drop data.
DLLP to drop.
Send ERR_CORR
Correctable.
Follow LL rules.
Send ERR_CORR
Correctable.
Follow LL rules.
Send ERR_CORR
Uncorrectable.
Follow LL rules.
Send ERR_FATAL
TLP errors
Poisoned TLP
received
Log header
A poisoned completion is ignored and
the request can be retried after
timeout. If enabled, the error is
reported.
Send completion with UR.
Uncorrectable.
•
TLP with error forwarding
ERR_NONFATAL
Unsupported
•
Wrong configuration access
Uncorrectable.
Request (UR)
•
MRdLk
ERR_NONFATAL
•
Configuration request type 1
Log header
•
Unsupported vendor Defined
type 0 message
•
Not valid MSG code
•
Not supported TLP type
•
Wrong function number
•
Wrong TC/VC
•
Received target access with
data size > 64-bit
•
Received TLP outside address
range
•
Completion timeout timer
expired
Completion
timeout
Completer abort
Unexpected
completion
•
Attempts to write to the Flash
device when writes are
disabled (EEC.FWE=01b)
Uncorrectable.
Uncorrectable.
Received completion without a
request for it (tag, ID, etc.)
Send completion with CA.
ERR_NONFATAL
Log header
Uncorrectable.
•
Send the read request again.
ERR_NONFATAL
Discard TLP.
ERR_NONFATAL
Log header
Receiver overflow
Flow control
protocol error
Received TLP beyond
allocated credits
Uncorrectable.
•
Minimum initial flow control
advertisements
Uncorrectable.
•
Flow control update for infinite
credit advertisement
•
Intel® 82576 GbE Controller
Datasheet
94
Receiver behavior is undefined.
ERR_FATAL
ERR_FATAL
Receiver behavior is undefined. The
82576 doesn’t report violations of
Flow Control initialization protocol
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Table 3-12.
Malformed TLP
(MP)
Response and Reporting of Error Events (Continued)
•
Data payload exceed
Max_Payload_Size
•
Received TLP data size does
not match length field
•
TD field value does not
correspond with the observed
size
•
Byte enables violations.
•
Power management messages
that don’t use TC0.
•
Usage of unsupported VC
Completion with
unsuccessful
completion status
Byte count
integrity in
completion
process.
3.1.7.3
When byte count isn’t compatible
with the length field and the
actual expected completion
length. For example, length field
is 10 (in Dword), actual length is
40, but the byte count field that
indicates how many bytes are still
expected is smaller than 40,
which is not reasonable.
Uncorrectable.
Drop the packet and free FC credits.
ERR_FATAL
Log header
No action (already
done by originator of
completion).
Free FC credits.
No action
The 82576 doesn't check for this error
and accepts these packets.
This may cause a completion timeout
condition.
Error Pollution
Error pollution can occur if error conditions for a given transaction are not isolated to the error's first
occurrence. If the Physical layer detects and reports a receiver error, to avoid having this error
propagate and cause subsequent errors at upper layers the same packet is not signaled at the data link
or transaction layers.
Similarly, when the data link layer detects an error, subsequent errors that occur for the same packet
are not signaled at the transaction layer.
3.1.7.4
Completion with Unsuccessful Completion Status
A completion with unsuccessful completion status is dropped and not delivered to its destination. The
request that corresponds to the unsuccessful completion is retried by sending a new request for the
data that was not delivered.
3.1.7.5
Error Reporting Changes
The Rev. 1.1 specification defines two changes to advanced error reporting. A new Role-Based Error
Reporting bit in the Device Capabilities register is set to 1b to indicate that these changes are
supported by the 82576.
1. Setting the SERR# Enable bit in the PCI Command register also enables UR reporting (in the same
manner that the SERR# Enable bit enables reporting of correctable and uncorrectable errors). In
other words, the SERR# Enable bit overrides the UR Error Reporting Enable bit in the PCIe Device
Control register.
2. Changes in the response to some uncorrectable non-fatal errors, detected in non-posted requests
to the 82576. These are called advisory Non-fatal error cases. For each of the errors that follow, the
following behavior is defined:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
95
Intel® 82576 GbE Controller — Interconnects
a.
The Advisory Non-Fatal Error Status bit is set in the Correctable Error Status register to indicate
the occurrence of the advisory error and the Advisory Non-Fatal Error Mask corresponding bit in
the Correctable Error Mask register is checked to determine whether to proceed further with
logging and signaling.
b.
If the Advisory Non-Fatal Error Mask bit is clear, logging proceeds by setting the corresponding
bit in the Uncorrectable Error Status register, based upon the specific uncorrectable error that's
being reported as an advisory error. If the corresponding uncorrectable error bit in the
Uncorrectable Error Mask register is clear, the First Error Pointer and Header Log registers are
updated to log the error, assuming they are not still occupied by a previously unserviced error.
c.
An ERR_COR message is sent if the Correctable Error Reporting Enable bit is set in the Device
Control register. An ERROR_NONFATAL message is not sent for this error.
The following uncorrectable non-fatal errors are considered as advisory non-fatal Errors:
• A completion with an Unsupported Request or Completer Abort (UR/CA) status that signals an
uncorrectable error for a non-posted request. If the severity of the UR/CA error is non-fatal, the
completer must handle this case as an advisory non-fatal error.
• When the requester of a non-posted request times out while waiting for the associated completion,
the requester is permitted to attempt to recover from the error by issuing a separate subsequent
request, or to signal the error without attempting recovery. The requester is permitted to attempt
recovery zero, one, or multiple (finite) times, but must signal the error (if enabled) with an
uncorrectable error message if no further recovery attempt is made. If the severity of the
completion timeout is non-fatal and the requester elects to attempt recovery by issuing a new
request, the requester must first handle the current error case as an advisory non-fatal error.
• When a receiver receives an unexpected completion and the severity of the unexpected completion
error is non-fatal, the receiver must handle this case as an advisory non-fatal error.
3.1.8
Performance Monitoring
The 82576 incorporates PCIe performance monitoring counters to provide common capabilities to
evaluate performance. The 82576 implements four 32-bit counters to correlate between concurrent
measurements of events as well as the sample delay and interval timers. The four 32-bit counters can
also operate in a two 64-bit mode to count long intervals or payloads. software can reset, stop, or start
the counters (all at the same time).
The list of events supported by the 82576 and the counters control bits are described in the memory
register map (Section 8.6).
Some counters operate with a threshold - the counter increments only when the monitored event
crossed a configurable threshold (such as the number of available credits is below a threshold).
Counters operate in the following modes:
• Count mode - The counter increments when the respective event occurred.
• Leaky bucket mode - The counter increments only when the rate of events exceeded a certain
value. See Section 3.1.8.1.
3.1.8.1
Leaky Bucket Mode
Each of the counters may be configured independently to operate in a leaky bucket mode. When in
leaky bucket mode, the following functionality is provided:
• One of four 16-bit Leaky Bucket Counters (LBC) is enabled via the LBC Enable [3:0] bits in the PCIe
Statistic Control register #1.
Intel® 82576 GbE Controller
Datasheet
96
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
• The LBC is controlled by the GIO_COUNT_START, GIO_COUNT_STOP, and GIO_COUNT_RESET bits
in the PCIe Statistic Control register #1.
• The LBC increments every time the respective event occurs.
• The LBC is decremented every T ms as defined in the LBC Timer field in the PCIe Statistic Control
registers.
• When an event occurs and the value of the LBC meets or exceeds the threshold defined in the LBC
Threshold field in the PCIe Statistic Control registers, the respective statistics counter increments.
3.1.9
PCIe Power Management
Described in Section 5.4.1 - Power Management.
3.1.10
PCIe Programming Interface
Described in Section 9.0 - PCIe Programming Interface
3.2
Management Interfaces
See Chapter 10.0, System Manageability.
The 82576 contains 2 possible interfaces to an external BMC.
• SMBus
• NC-SI
Since the manageability sideband throughput is lower than the network link throughput, the 82576
allocates an 8 KB internal buffer for incoming network packets prior to being sent over the sideband
interface.
3.2.1
SMBus
SMBus is an optional interface for pass-through and/or configuration traffic between an external MC
and the 82576. The SMBus commands used to configure or read status from the 82576 are described in
Chapter 10.0, System Manageability.
3.2.1.1
Channel Behavior
3.2.1.1.1
SMBus Addressing
The SMBus addresses that the 82576 responds to depend on the LAN mode (teaming/non-teaming).
When the LAN is in teaming mode (fail-over), the 82576 is presented over the SMBus as one device
along with one SMBus address. When in non-teaming mode in the LAN ports, the SMBus is presented
as two SMBus devices on the SMBus along with two SMBus addresses. In dual-address mode all passthrough functionality is duplicated on the SMBus address, where each SMBus address is connected to a
different LAN port.
Note:
DO NOT configure both ports to the same address. When a LAN function is disabled, the
corresponding SMBus address is not presented to the external BMC.
The SMBus address method is defined through the SMBus Addressing Mode bit in the EEPROM. The
SMBus addresses are set by SMBus Address 0 and SMBus Address 1 in the EEPROM.
Note:
If the single-address mode is set, only SMBus address 0 field is valid.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
97
Intel® 82576 GbE Controller — Interconnects
The SMBus addresses (those that are enabled from the EEPROM) can be re-assigned using the SMBus
ARP protocol.
Besides the SMBus address values, all the previously stated parameters of the SMBus (SMBus channel
selection, address mode, address enable) can be set only through EEPROM configuration. The EEPROM
is read on the 82576 at power-up, resets, and other cases described in Section 4.2.
All SMBus addresses should be in Network Byte Order (NBO); most significant byte first.
3.2.1.1.2
SMBus Notification Methods
The 82576 supports three methods of informing the external MC that it has information that is needed
to be read by an external BMC:
• SMBus alert.
• Asynchronous notify.
• Direct receive.
The notification method that is used by the 82576 can be configured from the SMBus using the Receive
Enable command. The default method is set from the EEPROM in the PT init field.
The following events cause the 82576 to send a notification event to the external BMC:
• Receiving a LAN packet that was designated to the BMC.
• Receiving a request status command from the MC initiates a status response (see
Section 10.5.10.2.2).
• Status change has occurred and the 82576 is configured to notify the external MC upon one of the
status changes. The following event triggers a notification to the BMC:
— A change in any of the Status Data 1 bits of the Read Status command (see
Section 10.5.10.2.2 for description of this command).
— A Circuit Breaker indication - indicates matching of a Circuit Breaker filter (or of its counter/
threshold).
There might be cases where the external MC is hung and is unable to respond to the SMBus
notification. The 82576 has a time-out value defined in the EEPROM (see Section 6.8) to avoid hanging
while waiting for the notification response. If the MC does not respond until the timeout expires, the
notification is de-asserted.
3.2.1.1.2.1
SMBus Alert and Alert Response Method
The SMBus Alert# signal is an additional SMBus signal that acts as an asynchronous interrupt signal to
an external SMBus master. The 82576 asserts this signal each time it has a message that it needs the
external MC to read and if the chosen notification method is the SMBus-alert method. Note that the
SMBus alert is an open-drain signal, which means that other devices besides the 82576 can be
connected on the same alert pin and the external MC needs a mechanism to distinguish between the
alert sources as described:
The external MC can respond to the alert by issuing an ARA cycle (see Figure 3-13) to detect the alert
source device. The 82576 responds to the ARA cycle (if it was the SMBus alert source) and de-asserts
the alert when the ARA cycle completes. Following the ARA cycle, the external MC issues a Read
command to retrieve the 82576 message.
Intel® 82576 GbE Controller
Datasheet
98
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Some BMCs do not implement ARA cycle transactions. These BMCs respond to an alert by issuing a
Read command to the 82576 (0xC0/0xD0 or 0xDE). The 82576 always responds to a Read command,
even if it is not the source of the notification. The default response is a status transaction. If the 82576
is the source of the SMBus alert, it replies to the read transaction and de-asserts the alert after the
command byte of the read transaction.
The ARA cycle is an SMBus receive byte transaction to SMBus Address 0001-100b. Note that the ARA
transaction does not support PEC. The ARA transaction format is as follows:
Table 3-13.
SMBus ARA Cycle Format
1
7
1
1
8
1
1
S
Alert Response Address
Rd
A
Slave Device Address
A
P
0001 100
1
0
Manageability Slave SMBus Address
1
Note:
Since the master-receiver (BMC receiver) is involved in the transaction, it must signal the
end of data by generating a NACK (a ‘1’ in the ACK bit position) on the slave device address
byte that was clocked out. This releases the data line to allow the master to generate a stop
condition.
3.2.1.1.2.2
Asynchronous Notify Method
When configured to asynchronous notify method, the 82576 acts as SMBus master and notifies the
external MC by issuing a modified form of the write word transaction. The asynchronous notify
transaction SMBus address and data payload is configured using the Receive Enable command or using
the EEPROM defaults. Note that the asynchronous notify method is not protected by a PEC byte.
Table 3-14.
Asynchronous Notify Command Format
1
7
1
1
7
S
Target Address
Wr
A
Sending Device Address
BMC Slave Address
0
0
Manageability Slave SMBus Address
8
1
8
1
1
Data Byte Low
A
Data Byte High
A
P
Interface
0
Alert Value
0
1
1
A
0

0
The target address and data byte low/high is taken from the Receive Enable command (see
Section 10.5.10.2.6) or EEPROM configuration (See Section 6.8).
3.2.1.1.2.3
Direct Receive Method
If configured, the 82576 has the capability to send the message it needs to transfer to the external MC
as a master over the SMBus, instead of alerting the BMC, and waiting for it to read the message.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
99
Intel® 82576 GbE Controller — Interconnects
Table 3-15 shows the message. Note that the “F”, “L” and command fields in the message are the same
as the op-code returned by the 82576 in response to a MC Receive TCO Packet Block read command
(See Section 10.5.10.2.1). The rules for the “F” and “L” flags are also the same as used in the Receive
TCO Packet Block Read command.
Table 3-15.
Direct Receive Transaction Format
1
7
1
1
1
1
6
1
S
Target Address
Wr
A
F
L
Command
A
BMC Slave Address
0
0
First
Flag
Last
Flag
Receive TCO Command
0
8
1
8
1
Byte Count
A
Data Byte 1
A
N
0
3.2.1.1.3

0
01 0000b
1
8
1
1
A
Data Byte N
A
P
0

0
Receive TCO Flow
The 82576 is used as a channel for receiving packets from the network link and passing them to the
external BMC. The MC can configure the 82576 to pass specific packets to the MC as described in
Section 10.5.10.1.5. Once a full packet is received from the link and identified as a manageability
packet that should be transferred to the BMC, the 82576 starts the receive TCO transaction flow to the
BMC.
The maximum SMBus fragment length is defined in the EERPOM (see Section 6.8.2). The 82576 uses
the SMBus notification method to notify the MC that it has data to deliver. The packet is divided into
fragments, where the 82576 uses the maximum fragment size allowed in each fragment. The last
fragment of the packet transfer is always the status of the packet. As a result, the packet is transferred
in at least two fragments. The data of the packet is transferred in the Receive TCO LAN packet
transaction as described in Section 10.5.10.2.1.
When SMBus alert is selected as the MC notification method, the 82576 notifies the MC on each
fragment of a multi-fragment packet. When asynchronous notify is selected as the MC notification
method, the 82576 notifies the MC only on the first fragment of a received packet. It is BMC’s
responsibility to read the full packet including all the fragments.
Any timeout on the SMBus notification results in discarding the entire packet. Any NACK by the MC on
one of the 82576 receive bytes also causes the packet to be silently discarded.
The maximum size of the received packet is limited by the 82576 hardware to 1536 bytes. Packets
larger then 1536 bytes are silently discarded. Any packet smaller than 1536 bytes is processed by the
82576.
Note:
3.2.1.1.4
When the RCV_EN bit is cleared, all receive TCO functionality is disabled, not just the
packets that are directed to the MC (also auto ARP packets).
Transmit TCO Flow
The 82576 is used as a channel for transmitting packets from the external MC to the network link. The
network packet is transferred from the external MC over the SMBus, and then, when fully received by
the 82576, is transmitted over the network link.
Intel® 82576 GbE Controller
Datasheet
100
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
In dual-address mode, each SMBus address is connected to a different LAN port. When a packet is
received in SMBus transactions using SMBus Address 0, it is transmitted to the network using LAN port
0 and is transmitted through LAN port 1, if received on SMBus Address 1. In single-address mode, the
transmitted port is selected according to the fail-over algorithm (see Section 3.2.1.1.9).
The 82576 supports packets up to the Ethernet packet length (1536 bytes). SMBus transactions can be
up to 240 bytes in length, which means that packets can be transferred over the SMBus in more than
one fragment. In each command byte there are the F and L bits. When the F bit is set, it means that
this is the first fragment of the packet; L means that it is the last fragment of the packet.
Note:
When both flags are set, the entire packet is in one fragment.
The packet is sent over the network link, only after all its fragments are received correctly over the
SMBus.
The 82576 calculates the L2 CRC on the transmitted packet and adds its four bytes at the end of the
packet. Any other packet field (such as XSUM) must be calculated and inserted by the external MC (the
82576 does not change any field in the transmitted packet, besides adding padding and CRC bytes).
Note:
If the packet sent by the MC is larger than 1536 bytes, then the packet is silently discard by
the 82576.
The minimum packet length defined by the 802.3 specification is 64 bytes. The 82576 pads packets
that are less than 64 bytes to meet the specification requirements. There is one exception, when the
packet sent over the SMBus is less than 32 bytes, the external MC must pad it for at least 32 bytes. The
passing bytes value should be zero.
Note:
Packets that are smaller then 32 bytes (including padding) are silently discarded by the
82576.
If the network link goes down at anytime while the 82576 is receiving the packet, it silently discards the
packet. Note that any link down event during the transfer of a packet over the SMBus (after received
from the network), does not stop the operation.
The transmit SMBus transactions are described in Section 10.5.5.2.
3.2.1.1.5
Transmit Errors in Sequence Handling
Once a packet is transferred over the SMBus from the MC to the 82576, the F and L flags should follow
specific rules. The F flag defines that this is the first fragment of the packet; The L flag defines that the
transaction contains the last fragment of the packet.
The following table lists the different options regarding the flags in transmit packet transactions:
Table 3-16.
Previous
Flags in Transmit Packet Transactions
Current
Action/Notes
Last
First
Accepts both.
Last
Not First
Error for current transaction. Current transaction is discarded and an abort status is asserted.
Not Last
First
Error for previous transaction. Previous transaction (until previous first) is discarded. Current
packet is processed.
No abort status is asserted.
Not Last
320961-015EN
Revision: 2.61
December 2010
Not First
Processes the current transaction.
Intel® 82576 GbE Controller
Datasheet
101
Intel® 82576 GbE Controller — Interconnects
Note that since every other Block Write command in the TCO protocol has both F and L flags off, they
cause flushing any pending transmit fragments that were previously received. In other words, when
running the TCO transmit flow, no other block write transactions are allowed in between the fragments.
3.2.1.1.6
TCO Command Aborted Flow
Bit 6 in first byte of the status returned from the 82576 to the external MC indicates that there was a
problem with previous SMBus transactions or with the completion of the operation requested in
previous transaction.
An abort can be asserted for any of the following reasons:
• Any error in the SMBus protocol (NACK, SMBus timeouts).
• Any error in compatibility between required protocols to specific functionality (Receive Enable
command with byte count not 1/14 as defined in the command specification).
• If the 82576 does not have space to store the transmit packet from the MC (in its internal buffer
before sending it to the link). In this case, the entire transaction completes, but the packet is
discarded and the MC is notified about it through the Abort bit.
• Error in the F/L bit sequence during multi-fragment transactions.
• The Abort bit is asserted after an internal reset to the 82576 manageability unit.
Note:
An abort in the status does not always imply that the last transaction of the sequence was
incorrect. There is a gap between the time the status is read from the 82576 and the time
the transaction occurred.
3.2.1.1.7
Concurrent SMBus Transactions
Concurrent SMBus write transactions are not permitted. Once a transaction is started, it must be
completed before additional transaction can be initiated.
3.2.1.1.8
SMBus ARP Functionality
The 82576 supports SMBus ARP protocol as defined in the SMBus 2.0 specification. The 82576 is a
persistent slave address device meaning that its SMBus address is valid after power-up and loaded
from the EEPROM. The 82576 supports all SMBus ARP commands defined in the SMBus specification,
both general and directed.
Note:
SMBus ARP can be disabled through EEPROM configuration (See Section 6.8.3).
SMBus-ARP transactions are described in Section 10.5.5.2.
3.2.1.1.8.1
SMBus ARP in Dual-/Single-Address Mode
The 82576 operates either in single SMBus address mode or in dual SMBus address mode. These
modes reflect on its SMBus-ARP behavior.
When operating in single-address mode, the 82576 presents itself on the SMBus as one device and
responds to SMBus-ARP as one device only. In this case, its SMBus address is SMBus Address 0 as
defined in the EEPROM SMBus ARP addresses word (see Section 6.7.32 and Section 6.7.33). The 82576
has only one AR flag and one AV flag. The vendor specific ID, which is the MAC address of the LAN's
port, is taken from the port 0 address.
In dual-address mode, the 82576 responds as two SMBus devices, meaning that it has two sets of AR/
AV flags (one for each port). The 82576 responds twice to the SMBus-ARP master, one time for each
port. Both SMBus addresses are taken from the SMBus ARP addresses word of the EEPROM. The UDID
Intel® 82576 GbE Controller
Datasheet
102
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
is different between the two ports in the vendor specific ID field, which represent the MAC address,
which is different between the two ports. It is recommended for the 82576 to first answer as port 0,
and only when the address is assigned, to start answering as port 1 to the Get UDID command.
3.2.1.1.8.2
SMBus ARP Flow
SMBus-ARP flow is based on the status of two flags:
• AV - Address Valid - This flag is set when the 82576 has a valid SMBus address.
• AR - Address Resolved - This flag is set when the 82576’s SMBus address is resolved (SMBus
address was assigned by the SMBus-ARP process).
Note:
These flags are internal the 82576 flags and not shown to external SMBus devices.
Since the 82576 is a Persistent SMBus Address (PSA) device, the AV flag is always set, while the AR flag
is cleared after power-up until the SMBus-ARP process completes. Since the AV flag is always set, the
82576 always has a valid SMBus address.
When the SMBus master needs to start an SMBus-ARP process, it resets (In terms of ARP functionality)
all the devices on the SMBus by issuing either Prepare to ARP or Reset Device commands. When the
82576 accepts one of these commands, it clears its AR flag (if set from previous SMBus-ARP process),
but not its AV flag (the current SMBus address remains valid until the end of the SMBus ARP process).
The meaning of an AR flag cleared is that the 82576 answers the following SMBus ARP transactions that
are issued by the master. The SMBus master then issues a Get UDID command (general or directed), to
identify the devices on the SMBus. The 82576 responds to the directed command all the time and to the
general command only if its AR flag is not set. After the Get UDID command, the master assigns the
82576’s SMBus address by issuing an Assign Address command. The 82576 checks whether the UDID
matches its own UDID, and if they match, it switches its SMBus address to the address assigned by the
command (byte 17). After accepting the Assign Address command, the AR flag is set and from this
point on (as long as the AR flag is set), the 82576 does not respond to the Get UDID general command,
while all other commands should be processed even if the AR flag is set. The 82576 stores the SMBus
address that was assigned in the SMBus-ARP process in its EEPROM, so after the next power-up, it
returns to its assigned SMBus address.
Figure 3-4 shows the SMBus-ARP behavior of the 82576.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
103
Intel® 82576 GbE Controller — Interconnects
Figure 3-4.
3.2.1.1.8.3
SMBus ARP Flow
SMBus ARP UDID Content
The Unique Device Identifier (UDID) provides a mechanism to isolate each device for the purpose of
address assignment. Each device has a unique identifier. The 128-bit number is comprised of the
following fields:
Intel® 82576 GbE Controller
Datasheet
104
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Table 3-17.
Unique Device Identifier (UDID)
1 Byte
1 Byte
2 Bytes
2 Bytes
2 Bytes
2 Bytes
2 Bytes
4 Bytes
Device
Capabilities
Version /
Revision
Vendor ID
Device ID
Interface
Sub-system
Vendor ID
Sub- system
Device ID
Vendor
Specific ID
See below
See below
0x8086
0x10C9
0x0004
0x0000
0x0000
See below
MSB
LSB
Where:
• Vendor ID — The device manufacturer's ID as assigned by the SBS Implementers' Forum or the PCI
SIG — Constant value: 0x8086.
• Device ID — The device ID as assigned by the device manufacturer (identified by the Vendor ID
field) - Constant value: 0x10C9.
• Interface — Identifies the protocol layer interfaces supported over the SMBus connection by the
device - In this case, SMBus Version 2.0 - Constant value: 0x0004.
• Sub-system Fields — These fields are not supported and return zeros.
Device Capabilities: Dynamic and Persistent Address, PEC Support bit:
Table 3-18.
Dynamic and Persistent Address, PEC Support bit
7
6
5
Address Type
0b
1b
4
3
2
1
0
Reserved
(0)
Reserved
(0)
Reserved
(0)
Reserved
(0)
Reserved
(0)
PEC
Supported
0b
0b
0b
0b
0b
0b
MSB
LSB
Version/Revision: UDID Version 1, Silicon Revision:
Table 3-19.
Version/Revision: UDID Version 1, Silicon Revision
7
6
5
Reserved (0)
Reserved (0)
UDID Version
Silicon Revision ID
0b
0b
001b
See below
MSB
4
3
2
1
0
LSB
Silicon Revision ID:
Table 3-20.
Silicon version
Silicon Revision ID
Revision ID
A1
001b
A1/B0
001b
C0
010b
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
105
Intel® 82576 GbE Controller — Interconnects
Vendor Specific ID - Four LSB bytes of the 82576’s Ethernet MAC address. The 82576’s Ethernet
address is taken from words 0b-2b in the EEPROM. Note that in the 82576 there are two MAC addresses
(one for each port). Bit 0 of the port 1 MAC address has the inverted value of bit 0 from the EEPROM.
Table 3-21.
Vendor Specific ID
1 Byte
1 Byte
1 Byte
1 Byte
MAC Address, byte 3
MAC Address, byte 2
MAC Address, byte 1
MAC Address, byte 0
MSB
3.2.1.1.9
LSB
LAN Fail-Over Through SMBus
In fail-over mode, the 82576 determines which ports are used for transmit and receive (according to
the configuration). LAN fail-over is tied to the SMBus addressing mode. When the SMBus is dualaddress mode, the 82576 does not activate its fail-over mechanism (ignores the fail-over register) and
operates using individual LAN ports. When the SMBus is in single-address mode or in pass-through
mode, the 82576 operates in fail-over mode. See Section 10.5.11.
3.2.2
NC-SI
The NC-SI interface in the 82576 is a connection to an external MC defined by the DMTF NC-SI
protocol. It operates as a single interface with an external BMC, where all traffic between the 82576
and the MC flows through the interface.
3.2.2.1
Electrical Characteristics
The 82576 complies with the electrical characteristics defined in the NC-SI specification. However, the
82576 pads are not 5V tolerant and require that signals conform to 3.3V signaling.
The 82576 NC-SI behavior is configured by the 82576 on power-up:
• The 82576 provides an NC-SI clock output if enabled by the NC-SI Clock Direction EEPROM bit. The
default value is to use an external clock source as defined in the NC-SI specification.
• The output driver strength for the NC-SI_CLK_OUT pad is configured by the EEPROM NC-SI Clock
Pad Drive Strength bit (default = 0b).
• The output driver strength for the NC-SI output signals (NC-SI_DV & NC-SI_RX) is configured by
the EEPROM NC-SI Data Pad Drive Strength bit (default = 0b).
• The Multi-Drop NC-SI EEPROM bit defines the NC-SI topology (point-to-point or multi-drop; the
default is point-to-point).
The 82576 can provide an NC-SI clock output as previously mentioned. The NC-SI clock input (NCSI_CLK_IN) serves as an NC-SI input clock in either case. That is, if the 82576 provides an NC-SI
output clock, the platform is required to route it back through the NC-SI clock input with the correct
latency. See the Electrical chapter for more details.
The 82576 dynamically drives its NC-SI output signals (NC-SI_DV and NC-SI_RX) as required by the
sideband protocol:
• On power-up, the 82576 floats the NC-SI outputs
• If the 82576 operates in point-to-point mode, then the 82576 starts driving the NC-SI outputs at
some time following power-up
• If the 82576 operates in a multi-drop mode, the 82576 drives the NC-SI outputs as configured by
the BMC.
Intel® 82576 GbE Controller
Datasheet
106
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.2.2.2
NC-SI Transactions
The NC-SI link supports both pass-through traffic between the MC and the 82576 LAN functions, as well
as configuration traffic between the MC and the 82576 internal units as defined in the NC-SI protocol.
See
3.3
Flash / EEPROM
3.3.1
EEPROM Interface
3.3.1.1
General Overview
The 82576 uses an EEPROM device for storing product configuration information. The EEPROM is
divided into three general regions:
• Hardware accessed - Loaded by the 82576 after power-up, PCI reset de-assertion,
D3 ->D0 transition, or a software-commanded EEPROM read (CTRL_EXT.EE_RST).
• Manageability firmware accessed - Loaded by the 82576 in pass-through mode after power-up or
firmware reset.
• Software accessed - Used only by software. The meaning of these registers, as listed here, is a
convention for software only and is ignored by the 82576.
Table 3-22 lists the structure of the EEPROM image in the 82576.
Table 3-22.
EEPROM Structure
Address
Content
0x0 – 0x9
MAC address and software area
0xA – 0x2F
Hardware area (+ pointer to analog configuration)
0x30 – 0x3F
PXE area
0x40 – 0x4F
Reserved
0x50 – 0x5A
FW pointers
…
Firmware structures
…
VPD area
…
Analog configuration (PCIe/PHY/PLL/SerDes structures)
The EEPROM mapping is described in Section 6.0.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
107
Intel® 82576 GbE Controller — Interconnects
3.3.1.2
EEPROM Device
The EEPROM interface supports an SPI interface and expects the EEPROM to be capable of 2 MHz
operation. The 82576 is compatible with many sizes of 4-wire serial EEPROM devices. Different EEPROM
sizes have differing numbers of address bits (8 bits or 16 bits). Software must be aware when doing
direct access.
See Section 11.5.2, EEPROM Device Options.
3.3.1.3
Software Accesses
The 82576 provides two different methods for software access to the EEPROM. It can either use the
built-in controller to read the EEPROM or access the EEPROM directly using the EEPROM's 4-wire
interface.
In addition, the VPD area of the EEPROM can be accessed via the VPD capability structure of the PCIe.
Software can use the EEPROM Read (EERD) register to cause the 82576 to read a word from the
EEPROM that the software can then use. To do this, software writes the address to read to the Read
Address (EERD.ADDR) field simultaneously writes a 1b to the Start Read bit (EERD.START). The 82576
reads the word from the EEPROM, sets the Read Done bit (EERD.DONE), and puts the data in the Read
Data field (EERD.DATA). Software can poll the EEPROM Read register until it sees the Read Done bit set
and then uses the data from the Read Data field. Any words read this way are not written to the
82576's internal registers.
Software can also directly access the EEPROM's 4-wire interface through the EEPROM/Flash Control
(EEC) register. It can use this for reads, writes, or other EEPROM operations.
To directly access the EEPROM, software should follow these steps:
1. Write a 1b to the EEPROM Request bit (EEC.EE_REQ).
2. Read the EEPROM Grant bit (EEC.EE_GNT) until it becomes 1b. It remains 0b as long as the
hardware is accessing the EEPROM.
3. Write or read the EEPROM using the direct access to the 4-wire interface as defined in the EEPROM/
Flash Control and Data (EEC) register. The exact protocol used depends on the EEPROM placed on
the board and can be found in the appropriate datasheet.
4. Write a 0b to the EEPROM Request bit (EEC.EE_REQ).
Finally, software can cause the 82576 to re-read the hardware accessed fields of the EEPROM (setting
the 82576's internal registers appropriately) by writing a 1b to the EEPROM Reset bit of the Extended
Device Control register (CTRL_EXT.EE_RST).
Note:
3.3.1.4
If the EEPROM does not contain a valid signature (see Section 3.3.1.4), the 82576 assumes
16-bit addressing. In order to access an EEPROM that requires 8-bit addressing, software
must use the direct access mode.
Signature Field
The 82576 determines if an EEPROM is present by attempting to read it. The 82576 first reads the
EEPROM Sizing and Protected Fields word at address 0x12. It checks the signature value for bits 15 and
14. If bit 15 is 0b and bit 14 is 1b, it considers the EEPROM to be present and valid and reads additional
Intel® 82576 GbE Controller
Datasheet
108
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
EEPROM words and then programs its internal registers based on the values read. Otherwise, it ignores
the values it reads from that location and does not read any other words as part of the auto-read
process. However, the EEPROM is still accessible to software.
3.3.1.5
Protected EEPROM Space
The 82576 provides a mechanism for a hidden area in the EEPROM to the host. The hidden area cannot
be accessed via the EEPROM registers in the CSR space. It can be accessed only by the manageability
subsystem. This area is located at the end of the EEPROM memory. It’s size is defined by the HEPSize
field in EEPROM word 0x12. Note that the current the 82576 firmware does not use this mechanism.
A mechanism to protect part of the EEPROM from host writes is also provided. This mechanism is
controlled by word 0x2D and 0x2C that controls the start and the end of the read-only area.
3.3.1.5.1
Initial EEPROM Programming
In most applications, initial EEPROM programming is done directly on the EEPROM pins. Nevertheless, it
is desired to enable existing software utilities (accessing the EEPROM via the host interface) to initially
program the entire EEPROM without breaking the protection mechanism. Following a power-up
sequence, the 82576 reads the hardware initialization words in the EEPROM. If the signature in word
0x12 does not equal 01b, the EEPROM is assumed as non-programmed. There are two effects of a nonvalid signature:
• The 82576 does not read any further EEPROM data and sets the relevant registers to default.
• The 82576 enables access to any location in the EEPROM via the EEPROM CSR registers.
3.3.1.5.2
Activating the Protection Mechanism
Following initialization, the 82576 reads the EEPROM and turns on the protection mechanism
0x12 contains a valid signature (equals 01b) and word 0x12, bit 4 is set (enable protection).
protection mechanism is turned on, words 0x12, 0x2C and 0x2D become write-protected, the
is defined by word 0x12 becomes hidden (such as read/write protected) and the area defined
0x2C and 0x2D become write protected.
if word
Once the
area that
by words
• No matter what is designated as the read only protected area, words 0x30:0x3F (used by PXE
driver) are writeable, unless it is defined as hidden.
3.3.1.5.3
Non Permitted Accessing to Protected Areas in the EEPROM
This paragraph refers to EEPROM accesses via the EEC (bit banging) or EERD (parallel read access)
registers. Following a write access to the protected areas in the EEPROM, hardware responds properly
on the PCIe interface but does not initiate any access to the EEPROM. Following a read access to the
hidden area in the EEPROM (as defined by word 0x12), hardware does not access the EEPROM and
returns meaningless data to the host.
Note:
Using bit banging, the SPI EEPROM can be accessed in a burst mode. For example,
providing op-code, address, and then read or write data for multiple bytes. Hardware
inhibits any attempt to access the protected EEPROM locations even in burst accesses.
Software should not access the EEPROM in a burst-write mode starting in a non-protected
area and continue to a protected one. In such a case it is not guaranteed that the write
access to any area ever takes place.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
109
Intel® 82576 GbE Controller — Interconnects
3.3.1.6
EEPROM Recovery
The EEPROM contains fields that if programmed incorrectly might affect the functionality of the 82576.
The impact can range from an incorrect setting of some function (such as LED programming), via
disabling of entire features (such as no manageability) and link disconnection, to the inability to access
the 82576 via the regular PCIe interface.
The 82576 implements a mechanism that enables recovery from a faulty EEPROM no matter what the
impact is, using an SMBus message that instructs firmware to invalidate the EEPROM.
This mechanism uses an SMBus message that the firmware is able to receive in all modes, no matter
what the content of the EEPROM is (even in diagnostic mode). After receiving this kind of message,
firmware clears the signature of the EEPROM in word 0x12 (bits 15/14 to 00b). Afterwards, the BIOS/
operating system initiates a reset to force an EEPROM auto-load process that fails in order to enable
access to the 82576.
Firmware is programmed to receive such a command only from a PCIe reset until one of the functions
changes it’s status from D0u to D0a. Once one of the functions moves to D0a, it can be safely assumed
that the 82576 is accessible to the host and there is no further need for this function. This reduces the
possibility of malicious software using this command as a back door and limits the time firmware must
be active in non-manageability mode.
If firmware is programmed not to do any other function apart from answering this command, it can
request clock gating immediately after one of the functions changed its status from D0u to D0a.
The command is sent on a fixed SMBus address of 0xC8. The format of the command is the SMBus
write data byte as follows:
Table 3-23.
Note:
Command Format
Function
Command
Data Byte
Release EEPROM
0xC7
0xAA
This solution requires a controllable SMBus connection to the 82576.
If more than one the 82576 is in a state to accept this solution, all of the the 82576s' on the
board ACKs this command and accepts it. A device supporting this mode does not ACK this
command if not in D0u state.
The 82576 is guaranteed to accept the command on the SMBus interface and on address
0xC8, but it might be accepted on other configured interfaces and addresses as well.
After receiving a release EEPROM command, firmware keeps its current state. It is the responsibility of
the programmer that is updating the EEPROM to send a firmware reset (if required) after the full
EEPROM update process completes.
3.3.1.7
EEPROM-Less Support
The 82576 supports EEPROM-less operation with the following limitations:
• Non-manageability mode only.
• No support for legacy Wake on LAN (magic packets).
• No support for Flash (no PXE code).
• No support for serial ID PCIe capability.
• No support for Vital Product Data (VPD).
Intel® 82576 GbE Controller
Datasheet
110
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
• All initialization values usually taken from the EEPROM must be done by a custom host driver.
• Intel SW drivers do not support EEPROM-less operation.
3.3.1.7.1
Access to the EEPROM Controlled Feature
The EEARBC register enables access to registers that are not accessible via regular CSR access (such as
PCIe configuration read-only registers) by emulating the auto-read process. EEARBC contains three
strobe fields that emulate the internal strobes of the internal auto-read process. This register is
common to both functions and should be accessed only after the coordination with the other port.
Table 3-24 lists the strobe to be used when emulating a read of a specific word of the EEPROM autoread feature.
Table 3-24.
Strobes for EEARBC Auto-Read Emulation
EEPROM Word
Emulated (In Hex)
Content
Strobe for Port 0
Strobe for Port 1
0:2
MAC address
VALID_CORE0
VALID_CORE1
0A/0F
Init control 1/2
VALID_CORE0
VALID_CORE1
0B/0C1
Sub-system device and
vendor
VALID_COMMON
VALID_COMMON
1E/1D2
Dummy device ID, Rev ID
VALID_COMMON
VALID_COMMON
21
Function control
VALID_COMMON
VALID_COMMON
0D2
Device ID port 0
VALID_COMMON
N/A
112
Device ID port 1
N/A
VALID_COMMON
10
SDP control
N/A
VALID_CORE1
20
SDP control
VALID_CORE0
N/A
14
Init control 3
N/A
VALID_CORE1
24
Init control 3
VALID_CORE0
N/A
15/16/18/19/1A/
1B/22/25/26
PCIe and NC-SI configuration
VALID_COMMON
VALID_COMMON
1C/1F
LED control port 0
VALID_CORE0
N/A
2A/2B3
LED control port 1
N/A
VALID_CORE1
2E3
Watchdog configuration
VALID_CORE0
VALID_CORE1
2F
VPD area
N/A
N/A
1. If word 0xA was accessed before the subsystem or subvendor ID are set, care must be taken that the load Subsystem IDs bit in
word 0xA is set.
2. If word 0xA was accessed before one of the device IDs is set, care must be taken that the load Device IDs bit in word 0xA is set.
3. Part of the parameters that can be configured through the EEARBC register can be directly set through regular registers and thus
usage of this mechanism is not needed for them. Specifically, words 0x2A, 0x2B and 0x2E controls only parameters that can be
set through regular registers.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
111
Intel® 82576 GbE Controller — Interconnects
3.3.2
Shared EEPROM
The 82576 uses a single EEPROM device to configure hardware default parameters for both LAN
devices, including Ethernet Individual Addresses (IA), LED behaviors, receive packet filters for
manageability, wake-up capability, etc. Certain EEPROM words are used to specify hardware
parameters that are LAN device-independent (such as those that affect circuit behavior). Other
EEPROM words are associated with a specific LAN device. Both LAN devices access the EEPROM to
obtain their respective configuration settings.
3.3.2.1
EEPROM Deadlock Avoidance
The EEPROM is a shared resource between the following clients:
• Hardware auto-read.
• Port 0 LAN driver accesses.
• Port 1 LAN driver accesses.
• Firmware accesses.
All clients can access the EEPROM using parallel access, where hardware implements the actual access
to the EEPROM. Hardware can schedule these accesses so that all clients get served without starvation.
However, software and hardware clients can access the EEPROM using bit banging. In this case, there is
a request/grant mechanism that locks the EEPROM to the exclusive usage of one client. If this client is
stuck (without releasing the lock), the other clients are not able to access the EEPROM. In order to
avoid this, the 82576 implements a timeout mechanism, which releases the grant from a client that
didn't toggle the EEPROM bit-bang interface for more than two seconds.
Note:
If an agent that was granted access to the EEPROM for bit-bang access didn't toggle the bit
bang interface for 500 ms, it should check if it still owns the interface before continuing the
bit-banging.
3.3.2.2
EEPROM Map Shared Words
The EEPROM map in Section 6.1 identifies those words configuring either LAN devices or the entire
Intel® 82576 GbE Controller component as “both”. Those words configuring a specific LAN device
parameter are identified by their LAN number.
The following EEPROM words warrant additional notes specifically related to dual-LAN support:
Table 3-25.
Notes on EEPROM Words
Ethernet Address (IA)
(shared between LANs)
Initialization Control 1,
Initialization Control 2
The EEPROM specifies the IA associated with the LAN 0 device and used as the hardware
default of the Receive Address registers for that device.
The hardware-default IA for the LAN 1 device is automatically determined by the same
EEPROM word and is set to the value of {IA LAN 0 XOR 0x010000000000}.
These EEPROM words specify hardware-default values for parameters that apply a single
value to both LAN devices, such as link configuration parameters required for autonegotiation, wake-up settings, PCIe bus advertised capabilities, etc.
(shared between LANs)
Initialization Control 3
(unique to each LAN)
Intel® 82576 GbE Controller
Datasheet
112
This EEPROM word configures default values associated with each LAN device’s hardware
connections, including which link mode (internal PHY, SGMII, SerDes) is used with this LAN
device. Because a separate EEPROM word configures the defaults for each LAN, extra care
must be taken to ensure that the EEPROM image does not specify a resource conflict.
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.3.3
Vital Product Data (VPD) Support
The EEPROM image might contain an area for VPD. This area is managed by the OEM vendor and
doesn’t influence the behavior of hardware. Word 0x2F of the EEPROM image contains a pointer to the
VPD area in the EEPROM. A value of 0xFFFF means VPD is not supported and the VPD capability doesn’t
appear in the configuration space.
The VPD area should be aligned to a Dword boundary in the EEPROM and should start in the first
1Kbyte of the EEPROM.
The maximum area size is 256 bytes but can be smaller. The VPD block is built from a list of resources.
A resource can be either large or small. The structure of these resources are listed in the following
tables.
Table 3-26.
Small Resource Structure
Offset
Content
Table 3-27.
0
1-n
Tag = 0xxx, xyyyb (Type = Small(0), Item Name = xxxx, length = yyy
bytes)
Data
Large Resource Structure
Offset
Content
0
1-2
3-n
Tag = 1xxx, xxxxb (Type = Large(1), Item Name =
xxxxxxxx)
Length
Data
The 82576 parses the VPD structure during the auto-load process (power up and PCIe reset or warm
reset) in order to detect the read-only and read/write area boundaries. The 82576 assumes the
following VPD structure:
Table 3-28.
VPD Structure
Tag
Structure
Type
Length
(Bytes)
0x82
Large
Length of
identifier
string
Identifier
0x90
Large
Length of
RO area
RO data
Data
Identifier string.
VPD-R list containing one or more VPD keywords
This part is optional and might not appear.
0x91
Large
Length of R/
W area
RW data
0x78
Small
N/A
N/A
Note:
Resource Description
VPD-W list containing one or more VPD keywords. This part
is optional and might not appear.
End tag.
The VPD-R and VPD-W structures can be in any order.
If the 82576 doesn’t detect a value of 0x82 in the first byte of the VPD area, or the structure doesn’t
follow the description listed in Table 3-28, it assumes the area is not programmed and the entire 256
bytes area is read only. If a VPD-W tag is found after the VPD-R tag, the area defined by it’s size is
writable via the VPD structure. Refer to the PCI 3.0 specification (Appendix I) for details of the different
tags.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
113
Intel® 82576 GbE Controller — Interconnects
In any case, the VPD area is accessible for read and write via the regular EEPROM mechanisms pending
the EEEPROM protection capabilities enabled. For example, if VPD is in the protected area, the VPD area
is not accessible to the software device driver (parallel or serial), but accessible through the VPD
mechanism. If the VPD area is not in the protected area, then the software device driver can access all
of it for read and write.
The VPD area can be accessed through the PCIe configuration space VPD capability structure described
in Section 9.5.4. Write accesses to a read-only area or any access outside of the VPD area via this
structure are ignored.
Note:
3.3.4
3.3.4.1
Write access to Dwords, which are only partially in the read/write area, are ignored. It is
responsibility of VPD software to make the right alignment to enable a write to the entire
area.
Flash Interface
Flash Interface Operation
The 82576 provides two different methods for software access to the Flash.
Using the legacy Flash transactions, the Flash is read from or written to each time the host CPU
performs a read or a write operation to a memory location that is within the Flash address mapping or
after a re-boot via accesses in the space indicated by the Expansion ROM Base Address register. All
accesses to the Flash require the appropriate command sequence for the device used. Refer to the
specific Flash data sheet for more details on reading from or writing to Flash. Accesses to the Flash are
based on a direct decode of CPU accesses to a memory window defined in either:
1. The 82576's Flash Base Address register (PCIe Control register at offset 0x14 or 0x18).
2. A certain address range of the IOADDR register defined by the IO Base Address register (PCIe
Control register at offset 0x18 or 0x20).
3. The Expansion ROM Base Address register (PCIe Control register at offset 0x30).
The 82576 controls accesses to the Flash when it decodes a valid access.
Note:
Flash read accesses must always be assembled by the 82576 each time the access is
greater than a byte-wide access.
The 82576 byte reads or writes to the Flash take on the order of 2 s. The 82576 continues
to issue retry accesses during this time.
The 82576 supports only byte writes to the Flash.
Another way for software to access the Flash is directly using the Flash's 4-wire interface through the
Flash Access (FLA) register. It can use this for reads, writes, or other Flash operations (accessing the
Flash status register, erase, etc.).
To directly access the Flash, software should follow these steps:
1. Write a 1b to the Flash Request bit (FLA.FL_REQ).
2. Read the Flash Grant bit (FLA.FL_GNT) until it becomes 1b. It remains 0b as long as there are other
accesses to the Flash.
3. Write or read the Flash using the direct access to the 4-wire interface as defined in the FLA register.
The exact protocol used depends on the Flash placed on the board and can be found in the
appropriate datasheet.
4. Write a 0b to the Flash Request bit (FLA.FL_REQ).
Intel® 82576 GbE Controller
Datasheet
114
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.3.4.2
Flash Write Control
The Flash is write controlled by the FWE bits in the EEPROM/FLASH Control and Data (EEC) register.
Note that attempts to write to the Flash device when writes are disabled (EEC.FWE=01b) should not be
attempted. Behavior after such an operation is undefined and can result in component and/or system
hangs.
After sending one byte write to the Flash, software checks if it can send the next byte to write (check if
the write process in the Flash had finished) by reading the FLA register, If bit (FLA.FL_BUSY) in this
register is set, the current write did not finish. If bit (FLA.FL_BUSY) is clear then software can continue
and write the next byte to the Flash.
3.3.4.3
Flash Erase Control
When software needs to erase the Flash, it should set bit FLA.FL_ER in the FLA register to 1b (Flash
erase) and then set bits EEC.FWE in the EEPROM/Flash Control register to 0b.
Hardware gets this command and sends the Erase command to the Flash. The erase process finishes by
itself. Software should wait for the end of the erase process before any further access to the Flash. This
can be checked by using the Flash write control mechanism previously described.
The op-code used for erase operation is defined in the FLASHOP register.
Note:
Sector erase by software is not supported. In order to delete a sector, the serial (bit bang)
interface should be used.
3.3.5
Shared FLASH
The 82576 provides an interface to an external serial Flash/ROM memory device, as described in
Section 2.1.2. This Flash/ROM device can be mapped into memory and/or I/O address space for each
LAN device through the use of Base Address Registers (BARs). Bit 13 of the EEPROM Initialization
Control Word 3, associated with each LAN device, selectively disables/enables whether the Flash can be
mapped for each LAN device, by controlling the BAR register advertisement and write ability.
3.3.5.1
Flash Access Contention
The 82576 implements internal arbitration between Flash accesses initiated through the LAN 0 device
and those initiated through the LAN 1 device. If accesses from both LAN devices are initiated during the
same approximate size window, The first one is served first and only then the next one.
Note:
The 82576 does not synchronize between the two entities accessing the Flash. Contentions
caused by one entity reading and the other modifying the same location is possible.
To avoid this contention, accesses from both LAN devices should be synchronized using external
software synchronization of the memory or I/O transactions responsible for the access. It might be
possible to ensure contention-avoidance by the nature of the software sequence.
3.3.5.2
Flash Deadlock Avoidance
The Flash is a shared resource between the following clients:
• Port 0 LAN driver accesses.
• Port 1 LAN driver accesses.
• BIOS parallel access via expansion ROM mechanism.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
115
Intel® 82576 GbE Controller — Interconnects
• Firmware accesses.
All clients can access the flash using parallel access, where hardware implements the actual access to
the Flash. Hardware can schedule these accesses so that all the clients get served without starvation.
However, the driver and firmware clients can access the serial Flash using bit banging. In this case,
there is a request/grant mechanism that locks the serial Flash to the exclusive usage of one client. If
this client is stuck without releasing the lock, the other clients are unable to access the Flash. In order
to avoid this, the 82576 implements a time-out mechanism that releases the grant from a client that
doesn’t toggle the Flash bit-bang interface for more than two seconds.
Note:
If an agent that was granted access to the Flash for bit-bang access doesn’t toggle the bitbang interface for 500 ms, it should check that it still owns the interface before continuing
the bit banging.
This mode is enabled by bit five in word 0xA of the EEPROM.
3.4
Configurable I/O Pins
3.4.1
General-Purpose I/O (Software-Definable Pins)
The 82576 has four software-defined pins (SDP pins) per port that can be used for miscellaneous
hardware or software-controllable purposes. These pins and their function are bound to a specific LAN
device. For example, eight SDP pins cannot be associated with a single LAN device. These pins can each
be individually configurable to act as either input or output pins. The default direction of each of the
four pins is configurable via the EEPROM as well as the default value of any pins configured as outputs.
To avoid signal contention, all four pins are set as input pins until after the EEPROM configuration has
been loaded.
In addition to all four pins being individually configurable as inputs or outputs, they can be configured
for use as General-Purpose Interrupt (GPI) inputs. To act as GPI pins, the desired pins must be
configured as inputs. A separate GPI interrupt-detection enable is then used to enable rising-edge
detection of the input pin (rising-edge detection occurs by comparing values sampled at the internal
clock rate as opposed to an edge-detection circuit). When detected, a corresponding GPI interrupt is
indicated in the Interrupt Cause register.
The use, direction, and values of SDP pins are controlled and accessed using fields in the Device Control
(CTRL) register and Extended Device Control (CTRL_EXT) register.
The SDPs can be used for special purpose mechanism such as watch dog indication (see Section 3.4.2
for details) or IEEE 1588 support.
3.4.2
Software Watchdog
In some situations, it might be useful to give an indication to the manageability firmware or to external
devices that the 82576 hardware or software device driver is not functional (because, in a pass-through
NIC, the 82576 can be bypassed if it is not functional).
Once the host driver is up and determines that the hardware is functional, the driver might reset the
watchdog timer to indicate that the 82576 is functional. The driver then could re-arm the timer
periodically. If the timer is not re-armed after a programmed timeout, an interrupt could be given to
firmware and a pre-programmed SDP (SDP0[0] or SDP1[0]) could be raised. Note that an SDP
indication is shared between the ports. In addition, an ICR[26] could be set to give a interrupt to the
driver when a timeout is reached.
Intel® 82576 GbE Controller
Datasheet
116
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
The register controlling this feature is WDSTP. This register enables the setting of a time-out period and
the activation of this mode. Both values get their default from the EEPROM. Re-arming of the timer is
accomplished by setting the WDSWSTS.Dev_functional bit.
If the device driver needs to trigger the watchdog immediately because it suspects the 82576 is stuck,
the driver can set the WDSWSTS.Force_WD bit. It can also give firmware a reason indication by using
the WDSWSTS.stuck_reason field.
The watchdog feature provides the driver a way to indicate to the firmware that the 82576 is not
functional. Note that the watchdog feature has no logic to detect if hardware is not functional. If the
82576 is not functional, the watchdog timer expires due to the driver not being able to access the
hardware, indicating a problem.
The SDP associated with the watchdog indication is set using the CTRL.SDP0_WDE bit. In this mode,
the CTRL.SDP0_IODIR should be set to output. The CTRL.SDP0_DATA bit indicates polarity. Setting this
bit in one core causes watchdog indications for both ports on the SDP.
3.4.2.1
Watchdog rearm
After a watchdog indication was received, in order to rearm the mechanism the following flow should be
used:
1. Clear WD_enable bit in the WDSTP register.
2. Clear SDP0_WDE bit in CTRL register.
3. Set SDP0_WDE bit in CTRL register.
4. Set WD_enable in the WDSTP register.
3.4.3
LEDs
The 82576 provides four LEDs per port that can be used to indicate different statuses of the traffic. The
default setup of the LEDs is done via EEPROM words 0x1C, 0x1F for port 0 and words 0x2A, 0x2B for
port 1. This setup is reflected in the LEDCTL register of each port. Each software device driver can
change its setup individually. For each of the LEDs the following parameters can be defined:
• Mode: Defines which information is reflected by this LED. The encoding is described in the LEDCTL
register.
• Polarity: Defines the polarity of the LED.
• Blink mode: Determines whether or not the LED should blink or be stable.
In addition, the blink rate of all LEDs can be defined. The possible rates are 200 ms or 83 ms for each
phase. There is one rate for all the LEDs of a port.
3.5
Network Interfaces
3.5.1
Overview
The 82576 MAC provides a complete CSMA/CD function supporting IEEE 802.3 (10 Mb/s), 802.3u (100
Mb/s), 802.3z and 802.3ab (1000 Mb/s) implementations. The 82576 performs all of the functions
required for transmission, reception, and collision handling called out in the standards.
Each 82576 MAC can be configured to use a different media interface. The 82576 supports the following
potential configurations:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
117
Intel® 82576 GbE Controller — Interconnects
• Internal copper PHY.
• External SerDes device such as an optical SerDes (SFP or on board) or backplane connections.
• External SGMII device. This mode is used for SFP connections or external SGMII PHYs.
Selection between the various configurations is programmable via each MAC's Extended Device Control
register (CTRL_EXT.LINK_MODE bits) and default is set via EEPROM settings. Table 3-29 lists the
encoding on the LINK_MODE field for each of the modes.
Table 3-29.
Link Mode Encoding
Link Mode
82576 Mode
00b
Internal PHY
01b
Reserved
10b
SGMII
11b
SerDes
The GMII/MII interface used to communicate between the MAC and the internal PHY or the SGMII PCS
supports 10/100/1000 Mb/s operation, with both half- and full-duplex operation at 10/100 Mb/s, and
full-duplex operation at 1000 Mb/s.
The SerDes function can be used to implement a fiber-optics-based solution or backplane connection
without requiring an external TBI mode transceiver/SerDes.
The SGMII interface can be used to connect to SFP modules. As such, this SGMII interface has the
following limitations:
• No Tx clock
• AC coupling only
The internal copper PHY features 10/100/1000-BaseT signaling and is capable of performing intelligent
power-management based on both the system power-state and LAN energy-detection (detection of
unplugged cables). Power management includes the ability to shut-down to an extremely low
(powered-down) state when not needed, as well as the ability to auto-negotiate to lower-speed (and
less power-hungry) 10/100 Mb/s operation when the system is in low power-states.
3.5.2
3.5.2.1
MAC Functionality
Internal GMII/MII Interface
The 82576’s MAC and PHY/PCS communicate through an internal GMII/MII interface that can be
configured for either 1000 Mb/s operation (GMII) or 10/100 Mb/s (MII) mode of operation. For proper
network operation, both the MAC and PHY must be properly configured (either explicitly via software or
via hardware auto-negotiation) to identical speed and duplex settings.
All MAC configuration is performed using Device Control registers mapped into system memory or I/O
space; an internal MDIO/MDC interface, accessible via software, is used to configure the PHY operation.
Intel® 82576 GbE Controller
Datasheet
118
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.2.2
MDIO/MDC
The 82576 implements an IEEE 802.3 MII Management Interface (also known as the Management Data
Input/Output or MDIO Interface) between the MAC and the PHY. This interface provides the MAC and
software the ability to monitor and control the state of the PHY. The MDIO interface defines a physical
connection, a special protocol that runs across the connection, and an internal set of addressable
registers. The internal or external interface consists of a data line (MDIO) and clock line (MDC), which
are accessible by software via the MAC register space.
• MDC (management data clock): This signal is used by the PHY as a clock timing reference for
information transfer on the MDIO signal. The MDC is not required to be a continuous signal and can
be frozen when no management data is transferred. The MDC signal has a maximum operating
frequency of 2.5 MHz.
• MDIO (management data I/O): This internal signaling between the MAC and PHY logically
represents a bi-directional data signal is used to transfer control information and status to and from
the PHY (to read and write the PHY management registers). Asserting and interpreting value(s) on
this interface requires knowledge of the special MDIO protocol to avoid possible internal signal
contention or miscommunication to/from the PHY.
Software can use MDIO accesses to read or write registers in internal PHY mode by accessing the
82576's MDIC register (see Section 8.2.4).
When working in SGMII/SerDes mode, the external PHY (if it exists) can be accessed either through
MDC/MDIO as previously described, or via a two wire interface bus using the I2CCMD register (see
Section 8.18.8). The two wire interface bus or the MDC/MDIO bus are connected via the same pins, and
thus are mutually exclusive. In order to be able to control an external device, either by SFP or MDC/
MDIO, the I2C SFP Enable bit in Initialization Control 3 EEPROM word should be set.
As the MDC/MDIO command can be targeted either to the internal PHY or to an external bus, the
MDIC.destination bit is used to define the target of the transaction.
Note:
Each port has its own MDC/MDIO or two wire interface bus and there is no sharing between
the ports of the control port. In order to control both ports’ PHYs, via the same control bus,
accesses to both PHYs should be done via the same port with different device addresses.
3.5.2.2.1
MDIC Register Usage
For an MDI read cycle, the sequence of events is as follows:
1. The processor performs a PCIe write cycle to the MII register with:
— Ready = 0b
— Interrupt Enable set to 1b or 0b
— Opcode = 10b (read)
— PHYADD = PHY address from the MDI register
— REGADD = Register address of the specific register to be accessed (0 through 31).
2. The MAC applies the following sequence on the MDIO signal to the PHY:
<PREAMBLE><01><10><PHYADD><REGADD><Z> where Z stands for the MAC tri-stating the
MDIO signal.
3. The PHY returns the following sequence on the MDIO signal:
<0><DATA><IDLE>.
4. The MAC discards the leading bit and places the following 16 data bits in the MII register.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
119
Intel® 82576 GbE Controller — Interconnects
5. The the 82576 asserts an interrupt indicating MDI “Done” if the Interrupt Enable bit was set.
6. The the 82576 sets the Ready bit in the MII register indicating the Read is complete.
7. The processor might read the data from the MII register and issue a new MDI command.
For a MDI write cycle, the sequence of events is as follows:
1. Ready = 0b.
2. Interrupt Enable set to 1b or 0b.
3. Opcode = 01b (write).
4. PHYADD = PHY address from the MDI register.
5. REGADD = Register address of the specific register to be accessed (0 through 31).
6. Data = Specific data for desired control of the PHY.
7. The MAC applies the following sequence on the MDIO signal to the PHY:
<PREAMBLE><01><01><PHYADD><REGADD><10><DATA><IDLE>
8. The the 82576 asserts an interrupt indicating MDI “Done” if the Interrupt Enable bit was set.
9. The the 82576 sets the Ready bit in the MII register to indicate that the write operation completed.
10. The CPU might issue a new MDI command.
Note:
An MDI read or write might take as long as 64 s from the processor write to the Ready bit
assertion.
If an invalid opcode is written by software, the MAC does not execute any accesses to the PHY registers.
If the PHY does not generate a 0b as the second bit of the turn-around cycle for reads, the MAC aborts
the access, sets the E (error) bit, writes 0xFFFF to the data field to indicate an error condition, and sets
the Ready bit.
Note:
3.5.2.3
After a PHY reset, access through the MDIC register should not be attempted for 300 sec.
Duplex Operation with Copper PHY
The 82576 supports half-duplex and full-duplex 10/100 Mb/s MII mode either through the internal
copper PHY or SGMII interface. However, only full-duplex mode is supported when SerDes mode is used
or in any 1000 Mb/s connection.
Configuration of the duplex operation of the 82576 can either be forced or determined via the autonegotiation process. See Section 3.5.4.3 for details on link configuration setup and resolution.
3.5.2.3.1
Full Duplex
All aspects of the IEEE 802.3, 802.3u, 802.3z, and 802.3ab specifications are supported in full-duplex
operation. Full-duplex operation is enabled by several mechanisms, depending on the speed
configuration of the 82576 and the specific capabilities of the link partner used in the application.
During full-duplex operation, the 82576 can transmit and receive packets simultaneously across the
link interface.
In full-duplex, transmission and reception are delineated independently by the GMII/MII control
signals. Transmission starts TX_EN is asserted, which indicates there is valid data on the TX_DATA bus
driven from the MAC to the PHY/PCS. Reception is signaled by the PHY/PCS by the asserting the RX_DV
signal, which indicates valid receive data on the RX_DATA lines to the MAC.
Intel® 82576 GbE Controller
Datasheet
120
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.2.3.2
Half Duplex
In half-duplex operation, the MAC attempts to avoid contention with other traffic on the link by
monitoring the CRS signal provided by the PHY and deferring to passing traffic. When the CRS signal is
de-asserted or after a sufficient Inter-Packet Gap (IPG) has elapsed after a transmission, frame
transmission begins. The MAC signals the PHY/PCS with TX_EN at the start of transmission.
In the case of a collision, the PHY/SGMII detects the collision and asserts the COL signal to the MAC.
Frame transmission stops within four link clock times and then the 82576 sends a JAM sequence onto
the link. After the end of a collided transmission, the 82576 backs off and attempts to re-transmit per
the standard CSMA/CD method.
Note:
The re-transmissions are done from the data stored internally in the 82576 MAC transmit
packet buffer (no re-access to the data in host memory is performed).
The MAC behavior is different if a regular collision or a late collision is detected. If a regular collision is
detected, the MAC always tries to re-transmit until the number of excessive collisions is reached. In
case of late collision, the MAC retransmission is configurable. In addition, statistics are gathered on late
collisions.
In the case of a successful transmission, the 82576 is ready to transmit any other frame(s) queued in
the MAC's transmit FIFO, after the minimum inter-frame spacing (IFS) of the link has elapsed.
During transmit, the PHY is expected to signal a carrier-sense (assert the CRS signal) back to the MAC
before one slot time has elapsed. The transmission completes successfully even if the PHY fails to
indicate CRS within the slot time window. If this situation occurs, the PHY can either be configured
incorrectly or be in a link down situation. Such an event is counted in the Transmit without CRS statistic
register (see Section 8.19.11).
3.5.3
SerDes, SGMII Support
The 82576 can be configured to follow either SGMII, SerDes standards. When in SGMII mode, the
82576 can be configured to operate in 1 Gb/s, 100 Mb/s or 10 Mb/s speeds. When in the 10/100 Mb/s
speed, they can be configured to half-duplex mode of operation. When configured for SerDes operation,
the port supports only 1 Gb/s, full-duplex operation. Since the serial interfaces are defined as
differential signals, internally the hardware has analog and digital blocks. Following is the initialization/
configuration sequence for the analog and digital blocks.
3.5.3.1
SerDes Analog Block
The analog block may require some changes to it’s configuration registers in order to work properly.
There is no special requirement for designers to do these changes as the hardware internally updates
the configuration using a default sequence or a sequence loaded from the EEPROM. There is a provision
for EEPROM-less systems, where software can generate the same changes that the hardware generates
by writing the initialization sequence through the SCCTL register.
3.5.3.2
SerDes/SGMII PCS Block
The link setup for SerDes and SGMII are described in sections 3.5.4.1 and 3.5.4.2, respectively.
3.5.3.3
GbE Physical Coding Sub-Layer (PCS)
The 82576 integrates the 802.3z PCS function on-chip. The on-chip PCS circuitry is used when the link
interface is configured for SerDes or SGMII operation and is bypassed for internal PHY mode.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
121
Intel® 82576 GbE Controller — Interconnects
The packet encapsulation is based on the Fiber Channel (FC0/FC1) physical layer and uses the same
coding scheme to maintain transition density and DC balance. The physical layer device is the SerDes
and is used for 1000BASE-SX, -L-, or -CX configurations.
3.5.3.3.1
8B10B Encoding/Decoding
The GbE PCS circuitry uses the same transmission-coding scheme used in the fiber channel physical
layer specification. The 8B10B-coding scheme was chosen by the standards committee in order to
provide a balanced, continuous stream with sufficient transition density to allow for clock recovery at
the receiving station. There is a 25% overhead for this transmission code, which accounts for the datasignaling rate of 1250 Mb/s with 1000 Mb/s of actual data.
3.5.3.3.2
Code Groups and Ordered Sets
Code group and ordered set definitions are defined in clause 36 of the IEEE 802.3z standard. These
represent special symbols used in the encapsulation of GbE packets. The following table contains a brief
description of defined ordered sets and included for informational purposes only. See clause 36 of the
IEEE 802.3z specification for more details.
Table 3-30.
Brief Description of Defined Ordered Sets
# of Code
Groups
Code
Ordered_Set
/C/
Configuration
4
General reference to configuration ordered sets, either /C1/ or /C2/,
which is used during auto-negotiation to advertise and negotiate link
operation information between link partners. Last 2 code groups
contain configuration base and next page registers.
/C1/
Configuration 1
4
See /C/. Differs from /C2/ in 2nd code group for maintaining proper
signaling disparity1.
/C2/
Configuration 2
4
See /C/. Differs from /C1/ in 2nd code group for maintaining proper
signaling disparity1.
/I/
IDLE
2
General reference to idle ordered sets. Idle characters are continually
transmitted by the end stations and are replaced by encapsulated
packet data. The transitions in the idle stream enable the SerDes to
maintain clock and symbol synchronization between link partners.
/I1/
IDLE 1
2
See /I/. Differs from /I2/ in 2nd code group for maintaining proper
signaling disparity1.
/I2/
IDLE 2
2
See /I/. Differs from /I1/ in 2nd code group for maintaining proper
signaling disparity1.
/R/
Carrier_Extend
1
This ordered set is used to indicate carrier extension to the receiving
PCS. It is also used as part of the end_of_packet encapsulation
delimiter as well as IPG for packets in a burst of packets.
/S/
Start_of_Packet
1
The SPD (start_of_packet delimiter) ordered set is used to indicate the
starting boundary of a packet transmission. This symbol replaces the
last byte of the preamble received from the MAC layer.
/T/
End_of_Packet
1
The EPD (end_of_packet delimiter) is comprised of three ordered sets.
The /T/ symbol is always the first of these and indicates the ending
boundary of a packet.
/V/
Error_Propagation
1
The /V/ ordered set is used by the PCS to indicate error propagation
between stations. This is normally intended to be used by repeaters to
indicate collisions.
Usage
1. The concept of running disparity is defined in the standard. In summary, this refers to the 1-0 and 0-1 transitions within 8B10B
code groups.
Intel® 82576 GbE Controller
Datasheet
122
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.4
Auto-Negotiation and Link Setup Features
The method for configuring the link between two link partners is highly dependent on the mode of
operation as well as the functionality provided by the specific physical layer device (PHY or SerDes). For
SerDes mode, the 82576 provides the complete 802.3z PCS function. For internal PHY mode, the PCS
and auto-negotiation functions are maintained within the PHY. For SGMII mode, the 82576 supports the
SGMII link auto-negotiation process, whereas the link auto-negotiation is done by the external PHY.
Configuring the link can be accomplished by several methods ranging from software forcing link
settings, software-controlled negotiation, MAC-controlled auto-negotiation, to auto-negotiation initiated
by a PHY. The following sections describe processes of bringing the link up including configuration of the
82576 and the transceiver, as well as the various methods of determining duplex and speed
configuration.
The process of determining link configuration differs slightly based on the specific link mode (internal
PHY, external SerDes or SGMII) being used.
When operating in a SerDes mode, the PCS layer performs auto-negotiation per clause 37 of the 802.3z
standard. The transceiver used in this mode (the SerDes) does not participate in the auto-negotiation
process as all aspects of auto-negotiation are controlled by the 82576.
When operating in internal PHY mode, the PHY performs auto-negotiation per 802.3ab clause 40 and
extensions to clause 28. Link resolution is obtained by the MAC from the PHY after the link has been
established. The MAC accomplishes this via the MDIO interface, via specific signals from the internal
PHY to the MAC, or by MAC auto-detection functions.
When operating in SGMII mode, the PCS layer performs SGMII auto-negotiation per the SGMII
specification. The external PHY is responsible for the Ethernet auto-negotiation process.
3.5.4.1
SerDes Link Configuration
When using SerDes link mode, link mode configuration can be performed using the PCS function in the
82576. The hardware supports both hardware and software auto-negotiation methods for determining
the link configuration, as well as allowing for a manual configuration to force the link. Hardware autonegotiation is the preferred method.
3.5.4.1.1
Signal Detect Indication
The SRDS_0/1_SIG_DET pins can be connected to a Signal Detect or loss-of-signal output that
indicates when no laser light is being received when the 82576 is used in a 1000BASE-SX or -LX
implementation (SerDes operation). It prevents false carrier cases occurring when transmission by a
non connected port couples in to the input. Unfortunately, there is no standard polarity for this signal
coming from different manufacturers. The CTRL.ILOS bit provides for inversion of the signal from
different external SerDes vendors, and should be set when the external SerDes provides a negativetrue loss-of-signal.
Note:
This bit also inverts the LINK input that provides link status indication from the PHY (in
GMII/MII mode) and thus should be set to 0 for proper internal PHY operation.
3.5.4.1.2
MAC Link Speed
SerDes operation is only defined for 1000 Mb/s operation. Other link speeds are not supported. When
configured for the SerDes interface, the MAC speed-determination function is disabled and the Device
Status register bits (STATUS.SPEED) indicate a value of 10b for 1000 Mb/s.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
123
Intel® 82576 GbE Controller — Interconnects
3.5.4.1.3
SerDes Mode Auto-Negotiation
In SerDes mode, after power up or the 82576 reset via PERST, the 82576 initiates auto-negotiation
based on the default settings in the device control and transmit configuration or PCS Link Control Word
registers, as well as settings read from the EEPROM. If enabled in the EEPROM, the 82576 immediately
performs auto-negotiation.
TBI mode auto-negotiation, as defined in clause 37 of the IEEE 802.3z standard, provides a protocol for
two devices to advertise and negotiate a common operational mode across a GbE link. The 82576 fully
supports the IEEE 802.3z auto-negotiation function when using the on-chip PCS and internal SerDes.
TBI mode auto-negotiation is used to determine the following information:
• Duplex resolution (even though the 82576 MAC only supports full-duplex in SerDes mode).
• Flow control configuration.
Note:
Since speed for SerDes modes is fixed at 1000 Mb/s, speed settings in the Device Control
register are unaffected by the auto-negotiation process.
Auto-negotiation can be initiated at power up or asserting PERST# by enabling specific bits
in the EEPROM.
The auto-negotiation process is accomplished by the exchange of /C/ ordered sets that contain the
capabilities defined in the PCS_ANADV register in the 3rd and 4th symbols of the ordered sets. Next
page are supported using the PCS_NPTX_AN register.
Bits FD and LU in the Device Status (STATUS) register, and bits in the PCS_LSTS register provide status
information regarding the negotiated link.
Auto-negotiation can be initiated by the following:
• PCS_LCMD.AN_ENABLE transition from 0b to 1b
• Receipt of /C/ ordered set during normal operation
• Receipt of a different value of the /C/ ordered set during the negotiation process
• Transition from loss of synchronization to synchronized state (if AN_ENABLE is set).
• PCS_LCMD.AN_RESTART transition from 0b to 1b
Resolution of the negotiated link determines device operation with respect to flow control capability and
duplex settings. These negotiated capabilities override advertised and software-controlled device
configuration.
Software must configure the PCS_ANADV fields to the desired advertised base page. The bits in the
Device Control register are not mapped to the txConfigWord field in hardware until after autonegotiation completes. Table 3-31 lists the mapping of the PCS_ANADV fields to the Config_reg Base
Page encoding per clause 37 of the standard.
Table 3-31.
802.3z Advertised Base Page Mapping
15
14
13:12
11:9
8:7
6
5
4:0
Nextp
Ack
RFLT
rsv
ASM
Hd
Fd
rsv
The partner advertisement can be seen in the PCS_ LPAB and PCS_ LPABNP registers.
3.5.4.1.4
Forcing Link
Intel® 82576 GbE Controller
Datasheet
124
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Forcing link can be accomplished by software by writing a 1b to CTRL.SLU, which forces the MAC PCS
logic into a link-up state (enables listening to incoming characters when LOS is de-asserted by the
internal or external SerDes).
Note:
The PCS_LCMD.AN_ENABLE bit must be set to a logic zero to enable forcing link.
When link is forced via the CTRL.SLU bit, the link does not come up unless the LOS signal is
asserted or an energy indication is received from the SerDes receiver, implying that there is
a valid signal being received by the optics or the SerDes.
The source of the signal detect is fixed using bit ENRGSRC in the CONNSW register.
3.5.4.1.5
HW Detection of Non-Auto-Negotiation Partner
Hardware can detect a SerDes partner that sends idle code groups continuously, but does not initiate or
answer an auto-negotiation process. In this case, hardware initiates an auto-negotiation process, and if
it fails after some timeout, a link up is assumed. To enable this functionality the
PCS_LCTL.AN_TIMEOUT_EN bit should be set. This mode can be used instead of the force link mode as
a way to support a partner that do not support auto-negotiation.
3.5.4.2
SGMII Link Configuration
When working in SGMII mode, the actual link setting is done by the external PHY and is dependent on
the settings of this PHY. The SGMII auto-negotiation process described in the sections that follow is only
used to establish the MAC/PHY connection.
3.5.4.2.1
SGMII Auto-Negotiation
This auto-negotiation process is not dependent on the SRDS0/1_SIG_DET signal, as this signal
indicates the status of the PHY signal detection (usually used in an optical PHY).
The outcome of this auto-negotiation process includes the following information:
• Link status
• Speed
• Duplex
This information is used by hardware to configure the MAC, when operating in SGMII mode.
Bits FD and LU of the Device Status (STATUS) register and bits in the PCS_LSTS register provide status
information regarding the negotiated link.
Auto-negotiation can be initiated by the following:
• LRST transition from b1 to 0b.
• PCS_LCMD.AN_ENABLE transition from 0b to 1b.
• Receipt of /C/ ordered set during normal operation.
• Receipt of different value of the /C/ ordered set during the negotiation process.
• Transition from loss of synchronization to a synchronized state (if AN_ENABLE is set).
• PCS_LCMD.AN_RESTART transition from 0b to 1b.
Resolving the negotiated link determines the 82576 operation with respect to speed and duplex
settings. These negotiated capabilities override advertised and software controlled device configuration.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
125
Intel® 82576 GbE Controller — Interconnects
When working in SGMII mode, there is no need to set the PCAS_ANADV register, as the MAC
advertisement word is fixed. The result of the SGMII level auto-negotiation can be read from the
PCS_LPAB register.
3.5.4.2.2
Forcing Link
In SGMII, forcing of the link cannot be done at the PCS level, only in the external PHY. The forced speed
and duplex settings are reflected by the SGMII auto-negotiation process; the MAC settings are
automatically done according to this functionality.
3.5.4.2.3
MAC Speed Resolution
The MAC speed and duplex settings are always set according to the SGMII auto-negotiation process.
3.5.4.3
Copper PHY Link Configuration
When operating with the internal PHY, link configuration is generally determined by PHY autonegotiation. The software device driver must intervene in cases where a successful link is not
negotiated or the designer desires to manually configure the link. The following sections discuss the
methods of link configuration for copper PHY operation.
3.5.4.3.1
PHY Auto-Negotiation (Speed, Duplex, Flow Control)
When using a copper PHY, the PHY performs the auto-negotiation function. The actual operational
details of this operation are described in the IEEE P802.3ab draft standard and are not included here.
Auto-negotiation provides a method for two link partners to exchange information in a systematic
manner in order to establish a link configuration providing the highest common level of functionality
supported by both partners. Once configured, the link partners exchange configuration information to
resolve link settings such as:
• Speed: - 10/100/1000 Mb/s
• Duplex: - Full or half
• Flow control operation
PHY specific information required for establishing the link is also exchanged.
Note:
If flow control is enabled in the 82576, the settings for the desired flow control behavior
must be set by software in the PHY registers and auto-negotiation restarted. After autonegotiation completes, the software device driver must read the PHY registers to determine
the resolved flow control behavior of the link and reflect these in the MAC register settings
(CTRL.TFCE and CTRL.RFCE).
Once PHY auto-negotiation completes, the PHY asserts a link indication (LINK) to the MAC.
Software must have set the Set Link Up bit in the Device Control register (CTRL.SLU) before
the MAC recognizes the LINK indication from the PHY and can consider the link to be up.
3.5.4.3.2
MAC Speed Resolution
For proper link operation, both the MAC and PHY must be configured for the same speed of link
operation. The speed of the link can be determined and set by several methods with the 82576. These
include:
• Software-forced configuration of the MAC speed setting based on PHY indications, which might be
determined as follows:
— Software reads of PHY registers directly to determine the PHY's auto-negotiated speed
Intel® 82576 GbE Controller
Datasheet
126
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
— Software reads the PHY's internal PHY-to-MAC speed indication (SPD_IND) using the MAC
STATUS.SPEED register
• Software asks the MAC to attempt to auto-detect the PHY speed from the PHY-to-MAC RX_CLK,
then programs the MAC speed accordingly
• MAC automatically detects and sets the link speed of the MAC based on PHY indications by using
the PHY's internal PHY-to-MAC speed indication (SPD_IND)
Aspects of these methods are discussed in the sections that follow.
3.5.4.3.2.1
Forcing MAC Speed
There might be circumstances when the software device driver must forcibly set the link speed of the
MAC. This can occur when the link is manually configured. To force the MAC speed, the software device
driver must set the CTRL.FRCSPD (force-speed) bit to 1b and then write the speed bits in the Device
Control register (CTRL.SPEED) to the desired speed setting. See Section 8.2.1 for details.
Note:
Forcing the MAC speed using CTRL.FRCSPD overrides all other mechanisms for configuring
the MAC speed and can yield non-functional links if the MAC and PHY are not operating at
the same speed/configuration.
When forcing the 82576 to a specific speed configuration, the software device driver must also ensure
the PHY is configured to a speed setting consistent with MAC speed settings. This implies that software
must access the PHY registers to either force the PHY speed or to read the PHY status register bits that
indicate link speed of the PHY.
Note:
Forcing speed settings by CTRL.SPEED can also be accomplished by setting the
CTRL_EXT.SPD_BYPS bit. This bit bypasses the MAC's internal clock switching logic and
enables the software device driver complete control of when the speed setting takes place.
The CTRL.FRCSPD bit uses the MAC's internal clock switching logic, which does delay the
affect of the speed change.
3.5.4.3.2.2
Using Internal PHY Direct Link-Speed Indication
The 82576’s internal PHY provides a direct internal indication of its speed to the MAC (SPD_IND). When
using the internal PHY, the most direct method for determining the PHY link speed and either manually
or automatically configuring the MAC speed is based on these direct speed indications.
For MAC speed to be set/determined from these direct internal indications from the PHY, the MAC must
be configured such that CTRL.ASDE and CTRL.FRCSPD are both 0b (both auto-speed detection and
forced-speed override disabled). After configuring the Device Control register, MAC speed is reconfigured automatically each time the PHY indicates a new link-up event to the MAC.
When MAC speed is neither forced nor auto-sensed by the MAC, the current MAC speed setting and the
speed indicated by the PHY is reflected in the Device Status register bits STATUS.SPEED.
3.5.4.3.3
MAC Full-/Half- Duplex Resolution
The duplex configuration of the link is also resolved by the PHY during the auto-negotiation process.
The 82576’s internal PHY provides an internal indication to the MAC of the resolved duplex configuration
using an internal full-duplex indication (FDX).
When using the internal PHY, this internal duplex indication is normally sampled by the MAC each time
the PHY indicates the establishment of a good link (LINK indication). The PHY's indicated duplex
configuration is applied in the MAC and reflected in the MAC Device Status register (STATUS.FD).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
127
Intel® 82576 GbE Controller — Interconnects
Software can override the duplex setting of the MAC via the CTRL.FD bit when the CTRL.FRCDPLX (force
duplex) bit is set. If CTRL.FRCDPLX is 0b, the CTRL.FD bit is ignored and the PHY's internal duplex
indication is applied.
3.5.4.3.4
Using PHY Registers
The software device driver might be required under some circumstances to read from, or write to, the
MII management registers in the PHY. These accesses are performed via the MDIC registers (see
Section 8.2.4). The MII registers enable the software device driver to have direct control over the PHY's
operation, which can include:
• Resetting the PHY
• Setting preferred link configuration for advertisement during the auto-negotiation process
• Restarting the auto-negotiation process
• Reading auto-negotiation status from the PHY
• Forcing the PHY to a specific link configuration
The set of PHY management registers required for all PHY devices can be found in the IEEE P802.3ab
draft standard. The registers for the 82576 PHY are described in Section 3.5.8.
3.5.4.3.5
Comments Regarding Forcing Link
Forcing link in GMII/MII mode (internal PHY) requires the software device driver to configure both the
MAC and PHY in a consistent manner with respect to each other as well as the link partner. After
initialization, the software device driver configures the desired modes in the MAC, then accesses the
PHY registers to set the PHY to the same configuration.
Before enabling the link, the speed and duplex settings of the MAC can be forced by software using the
CTRL.FRCSPD, CTRL.FRCDPX, CTRL.SPEED, and CTRL.FD bits. After the PHY and MAC have both been
configured, the software device driver should write a 1b to the CTRL.SLU bit.
3.5.4.4
Loss of Signal/Link Status Indication
For all modes of operation, an LOS/LINK signal provides an indication of physical link status to the MAC.
When the MAC is configured for optical SerDes mode, the input reflects loss-of-signal connection from
the optics. In backplane mode, where there is no LOS external indication, an internal indication from
the SerDes receiver can be used. In SFP systems the LOS indication from the SFP can be used. In
internal PHY mode, this signal from the PHY indicates whether the link is up or down; typically indicated
after successful auto-negotiation. Assuming that the MAC has been configured with CTRL.SLU=1b, the
MAC status bit STATUS.LU, when read, generally reflects whether the PHY or SerDes has link (except
under forced-link setup where even the PHY link indication might have been forced).
When the link indication from the PHY is de-asserted or the loss-of-signal asserted from the SerDes, the
MAC considers this to be a transition to a link-down situation (such as cable unplugged, loss of link
partner, etc.). If the Link Status Change (LSC) interrupt is enabled, the MAC generates an interrupt to
be serviced by the software device driver.
3.5.5
Ethernet Flow Control (FC)
The 82576 supports flow control as defined in 802.3x as well as the specific operation of asymmetrical
flow control defined by 802.3z.
Intel® 82576 GbE Controller
Datasheet
128
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Flow control is implemented as a means of reducing the possibility of receive buffer overflows, which
result in the dropping of received packets, and allows for local controlling of network congestion levels.
This can be accomplished by sending an indication to a transmitting station of a nearly full receive
buffer condition at a receiving station.
The implementation of asymmetric flow control allows for one link partner to send flow control packets
while being allowed to ignore their reception. For example, not required to respond to PAUSE frames.
The following registers are defined for the implementation of flow control:
• CTRL.RFCE field is used to enable reception of legacy flow control packets and reaction to them.
• CTRL.TFCE field is used to enable transmission of legacy flow control packets.
• Flow Control Address Low, High (FCAL/H) - 6-byte flow control multicast address
• Flow Control Type (FCT) 16-bit field to indicate flow control type
• Flow Control bits in Device Control (CTRL) register - Enables flow control modes.
• Discard PAUSE Frames (DPF) and Pass MAC Control Frames (PMCF) in RCTL - controls the
forwarding of control packets to the host.
• Flow Control Receive Threshold High (FCRTH[1:0]) - A set of 13-bit high watermarks indicating
receive buffer fullness. A single watermark is used in link FC mode.
• Flow Control Receive Threshold Low (FCRTL[1:0]) - A set of 13-bit low watermarks indicating
receive buffer emptiness. A single watermark is used in link FC mode.
• Flow Control Transmit Timer Value (FCTTV) - a set of 16-bit timer values to include in transmitted
PAUSE frame. A single timer is used in Link FC mode.
• Flow Control Refresh Threshold Value (FCRTV) - 16-bit PAUSE refresh threshold value
3.5.5.1
MAC Control Frames and Receiving Flow Control Packets
3.5.5.1.1
Structure of 802.3X FC Packets
Three comparisons are used to determine the validity of a flow control frame:
1. A match on the 6-byte multicast address for MAC control frames or to the station address of the
82576 (Receive Address Register 0).
2. A match on the type field
3. A comparison of the MAC Control Op-Code field
The 802.3x standard defines the MAC control frame multicast address as 01-80-C2-00-00-01.
The Type field in the FC packet is compared against an IEEE reserved value of 0x8808.
The final check for a valid PAUSE frame is the MAC control op-code. At this time only the PAUSE control
frame op-code is defined. It has a value of 0x0001.
Frame-based flow control differentiates XOFF from XON based on the value of the PAUSE timer field.
Non-zero values constitute XOFF frames while a value of zero constitutes an XON frame. Values in the
Timer field are in units of pause quantum (slot time). A pause quantum lasts 64 byte times, which is
converted in absolute time duration according to the line speed.
Note:
XON frame signals the cancellation of the pause from initiated by an XOFF frame - pause for
zero pause quantum.
Table 3-32 lists the structure of a 802.3X FC packet
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
129
Intel® 82576 GbE Controller — Interconnects
Table 3-32.
802.3X Packet Format
DA
01_80_C2_00_00_01 (6 bytes)
SA
Port MAC address (6 bytes)
Type
0x8808 (2 bytes)
Op-code
0x0001 (2 bytes)
Time
XXXX (2 bytes)
Pad
42 bytes
CRC
4 bytes
3.5.5.1.2
Operation and Rules
The 82576 operates in Link FC.
• Link FC is enabled by the RFCE bit in the CTRL Register.
Note:
Link flow control capability is negotiated between link partners via the auto negotiation
process. It is the software device driver responsibility to reconfigure the link flow control
configuration after the capabilities to be used where negotiated as it might modify the value
of these bits based on the resolved capability between the local device and the link partner.
Receiving a link FC frame while in PFC mode might be ignored. Receiving a PFC frame while
in link FC mode is ignored.
Once the receiver has validated receiving an XOFF, or PAUSE frame, the 82576 performs the following:
• Increments the appropriate statistics register(s).
• Sets the Flow_Control State bit in the relevant FCSTS[0-1] register.
• Initializes the pause timer based on the packet's PAUSE timer field (overwriting any current timer’s
value).
• Disables packet transmission or schedules the disabling of transmission after the current packet
completes.
Resumption of transmission might occur under the following conditions:
• Expiration of the PAUSE timer
• Reception of an XON frame (a frame with its PAUSE timer set to 0b)
Both conditions clear the relevant Flow_Control State bit in the relevant FCSTS[0-1] register and
transmission can resume. Hardware records the number of received XON frames.
3.5.5.1.3
Timing Considerations
When operating at 1 Gb/s line speed, the 82576 must not begin to transmit a (new) frame more than
two pause-quantum-bit times after receiving a valid link XOFF frame, as measured at the wires. A
pause quantum is 512-bit times.
When operating in full duplex at 100 Mb/s or 1 Gb/s line speeds, the 82576 must not begin to transmit
a (new) frame more than 576-bit times after receiving a valid link XOFF frame, as measured at the
wire.
Intel® 82576 GbE Controller
Datasheet
130
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.5.2
PAUSE and MAC Control Frames Forwarding
Two bits in the Receive Control register, control forwarding of PAUSE and MAC control frames to the
host. These bits are Discard PAUSE Frames (DPF) and Pass MAC Control Frames (PMCF):
• The DPF bit controls forwarding of PAUSE packets to the host.
• The PMCF bit controls forwarding of non-PAUSE packets to the host.
Note:
When virtualization is enabled, forwarded control packets are queued according to the
regular switching procedure defined in Section 7.10.3.5.
When flow control reception is disabled (CTRL.RFCE = 0), flow control packets are not
recognized and are parsed as regular packets.
3.5.5.3
Transmission of PAUSE Frames
Table 3-33.
RFCE
Forwarding of PAUSE Packet to Host (DPF Bit)
DPF
Are FC Packets Forwarded to Host?
0
X
Yes. Packets needs to pass the L2 filters (see Section 7.1.2.1).1
1
0
Yes. Packets needs to pass the L2 filters (see Section 7.1.2.1).
1
1
No.
1. The flow control multicast address is not part of the L2 filtering unless explicitly required.
Table 3-34.
Transfer of Non-PAUSE Control Packets to Host (PMCF Bit)
RFCE
PMCF
Are Non-FC MAC Control Packets Forwarded to Host?
0
X
Yes. Packets needs to pass the L2 filters (see Section 7.1.2.1).
X
0
Yes. Packets needs to pass the L2 filters (see Section 7.1.2.1).
1
1
Reserved.
The 82576 generates PAUSE packets to insure there is enough space in its receive packet buffers to
avoid packet drop. The 82576 monitors the fullness of its receive packet buffers and compares it with
the contents of a programmable threshold. When the threshold is reached, the 82576 sends a PAUSE
frame. The 82576 also supports the sending of link Flow Control (FC).
Note:
Similar to receiving link flow control packets previously mentioned, link XOFF packets can
be transmitted only if this configuration has been negotiated between the link partners via
the auto-negotiation process or some higher level protocol. The setting of this bit by the
software device driver indicates the desired configuration.
The transmission of flow control frames should only be enabled in full-duplex mode per the
IEEE 802.3 standard. Software should ensure that the transmission of flow control packets
is disabled when the 82576 is operating in half-duplex mode.
3.5.5.3.1
Operation and Rules
Transmission of link PAUSE frames is enabled by software writing a 1b to the TFCE bit in the Device
Control register.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
131
Intel® 82576 GbE Controller — Interconnects
The content of the Flow Control Receive Threshold High (FCRTH) register determines at what point the
82576 first transmits a PAUSE frame. The 82576 monitors the fullness of the receive packet buffer and
compares it with the contents of FCRTH. When the threshold is reached, the 82576 sends a PAUSE
frame with its pause time field equal to FCTTV.
At this time, the 82576 starts counting an internal shadow counter (reflecting the pause timeout
counter at the partner end) from zero. When the counter reaches the value indicated in FCRTV register,
then, if the PAUSE condition is still valid (meaning that the buffer fullness is still above the high
watermark), an XOFF message is sent again.
Once the receive buffer fullness reaches the low water mark, the 82576 sends an XON message (a
PAUSE frame with a timer value of zero). Software enables this capability with the XONE field of the
FCRTL.
The 82576 sends a PAUSE frame if it has previously sent one and the packet buffer overflows. This is
intended to minimize the amount of packets dropped if the first PAUSE frame did not reach its target.
Since the secure receive packets use the same data path, the behavior is identical when secure packets
are received.
3.5.5.3.2
Software Initiated PAUSE Frame Transmission
The 82576 has the added capability to transmit an XOFF frame via software. This is accomplished by
software writing a 1b to the SWXOFF bit of the Transmit Control register. Once this bit is set, hardware
initiates the transmission of a PAUSE frame in a manner similar to that automatically generated by
hardware.
The SWXOFF bit is self-clearing after the PAUSE frame has been transmitted.
Note:
The Flow Control Refresh Threshold mechanism does not work in the case of softwareinitiated flow control. Therefore, it is the software’s responsibility to re-generate PAUSE
frames before expiration of the pause counter at the other partner's end.
The state of the CTRL.TFCE bit or the negotiated flow control configuration does not affect software
generated PAUSE frame transmission.
Note:
Software sends an XON frame by programming a 0b in the PAUSE timer field of the FCTTV
register. The software emission of XON packet is not allowed while the hardware flow
control mechanism is active, as both use the FCTIV registers for different purposes.
XOFF transmission is not supported in 802.3x for half-duplex links. Software should not
initiate an XOFF or XON transmission if the 82576 is configured for half-duplex operation.
When flow control is disabled, pause packets (XON, XOFF, and other FC) are not detected as
flow control packets and can be counted in a variety of counters (such as multicast).
3.5.5.4
IPG Control and Pacing
The 82576 supports the following modes of controlling IPG duration:
• Fixed IPG - IPG is extended by a fixed duration
• Limiting payload rate - IPG is extended to limit the average data rate on the link.
3.5.5.4.1
Fixed IPG Extension
Intel® 82576 GbE Controller
Datasheet
132
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
The 82576 allows controlling of the IPG duration. The IPGT configuration field enables an extension of
IPG in 4-byte increments. One possible use of this capability is to allow the insertion of bytes into the
transmit packet after it has been transmitted by the 82576 without violating the minimum IPG
requirements. For example, a security device connected in series to the 82576 might add security
headers to transmit packets before the packets go to the network.
3.5.5.4.2
Limiting Payload Rate
The 82576 allows controlling the maximum payload rate transmitted on the wire. Frames are spaced by
an amount of idle time proportional to the maximum rate to achieve and to the length of the last frame
transmitted. This feature is enabled by clearing bits TCRSBYP and TCRSCOMP in the RTTPCS register.
The maximum payload rate is defined for the entire link by setting the RS_ENA bit in the RTTPTCRC[0]
register and by configuring RTTPTCRC[0] and RTTPTCRM[0] registers.
3.5.6
3.5.6.1
Loopback Support
General
The 82576 supports the following types of internal loopback in the LAN interfaces:
• MAC Loopback (Point 1 in figure)
• Internal PHY Loopback (Point 2 in figure)
• Internal SerDes Loopback (Point 3 in figure)
• External PHY Loopback (Point 4 in figure)
By setting the device to loopback mode, packets that are transmitted towards the line will be looped
back to the host. The 82576 is fully functional in these modes, just not transmitting data over the lines.
Figure 3-5 shows the points of loopback.
For more details on the usage and loopback test setup - See Intel® Ethernet Controllers Loopback
Modes application note.
Figure 3-5.
3.5.6.2
Intel® 82576 GbE Controller Loopback Modes
MAC Loopback
In MAC loopback, the PHY and SerDes blocks are not functional and data is looped back before these
blocks. MAC loopback is operational only when working in PHY mode (CTRL_EXT.LINK_MODE = 00b).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
133
Intel® 82576 GbE Controller — Interconnects
3.5.6.2.1
Setting the 82576 to MAC loopback Mode
The following procedure should be used to put the 82576 in MAC loopback mode:
• Set RCTL.LBM to 2'b01 (bits 7:6)
• Set CTRL.SLU (bit 6, should be set by default)
• Set CTRL.FRCSPD & FRCDPLX (bits 11&12)
• Set CTRL.SPEED to 2'b10 (1G) and CTRL.FD
• Set CTRL.ILOS.
• Disable Auto negotiation in the PHY control register (Address 0 in the PHY):
— Clear Auto Neg enable bit (Bit 12)
Filter configuration and other TX/RX processes are as the same as I n normal mode.
Note:
This configuration is for a case that there is no link in the PHY. If there is a link, ILOS bit
should be cleared.
3.5.6.3
Internal PHY Loopback
In Internal PHY loopback the SerDes block is not functional and data is looped back at the end of the
PHY functionality. This means all the design that is functional in copper mode, is involved in the
loopback
3.5.6.3.1
Setting the 82576 to PHY loopback Mode
The following procedure should be used to put the 82576 in PHY loopback mode:
• Set Link mode to PHY: CTRL_EXT.LINK_MODE (CSR 0x18 BITS 23:22) = 0b00
• In PHY control register (Address 0 in the PHY):
— Set Duplex mode (bit 8)
— Set Loopback bit (Bit 14)
— Clear Auto Neg enable bit (Bit 12)
— Set speed using bits 6 and 13 as described in EAS.
— Register value should be:
For 10 Mbps 0x4100
For 100 Mbps 0x6100
For 1000 Mbps 0x4140.
• In port control register (Address 16 (0x10) in the PHY), set bit 14 (Link disable). This is not a must
for 1G but required for 10/100Mbps
While in loopback mode, polling for link might not return a valid link state. Transmit and receive
normally.
Note:
Make sure a Configure command is re-issued (loopback bits set to 00b) to cancel the
loopback mode.
Intel® 82576 GbE Controller
Datasheet
134
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.6.4
SerDes Loopback
In SerDes loopback the PHY block is not functional and data is looped back at the end of the SerDes
functionality. This means all the design that is functional in SerDes/SGMII mode, is involved in the
loopback.
Note:
SerDes loopback is functional only if the SerDes link is up.
3.5.6.4.1
Setting SerDes loopback Mode
The following procedure should be used to put the 82576 in SerDes loopback mode:
• Set Link mode to SerDes: CTRL_EXT.LINK_MODE (CSR 0x18 BITS 23:22) = 0b11
• Configure SERDES (register 4 bit 1) to loopback: write to SERDESCTL (CSR 0x00024) the value
0x410
• Move to Force mode by setting the following bits:
— CTRL.FD (CSR 0x0 bit 0) = 1
— CTRL.SLU (CSR 0x0 bit 6) = 1
— CTRL.RFCE (CSR 0x0 bit 27) = 0
— CTRL.TFCE (CSR 0x0 bit 28) = 0
— CTRL.ILOS (CSR 0x0 bit 7) = 1
— CTRL.LRST (CSR 0x0 bit 3) = 0
— PCS_LCTL.FORCE_LINK (CSR 0X04208 bit 5) = 1
— PCS_LCTL.FSD (CSR 0X04208 bit 4) = 1
— PCS_LCTL.FDV (CSR 0X04208 bit 3) = 1
— PCS_LCTL.FLV (CSR 0X04208 bit 0) = 1
— PCS_LCTL.AN_ENABLE (CSR 0X04208 bit 16) = 0
3.5.6.5
External PHY Loopback
In External PHY loopback the SerDes block is not functional and data is sent through the MDI interface
and looped back using an external loopback plug. This means all the design that is functional in copper
mode, is involved in the loopback.
3.5.6.5.1
Setting the 82576 to External PHY loopback Mode
The following procedure should be used to put the 82576 in PHY loopback mode:
• Set Link mode to PHY: CTRL_EXT.LINK_MODE (CSR 0x18 BITS 23:22) = 0x0b00
• In PHY control register (Address 0 in the PHY): - Write 0x0140 to:
— Set Duplex mode (bit 8)
— Clear Loopback bit (Bit 14)
— Clear Auto Neg enable bit (Bit 12)
— Force 1 Gbps mode (set bit 6 and clear bit 13)
• Force master mode by setting GCON PHY register (Address 9 in the PHY) to 0x1a00
• Tune the PHY DSP to Loopback operation (in 1 Gbps mode only) using the following sequence:
— Set PHY Register address 0x12 to 0x1610
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
135
Intel® 82576 GbE Controller — Interconnects
•
Enable Loopback on Twisted Pair
•
Disable Flip Chip
•
Auto MDI-X
— Turn off NEXT cancellers using the following command:
•
Set PHY Register address 0x1f37 to 0x3f1c.
•
•
—
The above procedure puts the device in PHY loopback mode. After using the procedure, wait for link to
become up. Once PHY register 1 bit 2 is set (this can take up to 750ms), transmit and receive normally.
If you are unable to get link after 750ms, reset the PHY using CTRL.PHY_RST (see Section 4.2.1.10)
and then repeat the above procedure.
When exiting External PHY loopback mode, a full PHY reset must be done. Use CTRL.PHY_RST (see
Section 4.2.1.10).
3.5.7
Integrated Copper PHY Functionality
The PHY default configuration is determined by data from the EEPROM, which is read right after poweron reset.
The register set used to control the PHY functionality (PHYREG) is described in Section 8.25.
3.5.7.1
PHY Initialization Functionality
3.5.7.1.1
Auto MDIO Register Initialization
The 82576 PHY supports an option to automatically initialize MDIO registers with values from EEPROM/
ROM if the hardware defaults are not adequate.
In the 82576, this is performed by the MMS unit (firmware).
There are two types of register initialization:
1. General register initialization - any register in PHY can be initialized.
2. EEPROM bit initialization - there are some bits in the PHY that are a mirror of EEPROM bit - 25.6,
25.3:0, 26.0.
After any PHY reset (power down included), the PHY needs to be initialized for both steps 1 and 2.
The register initialization is done by the MMS (firmware) through the MAC/PHY MDIO interface (MDIC).
3.5.7.1.2
General Register Initialization
A block of data is allocated in EEPROM/ROM (see Section 6.4).
This block holds register addresses and data in MDIC format (Section 8.2.4).
Every time a PHY reset ends, this block is read from EEPROM by the MMS and is written to PHY registers
through the MDIC registers and the MDIO interface.
Intel® 82576 GbE Controller
Datasheet
136
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.7.1.3
Mirror Bit Initialization
There are a number of bits (OEM bits) that reside in EEPROM/MAC control registers that have a mirror
bit in the PHY registers.These bits are also updated by the MMS after every PHY reset.
These bits are updated after the general register initialization and through a read modify write
sequence.
The current mirror bits are: registers - 25.6, 25.3:0, and 26.0.
3.5.7.2
Determining Link State
The PHY and its link partner determine the type of link established through one of three methods:
• Auto-negotiation
• Parallel detection
• Forced operation
Auto-negotiation is the only method allowed by the 802.3ab standard for establishing a 1000BASE-T
link, although forced operation could be used for test purposes. For 10/100 links, any of the three
methods can be used. The following sections discuss each in greater detail.
Figure 3-6 provides an overview of link establishment. First the PHY checks if auto-negotiation is
enabled. By default, the PHY supports auto-negotiation, see PHY Register 0, bit 12. If not, the PHY
forces operation as directed. If auto-negotiation is enabled, the PHY begins transmitting Fast Link
Pulses (FLPs) and receiving FLPs from its link partner. If FLPs are received by the PHY, auto-negotiation
proceeds. It also can receive 100BASE-TX MLT3 and 10BASE-T Normal Link Pulses (NLPs). If either
MLT3 or NLPs are received, it aborts FLP transmission and immediately brings up the corresponding
half-duplex link.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
137
Intel® 82576 GbE Controller — Interconnects
Figure 3-6.
3.5.7.2.1
Overview of Link Establishment
False Link
The PHY does not falsely establish link with a partner operating at a different speed. For example, the
PHY does not establish a 1 Gb/s or 10 Mb/s link with a 100 MB/s link partner.
When the PHY is first powered on, reset, or encounters a link down state, it must determine the line
speed and operating conditions to use for the network link.
The PHY first checks the MDIO registers (initialized via the hardware control interface or written by
software) for operating instructions. Using these mechanisms, designers can command the PHY to do
one of the following:
• Force twisted-pair link operation to:
— 1000T, full duplex
— 1000T, half duplex
— 100TX, full duplex
— 100TX, half duplex
— 10BASE-T, full duplex
— 10BASE-T, half duplex
• Allow auto-negotiation/parallel-detection.
Intel® 82576 GbE Controller
Datasheet
138
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
In the first six cases (forced operation), the PHY immediately begins operating the network interface as
commanded. In the last case, the PHY begins the auto-negotiation/parallel-detection process.
3.5.7.2.2
Forced Operation
Forced operation can be used to establish 10 Mb/s and 100 Mb/s links, and 1000 Mb/s links for test
purposes. In this method, auto-negotiation is disabled completely and the link state of the PHY is
determined by MII Register 0.
Note:
When speed is forced, the auto cross-over feature is not functional.
In forced operation, the designer sets the link speed (10, 100, or 1000 MB/s) and duplex state (full or
half). For Gigabit (1000 MB/s) links, designers must explicitly designate one side as the master and the
other as the slave.
Note:
The paradox (per the standard): If one side of the link is forced to full-duplex operation and
the other side has auto-negotiation enabled, the auto-negotiating partner parallel-detects
to a half-duplex link while the forced side operates as directed in full-duplex mode. The
result is spurious, unexpected collisions on the side configured to auto-negotiate.
Table 3-35 lists link establishment procedures.
Table 3-35.
Determining Duplex State Via Parallel Detection
Configuration
Result
Both sides set for auto-negotiate
Link is established via auto-negotiation.
Both sides set for forced operation
No problem as long as duplex settings match.
One side set for auto-negotiation and the other for forced,
half-duplex
Link is established via parallel detect.
One side set for auto-negotiation and the other for forced fullduplex
Link is established; however, sides disagree, resulting in
transmission problems (Forced side is full-duplex, autonegotiation side is half-duplex.).
3.5.7.2.3
Auto Negotiation
The PHY supports the IEEE 802.3u auto-negotiation scheme with next page capability. Next page
exchange uses Register 7 to send information and Register 8 to receive them. Next page exchange can
only occur if both ends of the link advertise their ability to exchange next pages.
3.5.7.2.4
Parallel Detection
Parallel detection can only be used to establish 10 and 100 Mb/s links. It occurs when the PHY tries to
negotiate (transmit FLPs to its link partner), but instead of sensing FLPs from the link partner, it senses
100BASE-TX MLT3 code or 10BASE-T Normal Link Pulses (NLPs) instead. In this case, the PHY
immediately stops auto-negotiation (terminates transmission of FLPs) and immediately brings up
whatever link corresponds to what it has sensed (MLT3 or NLPs). If the PHY senses both technologies,
the parallel detection fault is detected and the PHY continues sending FLPs.
With parallel detection, it is impossible to determine the true duplex state of the link partner and the
IEEE standard requires the PHY to assume a half-duplex link. Parallel detection also does not allow
exchange of flow-control ability (PAUSE and ASM_DIR) or the master/slave relationship required by
1000BASE-T. This is why parallel detection cannot be used to establish GbE links.
3.5.7.2.5
320961-015EN
Revision: 2.61
December 2010
Auto Cross-Over
Intel® 82576 GbE Controller
Datasheet
139
Intel® 82576 GbE Controller — Interconnects
Twisted pair Ethernet PHY's must be correctly configured for MDI or MDI-X operation to inter operate.
This has historically been accomplished using special patch cables, magnetics pinouts or Printed Circuit
Board (PCB) wiring. The PHY supports the automatic MDI/MDI-X configuration originally developed for
1000Base-T and standardized in IEEE 802.3u section 40. Manual (non-automatic) configuration is still
possible.
For 1000BASE-T links, pair identification is determined automatically in accordance with the standard.
For 10/100 Mb/s inks and during auto-negotiation, pair usage is determined by bits 12 and 13 in the
Port Control Register (PHYREG18).
In addition, the PHY has an automatic cross-over detection function. If bit 18.12 = 1b, the PHY
automatically detects which application is being used and configures itself accordingly.
The automatic MDI/MDI-X state machine facilitates switching the MDI_PLUS[0] and MDI_MINUS[0]
signals with the MDI_PLUS[1] and MDI_MINUS[1] signals, respectively, prior to the auto-negotiation
mode of operation so that FLPs can be transmitted and received in compliance with Clause 28 autonegotiation specifications. An algorithm that controls the switching function determines the correct
polarization of the cross-over circuit. This algorithm uses an 11-Bit Linear Feedback Shift Register
(LFSR) to create a pseudo-random sequence that each end of the link uses to determine its proposed
configuration. After making the selection to either MDI or MDI-X, the node waits for a specified amount
of time while evaluating its receive channel to determine whether the other end of the link is sending
link pulses or PHY-dependent data. If link pulses or PHY-dependent data are detected, it remains in that
configuration. If link pulses or PHY-dependent data are not detected, it increments its LFSR and makes
a decision to switch based on the value of the next bit. The state machine does not move from one
state to another while link pulses are being transmitted.
Figure 3-7.
3.5.7.2.6
Cross-Over Function
10/100 MB/s Mismatch Resolution
It is a common occurrence that a link partner (such as a switch) is configured for forced full-duplex 10/
100 Mb/s operation. The normal auto-negotiation sequence would result in the other end settling for
half-duplex 10/100 Mb/s operation. The mechanism described in this section resolves the mismatch
and automatically transitions the 82576 into FDX mode, enabling it to operate with a partner configured
for FDX operation.
The 82576 enables the system software device driver to detect the mismatch event previously
described and sets its duplex mode to the appropriate value without a need to go through another
auto-negotiation sequence or breaking link. Once software detects a possible mismatch, it might
instruct the 82576 to change its duplex setting to either HDX or FDX mode. Software sets the
Intel® 82576 GbE Controller
Datasheet
140
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Duplex_manual_set bit to indicate that duplex setting should be changed to the value indicated by the
Duplex Mode bit in PHY Register 0. Any change in the value of the Duplex Mode bit in PHY Register 0
while the Duplex_manual_set bit is set to 1b would also cause a change in the device duplex setting.
The Duplex_manual_set bit is cleared on all PHY resets, following auto-negotiation, and when the link
goes down. Software might track the change in duplex through the PHY Duplex Mode bit in Register 17
or a MAC indication.
3.5.7.2.7
Link Criteria
Once the link state is determined-via auto-negotiation, parallel detection or forced operation, the PHY
and its link partner bring up the link.
3.5.7.2.7.1
1000BASE-T
For 1000BASE-T links, the PHY and its link partner enter a training phase. They exchange idle symbols
and use the information gained to set their adaptive filter coefficients. These coefficients are used to
equalize the incoming signal, as well as eliminate signal impairments such as echo and cross talk.
Either side indicates completion of the training phase to its link partner by changing the encoding of the
idle symbols it transmits. When both sides so indicate, the link is up. Each side continues sending idle
symbols each time it has no data to transmit. The link is maintained as long as valid idle, data, or
carrier extension symbols are received.
3.5.7.2.7.2
100BASE-TX
For 100BASE-TX links, the PHY and its link partner immediately begin transmitting idle symbols. Each
side continues sending idle symbols each time it has no data to transmit. The link is maintained as long
as valid idle symbols or data is received.
In 100 Mb/s mode, the PHY establishes a link each time the scrambler becomes locked and remains
locked for approximately 50 ms. Link remains up unless the de scrambler receives less than 12
consecutive idle symbols in any 2 ms period. This provides for a very robust operation, essentially
filtering out any small noise hits that might otherwise disrupt the link.
3.5.7.2.7.3
10BASE-T
For 10BASE-T links, the PHY and its link partner begin exchanging Normal Link Pulses (NLPs). The PHY
transmits an NLP every 16 ms and expects to receive one every 10 to 20 ms. The link is maintained as
long as normal link pulses are received.
In 10 Mb/s mode, the PHY establishes link based on the link state machine found in 802.3, clause 14.
Note:
100 Mb/s idle patterns do not bring up a 10 Mb/s link.
3.5.7.3
Link Enhancements
The PHY offers two enhanced link functions, each of which are discussed in the sections that follow:
• SmartSpeed
• Flow control
3.5.7.3.1
320961-015EN
Revision: 2.61
December 2010
SmartSpeed
Intel® 82576 GbE Controller
Datasheet
141
Intel® 82576 GbE Controller — Interconnects
SmartSpeed is an enhancement to auto-negotiation that enables the PHY to react intelligently to
network conditions that prohibit establishment of a 1000BASE-T link, such as cable problems. Such
problems might allow auto-negotiation to complete, but then inhibit completion of the training phase.
Normally, if a 1000BASE-T link fails, the PHY returns to the auto-negotiation state with the same speed
settings indefinitely. With SmartSpeed enabled, after a configurable number (1-5, Register 27.8:6) of
failed attempts, the PHY automatically downgrades the highest ability it advertises to the next lower
speed: from 1000 to 100 to 10 Mb/s. Once a link is established, and if it is later broken, the PHY
automatically upgrades the capabilities advertised to the original setting. This enables the PHY to
automatically recover once the cable plant is repaired.
3.5.7.3.1.1
Using SmartSpeed
SmartSpeed is enabled by setting PHYREG.16.7 = 1b. When SmartSpeed downgrades the PHY
advertised capabilities, it sets bit PHYREG.19.5. When link is established, its speed is indicated in
PHYREG.17.15:14. SmartSpeed automatically resets the highest-level auto-negotiation abilities
advertised, if link is established and then lost for more than 2 seconds.
The number of failed attempts allowed is configured by Register 27.8:6.
Note:
SmartSpeed and M/S fault - When SmartSpeed is enabled, the M/S (Master-Slave)
resolution is not given seven attempts to try to resolve M/S status (see IEEE 802.3 clause
40.5.2), this is due to the fact that SmartSpeed downgrades the link after at most five
attempts.
Time To Link with Smart Speed - in most cases, any attempt duration is approximately 2.5
seconds, in other cases it could take more than 2.5 seconds depending on configuration and
other factors.
3.5.7.4
Flow Control
Flow control is a function that is described in Clause 31 of the IEEE 802.3 standard. It allows congested
nodes to pause traffic. Flow control is essentially a MAC-to-MAC function. MACs indicate their ability to
implement flow control during auto-negotiation. This ability is communicated through two bits in the
auto-negotiation registers (PHYREG.4.10 and PHYREG.4.11).
The PHY transparently supports MAC-to-MAC advertisement of flow control through its auto-negotiation
process. Prior to auto-negotiation, the MAC indicates its flow control capabilities via PHYREG.4.10
(Pause) and PHYREG.4.11 (ASM_DIR). After auto-negotiation, the link partner's flow control capabilities
are indicated in PHYREG.5.10 and PHYREG.5.11.
There are two forms of flow control that can be established via auto-negotiation: symmetric and
asymmetric. Symmetric flow control is for point-to-point links; asymmetric for hub-to-end-node
connections. Symmetric flow control enables either node to flow-control the other. Asymmetric flowcontrol enables a repeater or switch to flow-control a DTE, but not vice versa.
Table 3-36 lists the intended operation for the various settings of ASM_DIR and PAUSE. This
information is provided for reference only; it is the responsibility of the MAC to implement the correct
function. The PHY merely enables the two MACs to communicate their abilities to each other.
Table 3-36.
Pause And Asymmetric Pause Settings
ASM_DIR settings Local
(PHYREG.4.10) and Remote
(PHYREG.5.10)
Both ASM_DIR = 1b
Intel® 82576 GbE Controller
Datasheet
142
Pause Setting Local
(PHYREG.4.9)
Pause Setting Remote
(PHYREG.5.9)
Result
1
1
Symmetric - Either side can flow control the other
1
0
Asymmetric - Remote can flow control local only
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
Table 3-36.
Pause And Asymmetric Pause Settings
Either or both ASM_DIR =
0b
0
1
Asymmetric - Local can flow control remote
0
0
No flow control
1
1
Symmetric - Either side can flow control the other
Either or both = 0
3.5.7.5
No flow control
Management Data Interface
The PHY supports the IEEE 802.3 MII Management Interface also known as the Management Data
Input/Output (MDIO) Interface. This interface enables upper-layer devices to monitor and control the
state of the PHY. The MDIO interface consists of a physical connection, a specific protocol that runs
across the connection, and an internal set of addressable registers.
The PHY supports the core 16-bit MDIO registers. Registers 0-10 and 15 are required and their
functions are specified by the IEEE 802.3 specification. Additional registers are included for expanded
functionality. Specific bits in the registers are referenced using an PHY REG X.Y notation, where X is the
register number (0-31) and Y is the bit number (0-15). See the software interface chapter.
3.5.7.6
Low Power Operation and Power Management
The PHY incorporates numerous features to maintain the lowest power possible.
The PHY can be entered into a low-power state according to MAC control (Power Management controls)
or via PHY Register 0. In either power down mode, the PHY is not capable of receiving or transmitting
packets.
3.5.7.6.1
Power Down via the PHY Register
The PHY can be powered down using the control bit found in PHYREG.0.11. This bit powers down a
significant portion of the port but clocks to the register section remain active. This enables the PHY
management interface to remain active during register power down. The power down bit is active high.
When the PHY exits software power-down (PHYREG.0.11 = 0b), it re-initializes all analog functions, but
retains its previous configuration settings.
3.5.7.6.2
Power Management State
PHY is aware of power management state. If the PHY is not in a power down state, then PHY behavior
regarding several features are different depending on the power state. See Section 3.5.7.6.4 for
details.
3.5.7.6.3
AN1000_dis
AN1000_dis is an option to disable 1000 Mb/s advertisement in PHY regardless of Register 9.
This is for cases where the system doesn't support working in 1000 Mb/s due to power limitations.
This option is enabled by following bits in PHY registers:
• PHYREG 25.3 - disable 1000 Mb/s when in non-D0a states only.
• PHYREG 25.6 - disable 1000 Mb/s always.
• PHYREG 26.0 - same as 25.6, but this is a secure bit (see Secure Register chapter).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
143
Intel® 82576 GbE Controller — Interconnects
3.5.7.6.4
Low Power Link Up - Link Speed Control
Normal PHY speed negotiation drives to establish a link at the highest possible speed. The PHY supports
an additional mode of operation, where the PHY drives to establish a link at a low speed. The link-up
process enables a link to come up at the lowest possible speed in cases where power is more important
than performance. Different behavior is defined for the D0 state and the other non-D0 states.
Note:
The Low-Power Link-Up (LPLU) feature previously described should be disabled (in both D0a
state and non-D0a states) when the designer advertisement is anything other than 10/100/
1000 Mb/s (all three). This is to avoid reaching (through the LPLU procedure) a link speed
that is not advertised by the user.
Table 3-37 lists link speed as function of power management state, link speed control, and GbE speed
enabling:
Table 3-37.
Power
Management
State
Link Speed vs. Power State
Low Power
Link Up
(reg 25.1
and 2)
0, Xb
GbE Disable Bits
Disable 1000
(reg 25.6)
0b
Disable 1000 in
non-D0a (reg
25.3)
X
D0a
1b
1, Xb
0b
X, 0b
X, 1b
PHY negotiates to highest speed advertised (normal
operation).
PHY negotiates to highest speed advertised (normal
operation), excluding 1000 Mb/s.
X
1b
Non-D0a
PHY Speed Negotiation
PHY goes through Low Power Link Up (LPLU)
procedure, starting with advertised values.
PHY goes through LPLU procedure, starting with
advertised values. Does not advertise 1000 Mb/s.
0b
0b
PHY negotiates to highest speed advertised.
0b
1b
PHY negotiates to highest speed advertised, excluding
1000 Mb/s.
1b
X
0b
0b
PHY goes through LPLU procedure, starting at 10 Mb/
s.
0b
1b
PHY goes through LPLU procedure, starting at 10 Mb/
s. Does not advertise 1000 Mb/s.
The PHY initiates auto-negotiation without a direct driver command in the following cases:
• When the state of Disable_1000 changes. For example, if 1000 Mb/s is disabled on D3 or Dr entry
(but not in D0a), the PHY auto-negotiates on entry.
• When LPLU changes state with a change in a power management state. For example, on transition
from D0a without LPLU to D3 with LPLU. Or, on transition from D3 with LPLU to D0 without LPLU.
• On a transition from D0a state to a non-D0a state, or from a non-D0a state to D0a state, and LPLU
is set.
3.5.7.6.4.1
D0a State
A power-managed link speed control lowers link speed (and power) when highest link performance is
not required. When enabled (D0 Low Power Link Up mode), any link negotiation tries to establish a lowlink speed, starting with an initial advertisement defined by software.
Intel® 82576 GbE Controller
Datasheet
144
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
The D0LPLU configuration bit enables D0 Low Power Link Up. Before enabling this feature, software
must advertise to one of the following speed combinations: 10 Mb/s only, 10/100 Mb/s only, or 10/100/
1000 Mb/s.
When speed negotiation starts, the PHY tries to negotiate at a speed based on the currently advertised
values. If link establishment fails, the PHY tries to negotiate with different speeds; it enables all speeds
up to the lowest speed supported by the partner. For example, PHY advertises 10 Mb/s only, and the
partner supports 1000 Mb/s only. After the first try fails, the PHY enables 10/100/1000 Mb/s and tries
again. The PHY continues to try and establish a link until it succeeds or until it is instructed otherwise.
In the second step (adjusting to partner speed), the PHY also enables parallel detect, if needed.
Automatic MDI/MDI-X resolution is done during the first auto-negotiation stage.
3.5.7.6.4.2
Non-D0a State
The PHY might negotiate to a low speed while in non-D0a states (Dr, D0u, D3). This applies only when
the link is required by one of the following: SMBus manageability, APM Wake, or PME. Otherwise, the
PHY is disabled during the non-D0 state.
The Low Power on Link-Up (Register 25.2, is also loaded from EEPROM) bit enables reduction in link
speed:
• At power-up entry to Dr state, the PHY advertises supports for 10 Mb/s only and goes through the
link up process.
• At any entry to a non-D0a state (Dr, D0u, D3), the PHY advertises support for
10 Mb/s only and goes through the link up process.
• While in a non-D0 state, if auto-negotiation is required, the PHY advertises support for 10 Mb/s only
and goes through the link up process.
Link negotiation begins with the PHY trying to negotiate at 10 Mb/s speed only regardless of user autonegotiation advertisement. If link establishment fails, the PHY tries to negotiate at additional speeds; it
enables all speeds up to the lowest speed supported by the partner. For example, the PHY advertises 10
Mb/s only and the partner supports 1000 Mb/s only. After the first try fails, PHY enables 10/100/
1000 Mb/s and tries again. The PHY continues to try and establish a link until it succeeds or until it is
instructed otherwise. In the second step (adjusting to partner speed), the PHY also enables parallel
detect, if needed. Automatic MDI/MDI-X resolution is done during the first auto-negotiation stage.
3.5.7.6.5
Smart Power-Down (SPD)
Smart power-down is a link-disconnect capability applicable to all power management states. SPD
combines a power saving mechanism with the fact that the link might disappear and resume.
Smart power-down is enabled by PHYREG 25.0 or by SPD Enable bit in the EEPROM and is entered
when the PHY detects link loss. Auto-negotiation must also be enabled. While in the smart power-down
state, the PHY powers down circuits and clocks that are not required for detection of link activity. The
PHY is still be able to detect link pulses (including parallel detect) and wake-up to engage in link
negotiation. The PHY does not send link pulses (NLP) while in SPD state; however, register accesses are
still possible.
When the PHY is in smart power-down and detects link activity, it re-negotiates link speed based on the
power state and the Low Power Link Up bit as described in PHYREG 25.1 and 25.2.
Note:
The link-disconnect state applies to all power management states (Dr, D0u, D0a, D3).
The link might change status, that is go up or go down, while in any of these states.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
145
Intel® 82576 GbE Controller — Interconnects
3.5.7.6.5.1
Back-to-Back Smart Power-Down
While in link disconnect, the 82576 monitors the link for link pulses to identify when a link is reconnected. The 82576 also periodically transmits pulses to resolve the case of two the 82576s (or
devices with the 82576-like behavior) connected to each other across the link. Otherwise, two such
devices might be locked in Smart power-down mode, not capable of identifying that a link was reconnected.
The link pulses are transmitted on average every 100 ms on alternate channels (A/B and C/D) and add
<1% to total the 82576 power in link disconnect mode. Pulses do not conform to IEEE specification
regarding link pulse template. A single pulse should be enough to bring a receiver out of smart powerdown mode in a worst-case configuration (such as maximum cable length, highest cable attenuation,
etc.).
If the link partners are disconnected and then reconnected, it is possible that the two controllers
transmit their pulses at the same time. Since the 82576 masks its receiver during pulse transmission,
such synchronization causes pulses to be missed by both partners. A randomization factor is therefore
applied to the timing of transmitted pulses, affecting the period between pulses. The randomization
factor is specific per device and should reduce the probability of a lock to 10-4. Note that if the two
partners happen to transmit within the same slot, and if the randomization factor happens to be similar,
it takes longer for the partners to get out of sync with each other.
Back-to-back smart power-down is enabled by the SPD_B2B_EN bit in the PHY registers. The default
value is enabled. The Enable bit applies to smart power-down mode.
Note:
This bit should not be altered by software once the 82576 was set in smart power-down
mode. If software requires changing the back-to-back status, it first needs to transition the
PHY out of smart power-down mode and only then change the back-to-back bit to the
required state.
3.5.7.6.6
Link Energy Detect
The PHY asserts the Link Energy Detect bit (PHYREG 25.4) each time energy is detected on the link.
This bit provides an indication of a cable becoming plugged or unplugged.
This bit is valid only if auto-negotiation is enabled and smart power-down is enabled (reg 25.0).
In order to correctly deduce that there is no energy, the bit must read 0b for three consecutive reads
each second.
3.5.7.6.7
PHY Power-Down State
Each 82576 port enters a power-down state when none of its clients is enabled and therefore has no
need to maintain a link. This can happen in one of the following cases. Note that PHY power-down must
be enabled through the EEPROM PHY Power Down Enable bit.
1. D3/Dr state: Each PHY enters a low-power state if the following conditions are met:
a.
The LAN function associated with this PHY is in a non-D0 state
b.
APM WOL is inactive
c.
Manageability doesn't use this port.
d.
ACPI PME is disabled for this port.
e.
The PHY Power Down Enable EEPROM bit is set (word 0xF, bit 6).
2. SerDes mode: Each PHY is disabled when its LAN function is configured to SerDes mode.
Intel® 82576 GbE Controller
Datasheet
146
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3. LAN disable: Each PHY can be disabled if its LAN function's LAN Disable input indicates that the
relevant function should be disabled. Since the PHY is shared between the LAN function and
manageability, it might not be desirable to power down the PHY in LAN Disable. The
PHY_in_LAN_Disable EEPROM bit determines whether the PHY (and MAC) are powered down when
the LAN Disable pin is asserted. The default is not to power down.
A LAN port can also be disabled through EEPROM settings. If the LAN_DIS EEPROM bit is set, the PHY
enters power down. Note, however, that setting the EEPROM LAN_PCI_DIS bit does not bring the PHY
into power down.
3.5.7.7
Advanced Diagnostics
The 82576 PHY incorporates hardware support for advanced diagnostics.
The hardware support enables output of internal PHY data to host memory for post processing by the
software device driver.
Diagnostics supported are:
3.5.7.7.1
TDR - Time Domain Reflectometry
By sending a pulse onto the twisted pair and observing the retuned signal, the following can be
deduced:
1. Is there a short?
2. Is there an open?
3. Is there an impedance mismatch?
4. What is the length to any of these faults?
3.5.7.7.2
Channel Frequency Response
By doing analysis on the Tx and Rx data, it can be established that a channel’s frequency response
(also known as insertion loss) can determine if the channel is within specification limits. (Clause
40.7.2.1 in IEEE 802.3).
3.5.7.8
1000 Mb/s Operation
3.5.7.8.1
Introduction
Figure 3-8 shows an overview of 1000BASE-T functions, followed by discussion and review of the
internal functional blocks.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
147
Intel® 82576 GbE Controller — Interconnects
Figure 3-8.
3.5.7.8.2
1000BASE-T Functions Overview
Transmit Functions
This section describes functions used when the Media Access Controller (MAC) transmits data through
the PHY and out onto the twisted-pair connection (see Figure 3-8).
3.5.7.8.2.1
Scrambler
The scrambler randomizes the transmitted data. The purpose of scrambling is twofold:
1. Scrambling eliminates repeating data patterns (also known as spectral lines) from the 4DPAM5
waveform in order to reduce EMI.
2. Each channel (A, B, C, D) has a unique signature that the receiver uses for identification.
Intel® 82576 GbE Controller
Datasheet
148
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
The scrambler is driven by a 33-bit Linear Feedback Shift Register (LFSR), which is randomly loaded at
power up. The LFSR function used by the master differs from that used by the slave, giving each
direction its own unique signature. The LFSR, in turn, generates twelve mutually uncorrelated outputs.
Eight of these are used to randomize the inputs to the 4DPAM5 and Trellis encoders. The remaining four
outputs randomize the sign of the 4DPAM5 outputs.
3.5.7.8.2.2
Transmit FIFO
The transmit FIFO re-synchronizes data transmitted by the MAC to the transmit reference used by the
PHY. The FIFO is large enough to support a frequency differential of up to +/- 1000 ppm over a packet
size of 9500 bytes (max jumbo frame).
3.5.7.8.2.3
Transmit Phase-Locked Loop PLL
This function generates the 125 MHz timing reference used by the PHY to transmit 4DPAM5 symbols.
When the PHY is the master side of the link, the XI input is the reference for the transmit PLL. When the
PHY is the slave side of the link, the recovered receive clock is the reference for the transmit PLL.
3.5.7.8.2.4
Trellis Encoder
The Trellis encoder uses the two high-order bits of data and its previous output to generate a ninth bit,
which determines if the next 4DPAM5 pattern should be even or odd.
For data, this function is:
Trellisn = Data7n-1 XOR Data6n-2 XOR Trellisn-3
This provides forward error correction and enhances the Signal-To-Noise (SNR) ratio by a factor of 6 dB.
3.5.7.8.2.5
4DPAM5 Encoder
The 4DPAM5 encoder translates 8-byte codes transmitted by the MAC into 4DPAM5 symbols. The
encoder operates at 125 MHz, which is both the frequency of the MAC interface and the baud rate used
by 1000BASE-T.
Each 8-byte code represents one of 28 or 256 data patterns. Each 4DPAM5 symbol consists of one of
five signal levels (-2,-1,0,1,2) on each of the four twisted pair (A,B,C,D) representing 54 or 625
possible patterns per baud period. Of these, 113 patterns are reserved for control codes, leaving 512
patterns for data. These data patterns are divided into two groups of 256 even and 256 odd data
patterns. Thus, each 8-byte octet has two possible 4DPAM5 representations: one even and one odd
pattern.
3.5.7.8.2.6
Spectral Shaper
This function causes the 4DPAM5 waveform to have a spectral signature that is very close to that of the
MLT3 waveform used by 100BASE-TX. This enables 1000BASE-T to take advantage of infrastructure
(cables, magnetics) designed for 100BASE-TX.
The shaper works by transmitting 75% of a 4DPAM5 code in the current baud period, and adding the
remaining 25% into the next baud period.
3.5.7.8.2.7
Low-Pass Filter
To aid with EMI, this filter attenuates signal components more than 180 MHz. In 1000BASE-T, the
fundamental symbol rate is 125 MHz.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
149
Intel® 82576 GbE Controller — Interconnects
3.5.7.8.2.8
Line Driver
The line driver drives the 4DPAM5 waveforms onto the four twisted-pair channels (A, B, C, D), adding
them onto the waveforms that are simultaneously being received from the link partner.
Figure 3-9.
3.5.7.8.3
Figure 3-10.
1000BASE-T Transmit Flow And Line Coding Scheme
Receive Functions
Transmit/Receive Flow
This section describes function blocks that are used when the PHY receives data from the twisted pair
interface and passes it back to the MAC (see Figure 3-10).
3.5.7.8.3.1
Hybrid
The hybrid subtracts the transmitted signal from the input signal, enabling the use of simple 100BASETX compatible magnetics.
3.5.7.8.3.2
Automatic Gain Control (AGC)
AGC normalizes the amplitude of the received signal, adjusting for the attenuation produced by the
cable.
Intel® 82576 GbE Controller
Datasheet
150
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.7.8.3.3
Timing Recovery
This function re-generates a receive clock from the incoming data stream which is used to sample the
data. On the slave side of the link, this clock is also used to drive the transmitter.
3.5.7.8.3.4
Analog-to-Digital Converter (ADC)
The ADC function converts the incoming data stream from an analog waveform to digitized samples for
processing by the DSP core.
3.5.7.8.3.5
Digital Signal Processor (DSP)
DSP provides per-channel adaptive filtering, which eliminates various signal impairments including:
• Inter-symbol interference (equalization)
• Echo caused by impedance mismatch of the cable
• Near-end crosstalk (NEXT) between adjacent channels (A, B, C, D)
• Far-end crosstalk (FEXT)
• Propagation delay variations between channels of up to 120 ns
• Extraneous tones that have been coupled into the receive path
The adaptive filter coefficients are initially set during the training phase. They are continuously adjusted
(adaptive equalization) during operation through the decision-feedback loop.
3.5.7.8.3.6
De scrambler
The de scrambler identifies each channel by its characteristic signature, removing the signature and rerouting the channel internally. In this way, the receiver can correct for channel swaps and polarity
reversals. The de scrambler uses the same base 33-bit LFSR used by the transmitter on the other side
of the link.
The de scrambler automatically loads the seed value from the incoming stream of scrambled idle
symbols. The de scrambler requires approximately 15 s to lock, normally accomplished during the
training phase.
3.5.7.8.3.7
Viterbi Decoder/Decision Feedback Equalizer (DFE)
The Viterbi decoder generates clean 4DPAM5 symbols from the output of the DSP. The decoder includes
a Trellis encoder identical to the one used by the transmitter. The Viterbi decoder simultaneously looks
at the received data over several baud periods. For each baud period, it predicts whether the symbol
received should be even or odd, and compares that to the actual symbol received. The 4DPAM5 code is
organized in such a way that a single level error on any channel changes an even code to an odd one
and vice versa. In this way, the Viterbi decoder can detect single-level coding errors, effectively
improving the signal-to-noise (SNR) ratio by a factor of 6 dB. When an error occurs, this information is
quickly fed back into the equalizer to prevent future errors.
3.5.7.8.3.8
4DPAM5 Decoder
The 4DPAM5 decoder generates 8-byte data from the output of the Viterbi decoder.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
151
Intel® 82576 GbE Controller — Interconnects
3.5.7.8.3.9
100 Mb/s Operation
The MAC passes data to the PHY over the MII. The PHY encodes and scrambles the data, then transmits
it using MLT-3 for 100TX over copper. The PHY de scrambles and decodes MLT-3 data received from the
network. When the MAC is not actively transmitting data, the PHY sends out idle symbols on the line.
3.5.7.8.3.10
10 Mb/s Operation
The PHY operates as a standard 10 Mb/s transceiver. Data transmitted by the MAC as 4-bit nibbles is
serialized, Manchester-encoded, and transmitted on the MDI[0]+/- outputs. Received data is decoded,
de-serialized into 4-bit nibbles and passed to the MAC across the internal MII. The PHY supports all the
standard 10 Mb/s functions.
3.5.7.8.3.11
Link Test
In 10 Mb/s mode, the PHY always transmits link pulses. If link test function is enabled, it monitors the
connection for link pulses. Once it detects two to seven link pulses, data transmission are enabled and
remain enabled as long as the link pulses or data reception continues. If the link pulses stop, the data
transmission is disabled.
If the link test function is disabled, the PHY might transmit packets regardless of detected link pulses.
Setting the Port Configuration register bit (PHYREG.16.14) can disable the link test function.
3.5.7.8.3.12
10Base-T Link Failure Criteria and Override
Link failure occurs if link test is enabled and link pulses stop being received. If this condition occurs, the
PHY returns to the auto-negotiation phase, if auto-negotiation is enabled. Setting the Port Configuration
register bit (PHYREG.16.14) disables the link integrity test function, then the PHY transmits packets,
regardless of link status.
3.5.7.8.3.13
Jabber
If the MAC begins a transmission that exceeds the jabber timer, the PHY disables the transmit and
loopback functions and asserts collision indication to the MAC. The PHY automatically exits jabber mode
after 250-750 ms. This function can be disabled by setting bit PHYREG.16.10 = 1b.
3.5.7.8.3.14
Polarity Correction
The PHY automatically detects and corrects for the condition where the receive signal (MDI_PLUS[0]/
MDI_MINUS[0]) is inverted. Reversed polarity is detected if eight inverted link pulses or four inverted
end-of-frame markers are received consecutively. If link pulses or data are not received for 96-130 ms,
the polarity state is reset to a non-inverted state.
Automatic polarity correction can be disabled by setting bit PHYREG.27.5.
3.5.7.8.3.15
Dribble Bits
The PHY handles dribble bits for all of its modes. If between one and four dribble bits are received, the
nibble is passed across the interface. The data passed across is padded with 1's if necessary. If
between five and seven dribble bits are received, the second nibble is not sent onto the internal MII bus
to the MAC. This ensures that dribble bits between 1-7 do not cause the MAC to discard the frame due
to a CRC error.
Intel® 82576 GbE Controller
Datasheet
152
320961-015EN
Revision: 2.61
December 2010
Interconnects — Intel® 82576 GbE Controller
3.5.7.8.3.16
PHY Address
The PHY address for MDIO accesses is 00001b.
3.5.8
Media Auto Sense
The 82576 provides a significant amount of flexibility in pairing a LAN device with a particular type of
media (such as copper or fiber-optic) as well as the specific transceiver/interface used to communicate
with the media. Each MAC, representing a distinct LAN device, can be coupled with an internal copper
PHY (the default) or SerDes/SGMII interface independently. The link configuration specified for each
LAN device can be specified in the LINK_MODE field of the Extended Device Control (CTRL_EXT)
register and initialized from the EEPROM Initialization Control Word 3 associated with each LAN device.
In some applications, software might need to be aware of the presence of a link on the media not
currently active. In order to supply such an indication, any of the 82576 ports can set the
AUTOSENSE_EN bit in the CONNSW register (address 0x00034) in order to enable sensing of the non
active media activity.
Note:
When in SerDes/SGMII detect mode, software should define which indication is used to
detect the energy change on the SerDes/SGMII media. It can be either the external signal
detect pin or the internal signal detect. This is done using the CONNSW.ENRGSRC bit. The
signal detect pin is normally used when connecting in SerDes mode to optical media where
the receive LED provide such an indication.
Software can then enable the OMED interrupt in ICR in order to get an indication on any detection of
energy in the non active media.
Note:
The auto-sense capability can be used in either port independent of the usage of the other
port.
The following sections describes the procedures that should be followed in order to enable the autosense mode
3.5.8.1
Auto Sense Setup
3.5.8.1.1
SerDes/SGMII Detect Mode (PHY is active)
1. Set CONNSW.ENRGSRC to determine the sources for the signal detect indication (1b = external
SIG_DET, 0b = internal SerDes electrical idle). The default of this bit is set by EEPROM.
2. Set CONNSW.AUTOSENSE_EN.
3. When link is detected on the SerDes /SGMII media, the 82576 sets the interrupt bit OMED in ICR
and if enabled, issues an interrupt. The CONNSW.AUTOSENSE_EN bit is cleared .
3.5.8.1.2
PHY Detect Mode (SerDes/SGMII is active)
1. Set CONNSW.AUTOSENSE_CONF = 1b.
2. Reset the PHY as described in Section 4.2.
3. Place the PHY into link-disconnect mode by setting PHY_REG 25.5 using the MDIC register.
4. Set CONNSW.AUTOSENSE_EN = 1b and then clear CONNSW.AUTOSENSE_CONF.
5. When signal is detected on the PHY media, the 82576 sets the interrupt bit OMED in ICR and if
enabled, issues an interrupt.
6. The 82576 puts the PHY in power down mode.
According to the result of the interrupt, software can then decide to switch to the other media.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
153
Intel® 82576 GbE Controller — Interconnects
3.5.8.2
Switching Between Medias.
The 82576's link mode is controlled by the Extended Device Control register; CTRL_EXT (0x00018) bits
23:22. The default value for the LINK_MODE setting is directly mapped from the EEPROM's initialization
Control Word 3 (bits 1:0). Software can modify the LINK_MODE indication by writing the corresponding
value into this register.
Note:
Before dynamically switching between medias, the software should ensure that the current
mode of operation is not in the process of transmitting or receiving data. This is achieved by
disabling the transmitter and receiver, waiting until the 82576 is in an idle state, and then
beginning the process for changing the link mode.
The mode switch in this method is only valid until the next hardware reset of the 82576.
After a hardware reset, the link mode is restored to the default setting by the EEPROM. To
get a permanent change of the link mode, the default in the EEPROM should be changed.
The following procedures need to be followed to actually switch between the two modes.
3.5.8.2.1
Transition to SerDes/SGMII mode
1. Disable the receiver by clearing RCTL.RXEN.
2. Disable the transmitter by clearing TCTL.EN.
3. Verify the 82576 has stopped processing outstanding cycles and is idle.
4. Modify LINK mode to SerDes or SGMII by setting CTRL_EXT.LINK_MODE to 11b or 10b,
respectively.
5. Set up the link as described in Section 4.5.7.3 or Section 4.5.7.4.
6. Set up Tx and Rx queues and enable Tx and Rx processes.
3.5.8.2.2
Transition to Internal PHY Mode
1. Disable the receiver by clearing RCTL.RXEN.
2. Disable the transmitter by clearing TCTL.EN.
3. Verify the 82576 has stopped processing outstanding cycles and is idle.
4. Modify LINK mode to PHY mode by setting CTRL_EXT.LINK_MODE to 00b.
5. Set link-up indication by setting CTRL.SLU.
6. Reset the PHY as described in Section 4.2.
7. Set up the link as described in Section 4.5.7.4.
8. Set up the Tx and Rx queues and enable the Tx and Rx processes.
§§
Intel® 82576 GbE Controller
Datasheet
154
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
4.0
Initialization
4.1
Power Up
4.1.1
Power-Up Sequence
Figure 4-1 shows the 82576 power-up sequence from power ramp up and until the device is ready to
accept host commands.
Figure 4-1.
320961-015EN
Revision: 2.61
December 2010
82576 Power-Up - General Flow
Intel® 82576 GbE Controller
Datasheet
155
Intel® 82576 GbE Controller — Initialization
Note:
The keep_PHY_link_up bit (Veto bit) can be set by firmware when the MC is running IDER
or SoL. Its purpose is to prevent interruption of these processes when power is being turned
on.
4.1.2
Power-Up Timing Diagram
Figure 4-2.
Power-Up Timing Diagram
Table 4-1.
Notes to Power-Up Timing Diagram
Note
1
Xosc is stable txog after the Power is stable
2
Internal Reset is released after all power supplies are good and tppg after Xosc is stable.
3
An NVM read starts on the rising edge of the internal Reset.
4
After reading the NVM, PHY might exit power down mode.
5
APM Wakeup and/or manageability might be enabled based on NVM contents.
6
The PCIe reference clock is valid tPE_RST-CLK before the de-assertion of PE_RST# (according to PCIe spec).
7
PE_RST# is de-asserted tPVPGL after power is stable (according to PCIe spec).
8
De-assertion of PE_RST# causes the NVM to be re-read, asserts PHY power-down (except if veto bit also known as
keep_PHY_link_up bit is set), and disables Wake Up.
9
After reading the NVM, PHY exits power-down mode.
10
Link training starts after tpgtrn from PE_RST# de-assertion.
Intel® 82576 GbE Controller
Datasheet
156
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
Table 4-1.
Notes to Power-Up Timing Diagram (Continued)
11
A first PCIe configuration access might arrive after tpgcfg from PE_RST# de-assertion.
12
A first PCI configuration response can be sent after tpgres from PE_RST# de-assertion
13
Writing a 1 to the Memory Access Enable bit in the PCI Command Register transitions the device from D0u to D0
state.
4.1.2.1
Timing Requirements
the 82576 requires the following start-up and power state transitions.
Table 4-2.
Parameter
Power-Up Timing Requirements
Description
Min.
Max.
Notes
txog
Base 25 clock stable from power stable
tPWRGD-CLK
PCIe clock valid to PCIe power good
100s
-
According to PCIe spec
tPVPGL
Power rails stable to PCIe Reset inactive
100ms
-
According to PCIe spec
tpgcfg
External PCIe Reset signal to first configuration
cycle.
100ms
4.1.2.2
10msec
According to PCIe spec
Timing Guarantees
the 82576 guarantees the following start-up and power state transition related timing parameters.
Table 4-3.
Parameter
Power-Up Timing Guarantees
Description
Min.
Max.
Notes
txog
Xosc stable from power stable
10msec
tppg
Internal power good delay from
valid power rail
35msec
Use internal counter for external
devices stabilization
tee
EEPROM read duration
20msec
Actual time depends on the EEPROM
content
topll
PCIe Reset to start of link training
10msec
tpcipll
PCIe Reset to first configuration
cycle
5msec
tpgtrn
PCIe Reset to start of link training
20msec
tpgres
PCIe Reset to first configuration
cycle
4.2
Reset Operation
4.2.1
Reset Sources
100mse
c
According to PCIe spec
According to PCIe spec
The 82576 reset sources are described below:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
157
Intel® 82576 GbE Controller — Initialization
4.2.1.1
Internal_Power_On_Reset
The 82576 has an internal mechanism for sensing the power pins. Once the power is up and stable the
82576 creates an internal reset, this reset acts as a master reset of the entire chip. It is level sensitive,
and while it is zero holds all of the registers in reset. Internal_Power_On_Reset is interpreted to be an
indication that device power supplies are all stable. Internal_Power_On_Reset changes state during
system power-up.
4.2.1.2
PE_RST_N
The assertion of PE_RST_N indicates that both the power and the PCIe clock sources are stable. This
pin asserts an internal reset also after a D3cold exit. Most units are reset on the rising edge of
PE_RST_N. The only exception is the GIO unit, which is kept in reset while PE_RST_N is de-asserted
(level).
4.2.1.3
In-Band PCIe Reset
The 82576 generates an internal reset in response to a Physical layer message from the PCIe or when
the PCIe link goes down (entry to Polling or Detect state). This reset is equivalent to PCI reset in
previous (PCI) gigabit LAN controllers.
4.2.1.4
D3hot to D0 Transition
This is also known as ACPI Reset. The 82576 generates an internal reset on the transition from D3hot
power state to D0 (caused after configuration writes from D3 to D0 power state). Note that this reset is
per function and resets only the function that transitions from D3hot to D0.
4.2.1.5
Function Level Reset (FLR)
The FLR bit is required for the PF and per VF (Virtual Function). Setting of this bit for a VF resets only
the part of the logic dedicated to the specific VF and does not influence the shared part of the port.
Setting the PF FLR bit resets the entire function.
4.2.1.5.1
PF (Physical Function) FLR or FLR in non-IOV Mode
An FLR reset to a function is equivalent to a D0  D3  D0 transition with the exception that this reset
does not require driver intervention in order to stop the master transactions of this function. In an IOV
enabled system, this reset resets all the VFs attached to the PF. The EEPROM is partially reloaded after
an FLR reset.
The words read from EEPROM at FLR are the same read a full software reset.
4.2.1.5.2
VF (Virtual Function) FLR (Function Level Reset)
An FLR reset to a VF function resets all the queues, interrupts, and statistics registers attached to this
VF. It also resets the PCIe R/W configuration bits allocated to this function. It also disables Tx & Rx flow
for the queues allocated to this VF. All pending read requests are dropped and PCIe read completions to
this function might be completed as unsupported requests.
4.2.1.5.3
IOV (IO Virtualization) Disable
Clearing of the IOV enable bit in the IOV structure is equivalent to a VFLR to all the active VFs in the PF.
Intel® 82576 GbE Controller
Datasheet
158
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
4.2.1.6
Software Reset
4.2.1.6.1
Full Software Reset
Device Reset, RST, can be used to globally reset the entire component. This reset is provided primarily
as a last-ditch software mechanism to recover from an indeterminate or suspected hung hardware
state. Most registers (receive, transmit, interrupt, statistics, etc.), and state machines are set to their
power-on reset values, approximating the state following a power-on or PCI reset. However, PCIe
configuration registers are not reset, thereby leaving the device mapped into system memory space
and accessible by a driver. One internal configuration register, the Packet Buffer Allocation registers
(RXPBS, TXPBS & SWPBS), also retain their value through a global reset.
Note:
To ensure that global device reset was fully completed and that the 82576 responds to
subsequent accesses, wait approximately 1 millisecond after setting before attempting to
check if the bit was cleared, or to access (read or write) any other device register.
Software can reset the 82576 by writing the Device Reset bit of the Device Control Register
(CTRL.RST). The 82576 re-reads part of the per-function EEPROM fields after a software reset. Bits that
are normally read from the EEPROM are reset to their default hardware values.
Fields controlled by the LED, SDP & Init3 words of the EEPROM are not reset and not re-read after a
software reset.
Note:
This reset is per function and resets only the function that received the software reset. PCI
Configuration space (configuration and mapping) of the device is unaffected. Prior to issuing
software reset the Driver needs to operate the master disable algorithm as defined in
Section 5.2.3.2.
4.2.1.6.2
Physical Function (PF) Software Reset
A software reset by the PF in IOV mode has the same consequences as a software reset in a non-IOV
mode. The procedure for PF software reset is as follows:
• The PF driver disables master accesses by the device through the Master Disable mechanism (see
Section 5.2.3.2). Master Disable affects all VFs traffic.
• Execute the procedure described in Section 4.5.11.2.3 to synchronize between the PF and VFs.
VFs are expected to timeout and check on the VFMailbox.RSTD bit in order to identify a PF software
reset event. The VFMailbox.RSTD bits are cleared on read.
4.2.1.6.3
VF Software Reset
A software reset applied to a VF is equivalent to an FLR reset to this VF with the exception that the PCIe
configuration bits allocated to this function are not reset. This can be activated by setting the
VTCTRL.RST bit.
Setting VTCTRL.RST resets interrupts and queue enable bits. Other VF registers are not reset.
4.2.1.7
Force TCO
This reset is generated when manageability logic is enabled. It is only generated if the Reset on Force
TCO bit of the EEPROM's Management Control word is 1. In pass through mode it is generated when
receiving a ForceTCO SMBus command with bit 1 or bit 7 set.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
159
Intel® 82576 GbE Controller — Initialization
4.2.1.8
Firmware Reset
This reset is activated by writing a 1 to the FWR bit in the HOST Interface Control Register (HICR) in
CSR address 0x8F00.
4.2.1.9
EEPROM Reset
Writing a 1 to the EEPROM Reset bit of the Extended Device Control Register (CTRL_EXT.EE_RST)
causes the 82576 to re-read the per-function configuration from the EEPROM, setting the appropriate
bits in the registers loaded by the EEPROM.
4.2.1.10
PHY Reset
Software can write a 1 to the PHY Reset bit of the Device Control Register (CTRL.PHY_RST) to reset the
internal PHY. The PHY is internally configured after a PHY reset.
Note:
The PHY should not be reset using PHYREG 0 bit 15, as in this case the internal
configuration process is bypassed and there is no guarantee the PHY will operate correctly.
As the PHY may be accessed by the internal firmware and the driver software, the driver software
should coordinate any PHY reset with the firmware using the following procedure:
1. Check that MANC.BLK_Phy_Rst_On_IDE (offset 0x5820 bit 18) is cleared. If it is set, the MC
requires a stable link and thus the PHY should not be reset at this stage. The driver may skip the
PHY reset if not mandatory or wait for MANC.BLK_Phy_Rst_On_IDE to clear. See Section 4.2.3 for
more details.
2. Take ownership of the relevant PHY using the following flow:
a.
b.
c.
d.
Get ownership of the software/software semaphore SWSM.SMBI (offset 0x5B50 bit 0).
•
Read the SWSM register.
•
If SWSM.SMBI is read as zero, the semaphore was taken.
•
Otherwise, go back to step a.
•
This step assure that other software will not access the shared resources register
(SW_FW_SYNC).
Get ownership of the software/firmware semaphore SWSM.SWESMBI (offset 0x5B50 bit 1):
•
Set the SWSM.SWESMBI bit.
•
Read SWSM.
•
If SWSM.SWESMBI was successfully set - the semaphore was acquired - otherwise, go back
to step a.
•
This step assure that the internal firmware will not access the shared resources register
(SW_FW_SYNC).
Software reads the Software-Firmware Synchronization Register (SW_FW_SYNC) and checks
both bits in the pair of bits that control the PHY it wishes to own.
•
If both bits are cleared (both firmware and other software does not own the PHY), software
sets the software bit in the pair of bits that control the resource it wishes to own.
•
If one of the bits is set (firmware or other software owns the PHY), software tries again
later.
Release ownership of the software/firmware semaphore by clearing the SWSM.SWESMBI bit.
3. Drive PHY reset bit in CTRL bit 31.
4. Wait 100 s.
Intel® 82576 GbE Controller
Datasheet
160
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
5. Release PHY reset in CTRL bit 31.
6. Release ownership of the relevant PHY to the FW using the following flow:
a.
Get ownership of the software/firmware semaphore SWSM.SWESMBI (offset 0x5B50 bit 1):
•
Set the SWSM.SWESMBI bit.
•
Read SWSM.
•
If SWSM.SWESMBI was successfully set - the semaphore was acquired - otherwise, go back
to step a.
•
Clear the bit in SW_FW_SYNC that control the software ownership of the resource to
indicate this resource is free.
•
Release ownership of the software/firmware semaphore by clearing the SWSM.SWESMBI
bit.
7. Wait for the relevant CFG_DONE bit (EEMNGCTL.CFG_DONE0 - offset 0x1010 bit 18 or
EEMNGCTL.CFG_DONE1 - offset 0x1010 bit 19).
8. Take ownership of the relevant PHY using the following flow:
a.
Get ownership of the software/firmware semaphore SWSM.SWESMBI (offset 0x5B50 bit 1):
b.
•
Set the SWSM.SWESMBI bit.
•
Read SWSM.
•
If SWSM.SWESMBI was successfully set - the semaphore was acquired - otherwise, go back
to step a.
•
This step assure that the internal firmware will not access the shared resources register
(SW_FW_SYNC).
Software reads the Software-Firmware Synchronization Register (SW_FW_SYNC) and checks
both bits in the pair of bits that control the PHY it wishes to own.
c.
•
If both bits are cleared (both firmware and other software does not own the PHY), software
sets the software bit in the pair of bits that control the resource it wishes to own.
•
If one of the bits is set (firmware or other software owns the PHY), software tries again
later.
Release ownership of the software/software semaphore and the software/firmware semaphore
by clearing SWSM.SMBI and SWSM.SWESMBI bits.
9. Configure the PHY.
10. Release ownership of the relevant PHY using the flow described in Section 4.6.2.
4.2.2
Reset Effects
The resets affect the following registers and logic:
Table 4-4.
82576 Reset Effects - Common Resets
Internal_Power_
On_Reset
PE_
RST_N
In-Band PCIe Reset
LTSSM (PCIe back to detect/
polling)
X
X
X
PCIe Link data path
X
X
X
Reset Activation
FW
Reset
Notes
Read EEPROM (Per Function)
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
161
Intel® 82576 GbE Controller — Initialization
Table 4-4.
82576 Reset Effects - Common Resets (Continued)
Internal_Power_
On_Reset
PE_
RST_N
In-Band PCIe Reset
Read EEPROM (Complete
Load)
X
X
X
PCI Configuration Registersnon sticky
X
X
X
3.
PCI Configuration Registers sticky
X
X
X
4.
PCIe local registers
X
X
X
5.
Data path
X
X
X
On-die memories
X
X
X
MAC, PCS, Auto Negotiation,
MACSec, IPsec
X
X
X
Virtual function queue enable
X
X
X
Virtual function interrupt &
statistics registers
X
X
X
Wake Up (PM) Context
X
1
Wake Up Control Register
X
9.
Wake Up Status Registers
X
11.
Rule Checker Tables
X
Manageability Control
Registers
X
MMS Unit
X
Wake-Up Management
Registers
X
X
X
3.,13.
Memory Configuration
Registers
X
X
X
3.
EEPROM and flash request
X
PHY/SERDES PHY
X
X
X
Strapping Pins
X
X
X
Reset Activation
Circuit Breaker
Table 4-5.
FW
Reset
Notes
4.
2.
7.
12.
X
5.
2.
X
82576 Reset Effects - Per Function Resets
Reset Activation
D3hotD0
FLR
Full SW
Reset
Force
TCO
EE
Reset
X
X
X
X
X
Read EEPROM (Per
Function)
PCI Configuration
Registers RO
PCI Configuration
Registers , MSI-X
PCI Configuration
Registers RW shared
Intel® 82576 GbE Controller
Datasheet
162
PHY
Reset
Notes
3.
X
X
6.
8.
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
Table 4-5.
82576 Reset Effects - Per Function Resets (Continued)
Reset Activation
PCI Configuration
Registers RW
D3hotD0
FLR
X
X
Full SW
Reset
Force
TCO
EE
Reset
PHY
Reset
Notes
9.
PCIe local registers
5.
Data path
X
X
X
X
On-die memories
X
X
X
X
MAC, PCS, Auto
Negotiation, MACSec
IPsec
X
X
X
X
4.
Wake Up (PM)
Context
7.
Wake Up Control
Register
9.
Wake Up Status
Registers
11.
Rule Checker Tables
Manageability
Control Registers
12.
Virtual function
queue enable
X
X
X
X
2.
Virtual function
interrupt & statistics
registers
X
X
X
Wake-Up
Management
Registers
X
X
X
X
3.,13.
Memory
Configuration
Registers
X
X
X
X
3.
EEPROM and flash
request
X
X
PHY/SERDES PHY
X
X
2.
5.
X
X
2.
Strapping Pins
Table 4-6.
82576 Reset Effects -Virtual Function Resets
Reset Activation
VFLR6.
Software Reset
Interrupt registers
X
X
Queue disable
X
X
VF specific PCIe
configuration space
X
Notes
2.
1.
Data path
Notes:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
163
Intel® 82576 GbE Controller — Initialization
1. If AUX_POWER = 0b the Wakeup Context is reset (PME_Status and PME_En bits should be 0b at
reset if the 82576 does not support PME from D3cold).
2. The MMS unit must configure the PHY after any PHY reset.
3. The following register fields do not follow the general rules above:
a.
“CTRL.SDP0_IODIR, CTRL.SDP1_IODIR, CTRL_EXT.SDP2_IODIR, CTRL_EXT.SDP3_IODIR,
CONNSW.ENRGSRC field, CTRL_EXT.SFP_Enable, CTRL_EXT.LINK_MODE, CTRL_EXT.EXT_VLAN
and LED configuration registers are reset on Internal_Power_On_Reset only. Any EEPROM read
resets these fields to the values in the EEPROM.
b.
The Aux Power Detected bit in the PCIe Device Status register is reset on
Internal_Power_On_Reset and GIO Power Good only.
c.
The bits mentioned in the next note.
4. The following registers are part of this group:
a.
VPD registers
b.
Max payload size field in PCIe Capability Control register (offset 0xA8).
c.
Active State Link PM Control field, Common Clock Configuration field and Extended Synch field
in PCIe Capability Link Control register (Offset 0xB0).
d.
ARI enable bit in IOV capability Command register (offset 0x168).
e.
Read Completion Boundary in the PCIe Link Control register (Offset 0xB0).
5. The following registers are part of this group:
a.
SWSM
b.
GCR (only part of the bits - see register description for details)
c.
FUNCTAG
d.
GSCL_1/2/3/4
e.
GSCN_0/1/2/3
f.
SW_FW_SYNC - only part of the bits - see register description for details.
6. The following registers are part of this group:
a.
MSIX control register, MSIX PBA and MSIX per vector mask.
7. The Wake Up Context is defined in the PCI Bus Power Management Interface Specification (Sticky
bits). It includes:
a.
PME_En bit of the Power Management Control/Status Register (PMCSR).
b.
PME_Status bit of the Power Management Control/Status Register (PMCSR).
c.
Aux_En in the PCIe registers
d.
The device Requester ID (since it is required for the PM_PME TLP).
The shadow copies of these bits in the Wakeup Control Register are treated identically.
8. The following fields are part of the PCI Configuration Registers RW shared group:
a.
Captured Slot Power Limit Value in the Device Capabilities register
b.
Captured Slot Power Limit Scale in the Device Capabilities register
c.
Max_Payload_Size in the Device Control register
d.
Active State Power Management (ASPM) Control in the Link Control register
e.
Read Completion Boundary (RCB) in the Link Control register
f.
Common Clock Configuration in the Link Control register
Intel® 82576 GbE Controller
Datasheet
164
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
g.
Extended Synch in the Link Control register
h.
Enable Clock Power Management in the Link Control register
i.
Hardware Autonomous Width Disable bit in Link Control register
j.
Hardware Autonomous Speed Disable bit in the Link Control 2 register
9. Refers to all the PCI Configuration Registers RW registers not included in notes 8. and 6.
10. Refers to bits in the Wake Up Control Register that are not part of the Wake-Up Context (the
PME_En and PME_Status bits).
11. The Wake Up Status Registers include the following:
a.
Wake Up Status Register
b.
Wake Up Packet Length.
c.
Wake Up Packet Memory.
12. The manageability control registers refer to the following registers:
a.
MANC 0x5820
b.
MFUTP01-7 0x5030 - 0x504C
c.
MFVAL 0x05824
d.
MANC2H 0x5860
e.
MAVTV1-7 0x5010 - 0x502C
f.
MDEF0-7 0x5890 - 0x58AC
g.
MDEF_EXT 0x5930 - 0x594C
h.
METF 0x5060 - 0x506C
i.
MIPAF0-15 0x58B0 - 0x58EC
j.
MMAH/MMAL0-3 0x5910 - 0x592C
k.
FWSM
13. The Wake-up Management Registers include the following:
a.
Wake Up Filter Control
b.
IP Address Valid
c.
IPv4 Address Table
d.
IPv6 Address Table
e.
Flexible Filter Length Table
f.
Flexible Filter Mask Table
14. The Other Configuration Registers includes:
a.
General Registers
b.
Interrupt Registers
c.
Receive Registers
d.
Transmit Registers
e.
Statistics Registers
f.
Diagnostic Registers
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
165
Intel® 82576 GbE Controller — Initialization
Of these registers, MTA[n], VFTA[n], WUPM[n], FFMT[n], FFVT[n], TDBAH/TDBAL, and RDBAH/RDVAL
registers have no default value. If the functions associated with the registers are enabled they must be
programmed by software. Once programmed, their value is preserved through all resets as long as
power is applied to the 82576.
Note:
In situations where the device is reset using the software reset CTRL.RST, the TX data lines
is forced to all zeros. This causes a substantial number of symbol errors to be detected by
the link partner. In TBI mode, if the duration is long enough, the link partner might restart
the Auto-Negotiation process by sending “break-link” (/C/ codes with the configuration
register value set to all zeros).
1. These registers includes
a.
MSI/MSI-X enable bits
b.
BME
c.
Error indications
2. These registers includes
a.
VTEICS
b.
VTEIMS
c.
VTEIAC
d.
VTEIAM
e.
VTEITR 0-2
f.
VTIVAR0
g.
VTIVAR_MISC
h.
PBACL
i.
VFMailbox
3. These registers includes
a.
RXDCTL.Enable
b.
Adequate bit in VFTE & VFRE.
4. The contents of the following memories are cleared to support the requirements of PCIe FLR:
a.
The Tx packet buffers
b.
The Rx packet buffers
c.
IPsec Tx SA tables
d.
IPsec Rx SA tables
5. Includes EEC.REQ, EEC.GNT, FLA.REQ and FLA.GNT fields.
6. A VFLR do not reset the configuration of the VF, only disables the interrupts and the queues.
4.2.3
PHY Behavior During a Manageability Session
During some manageability sessions (e.g. an IDER or SoL session as initiated by an external MC ), the
platform is reset so that it boots from a remote media. This reset must not cause the Ethernet link to
drop since the manageability session is lost. Also, the Ethernet link should be kept on continuously
during the session for the same reasons. The 82576 therefore limits the cases in which the internal PHY
would restart the link, by masking two types of events from the internal PHY:
• PE_RST# and PCIe resets (in-band and link drop) do not reset the PHY during such a manageability
session
Intel® 82576 GbE Controller
Datasheet
166
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
• The PHY does not change link speed as a result of a change in power management state, to avoid
link loss. For example, the transition to D3hot state is not propagated to the PHY.
— Note however that if main power is removed, the PHY is allowed to react to the change in power
state (i.e., the PHY might respond in link speed change). The motivation for this exception is to
reduce power when operating on auxiliary power by reducing link speed.
The capability described in this section is disabled by default on LAN Power Good reset. The
Keep_PHY_Link_Up_En bit in the EEPROM must be set to '1' to enable it. Once enabled, the feature is
enabled until the next LAN Power Good (i.e., the 82576 does not revert to the hardware default value
on PE_RST#, PCIe reset or any other reset but LAN Power Good).
When the keep_PHY_link_up bit (also known as “veto bit”) in the MANC Register is set, the following
behaviors are disabled:
• The PHY is not reset on PE_RST# and PCIe resets (in-band and link drop). Other reset events are
not affected - LAN Power Good reset, Device Disable, Force TCO, and PHY reset by software.
• The PHY does not change its power state. As a result link speed does not change.
• The 82576 does not initiate configuration of the PHY to avoid losing link.
The keep_PHY_link_up bit is set by the MC through the Management Control command (See
Section 10.5 for SMBus commands and Section 10.6 for NC-SI commands) on the sideband interface. It
is cleared by the external MC (again, through a command on the sideband interface) when the
manageability session ends. Once the keep_PHY_link_up bit is cleared, the PHY updates its Dx state
and acts accordingly (e.g. negotiates its speed).
The keep_PHY_link_up bit is also cleared on de-assertion of the MAIN_PWR_OK input pin.
MAIN_PWR_OK must be de-asserted at least 1 msec before power drops below its 90% value. This
allows enough time to respond before auxiliary power takes over.
The keep_PHY_link_up bit is a R/W bit and can be accessed by host software, but software is not
expected to clear the bit. The bit is cleared in the following cases:
• On LAN Power Good
• When the MC resets or initializes it
• On de-assertion of the MAIN_PWR_OK input pin. The MC should set the bit again if it wishes to
maintain speed on exit from Dr state.
4.3
Function Disable
4.3.1
General
For a LOM (Lan on Motherboard) design, it might be desirable for the system to provide BIOS-setup
capability for selectively enabling or disabling LAN functions. It allows the end-user more control over
system resource-management and avoid conflicts with add-in NIC solutions. The 82576 provides
support for selectively enabling or disabling one or both LAN device(s) in the system.
4.3.2
Overview
Device presence (or non-presence) must be established early during BIOS execution, in order to ensure
that BIOS resource-allocation (of interrupts, of memory or IO regions) is done according to devices that
are present only. This is frequently accomplished using a BIOS CVDR (Configuration Values Driven on
Reset) mechanism. The 82576 LAN-disable mechanism is implemented in order to be compatible with
such a solution.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
167
Intel® 82576 GbE Controller — Initialization
The 82576 provides two mechanisms to disable LAN ports:
• Two pins (LANx_DIS_N, one per LAN port) are sampled on reset to determine the LAN-enable
configuration
• Port 1 might be disabled using EEPROM configuration.
Disabling a LAN port affects the PCI function it resides on. When function 0 is disabled (either LAN0 or
LAN1), two different behaviors are possible:
• Dummy Function mode — In some system, it is required to keep all the functions at their respective
location, even when other functions are disabled. In Dummy Function mode, if function #0 (either
LAN0 or LAN1) is disabled, then it does not disappear from the PCIe configuration space. Rather,
the function presents itself as a dummy function. The device ID and class code of this function
changes to other values (dummy function Device ID 0x10A6, Class Code 0xFF0000). In addition,
the function does not require any memory or I/O space, and does not require an interrupt line.
• Legacy mode — When function 0 is disabled (either LAN0 or LAN1), then the port residing on
function 1 moves to reside on function 0. Function 1 disappears from the PCI configuration space.
Note:
In some systems, the dummy function is not recognized by the enumeration process as a
valid PCI function. In these systems, both ports will not be enumerated and it is
recommended to work in legacy mode.
The disabled LAN port is still available for manageability purposes if it was disabled using the
LAN_PCI_DIS bit of the SDP control word in the EEPROM or if it was disabled through the pin
mechanism and the PHY_in_LAN_Disable bit in the SDP control word in the EEPROM is cleared. In this
case, and if LPLU bit is set, the PHY will attempt to create a link at 10 mbps.
Note:
Dummy Function mode should not be used if SR-IOV capability is exposed (since PF0 is
required to support certain functionality). SR-IOV is enabled by the IOV enable bit in
EEPROM word 0x25 (Section 6.2.24).
Mapping between function and LAN ports is summarized in the following tables.
Table 4-7.
PCI Functions Mapping (Legacy Mode)
PCI Function #
Both LAN functions are enabled
LAN Function Select
Function 0
Function 1
0
LAN 0
LAN 1
1
LAN 1
LAN 0
LAN 0 is disabled
x
LAN1
Disable
LAN 1 is disabled
x
LAN 0
Disable
Both LAN functions are disabled
Table 4-8.
Both PCI functions are disabled. Device is in low power mode.
PCI Functions Mapping (Dummy Function Mode)
PCI Function #
Both LAN functions are enabled
LAN 0 is disabled
Intel® 82576 GbE Controller
Datasheet
168
LAN Function Select
Function 0
Function 1
0
LAN 0
LAN 1
1
LAN 1
LAN 0
0
Dummy
LAN1
1
LAN 1
Disable
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
Table 4-8.
PCI Functions Mapping (Dummy Function Mode)
PCI Function #
LAN Function Select
LAN 1 is disabled
Both LAN functions are disabled
Function 0
Function 1
0
LAN 0
Disable
1
Dummy
LAN 0
Both PCI functions
are disabled.
Device is in low
power mode.
The following EEPROM bits control Function Disable:
• The access of the host through a PCI function to LAN1 can be enabled or disabled according to the
“LAN PCI Disable” bit in EEPROM word 0x10 (Section 6.2.8).
• The “LAN Disable Select” EEPROM field in word 0x10 indicates if port 1 is disabled (Section 6.2.8).
• The “LAN Function Select” bit in EEPROM word 0x21 defines the correspondence between LAN Port
and PCI function (Section 6.2.22)
• The “Dummy Function Enable” bit in EEPROM word 0x1B enables the Dummy Function mode.
Default value is disabled (Section 6.2.18).
• The “PHY_in_LAN_disable” bit in EEPROM words 0x10 and 0x20 controls the availability of the
disabled function to manageability channel when disabled through the LAN0_Dis_N or LAN1_Dis_N
pins (Section 6.2.8 and Section 6.2.9).
When a particular LAN is fully disabled, all internal clocks to that LAN are disabled, the device is held in
reset, and the internal PHY for that LAN is powered-down. In both modes, the device does not respond
to PCI configuration cycles. Effectively, the LAN device becomes invisible to the system from both a
configuration and power-consumption standpoint.
4.3.3
Control Options
The functions have a separate enabling Mechanism. Any function that is not enabled does not function
and does not expose its PCI configuration registers.
4.3.3.1
Table 4-9.
PCI functions Disable Options
Strapping for Control Options
Function
Control Options
LAN 0
Strapping Option + EEPROM word 0x20 bit 13 (full/PCI only disable in case of strap)
LAN 1
Strapping Option + EEPROM word 0x10 bit 13 (full/PCI only disable in case of strap)/ EEPROM Word 0x10 bit
11 (full disable) / EEPROM word 0x10 bit 10 (PCI only disable)
The 82576 strapping option for LAN Disable feature:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
169
Intel® 82576 GbE Controller — Initialization
Table 4-10.
Strapping for LAN Disable
Symbol
Ball #
Name and Function
LAN0_Dis_N
B13
This pin is a strapping option pin always active. This pin has an internal weak pull-up
resistor. In case this pin is not connected or driven hi during init time, LAN 0 is enabled. In
case this pin is driven low during init time, LAN 0 is disabled. This pin is also used for
testing and scan. When used for testing or scan, the LAN disable functionality is not
active.
LAN1_Dis_N
A15
This pin is a strapping option pin always active. This pin has an internal weak pull-up
resistor. In case this pin is not connected or driven hi during init time, LAN 1 is enabled. In
case this pin is driven low during init time, LAN 1 function is disabled. This pin is also used
for testing and scan. When used for testing or scan, the LAN disable functionality is not
active.
4.3.4
Event Flow for Enable/Disable Functions
This section describes the driving levels and event sequence for device functionality. Following a Power
on Reset / Internal Power / PE_RST_N/ In-Band reset the LANx_DIS_N signals should be driven hi (or
left open) for nominal operation. If any of the LAN functions are not required statically its associated
Disable strapping pin can be tied statically to low.
Case A - BIOS Disable the LAN Function at boot time by using strapping:
1. Assume that following power up sequence LANx_DIS_N signals are driven high.
2. The PCIe is established following the PERST.
3. BIOS recognize that a LAN function in the 82576 should be disabled.
4. The BIOS drive the LANx_DIS_N signal to the low level.
5. The BIOS should assert the PCIe reset, either in-band or via PE_RST_N.
6. As a result, the 82576 samples the LANx_DIS_N signals and disable the LAN function and issue an
internal reset to this function.
7. BIOS might start with the Device enumeration procedure (the disabled LAN function is invisible or
changed to dummy function).
8. Proceed with Nominal operation.
9. Re-enable could be done by driving the LANx_DIS_N signal high and then request the user to issue
a warm boot that generate bus enumeration.
4.3.4.1
Multi-Function Advertisement
If one of the LAN devices is disabled, the 82576 no longer is a multi-function device. The 82576
normally reports a 0x80 in the PCI Configuration Header field Header Type, indicating multi-function
capability. However, if a LAN is disabled, the 82576 reports a 0x0 in this field to signify single-function
capability.
4.3.4.2
Legacy Interrupts Utilization
When both LAN devices are enabled, the 82576 can utilizes INTA# to INTC# interrupts for interrupt
reporting. The EEPROM Initialization Control Word 3 (bits 12:11) associated with each LAN device
controls which of these interrupts are used for each LAN device. The specific interrupt pin utilized is
reported in the PCI Configuration Header Interrupt Pin field associated with each LAN device.
Intel® 82576 GbE Controller
Datasheet
170
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
However, if only one LAN device is enabled, then the INTA# must be used for this LAN device,
regardless of the EEPROM configuration. Under these circumstances, the Interrupt Pin field of the PCI
Header always reports a value of 0x1, indicating INTA# usage.
4.3.4.3
Power Reporting
When both LAN devices are enabled, the PCI Power Management Register Block has the capability of
reporting a “Common Power” value. The Common Power value is reflected in the Data field of the PCI
Power Management registers. The value reported as Common Power is specified via EEPROM, and is
reflected in the Data field whenever the Data_Select field has a value of 0x8 (0x8 = Common Power
Value Select).
When only one LAN is enabled, the 82576 appears as a single-function device, the Common Power
value, if selected, reports 0x0 (undefined value), as Common Power is undefined for a single-function
device.
4.4
Device Disable
For a LOM design, it might be desirable for the system to provide BIOS-setup capability for selectively
enabling or disabling LOM devices. This might allow the end-user more control over system resourcemanagement; avoid conflicts with add-in NIC solutions, etc. The 82576 provides support for selectively
enabling or disabling it.
Note:
If the 82576 is configured to provide a 50MHz NC-SI clock (via the NC-SI Output Clock
EEPROM bit), then the device should not be disabled.
Device Disable is initiated by assertion of the asynchronous DEV_OFF_N pin. The DEV_OFF_N pin
should always be connected to enable correct device operation.
The EEPROM "Power Down Enable" bit (Section 6.2.7) enables device disable mode (hardware default is
that the mode is disabled).
While in device disable mode, the PCIe link is in L3 state.The PHY is in power down mode. Output
buffers are tri-stated.
Assertion or de-assertion of PCIe PE_RST_N does not have any effect while the device is in device
disable mode (i.e., the device stays in the respective mode as long as DEV_OFF_N is asserted).
However, the device might momentarily exit the device disable mode from the time PCIe PE_RST_N is
de-asserted again and until the EEPROM is read.
During power-up, the DEV_OFF_N pin is ignored until the EEPROM is read. From that point, the device
might enter Device Disable if DEV_OFF_N is asserted.
Note:
De-assertion of the DEV_OFF_N pin causes a fundamental reset to the 82576.
Note to system designer: The DEV_OFF_N pin should maintain its state during system reset and system
sleep states. It should also insure the proper default value on system power-up. For example, one could
use a GPIO pin that defaults to '1' (enable) and is on system suspend power (i.e., it maintains state in
S0-S5 ACPI states).
4.4.1
BIOS Handling of Device Disable
1. Assume that following power up sequence the DEV_OFF_N signal is driven high (else it is already
disabled).
2. The PCIe is established following the PERST.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
171
Intel® 82576 GbE Controller — Initialization
3. BIOS recognize that the whole Device should be disabled.
4. The BIOS drive the DEV_OFF_N signal to the low level.
5. As a result, the 82576 samples the DEV_OFF_N signal and enters the device disable mode.
6. The BIOS put the Link in the Electrical IDLE state (at the other end of the PCIe link) by clearing the
LINK Disable bit in the Link Control Register.
7. BIOS might start with the Device enumeration procedure (all of the Device functions are invisible).
8. Proceed with Nominal operation.
9. Re-enable could be done by driving the DEV_OFF_N signal high followed later by bus enumeration.
4.5
Software Initialization and Diagnostics
4.5.1
Introduction
This chapter discusses general software notes for the 82576, especially initialization steps. This
includes general hardware, power-up state, basic device configuration, initialization of transmit and
receive operation, link configuration, software reset capability, statistics, and diagnostic hints.
4.5.2
Power Up State
When the 82576 powers up it reads the EEPROM. The EEPROM contains sufficient information to bring
the link up and configure the 82576 for manageability and/or APM wakeup. However, software
initialization is required for normal operation.
The power-up sequence, as well as transitions between power states, are described in section 4.1.1.
The detailed timing is given in Section 5.5. The next section gives more details on configuration
requirements.
4.5.3
Initialization Sequence
The following sequence of commands is typically issued to device by the software device driver in order
to initialize the 82576 to normal operation. The major initialization steps are:
• Disable Interrupts - see Interrupts during initialization.
• Issue Global Reset and perform General Configuration - see Global Reset and General
Configuration.
• Setup the PHY and the link - see Link Setup Mechanisms and Control/Status Bit Summary.
• Initialize all statistical counters - see Initialization of Statistics.
• Initialize Receive - see Receive Initialization.
• Initialize Transmit - see Transmit Initialization.
• Enable Interrupts - see Interrupts during initialization.
4.5.4
Interrupts During Initialization
• Most drivers disable interrupts during initialization to prevent re-entering to the interrupt routine.
Interrupts are disabled by writing to the IMC register. Note that the interrupts need to be disabled
also after issuing a global reset, so a typical driver initialization flow is:
• Disable interrupts
• Issue a Global Reset
Intel® 82576 GbE Controller
Datasheet
172
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
• Disable interrupts (again)
• …
After the initialization is done, a typical driver enables the desired interrupts by writing to the IMS
register.
4.5.5
Global Reset and General Configuration
Device initialization typically starts with a global reset that puts the device into a known state and
enables the device driver to continue the initialization sequence.
Several values in the Device Control Register (CTRL) need to be set, upon power up, or after a device
reset for normal operation.
• FD should be set per interface negotiation (if done in software), or is set by the hardware if the
interface is Auto-Negotiating. This is reflected in the Device Status Register in the Auto-Negotiating
case.
• Speed is determined via Auto-Negotiation by the PHY, Auto-Negotiation by the PCS layer in SGMII/
SerDes mode, or forced by software if the link is forced. Status information for speed is also
readable in STATUS.
• ILOS should normally be set to 0.
Set the packet buffer allocation for transmit and receive flows in the RXPBS, TXPBS & SWPBS registers.
This should be done before RCTL.RXEN & TCTL.TXEN are set. An ordered disabling of all queues and of
the Rx & Tx flows is required before any change in the packet buffer allocation is done.
4.5.6
Flow Control Setup
If flow control is enabled, program the FCRTL, FCRTH, FCTTV and FCRTV registers. In order to avoid
packet losses, FCRTH should be set to a value equal to at least two max size packet below the receive
buffer size. E.g. Assuming a packet buffer size of 32K and expected max size packet of 9.5K, the FCRTH
value should be set to 32 - 2 * 9.5 = 14K i.e. RTH should be set to 0x380.
4.5.7
Note:
Link Setup Mechanisms and Control/Status Bit Summary
The CTRL_EXT.LINK_MODE value should be set to the desired mode prior to the setting of
the other fields in the link setup procedures.
4.5.7.1
PHY Initialization
Refer to the PHY documentation for the initialization and link setup steps. The device driver uses the
MDIC register to initialize the PHY and setup the link. Section 3.5.4.3 describes the link setup for the
internal copper PHY. Section 3.5.2.2 describes the usage of the MDIC register.
4.5.7.2
MAC/PHY Link Setup (CTRL_EXT.LINK_MODE = 00)
This section summarizes the various means of establishing proper MAC/PHY link setups, differences in
MAC CTRL register settings for each mechanism, and the relevant MAC status bits. The methods are
ordered in terms of preference (the first mechanism being the most preferred).
4.5.7.2.1
320961-015EN
Revision: 2.61
December 2010
MAC Settings Automatically Based on Duplex and Speed
Resolved by PHY (CTRL.FRCDPLX = 0b, CTRL.FRCSPD = 0b,)
Intel® 82576 GbE Controller
Datasheet
173
Intel® 82576 GbE Controller — Initialization
CTRL.FD
Don't care; duplex setting is established from PHY's internal indication to the
MAC (FDX) after PHY has auto-negotiated a successful link-up.
CTRL.SLU
Must be set to 1 by software to enable communications between MAC and PHY.
CTRL.RFCE
Must be set by S/W after reading flow control resolution from PHY registers.
CTRL.TFCE
Must be set by S/W after reading flow control resolution from PHY registers.
CTRL.SPEED
Don't care; speed setting is established from PHY's internal indication to the MAC
(SPD_IND) after PHY has auto-negotiated a successful link-up.
STATUS.FD
Reflects the actual duplex setting (FDX) negotiated by the PHY and indicated to
MAC.
STATUS.LU
Reflects link indication (LINK) from PHY qualified with CTRL.SLU (set to 1).
STATUS.SPEED
Reflects actual speed setting negotiated by the PHY and indicated to the MAC
(SPD_IND).
4.5.7.2.2
MAC Duplex and Speed Settings Forced by Software Based on
Resolution of PHY (CTRL.FRCDPLX = 1b, CTRL.FRCSPD = 1b)
CTRL.FD
Set by software based on reading PHY status register after PHY has autonegotiated a successful link-up.
CTRL.SLU
Must be set to 1 by software to enable communications between MAC and PHY.
CTRL.RFCE
Must be set by S/W after reading flow control resolution from PHY registers.
CTRL.TFCE
Must be set by S/W after reading flow control resolution from PHY registers.
CTRL.SPEED
Set by software based on reading PHY status register after PHY has autonegotiated a successful link-up.
STATUS.FD
Reflects the MAC forced duplex setting written to CTRL.FD.
STATUS.LU
Reflects link indication (LINK) from PHY qualified with CTRL.SLU (set to 1).
STATUS.SPEED
Reflects MAC forced speed setting written in CTRL.SPEED.
4.5.7.2.3
MAC/PHY Duplex and Speed Settings Both Forced by Software
(Fully-Forced Link Setup) (CTRL.FRCDPLX = 1b, CTRL.FRCSPD =
1b, CTRL.SLU = 1b)
CTRL.FD
Set by software to desired full/half duplex operation (must match duplex setting
of PHY).
CTRL.SLU
Must be set to 1 by software to enable communications between MAC and PHY.
PHY must also be forced/configured to indicate positive link indication (LINK) to
the MAC.
CTRL.RFCE
Must be set by S/W to desired flow-control operation (must match flow-control
settings of PHY).
CTRL.TFCE
Must be set by S/W to desired flow-control operation (must match flow-control
settings of PHY).
CTRL.SPEED
Set by software to desired link speed (must match speed setting of PHY).
Intel® 82576 GbE Controller
Datasheet
174
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
STATUS.FD
Reflects the MAC duplex setting written by software to CTRL.FD.
STATUS.LU
Reflects 1 (positive link indication LINK from PHY qualified with CTRL.SLU). Note
that since both CTRL.SLU and the PHY link indication LINK are forced, this bit set
does not guarantee that operation of the link has been truly established.
STATUS.SPEED
Reflects MAC forced speed setting written in CTRL.SPEED.
4.5.7.3
MAC/SERDES Link Setup
(CTRL_EXT.LINK_MODE = 11b)
Link setup procedures using an external SERDES interface mode:
4.5.7.3.1
Hardware Auto-Negotiation Enabled (PCS_LCTL. AN ENABLE = 1b;
CTRL.FRCSPD = 0b; CTRL.FRCDPLX = 0)
CTRL.FD
Ignored; duplex is set by priority resolution of PCS_ANDV and PCS_LPAB.
CTRL.SLU
Must be set to 1 by software to enable communications to the SerDes.
CTRL.RFCE
Set by Hardware according to auto negotiation resolution1.
CTRL.TFCE
Set by Hardware according to auto negotiation resolution1.
CTRL.SPEED
Ignored; speed always 1000Mb/s when using SGMII mode communications.
STATUS.FD
Reflects hardware-negotiated priority resolution.
STATUS.LU
Reflects PCS_LSTS.AN COMPLETE (Auto-Negotiation complete).
STATUS.SPEED
Reflects 1000Mb/s speed, reporting fixed value of (10)b.
PCS_LCTL.FSD
Must be zero.
PCS_LCTL.Force Flow Control
Must be zero1.
PCS_LCTL.FSV
Must be set to 10b. Only 1000 Mb/s is supported in SerDes mode.
PCS_LCTL.FDV
Ignored; duplex is set by priority resolution of PCS_ANDV and PCS_LPAB.
4.5.7.3.2
Auto-Negotiation Skipped (PCS_LCTL. AN ENABLE = 0b;
CTRL.FRCSPD = 1b; CTRL.FRCDPLX = 1)
CTRL.FD
Must be set to 1b. - only full duplex is supported in SerDes mode.
CTRL.SLU
Must be set to 1 by software to enable communications to the SerDes.
CTRL.RFCE
Set by software for the desired mode of operation.
CTRL.TFCE
Set by software for the desired mode of operation.
CTRL.SPEED
Must be set to 10b. Only 1000 Mb/s is supported in SerDes mode.
STATUS.FD
Reflects the value written by software to CTRL.FD.
1. If PCS_LCTL.Force Flow Control is set, the auto negotiation result is not reflected in the CTRL.RFCE
and CTRL.TFCE registers. In This case, the software must set these fields after reading flow control
resolution from PCS registers.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
175
Intel® 82576 GbE Controller — Initialization
STATUS.LU
Reflects whether the PCS detected comma symbols, qualified with CTRL.SLU (set
to 1).
STATUS.SPEED
Reflects 1000Mb/s speed, reporting fixed value of (10)b.
PCS_LCTL.FSD
Must be set to 1 by software to enable communications to the SerDes.
PCS_LCTL.Force Flow Control
Must be set to 1.
PCS_LCTL.FSV
Must be set to 10b. Only 1000 Mb/s is supported in SerDes mode.
PCS_LCTL.FDV
Must be set to 1b - only full duplex is supported in SerDes mode.
4.5.7.4
MAC/SGMII Link Setup (CTRL_EXT.LINK_MODE = 10b)
Link setup procedures using an external SGMII interface mode:
4.5.7.4.1
Hardware Auto-Negotiation Enabled (PCS_LCTL. AN ENABLE = 1b,
CTRL.FRCDPLX = 0b, CTRL.FRCSPD = 0b)
CTRL.FD
Ignored; duplex is set by priority resolution of PCS_ANDV and PCS_LPAB.
CTRL.SLU
Must be set to 1 by software to enable communications to the SerDes.
CTRL.RFCE
Must be set by software after reading flow control resolution from PCS registers.
CTRL.TFCE
Must be set by software after reading flow control resolution from PCS registers.
CTRL.SPEED
Ignored; speed setting is established from SGMII's internal indication to the MAC
after SGMII has auto-negotiated a successful link-up.
STATUS.FD
Reflects hardware-negotiated priority resolution.
STATUS.LU
Reflects PCS_LSTS.Link OK
STATUS.SPEED
Reflects actual speed setting negotiated by the SGMII and indicated to the MAC.
PCS_LCTL.Force Flow Control
Ignored.
PCS_LCTL.FSD
Should be set to zero.
PCS_LCTL.FSV
Ignored; speed is set by priority resolution of PCS_ANDV and PCS_LPAB.
PCS_LCTL.FDV
Ignored; duplex is set by priority resolution of PCS_ANDV and PCS_LPAB.
4.5.8
Initialization of Statistics
Statistics registers are hardware-initialized to values as detailed in each particular register's
description. The initialization of these registers begins upon transition to D0active power state (when
internal registers become accessible, as enabled by setting the Memory Access Enable of the PCIe
Command register), and is guaranteed to be completed within 1 sec. of this transition. Access to
statistics registers prior to this interval might return indeterminate values.
All of the statistical counters are cleared on read and a typical device driver reads them (thus making
them zero) as a part of the initialization sequence.
Intel® 82576 GbE Controller
Datasheet
176
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
4.5.9
Receive Initialization
Program the Receive address register(s) per the station address. This can come from the EEPROM or
from any other means (for example, on some machines, this comes from the system PROM not the
EEPROM on the adapter card).
Set up the MTA (Multicast Table Array) per software. This means zeroing all entries initially and adding
in entries as requested.
Program RCTL with appropriate values. If initializing it at this stage, it is best to leave the receive logic
disabled (EN = 0b) until after the receive descriptor ring has been initialized. If VLANs are not used,
software should clear VFE. Then there is no need to initialize the VFTA. Select the receive descriptor
type.
The following should be done once per receive queue needed:
• Allocate a region of memory for the receive descriptor list.
• Receive buffers of appropriate size should be allocated and pointers to these buffers should be
stored in the descriptor ring.
• Program the descriptor base address with the address of the region.
• Set the length register to the size of the descriptor ring.
• Program SRRCTL of the queue according to the size of the buffers and the required header
handling.
• If header split or header replication is required for this queue, program the PSRTYPE register
according to the required headers.
• Enable the queue by setting RXDCTL.ENABLE. In the case of queue zero, the enable bit is set by
default - so the ring parameters should be set before RCTL.RXEN is set.
• Poll the RXDCTL register until the ENABLE bit is set. The tail should not be bumped before this bit
was read as one.
• Program the direction of packets to this queue according to the mode select in MRQC. Packets
directed to a disabled queue is dropped.
Note:
The tail register of the queue (RDT[n]) should not be bumped until the queue is enabled.
4.5.9.1
Initialize the Receive Control Register
To properly receive packets the receiver should be enabled by setting RCTL.RXEN. This should be done
only after all other setup is accomplished. If software uses the Receive Descriptor Minimum Threshold
Interrupt, that value should be set.
4.5.9.2
Dynamic Enabling and Disabling of Receive Queues
Receive queues can be dynamically enabled or disabled given the following procedure is followed:
Enabling:
• Follow the per queue initialization described in the previous section.
• Note that if there are still packets in the packet buffer directed to this queue according to previous
settings, they is received after the queue is re-enabled. In order to avoid this condition, the
software might poll the PBRWAC register. Once two wrap-arounds or an empty condition of the
relevant packet buffer is detected, the queue might be re-enabled.
Disabling:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
177
Intel® 82576 GbE Controller — Initialization
• Disable the direction of packets to this queue.
• Disable the queue by clearing RXDCTL.ENABLE. The 82576 stops fetching and writing back
descriptors from this queue immediately. The 82576 eventually completes the storage of one buffer
allocated to this queue. Any further packet directed to this queue is dropped. If the currently
processed packet is spread over more than one buffer, all subsequent buffers is not written.
• The 82576 clears RXDCTL.ENABLE only after all pending memory accesses to the descriptor ring or
to the buffers are done. The driver should poll this bit before releasing the memory allocated to this
queue.
The RX path might be disabled only after all Rx queues are disabled.
4.5.10
Transmit Initialization
Program the TCTL register according to the MAC behavior needed.
If work in half duplex mode is expected, program the TCTL_EXT.COLD field. For internal PHY mode the
default value of 0x41 is OK. For SGMII mode, a value reflecting the 82576 and the PHY SGMII delays
should be used. A suggested value for a typical PHY is 0x46 for 10 Mbps and 0x4C for 100 Mbps.
The following should be done once per transmit queue:
• Allocate a region of memory for the transmit descriptor list.
• Program the descriptor base address with the address of the region.
• Set the length register to the size of the descriptor ring.
• Program the TXDCTL register with the desired TX descriptor write back policy. Suggested values
are:
— WTHRESH = 1b
— All other fields 0b.
• If needed, set the TDWBAL/TWDBAH to enable head write back
• Enable the queue using TXDCTL.ENABLE (queue zero is enabled by default).
• Poll the TXDCTL register until the ENABLE bit is set.
Note:
The tail register of the queue (TDT[n]) should not be bumped until the queue is enabled.
Enable transmit path by setting TCTL.EN. This should be done only after all other settings are done.
4.5.10.1
Dynamic Queue Enabling and Disabling
Transmit queues can be dynamically enabled or disabled given the following procedure is followed:
Enabling:
• Follow the per queue initialization described in the previous section.
Disabling:
• Stop storing packets for transmission in this queue.
• Wait until the head of the queue (TDH) is equal to the tail (TDT), i.e. the queue is empty.
• Disable the queue by clearing TXDCTL.ENABLE.
The Tx path might be disabled only after all Tx queues are disabled.
Intel® 82576 GbE Controller
Datasheet
178
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
4.5.11
Virtualization Initialization Flow
4.5.11.1
Next Generation VMDq Mode
4.5.11.1.1
Global Filtering and Offload Capabilities
• Select one of the Next Generation VMDq pooling methods - MAC/VLAN filtering for pool selection
and RSS for the queue in pool selection. MRQC.Multiple Receive Queues Enable = 011b, 100b or
101b.
• In RSS mode, the RSS key (RSSRK) and redirection table (RETA) should be programmed. Note that
the redirection table is common to all the pools and only indicates the queue inside the pool to use
once the pool is chosen.
• Set the RPLOLR and RPLPSRTYPE registers to define the behavior of replicated packets.
• Configure VT_CTL.DEF_PL to define the default pool. If packets with no pools should be dropped,
set VT_CTL.Dis_def_Pool field.
• Enable replication via VT_CTL.replication_en.
• Enable loopback via DTXSWC.Loopback.
• If needed, enable padding of small packets via the RCTL.PSP
4.5.11.1.2
Mirroring rules.
For each mirroring rule to be activated:
a.
Set the type of traffic to be mirrored in the VMRCTL[n] register.
b.
Set the mirror pool in the VMRCTL[n].MP
c.
For pool mirroring, set the VMRVM[n] register with the pools to be mirrored.
d.
For VLAN mirroring, set the VMVRLAN[n] with the indexes from the VLVF registers of the VLANs
to be mirrored.
4.5.11.1.3
Per Pool Settings
As soon as a pool of queues is associated to a VM the software should set the following parameters:
1. Address filtering:
a.
The unicast MAC address of the VM by enabling the pool in the RAH/RAL registers.
b.
If all the MAC addresses are used, the unicast hash table (UTA) can be used. Pools servicing VMs
whose address is in the hash table should be declared as so by setting the VMOLR.ROPE. Packets
received according to this method didn’t pass perfect filtering and are indicated as such.
c.
Enable the pool in all the RAH/RAL registers representing the multicast MAC addresses this VM
belongs to.
d.
If all the MAC addresses are used, the multicast hash table (MTA) can be used. Pools servicing
VMs using multicast addresses in the hash table should be declared as so by setting the
VMOLR.ROMPE. Packets received according to this method didn’t pass perfect filtering and are
indicated as such.
e.
Define whether this VM should get all multicast/broadcast packets in the same VLAN via the
VMOLR.MPE and VMOLR.BAM fields
f.
Enable the pool in each VLVF register representing a VLAN this VM belongs to.
g.
A VM might be set to receive it’s own traffic in case the source and the destination are in the
same pool via the DTXSWC.LLE field.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
179
Intel® 82576 GbE Controller — Initialization
h.
Define whether the pool belongs to the default VLAN and should accept untagged packets via
the VMOLR.AUPE field
2. Offloads
a.
Define whether VLAN header should be stripped from the packet. CRC is always stripped from
the packet.
b.
Set which header split is required via the PSRTYPE register.
c.
Set whether larger than standard packet are allowed by the VM and what is the largest packet
allowed (jumbo packets support) via VMOLR.RLPML & VMOLR.RLE.
d.
In RSS mode, define if the pool uses RSS via the VMOLR.RSSE bit.
3. Queues
a.
Enable Rx & Tx queues as described in Section 4.5.9 & Section 4.5.10
b.
For each Rx queue a drop/no drop flag can be set in SRRCTL.DROP_EN or via the QDE register,
controlling the behavior in cases no receive buffers are available in the queue to receive packets.
The usual behavior is to allow drops in order to avoid head of line blocking, unless a no-drop
behavior is needed for some type of traffic (e.g. storage).
4.5.11.1.4
Security Features
4.5.11.1.4.1
Anti spoofing
For each pool, the driver may activate the MAC and VLAN anti spoof features via the relevant bit in
DTXSWC.MACAS and DTXSWC.VLANAS respectively.
4.5.11.1.4.2
Storm control
The driver may set limits to the broadcast or multicast traffic it can receive.
1. It should set how many 64 bytes chunks of Broadcast and Multicast traffic are acceptable per
interval via the BSCTRH and MSCTRH respectively.
2. It should then set the interval to be used via the SCCRL.Interval field and which action to take when
the broadcast or multicast traffic crosses the programmed threshold via the SCCRL.BDIPW,
SCCRL.BDICW, SCCRL.MDIPW, and SCCRL.MDICW fields.
3. The driver may be notified of storm control events through the ICR.SCE interrupt cause.
4.5.11.1.5
Allocation of Tx Bandwidth to VMs
4.5.11.1.5.1
Configuring Tx Bandwidth to VMs
Allocation of Tx Bandwidth to VMs feature is enabled or disabled via the programming of VMBACS and
VMBAMMW registers. When enabled, bandwidth to VMs (i.e. to Tx Queues) is configured via writing into
VMBASEL and VMBAC registers for each queue again.
The bandwidth configuring procedure is as follow 1. Allocate non-null rates to VMs present in the system RVMi (i=0..7), in Gb/s units, so that:
RVM0 + RVM1 + ... + RVM7 = 0.5 Gb/s
Assume also that for any different i,j:
RVMi / RVMj < 10 and RVMj / RVMi < 10
2. Allocate rates to enabled queues RQi (i=0..7), in Gb/s units, so that:
Intel® 82576 GbE Controller
Datasheet
180
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
RQi = RQi+8 = RVMi / 2
3. Compute rate factors RFQi (i=0..15) for all the enabled Tx queues, so that:
RFQi = 1 Gb/s / RQi
4. Format the rate factors obtained in the previous step as decimal binary numbers, with 10-bits
integral part left of the decimal point, and 14-bits decimal part right of it, and for i=0..15, set
RTTDQSEL.TXDQ_IDX=i and then:
a.
Set RTTDVMRC.RF_INT = integral part of RFQi
b.
Set RTTDVMRC.RF_DEC = decimal part of RFQi
5. Compute VM_MMW_SIZE to the VM Rate-Scheduler as follow:
VM_MMW_SIZE = 16 x MSS
for avoiding saturation while full workload. Refer to Section 4.5.11.1.5.2.
6. Set VMBAMMW.MMW_SIZE = VM_MMW_SIZE
4.5.11.1.5.2
Link Speed Change Procedure
Whenever the link status or speed is changed, the 82576 operates the VM arbiters in a packet based
round robin mode, and disables the VM rate-controllers. Software is responsible to re-enabling and reconfiguring them accordingly to the new link speed. However, to avoid any race condition between
hardware and software, the following procedure must be performed by the driver whenever a link
speed/status change interrupt occurs:
1. Check the SPEED_CHG bit in VMBACS register was asserted by hardware.
2. Read the VMBA_SET bit in the VMBACS register.
3. If the bit is read as 1, it means the VM rate-controllers were not completely disabled by hardware
(i.e. a race occurred between hardware and software). Software must therefore clear the RC_ENA
bit in the VMBAC register for all the queues, or for at least the queue(s) for which it is still set.
4. Clear the SPEED_CHG bit in VMBACS register.
4.5.11.2
IOV Initialization
The initialization flow used to enable an IOV function can be found in chapter 2 of the PCI-Express
Single Root I/O Virtualization and Sharing Specification.
4.5.11.2.1
PF Driver Initialization
The PF driver is responsible for the link setup and handling of all the filtering & off load capabilities for
all the VFs as described in Section 4.5.11.1.1 and the security features as described in
Section 4.5.11.1.4. It should also set the bandwidth allocation per transmit queue for each VF as
described in Section 4.5.10 and Section 4.5.11.1.5.
Note:
The link setup might include authentication process (802.1X or other);setup of the of the
MACSec channel .
In IOV mode, Next Generation VMDq + RSS mode is not available. RSS mode might be
used, but this assumes all the VMs uses the same key, RSS hash algorithms and redirection
table which is currently not POR of any VMM vendor.
After all the common parameters are set, the PF driver should set all the VFMailbox.RSTD bit by setting
the CTRL.PFRSTD.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
181
Intel® 82576 GbE Controller — Initialization
The PF might disable all active VF traffic (via the VFTE & VFRE registers) until the parameters of a VF
are set; see Section 4.5.11.1.3. VFs can be enabled using the same registers.
4.5.11.2.1.1
VF Specific Reset Coordination
After the PF driver receives an indication of a VF FLR via the VFLRE register, it should enable the receive
and transmit for the VF only once the device is programmed with the right parameters as defined in
Section 4.5.11.1.3. The receive filtering is enabled using the VFRE register and the transmit filtering is
enabled via the VFTE register.
Note:
The filtering & off loads setup might be based on a central IT settings or on requests from
the VF drivers.
The PF driver should assert the VF reset via the VCTRL register before configuration of the
VF parameters.
4.5.11.2.2
VF Driver Initialization
Upon init, after the PF indicated that the global init was done via the VFMailbox.RSTD bit, the VF driver
should communicate with the PF, either via the mailbox or other software mechanisms to assure that
the right parameters of the VF are programmed as described in Section 4.5.11.1.3.
The mailbox mechanism is described in Section 7.10.2.9.1.
The PF should also setup the security measures as described in Section 4.5.11.1.4. In addition, the PF
may also program whether the VF is allowed to control VLAN insertion or whether VLAN insertion is
controlled by the PF via the relevant VMVIR register.
The PF driver might then send an acknowledge message with the actual setup done according to the VF
request and the IT policy.
The VF driver should then setup the interrupts and the queues as described in Section 4.5.9 &
Section 4.5.10.
4.5.11.2.3
Full Reset Coordination
A mechanism is provided to synchronize reset procedures between the Physical Function and the VFs. It
is provided specifically for PF software reset but can be used in other reset cases as described below.
The procedure is as follows:
One of the following reset cases takes place:
• Internal_Power_On_Reset
• PCIe Reset (PERST# and in-band)
• D3hot --> D0
• FLR
• Software reset by the PF
The 82576 sets the RSTI bits in all the VFMailbox registers. Once the reset completes, each VF might
read its VFMailbox register to identify a reset in progress.
Once the PF completed configuring the device, it sets the CTRL_EXT.PFRSTD bit. As a result, the 82576
clears the RSTI bits in all the VFMailbox registers and sets the RSTD (Reset Done) bits are set in all the
VFMailbox registers.
Intel® 82576 GbE Controller
Datasheet
182
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
Until a RSTD condition is detected, the VFs should access only the VFMailbox register and should not
attempt to activate the interrupt mechanism or the transmit and receive process.
4.5.11.2.4
IOV disable
After IOV is disabled, the PF can not immediately reuse the resources released by the VF. It should first
wait 100 ms, to make sure all the pending request and completions of the defunct VFs are processed.
After that, it should set the IOVCTL.Use VF Queues bit. Only then, the released queues may be reused
by the PF.
4.5.11.2.5
VFRE/VFTE
This mechanism insures that a VF cannot transmit or receive before the Tx and Rx path have been
initialized by the PF. It is required for VFLR reset and must also be used in case of VF software reset. It
is optional for PF software reset as described above. The VFRE register contains a bit per VF. When the
bit is cleared assignment of Rx packet for the VF’s pool is disabled. When set, assignment of Rx packet
for the VF’s pool is enabled.
The VFTE register contains a bit per VF. When the bit is cleared, fetching of data for the VF’s pool is
disabled. When set, fetching of data for the VF’s pool is enabled. Fetching of descriptors for the VF pool
is maintained, up to the limit of the internal descriptor queues - regardless to VFTE settings.
Note:
The VFRE and VFTE registers apply in all device modes (not just IOV). The default values
for both registers are therefore ‘1’, enabling transmission and reception in non-IOV modes.
4.5.12
4.5.12.1
Transmit Rate Limiting Configuration
Link Speed Change Procedure
Whenever the link status or speed is changed, the 82576 disables the rate-schedulers. Software is
responsible to re-enabling and re-configuring them accordingly to the new link speed. However, to
avoid any race condition between hardware and software, the following procedure must be performed
by the driver whenever a link speed/status change interrupt occurs:
1. Check the SPEED_CHG bit in TRLDCS registers was asserted by hardware.
2. Read the TRL_RS_SET bit in the TRLDCS register.
3. If the bit is read as 1, it means the rate-schedulers were not completely disabled by hardware (i.e.
a race occurred between hardware and software). Software must therefore clear the RS_ENA bit in
the TRLRC register for all the queues, or for at least the queue(s) for which it is still set.
4. Clear the SPEED_CHG bit in TRLDCS register.
5. Set the appropriate LINK_SPEED field in TRLDC register.
4.5.12.2
Configuration Flow
At the initialization stage, the following registers shall be configured:
— Tx Rate-Limiter MMW (TRLMMW) with typically MMW_SIZE=0x014 if 9500 bytes jumbo is
supported over the TC, 0x004 otherwise
— Tx Rate-Limiter Control Register (TRLCR)
The driver will update the RTL parameters of the concerned Tx queue, on the fly, as follows:
— Tx Descriptor plane Queue Select (TRLDQSEL.TXQ_IDX) with the index of the rate-limited
queue
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
183
Intel® 82576 GbE Controller — Initialization
— Tx Rate-Limiter Rate Config (TRLRC), RS_ENA=1 and with desired maximum rate
4.5.12.3
Configuration Rules
Setting a rate limiter on Tx queue N to a TargetRate requires the following settings:
• Select the requested queue by programming the queue index - TRLDQSEL.TXQ_IDX
• Program the desired rate as follow
— Compute the Rate_Factor which equals Link_Speed / Target_Rate. Link_Speed could be either 1
Gb/s or 100 Mb/s. Note that the Rate_Factor is composed of an integer number plus a fraction.
The integer part is a 10 bit number field and the fraction part is a 14 bit binary fraction number.
— Integer (Rate_Factor) is programmed by the TRLRC.RF_INT[9:0] field
— Fraction (Rate_Factor) is programmed by the TRLRC.RF_DEC[13:0] field. It equals RF_DEC[13]
* 2-1 + RF_DEC[12] * 2-2 + ... + RF_DEC[0] * 2-14
• Enable Rate Scheduler by setting the TRLRC. RS_ENA
Numerical Example
• Target_Rate = 24 Mb/s ; Link_Speed = 1 Gb/s
• Rate_Factor = 1 / 0.024 = 41.6666... = 101001.10101010101011b
• RF_DEC = 10101010101011b ; RF_INT = 0000101001b
• Therefore, set TRLRC to 0x800A6AAB
4.6
Access to shared resources
Part of the resources in the 82576 are shared between several software entities - namely the drivers of
the two ports and the internal firmware. In order to avoid contentions, a driver that needs to access
one of these resources should use the flow described in Section 4.6.1 in order to acquire ownership of
this resource and use the flow described in Section 4.6.2 in order to relinquish ownership of this
resource.
The shared resources are:
1. The EEPROM
2. Both PHYs
3. CSRs accessed by the internal firmware after the initialization process. Currently there are no such
CSRs.
4. The flash.
Note:
4.6.1
Any other software tool that access the the 82576 register set directly should also follow
the flow described below.
Acquiring ownership over a shared resource
The following flow should be used to acquire a shared resource:
1. Get ownership of the software/software semaphore SWSM.SMBI (offset 0x5B50 bit 0).
a.
Read the SWSM register.
b.
If SWSM.SMBI is read as zero, the semaphore was taken.
c.
Otherwise, go back to step a.
Intel® 82576 GbE Controller
Datasheet
184
320961-015EN
Revision: 2.61
December 2010
Initialization — Intel® 82576 GbE Controller
This step assure that other software will not access the shared resources register (SW_FW_SYNC).
2. Get ownership of the software/firmware semaphore SWSM.SWESMBI (offset 0x5B50 bit 1):
a.
Set the SWSM.SWESMBI bit.
b.
Read SWSM.
c.
If SWSM.SWESMBI was successfully set - the semaphore was acquired - otherwise, go back to
step a.
This step assure that the internal firmware will not access the shared resources register
(SW_FW_SYNC).
3. Software reads the Software-Firmware Synchronization Register (SW_FW_SYNC) and checks both
bits in the pair of bits that control the resource it wishes to own.
a.
If both bits are cleared (both firmware and other software does not own the resource), software
sets the software bit in the pair of bits that control the resource it wishes to own.
b.
If one of the bits is set (firmware or other software owns the resource), software tries again later.
4. Release ownership of the software/software semaphore and the software/firmware semaphore by
clearing SWSM.SMBI and SWSM.SWESMBI bits.
5. At this stage, the shared resources is owned by the driver and it may access it. The SWSM and
SW_FW_SYNC registers can now be used to take ownership of another shared resources.
4.6.2
Releasing ownership over a shared resource
The following flow should be used to release a shared resource:
1. Get ownership of the software/software semaphore SWSM.SMBI (offset 0x5B50 bit 0).
a.
Read the SWSM register.
b.
If SWSM.SMBI is read as zero, the semaphore was taken.
c.
Otherwise, go back to step a.
This step assure that other software will not access the shared resources register (SW_FW_SYNC).
2. Get ownership of the software/firmware semaphore SWSM.SWESMBI (offset 0x5B50 bit 1):
a.
Set the SWSM.SWESMBI bit.
b.
Read SWSM.
c.
If SWSM.SWESMBI was successfully set - the semaphore was acquired - otherwise, go back to
step a.
This step assure that the internal firmware will not access the shared resources register
(SW_FW_SYNC).
3. Clear the bit in SW_FW_SYNC that control the software ownership of the resource to indicate this
resource is free.
4. Release ownership of the software/software semaphore and the software/firmware semaphore by
clearing SWSM.SMBI and SWSM.SWESMBI bits.
5. At this stage, the shared resources is released by the driver and it may not access it. The SWSM
and SW_FW_SYNC registers can now be used to take ownership of another shared resources.
§§
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
185
Intel® 82576 GbE Controller — Initialization
NOTE:
This page intentionally left blank.
Intel® 82576 GbE Controller
Datasheet
186
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
5.0
Power Management
This section describes how power management is implemented in the 82576. The 82576 supports the
Advanced Configuration and Power Interface (ACPI) specification as well as Advanced Power
Management (APM).
Note:
Power management can be disabled via the power management bit in the Initialization
Control Word 1 EERPROM word (see Section 6.2.2).
5.1
General Power State Information
5.1.1
PCI Device Power States
The PCIe specification defines function power states (D-states) that enable the platform to establish
and control power states for the 82576 ranging from fully on to fully off (drawing no power) and various
in-between levels of power-saving states, annotated as D0-D3. Similarly, PCIe defines a series of link
power states (L-states) that work specifically within the link layer between the 82576 and its upstream
PCIe port (typically in the host chipset).
Since the 82576 is a multi-port device, each of its PCI functions may be in a different state at any given
moment. The device power state is defined by the most active function. For example, if function 0 is in
D0 state and all other functions are in D3 state, device state is D0. Link state follows the device state.
For a given device D-state, only certain L-states are possible as follows.
For a given component D-state, only certain L-states are possible as follows.
• D0 (fully on): The 82576 is completely active and responsive during this D-state. The link can be in
either L0 or a low-latency idle state referred to as L0s. Minimizing L0s exit latency is paramount for
enabling frequent entry into L0s while facilitating performance needs via a fast exit. A deeper link
power state, L1 state, is supported as well.
• D1 and D2: These modes are not supported by the 82576.
• D3 (off): Two sub-states of D3 are supported:
— D3hot, where primary power is maintained.
— D3cold, where primary power is removed.
Link states are mapped into device states as follows:
— D3hot maps to L1 to support clock removal on mobile platforms
— D3cold maps to L2 if auxiliary power is supported on 82576 with wake-capable logic, or to L3 if
no power is delivered to 82576. A sideband PE_WAKE_N mechanism is supported to interface
wake-enabled logic on mobile platforms during the L2 state.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
187
Intel® 82576 GbE Controller — Power Management
5.1.2
PCIe Link Power States
5.1.3
PCIe Link Power States
Configuring an 82576 into D-states automatically causes the PCIe links to transition to the appropriate
L-states.
• L2/L3 Ready: This link state prepares the PCIe link for the removal of power and clock. The 82576
is in the D3hot state and is preparing to enter D3cold. The power-saving opportunities for this state
include, but are not limited to, clock gating of all PCIe architecture logic, shutdown of the PLL, and
shutdown of all transceiver circuitry.
• L2: This link state is intended to comprehend D3cold with auxiliary power support. Note that
sideband WAKE# signaling is recommended to cause wake-capable devices to exit this state. The
power-saving opportunities for this state include, but are not limited to, shutdown of all transceiver
circuitry except detection circuitry to support exit, clock gating of all PCIe logic, and shutdown of
the PLL as well as appropriate platform voltage and clock generators.
• L3 (link off): Power and clock are removed in this link state, and there is no auxiliary power
available. To bring the 82576 and its link back up, the platform must go through a boot sequence
where power, clock, and reset are reapplied appropriately.
5.2
82576 Power States
The 82576 supports the D0 and D3 architectural power states as described earlier. Internally, the
82576 supports the following power states:
• D0u (D0 un-initialized) - an architectural sub-state of D0
• D0a (D0 active) - an architectural sub-state of D0
• D3 - architecture state D3hot
• Dr - internal state that contains the architecture D3cold state. Dr state is entered when PE_RST_N
is asserted or a PCIe in-band reset is received
Figure 5-1 shows the power states and transitions between them.
Intel® 82576 GbE Controller
Datasheet
188
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
Figure 5-1.
5.2.1
Power Management State Diagram
D0 Uninitialized State (D0u)
The D0u state is an architectural low-power state.
When entering D0u, the 82576:
• Asserts a reset to the PHY while the EEPROM is being read
• Disables wake up. However, if the APM Mode bit in the EEPROM's Initialization Control Word 2 is set,
then APM wake up is enabled.
5.2.1.1
Entry into D0u state
D0u is reached from either the Dr state (on de-assertion of PE_RST_N) or the D3hot state (by
configuration software writing a value of 00b to the Power State field of the PCI PM registers).
5.2.1.2
Exit from D0u state
De-asserting PE_RST_N means that the entire state of the 82576 is cleared, other than sticky bits.
State is loaded from the EEPROM, followed by establishment of the PCIe link. Once this is done,
configuration software can access the 82576.
On a transition from D3 to D0u state, the 82576 PCI configuration space is not reset. However, the
82576 requires that software perform a full re-initialization of the function including its PCI
configuration space.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
189
Intel® 82576 GbE Controller — Power Management
5.2.2
D0active State
Once memory space is enabled, the 82576 enters a D0 active state. It can transmit and receive packets
if properly configured by the software device driver. The PHY is enabled or re-enabled by the software
device driver to operate/auto-negotiate to full line speed/power if not already operating at full
capability. Any APM wake up previously active remains active. The software device driver can deactivate
APM wake up by writing to the Wake Up Control (WUC) register or activate other wake-up filters by
writing to the Wake Up Filter Control (WUFC) register.
5.2.2.1
Entry to D0a state
D0a is entered from the D0u state by writing a 1b to the Memory Access Enable or the
I/O Access Enable bit of the PCI Command register. The DMA, MAC, and PHY of the appropriate LAN
function are also enabled.
5.2.3
D3 State (PCI-PM D3hot)
The 82576 transitions to D3 when the system writes a 11b to the Power State field of the Power
Management Control/Status Register (PMCSR). Any wake-up filter settings that were enabled before
entering this state are maintained. Upon completion or during the transition to D3 state, the 82576
clears the Memory Access Enable and I/O Access Enable bits of the PCI Command register, which
disables memory access decode. While in D3, the 82576 does not generate master cycles.
Configuration and message requests are the only TLPs accepted by a function in the D3hot state. All
other received requests must be handled as unsupported requests, and all received completions are
handled as unexpected completions. If an error caused by a received TLP (such as an unsupported
request) is detected while in D3hot, and reporting is enabled, the link must be returned to L0 if it is not
already in L0 and an error message must be sent. See section 5.3.1.4.1 in The PCIe Base Specification
5.2.3.1
Entry to D3 State
Transition to D3 state is through a configuration write to the Power State field of the PCI-PM registers.
Prior to transition from D0 to the D3 state, the software device driver disables scheduling of further
tasks to the 82576; it masks all interrupts and does not write to the Transmit Descriptor Tail (TDT)
register or to the Receive Descriptor Tail (RDT) register and operates the master disable algorithm as
defined in Section 5.2.3.2.
If wake up capability is needed, system should enable wake capability by setting to 1b the PME_En bit
in the Power Management Control / Status Register (PMCSR). After Wake capability has been enabled
Software device driver should set up the appropriate wake up registers prior to the D3 transition.
Note:
If operation during D3cold is required, even when Wake capability is not required (e.g. for
manageability operation), system should also set the Auxiliary (AUX) Power PM Enable bit in
the PCIe Device Control register.
As a response to being programmed into D3 state, the 82576 transitions its PCIe link into the L1 link
state. As part of the transition into L1 state, the 82576 suspends scheduling of new TLPs and waits for
the completion of all previous TLPs it has sent. The 82576 clears the Memory Access Enable and I/O
Access Enable bits of the PCI Command register, which disables memory access decode. Any receive
packets that have not been transferred into system memory are kept in the 82576 (and discarded later
on D3 exit). Any transmit packets that have not be sent can still be transmitted (assuming the Ethernet
link is up).
Intel® 82576 GbE Controller
Datasheet
190
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
In order to reduce power consumption, if the link is still needed for manageability or wake-up
functionality, the PHY auto-negotiates to a lower link speed on D3 entry (see Section 3.5.7.6.4).
5.2.3.2
Exit from D3 State
A D3 state is followed by either a D0u state (in preparation for a D0a state) or by a transition to Dr
state (PCI-PM D3cold state). To transition back to D0u, the system writes a 00b to the Power State field
of the Power Management Control/Status Register (PMCSR). Transition to Dr state is through PE_RST_N
assertion.
The 82576 always sets the No_Soft_Reset bit in the PCIe Power Management Control / Status Register
(PMCSR) to 0b to indicate that Barton Hills performs an internal reset on transition from D3hot to D0.
Configuration context is lost when performing the soft reset. After transition from the D3hot to the D0
state, full re-initialization sequence is needed to return Barton Hills to D0 Initialized.
5.2.3.3
Master Disable Via CTRL Register
System software can disable master accesses on the PCIe link by either clearing the PCI Bus Master bit
or by bringing the function into a D3 state. From that time on, the 82576 must not issue master
accesses for this function. Due to the full-duplex nature of PCIe, and the pipelined design in the 82576,
it might happen that multiple requests from several functions are pending when the master disable
request arrives. The protocol described in this section insures that a function does not issue master
requests to the PCIe link after its Master Enable bit is cleared (or after entry to D3 state).
Two configuration bits are provided for the handshake between the 82576 function and its software
device driver:
• GIO Master Disable bit in the Device Control (CTRL) register - When the GIO Master Disable bit is
set, the 82576 blocks new master requests by this function. The 82576 then proceeds to issue any
pending requests by this function. This bit is cleared on master reset (Internal_Power_On_Reset to
software reset) to enable master accesses.
• GIO Master Enable Status bits in the Device Status register - Cleared by the 82576 when the GIO
Master Disable bit is set and no master requests are pending by the relevant function. Set
otherwise. Indicates that no master requests are issued by this function as long as the GIO Master
Disable bit is set. The following activities must end before the 82576 clears the GIO Master Enable
Status bit:
— Master requests by the transmit and receive engines
— All pending completions to the 82576 are received.
Note:
The software device driver sets the GIO Master Disable bit when notified of a pending
master disable (or D3 entry). The 82576 then blocks new requests and proceeds to issue
any pending requests by this function. The software device driver then polls the GIO Master
Enable Status bit. Once the bit is cleared, it is guaranteed that no requests are pending
from this function. The software device driver might time out if the GIO Master Enable
Status bit is not cleared within a given time.
The GIO Master Disable bit must be cleared to enable a master request to the PCIe link.
This can be done either through reset or by the software device driver.
5.2.4
Dr State (D3cold)
Transition to Dr state is initiated on several occasions:
• On system power up - Dr state begins with the assertion of the internal power detection circuit
(PE_RST_N) and ends with de-assertion of PE_RST_N.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
191
Intel® 82576 GbE Controller — Power Management
• On transition from a D0a state - During operation the system might assert PE_RST_N at any time.
In an ACPI system, a system transition to the G2/S5 state causes a transition from D0a to Dr state.
• On transition from a D3 state - The system transitions the 82576 into the Dr state by asserting
PCIe PE_RST_N.
Any wake-up filter settings that were enabled before entering this reset state are maintained.
The system might maintain PE_RST_N asserted for an arbitrary time. The de-assertion (rising edge) of
PE_RST_N causes a transition to D0u state.
While in Dr state, the 82576 might enter one of several modes with different levels of functionality and
power consumption. The lower-power modes are achieved when the 82576 is not required to maintain
any functionality (see Section 5.2.4.1).
Note:
5.2.4.1
If the 82576 is configured to provide a 50 MHz NC-SI clock (via the NC-SI Output Clock
EEPROM bit), then the NC-SI clock must be provided in Dr state as well.
Dr Disable Mode
The 82576 enters a Dr disable mode on transition to D3cold state when it does not need to maintain
any functionality. The conditions to enter either state are:
• The 82576 (all PCI functions) is in Dr state
• APM WOL is inactive for both LAN functions
• Pass-through manageability is disabled
• ACPI PME is disabled for all PCI functions
• The 82576 Power Down En EEPROM bit is set (word 0x1E, bit 15) is set (default hardware value is
disabled).
• default hardware value is disabled).
• The PHY Power Down Enable EEPROM bit is set (word 0xF, bit 6).
Entering Dr disable mode is usually done by asserting PCIe PE_RST_N. It might also be possible to
enter Dr disable mode by reading the EEPROM while already in Dr state. The usage model for this later
case is on system power up, assuming that manageability and wake up are not required. Once the
82576 enters Dr state on power-up, the EEPROM is read. If the EEPROM contents determine that the
conditions to enter Dr disable mode are met, the 82576 then enters this mode (assuming that PCIe
PE_RST_N is still asserted).
Note:
5.2.4.2
The 82576 exits Dr disable mode when Dr state is exited (See Figure 5-1 for conditions to
exit Dr state).
Entry to Dr State
Dr entry on platform power-up begins with the assertion of the internal power detection circuit
(PE_RST_N). The EEPROM is read and determines 82576 configuration. If the APM Enable bit in the
EEPROM's Initialization Control Word 3 is set, then APM wake up is enabled. PHY and MAC states are
determined by the state of manageability and APM wake. To reduce power consumption, if
manageability or APM wake is enabled, the PHY auto-negotiates to a lower link speed on Dr entry (see
Section 3.5.7.6.4). The PCIe link is not enabled in Dr state following system power up (since PE_RST_N
is asserted).
Intel® 82576 GbE Controller
Datasheet
192
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
Entering Dr state from D0a state is done by asserting PE_RST_N. An ACPI transition to the G2/S5 state
is reflected in an 82576 transition from D0a to Dr state. The transition can be orderly (such as, user
selected the shut down option), in which case the software device driver might have a chance to
intervene. Or, it might be an emergency transition (such as power button override), in which case, the
software device driver is not notified.
To reduce power consumption, if any of manageability, APM wake or PCI-PM PME1 is enabled, the PHY
auto-negotiates to a lower link speed on D0a to Dr transition (see Section 3.5.7.6.4).
Transition from D3(hot) state to Dr state is done by asserting PE_RST_N. Prior to that, the system
initiates a transition of the PCIe link from L1 state to either the L2 or L3 state (assuming all functions
were already in D3 state). The link enters L2 state if PCI-PM PME is enabled.
5.2.4.3
Auxiliary Power Usage
The EEPROM D3COLD_WAKEUP_ADVEN bit and the AUX_PWR strapping pin determine when D3cold
PME is supported:
• D3COLD_WAKEUP_ADVEN denotes that PME wake should be supported
• AUX_PWR strapping pin indicates that auxiliary power is provided
D3cold PME is supported as follows:
• If the D3COLD_WAKEUP_ADVEN is set to ‘1’ and the AUX_PWR strapping is set to ‘1’, then D3cold
PME is supported
• Else D3cold PME is not supported
The amount of power required for the function (including the entire NIC) is advertised in the Power
Management Data register, which is loaded from the EEPROM.
If D3cold is supported, the PME_En and PME_Status bits of the Power Management Control/Status
Register (PMCSR), as well as their shadow bits in the Wake Up Control (WUC) register are reset only by
the power up reset (detection of power rising).
5.2.5
Link Disconnect
In any of D0u, D0a, D3, or Dr power states, the 82576 enters a link-disconnect state if it detects a linkdisconnect condition on the Ethernet link. Note that the link-disconnect state is invisible to software
(other than the Link Energy Detect bit state). In particular, while in D0 state, software might be able to
access any of the 82576 registers as in a link-connect state.
5.2.6
Device Power-Down State
The 82576 enters a global power-down state if all of the following conditions are met:
• The 82576 Power Down Enable EEPROM bit (word 0x1E bit 15) was set (default hardware value is
disabled).
• The 82576 is in Dr state.
• The link connections of both ports (PHY or SerDes) are in power down mode.
1. ACPI 2.0 specifies that OSPM will not disable wake events before setting the SLP_EN bit when
entering the S5 sleeping state. This provides support for remote management initiatives by
enabling Remote Power On (RPO) capability. This is a change from ACPI 1.0 behavior.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
193
Intel® 82576 GbE Controller — Power Management
The 82576 also enters a power-down state when the DEV_OFF_N pin is active and the relevant EEPROM
bits were configured as previously described (see Section 4.4 for more details on DEV_OFF_N
functionality).
5.3
Power Limits by Certain Form Factors
The 82576 exceeds the allocated auxiliary power in some configurations (such as both ports running at
1000 Mb/s speed). The 82576 must therefore be configured to meet requirements. To do so, the 82576
implements three EEPROM bits to disable operation in certain cases:
1. The Disable_1000 PHY register bit disables 1000 Mb/s operation under all conditions.
2. The Disable 1000 in non-D0a PHY CSR bit disables 1000 Mb/s operation in non-D0a states1. If
Disable 1000 in non-D0a is set, and the 82576 is at 1000 Mb/s speed on entry to a non-D0a state,
then the 82576 removes advertisement for 1000 Mb/s and auto-negotiates.
Note that the 82576 restarts link auto-negotiation each time it transitions from a state where 1000 Mb/
s or 100 Mb/s speed is enabled to a state where 1000 Mb/s or 100 Mb/s speed is disabled, or vice
versa. For example, if Disable 1000 in non-D0a is set but Disable_1000 is cleared, the 82576 restarts
link auto-negotiation on transition from D0 state to D3 or Dr states.
5.4
Interconnects Power Management
This section describes the power reduction techniques employed by the 82576 main interconnects.
5.4.1
PCIe Link Power Management
The PCIe link state follows the power management state of the 82576. Since the 82576 incorporates
multiple PCI functions, its power management state is defined as the power management state of the
most awake function (see Figure 5-2):
• If any function is in D0 state (either D0a or D0u), the PCIe link assumes the 82576 is in D0 state.
Else,
• If the functions are in D3 state, the PCIe link assumes the 82576 is in D3 state. Else,
• The 82576 is in Dr state (PE_RST_N is asserted to all functions).
The 82576 supports all PCIe power management link states:
• L0 state is used in D0u and D0a states.
• The L0s state is used in D0a and D0u states each time link conditions apply.
• The L1 state is also used in D0a and D0u states when idle conditions apply for a longer period of
time. The L1 state is also used in the D3 state.
• The L2 state is used in the Dr state following a transition from a D3 state if PCI-PM PME is enabled.
• The L3 state is used in the Dr state following power up, on transition from D0a, and if PME is not
enabled in other Dr transitions.
The 82576’s support for active state link power management is reported via the PCIe Active State Link
PM Support register and is loaded from the EEPROM.
1. The restriction is defined for all non-D0a states to have compatible behavior with previous
products.
Intel® 82576 GbE Controller
Datasheet
194
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
Figure 5-2.
Link Power Management State Diagram
While in L0 state, the 82576 transitions the transmit lane(s) into L0s state once the idle conditions are
met for a period of time as follows:
L0s configuration fields are:
• L0s enable - The default value of the Active State Link PM Control field in the PCIe Link Control
register is set to 00b (both L0s and L1 disabled). System software might later write a different
value into the PCIe Link Control register. The default value is loaded on any reset of the PCI
configuration registers.
• The L0S_ENTRY_LAT bit in the PCIe Control Register (GCR), determines L0s entry latency. When
set to 0b, L0s entry latency is the same as L0s exit latency of the 82576 at the other end of the
link. When set to 1b, L0s entry latency is 1/4 of the L0s exit Latency of the 82576 at the other end
of the link. The default value is 0b (entry latency is the same as L0s exit latency of the 82576 at the
other end of the link).
• L0s exit latency (as published in the L0s Exit Latency field of the Link Capabilities register) is loaded
from EEPROM. Separate values are loaded when the 82576 shares the same reference PCIe clock
with its partner across the link, and when the 82576 uses a different reference clock than its
partner across the link. The 82576 reports whether it uses the slot clock configuration, through the
PCIe Slot Clock Configuration bit loaded from the Slot_Clock_Cfg EEPROM bit.
• L0s acceptable latency (as published in the Endpoint L0s Acceptable Latency field of the Device
Capabilities register) is loaded from EEPROM.
L1 configuration fields are:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
195
Intel® 82576 GbE Controller — Power Management
• L1 entry latency — The 82576 enters the L1 state after it has been in the L0s state (in both
directions) for a period of time determined by the Latency_To_Enter_L1 CSR register. The initial
value is loaded from the Latency_To_Enter_L1 EEPROM field.
• L1 exit latency (as published in the L1 Exit Latency field of the Link Capabilities register) is loaded
from the L1_Act_Ext_Latency Latency_To_Enter_L1 field in the EEPROM.
• L1 acceptable latency (as published in the Endpoint L1 Acceptable Latency field of the Device
Capabilities register) is loaded from EEPROM.
5.4.2
NC-SI Clock Control
The 82576 can be configured to provide a 50 MHz output clock to its NC-SI interface and other platform
devices. When enabled (through the NC-SI Output Clock EEPROM bit), the NC-SI clock is provided in all
power states without exception.
5.4.3
PHY Power-Management
THe PHY power management features are described in Section 3.5.7.6.
5.4.4
SerDes/SGMII Power Management
Each 82576 SerDes enters a power-down state when none of its clients is enabled and therefore has no
need to maintain a link. This can happen in one of the following cases. Note that SerDes power-down
must be enabled through the EEPROM SerDes Low Power Enable bit.
1. D3/Dr state: Each SerDes enters a low-power state if the following conditions are met:
a.
The LAN function associated with this SerDes is in a non-D0 state
b.
APM WOL is inactive
c.
Pass-through manageability is disabled
d.
ACPI PME is disabled
2. PHY mode: Each SerDes is disabled when its LAN function is configured to PHY mode.
3. LAN disable: Each SerDes can be disabled if its LAN function's LAN Disable input indicates that the
relevant function should be disabled. Since the SerDes is shared between the LAN function and
manageability, it might not be desired to power down the SerDes in LAN Disable. The
PHY_in_LAN_Disable EEPROM bit determines whether the SerDes is powered down when the LAN
Disable pin is asserted. The default is not to power down.
5.5
Timing of Power-State Transitions
The following sections give detailed timing for the state transitions. In the diagrams the dotted
connecting lines represent the 82576 requirements, while the solid connecting lines represent the
82576 guarantees.
The timing diagrams are not to scale. The clocks edges are shown to indicate running clocks only are
not used to indicate the actual number of cycles for any operation.
Intel® 82576 GbE Controller
Datasheet
196
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
5.5.1
Power Up (Off to Dup to D0u to D0a
Figure 5-3.
Power Up (Off to Dup to D0u to D0a)
Table 5-1.
Power Up (Off to Dup to D0u to D0a)
Note
Description
1
Xosc is stable txog after power is stable.
2
Internal_Power_On_Reset is asserted after all power supplies are good and tppg after Xosc is stable.
3
An EEPROM read starts on the rising edge of Internal_Power_On_Reset.
4
After reading the EEPROM, PHY reset is de-asserted.
5
APM wake-up mode can be enabled based on what is read from the EEPROM.
6
The PCIe reference clock is valid tPE_RST-CLK before de-asserting PE_RST_N (according to PCIe specification).
7
PE_RST_N is de-asserted tPVPGL after power is stable (according to PCIe specification).
8
The internal PCIe clock is valid and stable tppg-clkint from PE_RST_N de-assertion.
9
The PCIe internal PWRGD signal is asserted tclkpr after the external PE_RST_N signal.
10
Asserting internal PCIe PWRGD causes the EEPROM to be re-read, asserts PHY reset, and disables wake up.
11
After reading the EEPROM, PHY reset is de-asserted.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
197
Intel® 82576 GbE Controller — Power Management
Table 5-1.
Power Up (Off to Dup to D0u to D0a) (Continued)
12
Link training starts after tpgtrn from PE_RST_N de-assertion.
13
A first PCIe configuration access might arrive after tpgcfg from PE_RST_N de-assertion.
14
A first PCI configuration response can be sent after tpgres from PE_RST_N de-assertion.
15
Writing a 1b to the Memory Access Enable bit in the PCI Command Register transitions the 82576 from D0u to D0.
state.
5.5.2
Transition from D0a to D3 and Back Without PE_RST_N
Figure 5-4.
Transition from D0a to D3 and Back Without PE_RST_N
Table 5-2.
Transition from D0a to D3 and Back Without PE_RST_N
Note
Description
1
Writing 11b to the Power State field of the Power Management Control/Status Register (PMCSR) transitions the
82576 to D3.
2
The system can keep the 82576 in D3 state for an arbitrary amount of time.
3
To exit D3 state, the system writes 00b to the Power State field of the PMCSR.
4
APM wake-up or SMBus mode might be enabled based on what is read in the EEPROM.
5
After reading the EEPROM, reset to the PHY is de-asserted. The PHY operates at reduced-speed if APM wake up or
SMBus is enabled, else powered-down.
6
The system can delay an arbitrary time before enabling memory access.
7
Writing a 1b to the Memory Access Enable bit or to the I/O Access Enable bit in the PCI Command Register
transitions the 82576 from D0u to D0 state and returns the PHY to full-power/speed operation.
Intel® 82576 GbE Controller
Datasheet
198
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
5.5.3
Transition From D0a to D3 and Back With PE_RST_N
Figure 5-5.
Transition From D0a to D3 and Back With PE_RST_N
Table 5-3.
Transition From D0a to D3 and Back With PE_RST_N
Note
Description
1
Writing 11b to the Power State field of the PMCSR transitions the 82576 to D3. PCIe link transitions to L1 state.
2
The system can delay an arbitrary amount of time between setting D3 mode and transitioning the link to an L2 or L3
state.
3
Following link transition, PE_RST_N is asserted.
4
The system must assert PE_RST_N before stopping the PCIe reference clock. It must also wait tl2clk after link
transition to L2/L3 before stopping the reference clock.
5
On assertion of PE_RST_N, the 82576 transitions to Dr state.
6
The system starts the PCIe reference clock tPE_RST-CLK before de-assertion PE_RST_N.
7
The internal PCIe clock is valid and stable tppg-clkint from PE_RST_N de-assertion.
8
The PCIe internal PWRGD signal is asserted tclkpr after the external PE_RST_N signal.
9
Asserting internal PCIe PWRGD causes the EEPROM to be re-read, asserts PHY reset, and disables wake up.
10
APM wake-up mode might be enabled based on what is read from the EEPROM.
11
After reading the EEPROM, PHY reset is de-asserted.
12
Link training starts after tpgtrn from PE_RST_N de-assertion.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
199
Intel® 82576 GbE Controller — Power Management
Table 5-3.
Transition From D0a to D3 and Back With PE_RST_N (Continued)
13
A first PCIe configuration access might arrive after tpgcfg from PE_RST_N de-assertion.
14
A first PCI configuration response can be sent after tpgres from PE_RST_N de-assertion.
15
Writing a 1b to the Memory Access Enable bit in the PCI Command Register transitions the 82576 from D0u to D0
state.
5.5.4
Transition From D0a to Dr and Back Without Transition to
D3
Figure 5-6.
Transition From D0a to Dr and Back Without Transition to D3
Table 5-4.
Transition From D0a to Dr and Back Without Transition to D3
Note
Description
1
The system must assert PE_RST_N before stopping the PCIe reference clock. It must also wait tl2clk after link
transition to L2/L3 before stopping the reference clock.
2
On assertion of PE_RST_N, the 82576 transitions to Dr state and the PCIe link transition to electrical idle.
3
The system starts the PCIe reference clock tPE_RST-CLK before de-assertion PE_RST_N.
4
The internal PCIe clock is valid and stable tppg-clkint from PE_RST_N de-assertion.
5
The PCIe internal PWRGD signal is asserted tclkpr after the external PE_RST_N signal.
6
Asserting internal PCIe PWRGD causes the EEPROM to be re-read, asserts PHY reset, and disables wake up.
Intel® 82576 GbE Controller
Datasheet
200
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
Table 5-4.
Transition From D0a to Dr and Back Without Transition to D3 (Continued)
7
APM wake-up mode might be enabled based on what is read from the EEPROM.
8
After reading the EEPROM, PHY reset is de-asserted.
9
Link training starts after tpgtrn from PE_RST_N de-assertion.
10
A first PCIe configuration access might arrive after tpgcfg from PE_RST_N de-assertion.
11
A first PCI configuration response can be sent after tpgres from PE_RST_N de-assertion.
12
Writing a 1b to the Memory Access Enable bit in the PCI Command Register transitions the 82576 from D0u to D0
state.
5.6
Wake Up
The 82576 supports two modes of wake-up management:
1. Advanced Power Management (APM) wake up
2. ACPI/PCIe defined wake up
The usual model is to activate one mode at a time but not both modes together. If both modes are
activated, the 82576 might wake up the system in unexpected events. For example, if APM is enabled
together with PCIe PME, a magic packet might wake up the system even if APMPME is disabled.
Alternatively, if APM is enabled together with some PCIe filters, packets matching these filters might
wake up the system even if PCIe PME is disabled.
5.6.1
Advanced Power Management Wake Up
Advanced Power Management Wake Up or APM Wakeup (also known as Wake on LAN) is a feature that
existed in earlier 10/100 Mb/s NICs. This functionality was designed to receive a broadcast or unicast
packet with an explicit data pattern, and then assert a subsequent signal to wake up the system. This
was accomplished by using a special signal that ran across a cable to a defined connector on the
motherboard. The NIC would assert the signal for approximately 50 ms to signal a wake up. The 82576
now uses (if configured) an in-band PM_PME message for this functionality.
On power up, the 82576 reads the APM Enable bits from the EEPROM Initialization Control Word 3 into
the APM Enable (APME) bits of the Wakeup Control (WUC) register. These bits control enabling of APM
wake up.
When APM wake up is enabled, the 82576 checks all incoming packets for Magic Packets.
Section 5.6.3.1.4 for a definition of Magic Packets.
See
Once the 82576 receives a matching magic packet, and if the Assert PME On APM Wakeup (APMPME)
bit is set in the Wake Up Control (WUC) register, it:
• Sets the PME_Status bit in the PMCSR and issues a PM_PME message (in some cases, this might
require asserting the WAKE# signal first to resume power and clock to the PCIe interface).
• Stores the first 128 bytes of the packet in the Wake Up Packet Memory (WUPM) register.
• Sets the Magic Packet Received bit in the Wake Up Status (WUS) register.
• Sets the packet length in the Wake Up Packet Length (WUPL) register.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
201
Intel® 82576 GbE Controller — Power Management
The 82576 maintains the first Magic Packet received in the Wake Up Packet Memory (WUPM) register
until the software device driver writes a 1b to the Magic Packet Received MAG bit in the Wake Up Status
(WUS) register.
APM wake up is supported in all power states and only disabled if a subsequent EEPROM read results in
the APM Wake Up bit being cleared or software explicitly writes a 0b to the APM Wake Up (APM) bit of
the WUC register.
5.6.2
PCIe Power Management Wake Up
The 82576 supports PCIe power management based wake ups. It can generate system wake-up events
from three sources:
• Reception of a Magic Packet.
• Reception of a network wakeup packet.
• Detection of a link change of state.
Activating PCIe power management wake up requires the following:
• The software device driver programs the Wake Up Filter Control (WUFC) register to indicate the
packets it needs to wake up and supplies the necessary data to the IPv4/v6 Address Table (IP4AT,
IP6AT) and the Flexible Host Filter Table (FHFT). It can also set the Link Status Change Wake Up
Enable (LNKC) bit in the Wake Up Filter Control (WUFC) register to cause wake up when the link
changes state.
• The operating system (at configuration time) writes a 1b to the PME_En bit of the Power
Management Control/Status (PMCSR.8) register.
Normally, after enabling wake up, the operating system write a 11b to the lower two bits of the PMCSR
to put the 82576 into low-power mode.
Once wake up is enabled, the 82576 monitors incoming packets, first filtering them according to its
standard address filtering method, then filtering them with all of the enabled wakeup filters. If a packet
passes both the standard address filtering and at least one of the enabled wakeup filters, the 82576:
• Sets the PME_Status bit in the PMCSR.
• Asserts PE_WAKE_N (if the PME_En bit in the PMCSR is set).
• Stores the first 128 bytes of the packet in the Wakeup Packet Memory (WPM) register.
• Sets one or more of the received bits in the Wake Up Status (WUS) register. Note that the 82576
sets more than one bit if a packet matches more than one filter.
• Sets the packet length in the Wake Up Packet Length (WUPL) register.
If enabled, a link state change wake up causes similar results, setting PME_Status, asserting
PE_WAKE_N and setting the Link Status Changed (LNKC) bit in the Wake Up Status (WUS) register
when the link goes up or down.
The 82576 supports the following change described in the PCIe Base Specification, Rev. 1.1RD (section
5.3.3.4) - On receiving a PME_Turn_Off message, the 82576 must block the transmission of PM_PME
messages and transmit a PME_TO_Ack message upstream. The 82576 is permitted to send a PM_PME
message after the Link is returned to an L0 state through LDn.
PE_WAKE_N remains asserted until the operating system either writes a 1b to the PME_Status bit of the
PMCSR register or writes a 0b to the PME_En bit.
Intel® 82576 GbE Controller
Datasheet
202
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
After receiving a wake-up packet, the 82576 ignores any subsequent wake-up packets until the
software device driver clears all of the received bits in the Wake Up Status (WUS) register. It also
ignores link change events until the software device driver clears the Link Status Changed (LNKC) bit in
the Wake Up Status (WUS) register.
Note:
A wake on link change is not supported when configured to SerDes mode.
5.6.3
Wake-Up Packets
The 82576 supports various wake-up packets using two types of filters:
• Pre-defined filters
• Flexible filters
Each of these filters are enabled if the corresponding bit in the Wake Up Filter Control (WUFC) register
is set to 1b.
5.6.3.1
Pre-Defined Filters
The following packets are supported by the 82576's pre-defined filters:
• Directed packet (including exact, multicast indexed, and broadcast)
• Magic Packet
• ARP/IPv4 request packet
• Directed IPv4 packet
• Directed IPv6 packet
Each of these filters are enabled if the corresponding bit in the Wakeup Filter Control (WUFC) register is
set to 1b.
The explanation of each filter includes a table showing which bytes at which offsets are compared to
determine if the packet passes the filter.
Note:
Both VLAN frames and LLC/SNAP can increase the given offsets if they are present.
5.6.3.1.1
Directed Exact Packet
The 82576 generates a wake-up event after receiving any packet whose destination address matches
one of the 24 valid programmed receive addresses, if the Directed Exact Wake Up Enable bit is set in
the Wake Up Filter Control (WUFC.EX) register.
5.6.3.1.2
Directed Multicast Packet
For multicast packets, the upper bits of the incoming packet's destination address index a bit vector, the
Multicast Table Array (MTA) that indicates whether to accept the packet. If the Directed Multicast Wake
Up Enable bit set in the Wake Up Filter Control (WUFC.MC) register and the indexed bit in the vector is
one, then the 82576 generates a wake-up event. The exact bits used in the comparison are
programmed by software in the Multicast Offset field of the Receive Control (RCTL.MO) register.
5.6.3.1.3
Broadcast
5.6.3.1.4
If the Broadcast Wake Up Enable bit in the Wake Up Filter Control (WUFC.BC)
register is set, the 82576 generates a wake-up event when it receives a
broadcast packet.Magic Packet
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
203
Intel® 82576 GbE Controller — Power Management
Magic packets are defined in:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/20213.pdf as:
“Once the LAN controller has been put into the Magic Packet mode, it scans all incoming frames
addressed to the node for a specific data sequence.This sequence indicates to the controller that this is
a Magic Packet frame. A Magic Packet frame must also meet the basic requirements for the LAN
technology chosen, such as SOURCE ADDRESS, DESTINATION ADDRESS (which may be the receiving
station's IEEE address or a MULTICAST address which includes the BROADCAST address), and CRC. The
specific data sequence consists of 16 repetitions of the IEEE address of this node, with no breaks or
interruptions. This sequence can be located anywhere within the packet, but must be preceded by a
synchronization stream. The synchronization stream allows the scanning state machine to be much
simpler. The synchronization stream is defined as 6 bytes of 0xFF. The device will also accept a
BROADCAST frame, as long as the 16 repetitions of the IEEE address match the address of the machine
to be awakened.”
The 82576 expects the destination address to either:
• Be the broadcast address (FF.FF.FF.FF.FF.FF)
• Match the value in Receive Address 0 (RAH0, RAL0) register. This is initially loaded from the
EEPROM but can be changed by the software device driver.
• Match any other address filtering enabled by the software device driver.
The 82576 searches for the contents of Receive Address 0 (RAH0, RAL0) register as the embedded IEEE
address. It considers any non-0xFF byte after a series of at least 6 0xFFs to be the start of the IEEE
address for comparison purposes. For example, it catches the case of 7 0xFFs followed by the IEEE
address). As soon as one of the first 96 bytes after a string of 0xFFs don't match, it continues to search
for anther set of at least 6 0xFFs followed by the 16 copies of the IEEE address later in the packet. Note
that this definition precludes the first byte of the destination address from being FF.
A Magic Packet's destination address must match the address filtering enabled in the configuration
registers with the exception that broadcast packets are considered to match even if the Broadcast
Accept bit of the Receive Control (RCTL.BAM) register is 0b. If APM wake up (wake up by a Magic
Packet) is enabled in the EEPROM, the 82576 starts up with the Receive Address 0 (RAH0, RAL0)
register loaded from the EEPROM. This enables the 82576 to accept packets with the matching IEEE
address before the software device driver loads.
Table 5-5.
Offset
Magic Packet Structure
# of bytes
Field
Value
Action
0
6
Destination Address
Compare
6
6
Source Address
Skip
12
S=(0/4)
Possible VLAN Tag
Skip
12 + S
D=(0/8)
Possible Length + LLC/
SNAP Header
Skip
12 + S +
D
2
Type
Skip
Any
6
Synchronizing Stream
FF*6+
Compare
any+6
96
16 copies of Node Address
A*16
Compare
Intel® 82576 GbE Controller
Datasheet
204
Comment
MAC header – processed by main
address filter.
Compared to Receive Address 0
(RAH0, RAL0) register.
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
Note:
Accepting broadcast Magic Packets for wake up purposes when the Broadcast Accept bit of
the Receive Control (RCTL.BAM) register is 0b is a change from a previous device, which
initialized RCTL.BAM to 1 if APM was enabled in the EEPROM, but then required that bit to
be 1b to accept broadcast Magic Packets, unless broadcast packets passed another perfect
or multicast filter.
5.6.3.1.5
ARP/IPv4 Request Packet
The 82576 supports receiving ARP request packets for wake up if the ARP bit is set in the Wake Up
Filter Control (WUFC) register. Four IPv4 addresses are supported, which are programmed in the IPv4
Address Table (IP4AT). A successfully matched packet must contain a broadcast MAC address, a
protocol type of 0x0806, an ARP op-code of 0x01, and one of the four programmed IPv4 addresses. The
82576 also handles ARP request packets that have VLAN tagging on both Ethernet II and Ethernet
SNAP types.
Table 5-6.
Offset
ARP Packet Structure and Processing
# of bytes
Field
Value
Action
0
6
Destination Address
Compare
6
6
Source Address
Skip
12
S=(0/4)
Possible VLAN Tag
Compare
12 + S
D=(0/8)
Possible Length + LLC/
SNAP Header
Skip
12 + S +
D
2
Ethernet Type
0x0806
Compare
14 + S +
D
2
HW Type
0x0001
Compare
16 + S +
D
2
Protocol Type
0x0800
Compare
18 + S +
D
1
Hardware Size
0x06
Compare
19 + S +
D
1
Protocol Address Length
0x04
Compare
20 + S +
D
2
Operation
0x0001
Compare
22 + S +
D
6
Sender HW Address
-
Ignore
28 + S +
D
4
Sender IP Address
-
Ignore
32 + S +
D
6
Target HW Address
-
Ignore
38 + S +
D
4
Target IP Address
IP4AT
Compare
Comment
MAC header – processed by
main address filter.
Processed by main address
filter.
ARP
Compare if the Directed ARP bit
is set to 1b.
May match any of four values in
IP4AT.
5.6.3.1.6
320961-015EN
Revision: 2.61
December 2010
Directed Ipv4 Packet
Intel® 82576 GbE Controller
Datasheet
205
Intel® 82576 GbE Controller — Power Management
The 82576 supports receiving directed IPv4 packets for wake up if the IPV4 bit is set in the Wake Up
Filter Control (WUFC) register. Four IPv4 addresses are supported, which are programmed in the IPv4
Address Table (IP4AT). A successfully matched packet must contain the station's MAC address, a
protocol type of 0x0800, and one of the four programmed IPv4 addresses. The 82576 also handles
directed IPv4 packets that have VLAN tagging on both Ethernet II and Ethernet SNAP types.
Table 5-7.
Offset
IPv4 Packet Structure and Processing
# of bytes
Field
Value
Action
Comment
0
6
Destination Address
Compare
6
6
Source Address
Skip
12
S=(0/4)
Possible VLAN Tag
Compare
12 + S
D=(0/8)
Possible Length + LLC/
SNAP Header
Skip
12 + S +
D
2
Ethernet Type
0x0800
Compare
IPv4
14 + S +
D
1
Version/ HDR length
0x4X
Compare
Check IPv4
15 + S +
D
1
Type of Service
-
Ignore
16 + S +
D
2
Packet Length
-
Ignore
18 + S +
D
2
Identification
-
Ignore
20 + S +
D
2
Fragment Info
-
Ignore
22 + S +
D
1
Time to live
-
Ignore
23 + S +
D
1
Protocol
-
Ignore
24 + S +
D
2
Header Checksum
-
Ignore
26 + S +
D
4
Source IP Address
-
Ignore
30 + S +
D
4
Destination IP Address
IP4AT
Compare
5.6.3.1.7
MAC header – processed by main
address filter.
Processed by main address filter.
May match any of four values in
IP4AT.
Directed IPv6 Packet
The 82576 supports receiving directed IPv6 packets for wake up if the IPV6 bit is set in the Wake Up
Filter Control (WUFC) register. One IPv6 address is supported and is programmed in the IPv6 Address
Table (IP6AT). A successfully matched packet must contain the station's MAC address, a protocol type
of 0x86DD, and the programmed IPv6 address. In addition, the IPAV.V60 bit should be set. The 82576
also handles directed IPv6 packets that have VLAN tagging on both Ethernet II and Ethernet SNAP
types.
Intel® 82576 GbE Controller
Datasheet
206
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
Table 5-8.
IPv6 Packet Structure and Processing
Offset
# of bytes
Field
Value
Action
Comment
0
6
Destination Address
Compare
6
6
Source Address
Skip
12
S=(0/4)
Possible VLAN Tag
Compare
12+ S
D=(0/8)
Possible Length + LLC/SNAP
Header
Skip
12 + S + D
2
Ethernet Type
0x86DD
Compare
IPv6
14 + S + D
1
Version/ Priority
0x6X
Compare
Check IPv6
15 + S + D
3
Flow Label
-
Ignore
18 + S + D
2
Payload Length
-
Ignore
20 + S + D
1
Next Header
-
Ignore
21 + S + D
1
Hop Limit
-
Ignore
22 + S + D
16
Source IP Address
-
Ignore
38 + S + D
16
Destination IP Address
IP6AT
Compare
5.6.3.2
MAC header –
processed by main
address filter.
Processed by main
address filter.
Match value in IP6AT.
Flexible Filters
The 82576 supports a total of six flexible filters. Each filter can be configured to recognize an arbitrary
pattern within the first 128 bytes of the packet. To configure the flexible filters, software programs the
mask values (required values and the minimum packet length), into the Flexible Host Filter Table (FHFT
and FHFT_EXT). These six flexible filters contain separate values for each filter. Software must also
enable the filters in the Wake Up Filter Control (WUFC) register, and enable the overall wake up
functionality. The overall wake up functionality must be enabled by setting PME_En in the PMCSR or the
Wake Up Control (WUC) register.
Once enabled, the flexible filters scan incoming packets for a match. If the filter encounters any byte in
the packet where the mask bit is one and the byte doesn't match the value programmed in the Flexible
Host Filter Table (FHFT), then the filter fails that packet. If the filter reaches the required length without
failing the packet, it passes the packet and generates a wake-up event. It ignores any mask bits set to
one beyond the required length.
Note:
The flex filters are temporarily disabled when read from or written to by the host. Any
packet received during a read or write operation is dropped. Filter operation resumes once
the read or write access completes.
The following packets are listed for reference purposes only. The flexible filter could be used to filter
these packets.
5.6.3.2.1
320961-015EN
Revision: 2.61
December 2010
IPX Diagnostic Responder Request Packet
Intel® 82576 GbE Controller
Datasheet
207
Intel® 82576 GbE Controller — Power Management
An IPX diagnostic responder request packet must contain a valid MAC address, a protocol type of
0x8137, and an IPX diagnostic socket of 0x0456. It might include LLC/SNAP headers and VLAN tags.
Since filtering this packet relies on the flexible filters, which use offsets specified by the operating
system directly, the operating system must account for the extra offset LLC/SNAP headers and VLAN
tags.
Table 5-9.
IPX Diagnostic Responder Request Packet Structure and Processing
Offset
# of bytes
Field
Value
Action
0
6
Destination Address
Compare
6
6
Source Address
Skip
12
S=(0/4)
Possible VLAN Tag
Skip
12+ S
D=(0/8)
Possible Length + LLC/
SNAP Header
Skip
12 + S + D
2
Ethernet Type
0x8137
Compare
14 + S + D
16
Some IPX Stuff
-
Ignore
30 + S + D
2
IPX Diagnostic Socket
0x0456
Compare
5.6.3.2.2
Comment
IPX
Directed IPX Packet
A valid directed IPX packet contains the station's MAC address, a protocol type of 0x8137, and an IPX
node address that is equal to the station's MAC address. It might include LLC/SNAP headers and VLAN
tags. Since filtering this packet relies on the flexible filters, which use offsets specified by the operating
system directly, the operating system must account for the extra offset LLC/SNAP headers and VLAN
tags.
Table 5-10.
Offset
IPX Packet Structure and Processing
# of bytes
Field
Value
Action
0
6
Destination Address
Compare
6
6
Source Address
Skip
12
S=(0/4)
Possible VLAN Tag
Skip
12+ S
D=(0/8)
Possible Length + LLC/SNAP
Header
Skip
12 + S +
D
2
Ethernet Type
0x8137
Compare
14 + S +
D
10
Some IPX Info
-
Ignore
24 + S +
D
6
IPX Node Address
Receive
Address 0
Compare
5.6.3.2.3
Comment
MAC header – processed
by main address filter.
IPX
Must match receive
address 0.
IPv6 Neighbor Discovery Filter
In IPv6, a neighbor discovery packet is used for address resolution. A flexible filter can be used to check
for a neighborhood discovery packet.
Intel® 82576 GbE Controller
Datasheet
208
320961-015EN
Revision: 2.61
December 2010
Power Management — Intel® 82576 GbE Controller
5.6.3.3
Wake Up Packet Storage
The 82576 saves the first 128 bytes of the wake-up packet in its internal buffer, which can be read
through the Wake Up Packet Memory (WUPM) register after the system wakes up.
§§
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
209
Intel® 82576 GbE Controller — Power Management
NOTE:
This page intentionally left blank.
Intel® 82576 GbE Controller
Datasheet
210
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.0
Non-Volatile Memory Map - EEPROM
6.1
EEPROM General Map
Table 6-1 lists the 82576 EEPROM map.
Table 6-1.
#
EEPROM Map
Used By
00
HW
01
HW
02
HW
03
High Byte
Low Byte
LAN
Section 6.2.1, Ethernet Address (Words 0x00:02)
Both
SW
Section 6.10.1, Compatibility (Word 0x03)
Both
04
SW
Section 6.10.2, OEM specific (Word 0x04)
Both
05
SW
Section 6.10.4, EEPROM Image Revision (Word 0x05)
Both
06:07
SW
Section 6.10.3, OEM Specific (Word 0x06, 0x07)
Both
08:09
SW
Section 6.10.5, PBA Number Module (Word 0x08, 0x09)
0A
HW
Section 6.2.2, Initialization Control Word 1 (Word 0x0A)
Both
0B
HW
Section 6.2.3, Subsystem ID (Word 0x0B)
Both
0C
HW
Section 6.2.4, Subsystem Vendor ID (Word 0x0C)
Both
0D
HW
Section 6.2.5, Device ID (Word 0x0D, 0x11)
LAN0
0E
HW
Reserved
0F
HW
Section 6.2.7, Initialization Control Word 2 LAN1 (Word 0x0F)
Both
10
HW
Section 6.2.8, Software Defined Pins Control LAN1 (Word 0x10)
LAN1
11
HW
Section 6.2.5, Device ID (Word 0x0D, 0x11)
LAN1
12
HW
Section 6.2.10, EEPROM Sizing and Protected Fields (Word 0x12)
Both
13
HW
Reserved
14
HW
Section 6.2.12, Initialization Control 3 (Word 0x14, 0x24)
LAN1
15
HW
Section 6.2.13, PCIe Completion Timeout Configuration (Word 0x15)
Both
16
HW
Section 6.2.14, MSI-X Configuration (Word 0x16)
Both
17
FW
Section 6.3.1, Analog Configuration Pointers Start Address (Offset 0x17)
Both
18
HW
Section 6.2.15, PCIe Init Configuration 1 Word (Word 0x18)
Both
19
HW
Section 6.2.16, PCIe Init Configuration 2 Word (Word 0x19)
Both
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
211
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
Table 6-1.
EEPROM Map (Continued)
1A
HW
Section 6.2.17, PCIe Init Configuration 3 Word (Word 0x1A)
Both
1B
HW
Section 6.2.18, PCIe Control (Word 0x1B)
Both
1C
HW
Section 6.2.19, LED 1,3 Configuration Defaults (Word 0x1C, 0x2A)
1D
HW
Section 6.2.6, Dummy Device ID (Word 0x1D)
Both
1E
HW
Section 6.2.20, Device Rev ID (Word 0x1E)
Both
1F
FW
Section 6.2.21, LED 0,2 Configuration Defaults (Word 0x1F, 0x2B)
LAN 0
20
HW
Section 6.2.9, Software Defined Pins Control LAN0 (Word 0x20)
LAN 0
21
HW
Section 6.2.22, Functions Control (Word 0x21)
Both
22
HW
Section 6.2.23, LAN Power Consumption (Word 0x22)
Both
23
HW
Section 6.5.9, Management HW Config Control (Word 0x23)
Both
24
HW
Section 6.2.12, Initialization Control 3 (Word 0x14, 0x24)
25
HW
Section 6.2.24, I/O Virtualization (IOV) Control (Word 0x25)
Both
26
HW
Section 6.2.25, IOV Device ID (Word 0x26)
Both
27
HW
Reserved
28
HW
Reserved
29
HW
Reserved
2A
HW
Section 6.2.19, LED 1,3 Configuration Defaults (Word 0x1C, 0x2A)
LAN 1
2B
HW
Section 6.2.21, LED 0,2 Configuration Defaults (Word 0x1F, 0x2B)
LAN 1
2C
HW
Section 6.2.26, End of Read-Only (RO) Area (Word 0x2C)
Both
2D
HW
Section 6.2.27, Start of RO Area (Word 0x2D)
Both
2E
HW
Section 6.2.28, Watchdog Configuration (Word 0x2E)
Both
2F
OEM
Section 6.2.29, VPD Pointer (Word 0x2F)
30
PXE
Section 6.10.6.1, Main Setup Options PCI Function 0 (Word 0x30)
31
PXE
Section 6.10.6.2, Configuration Customization Options PCI Function 0 (Word
0x31)
32
PXE
Section 6.10.6.3, PXE Version (Word 0x32)
33
PXE
Section 6.10.6.4, IBA Capabilities (Word 0x33)
34
PXE
Section 6.10.6.5, Setup Options PCI Function 1 (Word 0x34)
35
PXE
Section 6.10.6.6, Configuration Customization Options PCI Function 1 (Word
0x35)
36
HW
Section 6.10.6.7, iSCSI Option ROM Version (Word 0x36)
38
PXE
Section 6.10.6.8, Setup Options PCI Function 2 (Word 0x38)
39
PXE
Section 6.10.6.9, Configuration Customization Options PCI Function 2 (Word
0x39)
3A
PXE
Section 6.10.6.10, Setup Options PCI Function 3 (Word 0x3A)
3B
PXE
Section 6.10.6.11, Configuration Customization Options PCI Function 3 (Word
0x3B)
Intel® 82576 GbE Controller
Datasheet
212
LAN 0
LAN 0
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Table 6-1.
EEPROM Map (Continued)
3D
iSCSI
3F
PXE
Section 6.10.9, Checksum Word (Word 0x3F)
40
HW
Section 6.2.30, NC-SI Arbitration Enable (Word 0x40)
41
HW
Reserved
42
SW
Section 6.10.10, Image Unique ID (Word 0x42, 0x43)
43
SW
Section 6.10.10, Image Unique ID (Word 0x42, 0x43)
44:4F
HW
Reserved
50:5XX
FW
Section 6.5, Firmware Pointers & Control Words
6.2
Section 6.10.8, Alternate MAC Address Pointer (Word 0x37)
MNG
Hardware Accessed Words
This section describes the EEPROM words that are loaded by 82576 hardware. Most of these bits are
located in configuration registers. The words are only read and used if the signature field in the
EEPROM Sizing & Protected Fields (word 0x12) is valid.
Note:
There are two values given for many locations. One value is the hardware default (no
EEPROM present). The other is an example of a value loaded from EEPROM (the values used
are from the 82576_dev_start_No_Mgmt_Copper_A1 image). Depending on the image
loaded, the value you see may be different. The EEPROM values provided are illustrations.
Pointers and inactive areas have not been transcribed.
6.2.1
Ethernet Address (Words 0x00:02)
The Ethernet Individual Address (IA) is a 6-byte field that must be unique for each NIC, and thus
unique for each copy of the EEPROM image.
The first three bytes are vendor specific. For example, the IA is equal to [00 AA 00] or [00 A0 C9] for
Intel products. The value from this field is loaded into the Receive Address Register 0 (RAL0/RAH0).
For the purpose of this specification, the IA byte numbering convention is indicated as follows:
IA Byte / Value
Vendor
1
2
3
4
5
6
Intel Original
00
AA
00
variable
variable
variable
Intel New
00
A0
C9
variable
variable
variable
The Ethernet address is loaded for LAN0 and bit 41 (8th MSB) is inverted for LAN1 (bit 0 byte 6 in the
EEPROM = bit 8 in EEPROM word 0x2).
6.2.2
Initialization Control Word 1 (Word 0x0A)
The first word read by the 82576 contains initialization values that:
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
213
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
• Set defaults for some internal registers
• Enable/disable specific features
• Determine which PCI configuration space values are loaded from the EEPROM
Hardware
Default
Loaded from
EEPROM:1
0x046B
Description
Bit
Name
15:1
3
Reserved
0b
000b
Reserved - must be zero.
12
Reserved
0b
0b
Reserved - must be zero.
11
FRCSPD
0b
0b
Default setting for the Force Speed bit in the Device Control
register (CTRL[11]). See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
10
FD
0b
1b
Default setting for duplex setting. Mapped to CTRL[0]. See
Section 8.2.1, Device Control Register - CTRL (0x00000; R/
W).
9
Reserved
1b
0b
Reserved - should be set to zero.
8:7
Reserved
0b
0b
Reserved - must be zero
6
SDP_IDDQ_
EN
0b
1b
When set, SDP keeps their value and direction when the
82576 enters dynamic IDDQ mode. Otherwise, SDP moves to
HighZ + pull-up mode in dynamic IDDQ mode. Reflected in
EEDIAG (See Section 8.4.5, EEPROM Diagnostic - EEDIAG
(0x01038; RO)).
5
Deadlock
Timeout
Enable
1b
1b
If set, a device granted access to the EEPROM or Flash that
does not toggle the interface for more than 1 second might
have the grant revoked. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
4
ILOS
0b
0b
Default setting for the loss-of-signal polarity setting for
CTRL[7]. Section 8.2.1, Device Control Register - CTRL
(0x00000; R/W).
3
Power
Management
1b
1b
Reserved - must be one.
1b = Full support for power management (For normal
operation, this bit must be set to 1b). Must be one for
normal power management operation.
See Section 9.5.1, PCI Power Management Registers.
2
Reserved
0b
0b
Reserved - must be zero.
1
Load
Subsystem
IDs
1b
1b
This bit, when set to 1b, indicates that the 82576 is to load its
PCIe Subsystem ID and Subsystem Vendor ID from the
EEPROM (words 0x0B, 0x0C).
0
Load Vendor/
Device IDs
1b
1b
This bit, when set to 1b, indicates that the 82576 is to load its
PCIe Device IDs from the EEPROM (words 0x0D, 0x11, 0x1D,
0x26).
1.
Example EEPROM values are from the 82576_dev_start_No_Mgmt_Copper_A1 image. As there are numerous images, your
values may differ.
6.2.3
Subsystem ID (Word 0x0B)
If the Load Subsystem IDs in word 0x0A is set, this word is read in to initialize the Subsystem ID. See
See Section 9.4.14.
• Hardware Default: 0x0000; Loaded from sample EEPROM: 0x0000.
Intel® 82576 GbE Controller
Datasheet
214
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.2.4
Subsystem Vendor ID (Word 0x0C)
If the Load Subsystem IDs in word 0x0A is set, this word is read in to initialize the Subsystem Vendor
ID. The default value is 0x8086. See Section 9.4.13, Subsystem Vendor ID Register (0x2C; RO).
• Hardware Default 0x8086; loaded from sample EEPROM 0x0000.
6.2.5
Device ID (Word 0x0D, 0x11)
If the Load Subsystem IDs in word 0x0A is set, this word is read in to initialize the Device ID of LAN0,
and LAN1 functions, respectively. The default value is 10C9. See Section 9.4.3, Command Register
(0x4; R/W).
• Hardware Default: 0x10C9; loaded from sample EEPROM: 0x10C9.
6.2.6
Dummy Device ID (Word 0x1D)
If the Load Subsystem IDs in word 0x0A is set, this word is read in to initialize the Device ID of dummy
devices. The default value is 0x10A6. See Section 9.4.1, Vendor ID Register (0x0; RO).
• Hardware Default: 0x10A6; loaded from sample EEPROM:0x10A6.
6.2.7
Initialization Control Word 2 LAN1 (Word 0x0F)
This is the second word read by the 82576 and contains additional initialization values that:
• Set defaults for some internal registers
• Enable/disable specific features
Hardware
Default
Loaded from
EEPROM:
0xF14B
Bit
Name
15
APM PME#
Enable
0b
1b
Initial value of the Assert PME On APM Wakeup bit in the
Wake Up Control (WUC.APMPME) register. See
Section 8.20.1, Wakeup Control Register - WUC (0x05800;
R/W).
14
PCS parallel
detect
1b
1b
Enables PCS parallel detect.
13:12
Pause
Capability
11b
11b
11
ANE
0b
0b
Description
Mapped to PCS_LCTL.AN TIMEOUT EN bit. See
Section 8.18.2, PCS Link Control - PCS_LCTL (0x04208; RW)
Desired pause capability for advertised configuration base
page. Mapped to PCS_ANADV.ASM. See Section 8.18.4, AN
Advertisement - PCS_ANADV (0x04218; R/W).
Auto-Negotiation Enable.
Mapped to PCS_LCTL.AN_ENABLE. See Section 8.18.2, PCS
Link Control - PCS_LCTL (0x04208; RW).
10:8
FLASH Size
Indication
000b
001b
Indicates Flash size according to the following equation:
•
Size = 64 KB * 2**(Flash Size Indication field). From 64
KB up to 8 MB in powers of 2.
The Flash size impacts the requested memory space for the
Flash and Expansion ROM BARs in PCIe configuration space.
7
DMA clock
gating
enabled
320961-015EN
Revision: 2.61
December 2010
1b
0b
Enables automatic reduction of DMA and MAC frequency.
Mapped to STATUS[31]. This bit is relevant only if the L1
indication enable bit is set. See Section 8.2.2, Device Status
Register - STATUS (0x00008; R).
Intel® 82576 GbE Controller
Datasheet
215
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6
PHY Power
Down Enable
1b
5
Reserved
0b
0b
Reserved - must be zero.
4
CCM PLL
Shutdown
Enable
0b
0b
When set, enables shutting down the CCM PLL in low-power
states when the PHY is powered down (such as link
disconnect). When cleared, the CCM PLL is not shut down in
a low-power state. Reflected in EEDIAG (See Section 8.4.5,
EEPROM Diagnostic - EEDIAG (0x01038; RO)).
3
L1 indication
Enable
0b
1b
When set, enables idle indication to L1 mechanism.
SerDes Low
Power Enable
0b
2
1b
When set, enables the PHY to enter a low-power state.
This bit is mapped to CTRL_EXT[20]. See Section 8.2.3,
Extended Device Control Register - CTRL_EXT (0x00018; R/
W).
See Section 8.2.3, Extended Device Control Register CTRL_EXT (0x00018; R/W)
0b
When set, enables the SerDes to enter a low power state
when the function is in Dr state.
See Chapter 5.0 and Section 8.2.3, Extended Device Control
Register - CTRL_EXT (0x00018; R/W).
1
SPD Enable
1b
1b
Smart Power Down.
When set, enables PHY Smart Power Down mode. See
Section 3.5.7.6.5, Smart Power-Down (SPD). This bit is
loaded to each of the PHYs, only when the LAN1_OEM_DIS
and LAN0_OEM_DIS bits (word 0x23 bits 8:7) are cleared.
0
LPLU
1b
1b
Low Power Link Up.
Enables a decrease in link speed in non-D0a states when
power policy and power management states dictate it. See
Section 3.5.7.6.4, Low Power Link Up - Link Speed Control.
This bit is loaded to each of the PHYs only when LAN0/1 OEM
Bits Disable (word 0x23 bit 8:7) are cleared.
6.2.8
Software Defined Pins Control LAN1 (Word 0x10)
This word is used to configure initial settings for the Software Definable Pins (SDPs) for LAN1.
Bit
Name
15
SDPDIR[3]
Hardware
Default
0b
Loaded from
EEPROM:
0xE30C
1b
Description
SDP3 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP3_IODIR bit in the Extended Device Control (CTRL_EXT)
register following power up. See Section 8.2.3, Extended
Device Control Register - CTRL_EXT (0x00018; R/W).
14
SDPDIR[2]
0b
1b
SDP2 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP2_IODIR bit in the Extended Device Control (CTRL_EXT)
register following power up. See Section 8.2.3, Extended
Device Control Register - CTRL_EXT (0x00018; R/W).
13
PHY_in_LAN
_disable
0b
1b
Determines the behavior of the MAC and PHY when a LAN
port is disabled through an external pin.
0b = MAC and PHY are kept functional in LAN Disable (to
support manageability).
1b = MAC and PHY are powered down in LAN Disable
(manageability cannot access the network through this
port).
Intel® 82576 GbE Controller
Datasheet
216
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
12
Reserved
0b
0b
11
LAN
DISABLE
SELECT
0b
0b
LAN PCI
DISABLE
0b
SDPDIR[1]
0b
10
9
Reserved - must be zero.
LAN Disable.
When set to 1b, the appropriate LAN is disabled.
0b
LAN PCI Disable.
When set to 1b, the appropriate LAN PCI function is disabled.
For example, the LAN is functional for manageability
operation but is not connected to the host through the PCIe
interface. Reflected in EEDIAG. See Section 8.4.5, EEPROM
Diagnostic - EEDIAG (0x01038; RO).
1b
SDP1 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP1_IODIR bit in the Device Control (CTRL) register
following power up. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
8
SDPDIR[0]
0b
1b
SDP0 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP0_IODIR bit in the Device Control (CTRL) register
following power up. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
7
SDPVAL[3]
0b
0b
SDP3 Pin – Initial Output Value.
This bit configures the initial power on value output on SDP3
(when configured as an output) by configuring the initial
hardware value of the SDP3_DATA bit in the Extended Device
Control (CTRL_EXT) register after power up. See
Section 8.2.3, Extended Device Control Register - CTRL_EXT
(0x00018; R/W).
6
SDPVAL[2]
0b
0b
SDP2 Pin – Initial Output Value.
This bit configures the initial power-on value output on SDP2
(when configured as an output) by configuring the initial
hardware value of the SDP2_DATA bit in the Extended Device
Control (CTRL_EXT) register after power up. See
Section 8.2.3, Extended Device Control Register - CTRL_EXT
(0x00018; R/W).
5
WD_SDP0
0b
0b
When set, SDP[0] is used as a watchdog timeout indication.
When reset, it is used as an SDP (as defined in bits 8 and 0).
See Section 8.2.1, Device Control Register - CTRL (0x00000;
R/W).
4
Giga Disable
0b
0b
When set, GbE operation is disabled.
A usage example for this bit is to disable GbE operation if
system power limits are exceeded. This bit is loaded to the
PHY only when LAN1_OEM_DIS (word 0x23 bit 8) is cleared.
3
2
Disable 1000
in non-D0a
0b
D3COLD_
WAKEUP_
ADVEN
1b
320961-015EN
Revision: 2.61
December 2010
1b
Disables 1000 Mb/s operation in non-D0a states.
This bit is loaded to the PHY only when LAN1_OEM_DIS
(word 0x23 bit 8) is cleared. See Section 3.5.7.6.4, Low
Power Link Up - Link Speed Control.
1b
Configures the initial hardware default value of the
ADVD3WUC bit in the Device Control (CTRL) register
following power up.See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
Intel® 82576 GbE Controller
Datasheet
217
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
1
SDPVAL[1]
0b
0b
SDP1 Pin – Initial Output Value.
This bit configures the initial power on value output on SDP1
(when configured as an output) by configuring the initial
hardware value of the SDP1_DATA bit in the Device Control
(CTRL) register after power up.See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
0
SDPVAL[0]
0b
0b
SDP0 Pin – Initial Output Value.
This bit configures the initial power-on value output on SDP0
(when configured as an output) by configuring the initial
hardware value of the SDP0_DATA bit in the Device Control
(CTRL) register after power up.See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
6.2.9
Software Defined Pins Control LAN0 (Word 0x20)
This word is used to configure initial settings for the Software Definable Pins (SDPs) for LAN0.
Bit
Name
15
SDPDIR[3]
Hardware
Default
0b
Loaded from
EEPROM:
0xE30C
1b
Description
SDP3 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP3_IODIR bit in the Extended Device Control (CTRL_EXT)
register following power up. See Section 8.2.3, Extended
Device Control Register - CTRL_EXT (0x00018; R/W).
14
SDPDIR[2]
0b
1b
SDP2 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP2_IODIR bit in the Extended Device Control (CTRL_EXT)
register following power up. See Section 8.2.3, Extended
Device Control Register - CTRL_EXT (0x00018; R/W).
13
PHY_in_LAN
_disable
0b
1b
Determines the behavior of the MAC and PHY when a LAN
port is disabled through an external pin.
0b = MAC and PHY are kept functional in LAN disable (to
support manageability).
1b = MAC and PHY are powered down in LAN disable
(manageability cannot access the network through this
port). Reflected in EEDIAG. See Section 8.4.5, EEPROM
Diagnostic - EEDIAG (0x01038; RO).
12:10
Reserved
0b
0b
9
SDPDIR[1]
0b
0b
Reserved - must be zero.
SDP1 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP1_IODIR bit in the Device Control (CTRL) register
following power up. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
8
SDPDIR[0]
0b
0b
SDP0 Pin – Initial Direction.
This bit configures the initial hardware value of the
SDP0_IODIR bit in the Device Control (CTRL) register
following power up. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
Intel® 82576 GbE Controller
Datasheet
218
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
7
SDPVAL[3]
0b
1b
SDP3 Pin – Initial Output Value.
This bit configures the initial power-on value output on SDP3
(when configured as an output) by configuring the initial
hardware value of the SDP3_DATA bit in the Extended
Device Control (CTRL_EXT) register after power up. See
Section 8.2.3, Extended Device Control Register - CTRL_EXT
(0x00018; R/W).
6
SDPVAL[2]
0b
1b
SDP2 Pin – Initial Output Value.
This bit configures the initial power-on value output on SDP2
(when configured as an output) by configuring the initial
hardware value of the SDP2_DATA bit in the Extended
Device Control (CTRL_EXT) register after power up. See
Section 8.2.3, Extended Device Control Register - CTRL_EXT
(0x00018; R/W).
5
WD_SDP0
0b
0b
4
Giga Disable
0b
0b
When set, SDP[0] is used as a watchdog timeout indication.
When reset, it is used as an SDP (as defined in bits 8 and 0).
See Section 8.2.1, Device Control Register - CTRL
(0x00000; R/W).
When set, GbE operation is disabled.
A usage example for this bit is to disable GbE operation if
system power limits are exceeded. This bit is loaded to the
PHY only when LAN0_OEM_DIS (word 0x23 bit 7) is cleared.
3
Disable 1000
in non-D0a
0b
0b
2
D3COLD_WA
KEUP_ADVE
N
1b
0b
1
SDPVAL[1]
0b
1b
Disables 1000 Mb/s operation in non-D0a states.
This bit is loaded to the PHY only when LAN0_OEM_DIS
(word 0x23 bit 7) is cleared. See Section 3.5.7.6.4, Low
Power Link Up - Link Speed Control.
Configures the initial hardware default value of the
ADVD3WUC bit in the Device Control (CTRL) register
following power up. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
SDP1 Pin – Initial Output Value.
This bit configures the initial power-on value output on SDP1
(when configured as an output) by configuring the initial
hardware value of the SDP1_DATA bit in the Device Control
(CTRL) register after power up. See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
0
SDPVAL[0]
0b
1b
SDP0 Pin – Initial Output Value.
This bit configures the initial power-on value output on SDP0
(when configured as an output) by configuring the initial
hardware value of the SDP0_DATA bit in the Device Control
(CTRL) register after power up. See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
6.2.10
EEPROM Sizing and Protected Fields (Word 0x12)
Provides indication on EEPROM size and protection.
Note:
If the Enable Protection Bit in this word is set and the signature is valid, the software device
driver has read but no write access to this word via the EEC and EERD registers; In this
case, write access is possible only via an authenticated firmware interface.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
219
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
Hardware
Default
Loaded from
EEPROM:
0x5C00
Bit
Name
15:14
Signature
01b
01b
The Signature field indicates to the 82576 that there is a valid
EEPROM present. If the signature field is 01b, EEPROM read is
performed, otherwise the other bits in this word are ignored,
no further EEPROM read is performed, and default values are
used for the configuration space IDs.
13:10
EEPROM Size
0111b
0111b
These bits indicate the EEPROM’s actual size. Mapped to
EEC[14:11].
Description
0000b = 128 bytes
0001b = 256 bytes
0010b = 512 bytes
0011b = 1 KB
0100b = 2 KB
0101b = 4 KB
0110b = 8 KB
0111b = 16 KB
1000b = 32 KB
1001b = Reserved
1011b = Reserved
See Section 8.4.1, EEPROM/Flash Control Register - EEC
(0x00010; R/W).
9:5
Reserved
00000b
00000b
Reserved - must be zero.
4
Enable
EEPROM
Protection
0b
0b
If set, all EEPROM protection schemes are enabled.
3:0
HEPSize
0000b
0000b
Hidden EEPROM Block Size.
This field defines the area at the end of the EEPROM memory
accessible only by manageability firmware. It can be used to
store secured data and other manageability functions. The
size in bytes of the secured area equals:
0 bytes if HEPSize equals zero
2^ HEPSize bytes else (for example, 2 B, 4 B, …32 KB)
6.2.11
Reserved (Word 0x13)
• Hardware Default: 0x0; loaded from EEPROM: 0x0.
Intel® 82576 GbE Controller
Datasheet
220
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.2.12
Initialization Control 3 (Word 0x14, 0x24)
This word controls general initialization values.
Bit
Name
15
SerDes
Energy
Source
Hardware
Default
0b
Word 0x14
Loaded from
EEPROM:
0x8C00
1b
Word 0x24
Loaded from
EEPROM:
0x8400
1b
Description
SerDes Energy Source Detection.
When set to 0b, internal SerDes Rx
electrical idle indication.
When set to 1b, external LOS signal.
This bit also indicates the source of the
signal detect while establishing a link in
SerDes mode.
This bit sets the default value of the
CONNSW.ENRGSRC bit. See Section 8.2.6,
Copper/Fiber Switch Control - CONNSW
(0x00034; R/W).
14
2 wires SFP
Enable
0b
0b
0b
2 wires interface SFP Enable.
0b = Disabled. When disabled, the 2 wires
I/F pads are isolated.
1b = Enabled.
Used to set the default value of
CTRL_EXT[25]. See Section 8.2.3,
Extended Device Control Register CTRL_EXT (0x00018; R/W).
13
LAN Flash
Disable
12:1
1
Interrupt Pin
1b
0b
0b
A value of 1b disables the Flash logic. Flash
access BAR in the PCI configuration space is
disabled.
00b LAN 0
01b
00b
Controls the value advertised in the
Interrupt Pin field of the PCI Configuration
header for this device/function.
01b LAN 1
The encoding of this field is as follow:
Valueaaa INT LineaaaInterrupt Pin Field
Value
00baaaaaaINTAaaaaaaaaaaaaaaa1
01baaaaaaINTBaaaaaaaaaaaaaaa2
10baaaaaaINTCaaaaaaaaaaaaaaa3
11baaaaaaINTDaaaaaaaaaaaaaaa4
If only a single device/function of the
82576 component is enabled, this value is
ignored and the Interrupt Pin field of the
enabled 82576 reports INTA# usage. See
Section 9.4.18, Interrupt Pin Register
(0x3D; RO).
10
APM Enable
320961-015EN
Revision: 2.61
December 2010
0b
1b
1b
Initial value of Advanced Power
Management Wake Up Enable bit in the
Wake Up Control (WUC.APME) register.
Mapped to CTRL[6] and to WUC[0]. See
Section 8.2.1, Device Control Register CTRL (0x00000; R/W) and Section 8.20.1,
Wakeup Control Register - WUC (0x05800;
R/W).
Intel® 82576 GbE Controller
Datasheet
221
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
9:8
Link Mode
00b
00b
00b
Initial value of Link Mode bits of the
Extended Device Control
(CTRL_EXT.LINK_MODE) register,
specifying which link interface and protocol
is used by the MAC.
00b = MAC operates with internal copper
PHY (1000Base-T).
01b = Reserved.
10b = MAC operates in SGMII mode.
11b = MAC operates in SerDes mode.
See Section 8.2.3, Extended Device Control
Register - CTRL_EXT (0x00018; R/W).
7
LAN Boot
Disable
1b
0b
0b
A value of 1b disables the expansion ROM
BAR in the PCI configuration space.
6:2
Reserved
0b
00000b
0b
Reserved - must be zero.
1
Ext_VLAN
0b
0b
0b
Sets the default for CTRL_EXT[26] bit.
Indicates that additional VLAN is expected
in this system. See Section 8.2.3, Extended
Device Control Register - CTRL_EXT
(0x00018; R/W).
0
Keep_PHY_Li
nk_Up_En
0b
0b
0b
Enables No PHY Reset when the Baseboard
Management Controller (BMC) indicates
that the PHY should be kept on. When
asserted, this bit prevents the PHY reset
signal and the power changes reflected to
the PHY according to the
MANC.keep_PHY_link_up value. This bit
should be set to the same value at both
words (0x14 and 0x24) to reflect the same
option to both LANs.
The following tables lists the different combinations of bits 13 and 7:
Flash Disable (Bit 13)
Boot Disable (Bit 7)
Functionality (Active Windows)
0b
0b
Flash and expansion ROM BARs are active.
0b
1b
Flash BAR is enabled and expansion ROM BAR is disabled.
1b
0b
Flash BAR is disabled and expansion ROM BAR is enabled.
1b
1b
Flash and expansion ROM BARs are disabled.
Intel® 82576 GbE Controller
Datasheet
222
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.2.13
PCIe Completion Timeout Configuration (Word 0x15)
Hardware
Default
Loaded from
EEPROM:
0x0014
Bit
Name
15
Reserved
0b
0b
Reserved - must be zero.
14:12
Reserved
0x0
0x0
Reserved - must be zero.
11:8
Reserved
0x0
0x0
Reserved - must be zero
7
Completion
Timeout
Disable
0b
0b
Description
Disables the PCIe completion timeout mechanism.
0b = Completion timeout enabled.
1b = Completion timeout disabled. See Section 8.6.1, PCIe
Control - GCR (0x05B00; RW).
This bit is relevant only if the GIO cap field in word 0x1A is set
to 01b.
6:5
Completion
Timeout
Value
0x0
0x0
Determines the range of the PCIe completion timeout.
00b = 50 s to 10 ms
01b = 10 ms to 250 ms
10b = 250 ms to 4 s
11b = 4 s to 64 s
See Section 9.5.5.12, Device Control 2 Register (0xC8; RW).
This field is relevant only if the GIO cap field in word 0x1A is
set to 01b.
4
Completion
Timeout
Resend
1b
1b
When set, enables to resend a request once the completion
timeout expired
0b = Do not re-send request on completion timeout.
1b = Re-send request on completion timeout. See
Section 9.5.5.12, Device Control 2 Register (0xC8; RW)
3:0
Reserved
6.2.14
0100b
0100b
Reserved.
MSI-X Configuration (Word 0x16)
Hardware
Default
Bit
Name
15:1
1
MSI_X0_N
0x9
Loaded from
EEPROM:
0x4A40
0x9
Description
This field specifies the number of entries in MSI-X tables of
LAN 0.
The range is 0-24. MSI_X_N is equal to the number of entries
minus one. See Section 9.5.3.3, Message Control Register
(0x72; R/W).
10:6
MSI_X1_N
0x9
0x9
This field specifies the number of entries in MSI-X tables of
LAN 1.
The range is 0-24. MSI_X_N is equal to the number of entries
minus one. See Section 9.5.3.3, Message Control Register
(0x72; R/W).
5:0
Reserved
6.2.15
320961-015EN
Revision: 2.61
December 2010
0x0
0 0000
Reserved - must be zero.
PCIe Init Configuration 1 Word (Word 0x18)
Intel® 82576 GbE Controller
Datasheet
223
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
This word is used to:
• Set defaults for some internal registers.
• Enable/disable specific features.
Hardware
Default
Loaded from
EEPROM:
0x6CF6
Bit
Name
15
Reserved
0b
0b
Reserved - must be zero.
14:1
2
L1_Act_Ext_L
atency
0x6
(32 ms to 64
ms)
0x6
(32 ms to 64
ms)
L1 active exit latency for the configuration space. See
Section 9.5.5.7, Link CAP Register (0xAC; RO).
11:9
L1_Act_Acc_L
atency
0x6
(32 ms to 64
ms)
0x6
(32 ms to 64
ms)
L1 active acceptable latency for the configuration space.
L0s_Acc_Late
ncy
0x3
(512 ns)
0x3
(512 ns)
L0s acceptable latency for the configuration space.
0x6
L0s exit latency for active state power management (separated
reference clock) – (latency between 64 ns –
128 ns).
8:6
5:3
L0s_Se_Ext_
Latency
0x6
Description
See Section 9.5.5.4, Device Capability Register (0xA4; RW).
See Section 9.5.5.4, Device Capability Register (0xA4; RW).
See Section 9.5.5.7, Link CAP Register (0xAC; RO).
2:0
L0s_Co_Ext_
Latency
6.2.16
0x5b
0x6
L0s exit latency for active state power management (common
reference clock) – (latency between 64 ns – 128 ns). See
Section 9.5.5.7, Link CAP Register (0xAC; RO).
PCIe Init Configuration 2 Word (Word 0x19)
This word is used to set defaults for some internal PCIe configuration registers.
Hardware
Default
Loaded
from
EEPROM:
0xD7B0
Bit
Name
15
Reserved
1b
1b
14
IO_Sup
1b
1b
Description
Reserved - must be one.
I/O Support (effect I/O BAR request).
When set to 1b, I/O is supported.
13
Reserved
0b
0b
Reserved - must be zero.
12
Serial Number
enable
0b
1b
When set, the PCIe Serial Number capability is exposed in the
configuration space. See Section 9.6.2, Serial Number for details.
11:
8
Reserved
0x7
0x7
Reserved - must be 0111b.
7:0
0xB0
0xB0
Reserved - must be 0xB0.
Reserved
6.2.17
PCIe Init Configuration 3 Word (Word 0x1A)
This word is used to set defaults for some internal registers.
Intel® 82576 GbE Controller
Datasheet
224
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Hardware
Default
Loaded from
EEPRO:
0x0ABE
Bit
Name
15:13
Reserved
0b
000b
12
Cache_Lsize
0b
0b
Description
Reserved - must be zero.
Cache Line Size.
0b = 64 bytes.
1b = 128 bytes.
11:10
GIO_Cap
10b
10b
PCIe Capability Version.
The value of this field is reflected in the two LSBs of the
capability version in the PCIe CAP register (config space –
offset 0xA2).
Note: This is not the PCIe version. It is the PCIe capability
version. This version is a field in the PCIe capability structure
and is not the same as the PCIe version. It changes only when
the content of the capability structure changes. For example,
PCIe 1.0, 1.0a, and 1.1 all have a capability version of one.
PCIe 2.0 has a version two because it added registers to the
capabilities structures. See Section 9.5.5.3, PCIe CAP Register
(0xA2; RO).
9:8
Max Payload
Size
10b
10b
Default packet size.
00b = 128 bytes.
01b = 256 bytes.
10b = 512 bytes.
11b = Reserved.
See Section 9.5.5.4, Device Capability Register (0xA4; RW).
7:6
Lane_Width
10b
10b
Max link width.
00b = 1 lane.
01b = 2 lanes.
10b = 4 lanes.
11b = Reserved.
See Section 9.5.5.7, Link CAP Register (0xAC; RO).
5:4
Reserved
11b
11b
Reserved
3:2
Act_Stat_PM
_Sup
11b
11b
1
Slot_Clock_Cf
g
1b
1b
When set, the 82576 uses the PCIe reference clock supplied
on the connector (for add-in solutions).
0
Reserved
0b
0b
Reserved - must be zero.
Determines support for active state link power management.
Loaded into the PCIe Active State Link PM Support register.
See Section 9.5.5.7, Link CAP Register (0xAC; RO).
010
6.2.18
PCIe Control (Word 0x1B)
Used to configure initial settings for the PCIe default functionality.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
225
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
Hardware
Default
Loaded from
EEPRO:
0x8403
Bit
Name
15
Enable
WAKE#
Assertion
0b
1b
14
Dummy
Function
Enable
0b
0b
13
Reserved
0b
0b
Reserved must be 0.
12
Lane Reversal
Disable
0b
0b
When set, disables the ability to negotiate lane reversal.
11
Reserved
0b
0b
Reserved.
10
Reserved
1b
1b
Reserved.
9:2
Reserved
0b
0b
Reserved must be 0.
1:0
Latency_To_E
nter_L1
11b
11b
Description
Enable WAKE# assertion when PCIe link up.
0b = When function 0 is disabled, it is replaced by function 1.
1b = When function 0 is disabled, it is replaced with a dummy
function.
Period in L0s state before transition into an L1 state:
00b = 64 sec.
01b = 256 sec.
10b = 1 msec
11b = 4 msec
6.2.19
LED 1,3 Configuration Defaults (Word 0x1C, 0x2A)
These EEPROM words specify the hardware defaults for the LEDCTL register fields controlling the LED1
(ACTIVITY indication) and LED3 (LINK_1000 indication) output behaviors. Word 0x1C controls LAN0
LEDs behavior and word 0x2A controls LAN1.
Bit
Name
15
LED3 Blink
Hardware
Default
0b
Word 0x1C
Loaded from
EEPROM:
0x0783
0b
Word 0x2A
Loaded from
EEPROM:
0x0783
0b
Description
Initial value of LED3_BLINK field.
0b = Non-blinking. See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
14
LED3 Invert
0b
0b
0b
Initial value of LED3_IVRT field.
0b = Active-low output. See Section 8.2.1,
Device Control Register - CTRL (0x00000; R/
W).
13
Reserved
0b
0b
0b
Reserved - must be zero.
12
Reserved
0b
0b
0b
Reserved - must be zero.
11:
8
LED3 Mode
0x7
0x7
0x7
Initial value of the LED3_MODE field
specifying what event/state/pattern is
displayed on LED3 (LINK_1000) output. A
value of 0111b (0x7) indicates 1000 Mb/s
operation. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
Intel® 82576 GbE Controller
Datasheet
226
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
7
LED1 Blink
1b
1b
1b
Initial value of LED1_BLINK field.
0b = Non-blinking. See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
6
LED1 Invert
0b
0b
0b
Initial value of LED1_IVRT field.
0b = Active-low output. See Section 8.2.1,
Device Control Register - CTRL (0x00000; R/
W).
5
Reserved
0b
0b
0b
Reserved - must be zero.
4
Reserved
0b
0b
0b
Reserved - must be zero.
3:0
LED1 Mode
0x3
0x3
0x3
Initial value of the LED1_MODE field
specifying what event/state/pattern is
displayed on LED1 (ACTIVITY) output. A
value of 0011b (0x3) indicates the ACTIVITY
state.
See Section 8.2.1, Device Control Register CTRL (0x00000; R/W).
A value of 0x0703 is used to configure default hardware LED behavior equivalent to previous copper
adapters (LED0=LINK_UP, LED1=blinking ACTIVITY, LED2=LINK_100, and LED3=LINK_1000).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
227
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.2.20
Device Rev ID (Word 0x1E)
Hardware
Default
Loaded from
EEPROM:
0x0001
Bit
Name
15
Power Down
Enable
0b
0b
Device Off (dynamic IDDQ) enable/disable bit. See
Section 5.2.4.1, Dr Disable Mode for details. Reflected in
EEDIAG (Section 8.4.5, EEPROM Diagnostic - EEDIAG
(0x01038; RO)).
14
Reserved
0b
0b
Reserved - must be zero.
13
Reserved
0b
0b
Reserved - must be zero.
12
LAN 1 iSCSI
enable
0b
0b
Description
When set, LAN 1 class code is set to 0x010000 (SCSI).
When reset, LAN 1 class code is set to 0x020000 (LAN).
See Section 9.4.5, Revision Register (0x8; RO).
11
LAN 0 iSCSI
enable
0b
0b
When set, LAN 0 class code is set to 0x010000 (SCSI).
When reset, LAN 0 class code is set to 0x020000 (LAN).
See Section 9.4.5, Revision Register (0x8; RO).
10:8
Reserved
0b
0b
Reserved - must be zero.
7:0
DEVREVID
0x1
0x1
Device Revision ID. For the 82576 A1, the default value is one.
See Section 9.4.5, Revision Register (0x8; RO).
6.2.21
LED 0,2 Configuration Defaults (Word 0x1F, 0x2B)
These EEPROM words specify the hardware defaults for the LEDCTL register fields controlling the LED0
(LINK_UP) and LED2 (LINK_100) output behaviors. Word 0x1F controls LAN0 LEDs behavior and word
0x2B controls LAN1.
Bit
Name
15
LED2 Blink
Hardware
Default
0b
Word 0x1B
Loaded from
EEPROM:
0x8403
1b
Word 0x2B
Loaded from
EEPROM:
0x0602
0b
Description
Initial value of LED2_BLINK field.
0b = Non-blinking. See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
14
LED2 Invert
0b
0b
1b
Initial value of LED2_IVRT field.
0b = Active-low output. See Section 8.2.1,
Device Control Register - CTRL (0x00000; R/
W).
13
Reserved
0b
0b
0b
Reserved - must be zero.
12
Reserved
0b
0b
0b
Reserved - must be zero.
11:
8
LED2 Mode
0x6
0x4
0x6
Initial value of the LED2_MODE field
specifying what event/state/pattern is
displayed on LED2 (LINK_100) output. A
value of 0110b (0x6) indicates 100 Mb/s
operation. See Section 8.2.1, Device Control
Register - CTRL (0x00000; R/W).
7
LED0 Blink
0b
0b
0b
Initial value of LED0_BLINK field.
0b = Non-blinking. See Section 8.2.1, Device
Control Register - CTRL (0x00000; R/W).
Intel® 82576 GbE Controller
Datasheet
228
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6
LED0 Invert
0b
0b
0b
Initial value of LED0_IVRT field.
0b = Active-low output. See Section 8.2.1,
Device Control Register - CTRL (0x00000; R/
W).
5
Global Blink
Mode
0b
0b
0b
Global Blink Mode.
0b = Blink at 200 ms on and 200ms off.
1b = Blink at 83 ms on and 83 ms off.
See Section 8.2.1, Device Control Register CTRL (0x00000; R/W).
4
Reserved
0b
0b
0b
Reserved - must be zero.
3:0
LED0 Mode
0x2
0x3
0x2
Initial value of the LED0_MODE field
specifying what event/state/pattern is
displayed on LED0 (LINK_UP) output. A value
of 0010b (0x2) indicates the LINK_UP state.
See Section 8.2.1, Device Control Register CTRL (0x00000; R/W).
A value of 0x0602 is used to configure default hardware LED behavior equivalent to previous copper
adapters (LED0=LINK_UP, LED1=blinking ACTIVITY, LED2=LINK_100, and LED3=LINK_1000).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
229
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.2.22
Functions Control (Word 0x21)
Hardware
Default
Loaded from
EEPROM:
0x2020
Bit
Name
15
NC-SI Clock
Pad Drive
Strength
0b
0b
Defines the drive strength of the NC-SI_CLK_OUT pad. If set,
the driving strength is doubled. See Section 11.4.2.4, NC-SI
Input and Output Pads for details. Reflected in EEDIAG (See
Section 8.4.5, EEPROM Diagnostic - EEDIAG (0x01038; RO)).
14
NC-SI Data
Pad Drive
Strength
0b
0b
Defines the drive strength of the NC-SI_DV, NC-SI_RXD[1:0]
and NC-SI_ARB_OUT pads. If set, the driving strength is
doubled. See Section 11.4.2.4, NC-SI Input and Output Pads for
details.
Description
Reflected in EEDIAG (See Section 8.4.5, EEPROM Diagnostic EEDIAG (0x01038; RO)).
13
NC-SI Output
Clock Disable
0b
1b
If set, the clock source is external. In this case, the NCSI_CLK_OUT pad is kept stable at zero, and the NC-SI_CLK_IN
pad is used as an input source of the clock.
If cleared, the 82576 outputs the NC-SI clock through the NCSI_CLK_OUT pad. The NC-SI_CLK_IN pad is still used as an NCSI clock input.
If NC-SI is not used, then this bit should be set.
If this bit is cleared, the Device Power Down Enable bit in word
0x1E (bit 15) should not be set. Reflected in EEDIAG (See
Section 8.4.5, EEPROM Diagnostic - EEDIAG (0x01038; RO)).
12
LAN Function
Select
0b
0b
11:
10
BAR mapping
00b
00b
When both LAN ports are enabled and LAN Function Sel = 0b,
LAN 0 is routed to PCI Function 0 and LAN 1 is routed to PCI
Function 1. If LAN Function Sel = 1b, LAN 0 is routed to PCI
Function 1 and LAN 1 is routed to PCI Function 0. This bit is
Mapped to FACTPS[30]. See See Section 8.6.4.
00b = 32 bit BARS.
01b = reserved
10b = 64 bit BARs no I/O BAR
11b = 64 bit BARs no flash BAR.
See Section 9.4.11, Base Address Registers (0x10:0x27; R/W).
9
Prefetchable
0b
0b
0b = BARs are marked as non prefetchable.
1b = BARs are marked as prefetchable.
See Section 9.4.11, Base Address Registers (0x10:0x27; R/W).
8:6
Reserved
0b
0b
Reserved - must be zero.
5
Reserved
1b
1b
Reserved - must be one.
4:0
Reserved
0b
0b
Reserved - must be zero.
Intel® 82576 GbE Controller
Datasheet
230
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.2.23
LAN Power Consumption (Word 0x22)
Hardware
Default
Bit
Name
15:
8
LAN D0
Power
0x0
Loaded from
EEPROM
0x1AE5
0x1A
Description
The value in this field is reflected in the PCI Power Management
Data Register of the LAN functions for D0 power consumption
and dissipation (Data_Select = 0 or 4). Power is defined in
100mW units. The power includes also the external logic
required for the LAN function.
See Section 9.5.1.4, Power Management Control / Status
Register - PMCSR (0x44; R/W).
7:5
Function 0
Common
Power
0x0
0x7
The value in this field is reflected in the PCI Power Management
Data register of function 0 when the Data_Select field is set to 8
(common function). The MSBs in the data register that reflects
the power values are padded with zeros.
See Section 9.5.1.4, Power Management Control / Status
Register - PMCSR (0x44; R/W).
4:0
LAN D3
Power
0x0
0x5
The value in this field is reflected in the PCI Power Management
Data register of the LAN functions for D3 power consumption
and dissipation (Data_Select = 3 or 7). Power is defined in 100
mW units. The power also includes the external logic required
for the LAN function. The MSBs in the data register that reflects
the power values are padded with zeros.
See Section 9.5.1.4, Power Management Control / Status
Register - PMCSR (0x44; R/W).
6.2.24
I/O Virtualization (IOV) Control (Word 0x25)
This word controls IOV functionality.
Hardware
Default
Loaded from
EEPROM
0x00F7
Bit
Name
15:8
Reserved
0x0
0x0
Reserved - must be zero.
7:5
Max VFs
0x7
0x7
Defines the value of MaxVF exposed in the IOV structure. Valid
values are 0-7. The value exposed is the value of this field +
one.
4:3
MSI-X table
0x2
0x2
Defines the size of the VF function MSI-X table to request.
Valid values are 0-2.
2
64-bit
Advertisemen
t
1b
1b
Prefetchable
0b
1
Description
0b = VF BARs advertise 32-bit size.
1b = VF BARs advertise 64-bit size.
1b
0b = IOV memory BARS (0 and 3) are declared as non
prefetchable.
1b = IOV memory BARS (0 and 3) are declared as
prefetchable.
0
IOV Enabled
1b
1b
0b = IOV and ARI capability structures are not exposed as part
of the capabilities link list.
1b = IOV and ARI capability structures are exposed as part of
the capabilities link list.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
231
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.2.25
IOV Device ID (Word 0x26)
This word defines the device ID for virtual functions.
Bit
Name
15:0
VDev ID
6.2.26
Hardware
Default
Loaded from
EEPROM:
0x10CA
0x10CA
Description
0x10CA
Virtual function device ID.
End of Read-Only (RO) Area (Word 0x2C)
Defines the end of the area in the EEPROM that is RO.
Hardware
Default
Loaded from
EEPROM:
0x0000
Bit
Name
15
Reserved
0b
0b
Reserved - must be zero.
14:0
EORO_area
0x0
0x0
Defines the end of the area in the EEPROM that is RO. The
resolution is one word and can be up to byte address 0xFFFF
(0x7FFF words). A value of zero indicates no RO area.
6.2.27
Description
Start of RO Area (Word 0x2D)
Defines the start of the area in the EEPROM that is RO.
Hardware
Default
Loaded from
EEPROM:
0x00000
Bit
Name
15
Reserved
0b
0b
Reserved - must be zero.
14:0
SORO_area
0x0
0x0
Defines the start of the area in the EEPROM that is RO. The
resolution is one word and can be up to byte address 0xFFFF
(0x7FFF words).
6.2.28
Description
Watchdog Configuration (Word 0x2E)
Hardware
Default
Loaded from
EEPROM:
0x0000
Bit
Name
15
Watchdog
Enable
0b
0b
Enable watchdog interrupt. See Section 8.16.1, Watchdog
Setup - WDSTP (0x01040; R/W).
14:11
Watchdog
Timeout
0x2
0x0
Watchdog timeout period (in seconds). See Section 8.16.1,
Watchdog Setup - WDSTP (0x01040; R/W).
10:0
Reserved
0x0
0x0
Reserved - must be zero.
6.2.29
Description
VPD Pointer (Word 0x2F)
This word points to the Vital Product Data (VPD) structure. This structure is available for the NIC vendor
to store it's own data. A value of 0xFFFF indicates that the structure is not available.
Intel® 82576 GbE Controller
Datasheet
232
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Bit
Name
15:0
VPD offset
Hardware
Default
0xffff
Loaded from
EEPROM:
0xFFFF
0xffff
Description
Offset to VPD structure in words.
Bits 15:9 must be set to 0 (the VPD area must be in the first
1 Kbyte of EEPROM).
6.2.30
NC-SI Arbitration Enable (Word 0x40)
Loaded from
EEPROM:
0x0001
Bit
Hardware
Default
15:2
0x0
0x0
Reserved - must be 0x0.
1
0b
0b
Reserved - must be 1b.
0
1b
1b
0 = NCSI_ARB_IN and NCSI_ARB_OUT pads are not used. NCSI_ARB_IN is
pulled up internally to provide stable input.
1 = NCSI_ARB_IN and NCSI_ARB_OUT pads are used.
Description
6.3
Analog Blocks Configuration Structures
6.3.1
Analog Configuration Pointers Start Address (Offset 0x17)
Loaded from
EEPROM:
0x0060
Bit(s)
Name
15:0
Address
Note:
0x0060
Description
Defines the word address in the EEPROM of the pointers to the PHY, PCIe, and
SerDes initialization spaces.
Word 0x17 points to the pointers of three configuration blocks: SerDes, PHY, and PCIe.
6.3.2
PCIe Initialization Pointer (Offset 0, Relative to Word 0x17
Value)
Bit
Name
Description
15:0
PCIe Config
Pointer
Defines the location of the PCIe initialization structure. From this location, the PCIe lane all, PCIe
lanes 0/1/2/3, CCM and PLL structures are linked.
6.3.3
PHY Initialization Pointer (Offset 1, Relative to Word 0x17
Value)
Bit
Name
Description
15:0
PHY Config
Pointer
Defines the location of the PHY initialization structure. From this location, the PHY structures are
linked.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
233
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.3.4
SerDes Initialization Pointer (Offset 2, Relative to Word
0x17 Value)
Bit
Name
Description
15:0
SerDes
Config
Pointer
Defines the location of the SerDes initialization structure. From this location, the SerDes
structures are linked.
6.4
SerDes/PHY/PCIe/PLL/CCM Initialization
Structures
6.4.1
Block Header (Offset 0x0)
Bit
Name
15:12
Destination Type
Description
Destination Type.
Defines the module type that this block configures:
0x0h = 802.3 PHY.
0x1h = 802.3 SerDes.
0x2h = CCM, GBE PLL.
0x3h = PCIe lane all; write to all four PCIe lanes together.
0x4h = PCIe PLL.
0x5h = PCIe lane 0.
0x6h = PCIe lane 1.
0x7h = PCIe lane 2.
0x8h = PCIe lane 3.
11:10
Next Block
Next Block.
00b = The next configuration block proceeds at the end of this one.
01b = This is the last configuration block.
10b = The next configuration block starts at an offset defined by the NBP
(second, optional header word).
11b = Reserved.
9:8
Core Destination
Indicates the port to be accessed.
00b = LAN0.
01b = LAN1.
10b = Both cores. This block should be written for both LAN 0 and LAN 1.
This field is relevant only if the destination is 802.3 PHY or SerDes blocks.
7:0
Word Count
Intel® 82576 GbE Controller
Datasheet
234
Size of this structure.
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.4.2
CRC8 (Offset 1)
Bit
Name
Description
15:8
Block CRC
CRC8.
7:0
Reserved
Reserved - must be zero.
6.4.3
Next Buffer Pointer (Offset 2 - Optional)
Bit
Name
Description
15:0
NBP
Pointer to the starting word of the next configuration block.
6.4.4
Address/Data (Offset 3:Word Count)
Bit
Name
Description
15:8
Address
Internal register address that are written to. Refer to the following table.
7:0
Data
Data to write.
ID
Structure Type
Register to Use
Address
0
PHY
MDIC
0x20
1
SerDes
SERDESCTL
0x24
2
CCM
CCMCTL
0x5B48
3
All lanes
GIOANACTLALL
0x5B44
4
PCIe PLL
SCCTL
0x5B4C
5
Lane 0
GIOANACTL0
0x5B34
6
Lane 1
GIOANACTL1
0x5B38
7
Lane 2
GIOANACTL2
0x5B3C
8
Lane 3
GIOANACTL3
0x5B40
For the PHY configuration structure, the description for the configuration words is as follows:
Bit
Name
Description
15:0
MDIC Value
Even words: bits 15:0; Odd words: bits 31:16.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
235
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.5
Firmware Pointers & Control Words
Words 0x51:0x52 are used to point to load & no manageability patches and the test structure. Words
0x55:0x57 are used to point to firmware structures specific to PT. Words 0x54 & 0x23 control some
aspects of the FW functionality.
A value of zero for a pointer indicates the relevant structure is not present in the EEPROM.
6.5.1
Loader Patch Pointer (Word 0x51)
Bit
Name
Description
15:0
Pointer
Pointer to loader patch structure. See Section 6.6, Patch Structure for
details of the structure.
6.5.2
No Manageability Patch Pointer (Word 0x52)
Bit
Name
Description
15:0
Pointer
Pointer to no manageability patch structure. See Section 6.6, Patch
Structure for details of the structure.
6.5.3
Manageability Capability/Manageability Enable (Word
0x54)
Hardware
Default
Loaded from
EEPROM:
0x0000
Bit
Name
15
Reserved
0b
0b
Reserved - must be zero.
14
Redirection
Sideband
Interface
0b
0b
0b = SMBus.
13:11
Reserved
0x0
0x0
10:8
Manageability
Mode
0x0
0x0
Description
1b = NC-SI.
Reserved - must be zero.
0x0 = None.
0x1 = Reserved.
0x2 = Pass Through (PT) mode.
0x3 = Reserved.
0x4 = Host interface enable only.
0x5:0x7 = Reserved.
7
Port1
Manageability
Capable
0b
0b
6
Port0
Manageability
Capable
0b
0b
Intel® 82576 GbE Controller
Datasheet
236
0 = Not capable.
1 = Bits 3 is applicable to port 1.
0 = Not capable.
1 = Bits 3 is applicable to port 0.
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
5:4
Reserved
0b
0b
3
Pass Through
Capable
0b
0b
Reserved
0x0
2:0
6.5.4
Reserved - must be zero.
0b = Disable.
1b = Enable.
0x0
Reserved - must be zero.
PT Patch Configuration Pointer (Word 0x55)
Bit
Name
Description
15:0
Pointer
Pointer to the PT patch configuration pointer structure. See
Section 6.6, Patch Structure for details of the structure.
6.5.5
PT LAN0 Configuration Pointer (Word 0x56)
Bit
Name
Description
15:0
Pointer
Pointer to the PT LAN0 configuration pointer structure.See
Section 6.7, PT LAN Configuration Structure for details of the
structure.
6.5.6
Sideband Configuration Pointer (Word 0x57)
=
Bit
Name
Description
15:0
Pointer
Pointer to the Sideband configuration pointer structure. See
Section 6.8, Sideband Configuration Structure for details of the
structure.
6.5.7
Flex TCO Filter Configuration Pointer (Word 0x58)
Bit
Name
Description
15:0
Pointer
Pointer to the flex TCO configuration pointer structure. See
Section 6.9, Flex TCO Filter Configuration Structure for details
of the structure.
6.5.8
PT LAN1 Configuration Pointer (Word 0x59)
Bit
Name
Description
15:0
Pointer
Pointer to the PT LAN1 configuration pointer structure. See
Section 6.7, PT LAN Configuration Structure for details of the
structure.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
237
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.5.9
Management HW Config Control (Word 0x23)
This word contain bits that direct firmware special behavior when configuring the PHY, PCIe, and SerDes
interfaces.
Hardware
Default
Loaded From
EEPROM:
0x0000
Bit
Name
15
LAN1_FTCO_
DIS
0b
0b
LAN1 force TCO reset disable (1b disable; 0b enable).
14
LAN0_FTCO_
DIS
0b
0b
LAN0 force TCO reset disable (1b disable; 0b enable).
13:10
Reserved
0b
0b
Reserved - must be zero.
9
FW Code
Exist
0b
0b
If set, indicates to the firmware that there is firmware
EEPROM code at address 0x50.
8
LAN1_OEM_D
IS
0b
0b
LAN1 OEM bits configuration disable.
7
LAN0_OEM_D
IS
0b
0b
LAN0 OEM bits configuration disable.
6
CRC_DIS
0b
0b
PHY, SerDes, and PCIe CRC disable.
5
LAN1_ROM_D
IS
0b
0b
LAN0_ROM_D
IS
0b
3
MNG_wake_c
heck_dis
0b
0b
When set, indicates that the firmware to always
configure the PHY after power-up without checking if
manageability or wake-up is enabled.
2
PCIe ROM
Disable
0b
0b
When set, indicates to firmware not to configure the
PCIe from the ROM tables.
1
PHY ROM
Disable
0b
1b
When set, indicates to firmware not to configure the
PHY of both ports from the ROM tables.
0
SerDes ROM
Disable
0b
0b
When set, indicates to firmware not to configure the
SerDes of both ports from the ROM tables.
4
6.6
Description
LAN1 ROM Disable.
Disables PHY and SerDes ROM configuration for port 1.
0b
LAN0 ROM Disable.
Disables PHY and SerDes ROM configuration for port 0.
Patch Structure
This structure is used for all the patches in different modes: loader, no manageability, and pass
through.
6.6.1
Patch Data Size (Offset 0x0)
Bit
Name
15:0
Data Size (Bytes)
Intel® 82576 GbE Controller
Datasheet
238
Description
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.6.2
Block CRC8 (Offset 0x1)
Bit
Name
Description
15:8
Reserved
Reserved - must be zero
7:0
CRC8
6.6.3
Patch Entry Point Pointer Low Word (Offset 0x2)
Bit
Name
15:0
Patch Entry Point Pointer Low
Word
6.6.4
Patch Entry Point Pointer High Word (Offset 0x3)
Bit
Name
15:0
Patch Entry Point Pointer High
Word
6.6.5
Name
15:8
Patch Generation Hour
7:0
Patch Generation Minutes
Description
Patch Version 2 Word (Offset 0x5)
Bit
Name
15:8
Patch Generation Month
7:0
Patch Generation Day
6.6.7
Description
Patch Version 1 Word (Offset 0x4)
Bit
6.6.6
Description
Description
Patch Version 3 Word (Offset 0x6)
Bit
Name
Description
15:8
Patch Silicon Version
Compatibility
0x00 = A0.
0x01 = A1.
0x10 = B0.
0x11 = B1.
7:0
320961-015EN
Revision: 2.61
December 2010
Patch Generation Year
Intel® 82576 GbE Controller
Datasheet
239
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.6.8
Patch Version 4 Word (Offset 0x7)
Bit
Name
15:8
Patch Major Number
7:0
Patch Minor Number
6.6.9
Description
Patch Data Words (Offset 0x8, Block Length)
Bit
Name
15:0
Patch Firmware Data
6.7
Description
PT LAN Configuration Structure
Used to pre-configure manageability filters so that pass-thru traffic can be received without explicit
configuration by the BMC.
6.7.1
Section Header (Offset 0x0)
Bit
Name
15:8
Block CRC8
7:0
Block Length
6.7.2
Description
LAN0 IPv4 Address 0 LSB, MIPAF0 (Offset 0x01)
This value will be stored in the IPV4ADDR0 register (0x58E0).
Bit
Name
Description
15:8
LAN0 IPv4 Address 0 (Byte 1)
Manageability IP Address Filter (Byte 1).
7:0
LAN0 IPv4 Address 0 (Byte 0)
Manageability IP Address Filter (Byte 0).
6.7.3
LAN0 IPv4 Address 0 MSB, MIPAF0 (Offset 0x02)
This value will be stored in the IPV4ADDR0 register (0x58E0).
Bit
Name
Description
15:8
LAN0 IPv4 Address 0 (Byte 3)
Manageability IP Address Filter (Byte 3).
7:0
LAN0 IPv4 Address 0 (Byte 2)
Manageability IP Address Filter (Byte 2).
Intel® 82576 GbE Controller
Datasheet
240
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.7.4
LAN0 IPv4 Address 1; MIPAF1 (Offset 0x03:0x04)
Same structure as LAN0 IPv4 Address 0.
This value will be stored in the IPV4ADDR1 register (0x58E4).
6.7.5
LAN0 IPv4 Address 2; MIPAF2 (Offset 0x05h:0x06)
Same structure as LAN0 IPv4 Address 0.
This value will be stored in the IPV4ADDR2 register (0x58E8).
6.7.6
LAN0 IPv4 Address 3; MIPAF3 (Offset 0x07h:0x08)
Same structure as LAN0 IPv4 Address 0.
This value will be stored in the IPV4ADDR3 register (0x58EC).
6.7.7
LAN0 MAC Address 0 LSB, MMAL0 (Offset 0x09)
This value will be stored in the MMAL0 register (0x5910).
Bit
Name
Description
15:8
LAN0 MAC Address 0 (Byte 1)
Manageability MAC address low (Byte 1).
7:0
LAN0 MAC Address 0 (Byte 0)
Manageability MAC address low (Byte 0).
6.7.8
LAN0 MAC Address 0 LSB, MMAL0 (Offset 0x0A)
This value will be stored in the MMAL0 register (0x5910).
Bit
Name
Description
15:8
LAN0 MAC Address 0 (Byte 3)
Manageability MAC address low (Byte 3).
7:0
LAN0 MAC Address 0 (Byte 2)
Manageability MAC address low (Byte 2).
6.7.9
LAN0 MAC Address 0 MSB, MMAH0 (Offset 0x0B)
This value will be stored in the MMAH0 register (0x5914).
Bit
Name
Description
15:8
LAN0 MAC Address 0 (Byte 5)
Manageability MAC address high (Byte 5)
7:0
LAN0 MAC Address 0 (Byte 4)
Manageability MAC address high (Byte 4)
6.7.10
LAN0 MAC Address 1; MMAL/H1 (Offset 0x0C:0x0E)
Same structure as LAN0 MAC Address 0.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
241
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
This value will be stored in the MMAL1/MMAH1 registers (0x5918/1C).
6.7.11
LAN0 MAC Address 2; MMAL/H2 (Offset 0x0F:0x11)
Same structure as LAN0 MAC Address 0.
This value will be stored in the MMAL2/MMAH2 registers (0x5920/24).
6.7.12
LAN0 MAC Address 3; MMAL/H3 (Offset 0x12:0x14)
Same structure as LAN0 MAC Address 0.
This value will be stored in the MMAL3/MMAH3 registers (0x5928/2C).
6.7.13
LAN0 UDP Flex Filter Ports 0:15; MFUTP Registers (Offset
0x15:0x24)
This value will be stored in the MFUTP register (0x5030 - bits 15:0).
Bit
Name
Description
15:0
LAN UDP Flex Filter Value
Management Flex UDP/TCP port
6.7.14
LAN0 VLAN Filter 0:7; MAVTV Registers (Offset 0x25:0x2C)
This value will be stored in the MAVTV[7:0] registers (0x5010 - 0x502C).
Bit
Name
Description
15:12
Reserved
Reserved - must be zero
11:0
LAN0 VLAN Filter Value
VLAN ID value
6.7.15
LAN0 Manageability Filters Valid; MFVAL LSB (Offset 0x2D)
This value will be stored in the MFVAL register (0x5824).
Bit
Name
Description
15:8
VLAN
Indicates whether or not the VLAN filter registers (MAVTV) contain valid
VLAN tags. Bit 8 corresponds to filter 0, etc.
7:4
Reserved
Reserved - must be zero.
3:0
MAC
Indicates whether or not the MAC unicast filter registers (MMAH and
MMAL) contain valid MAC addresses. Bit 0 corresponds to filter 0, etc.
6.7.16
LAN0 Manageability Filters Valid; MFVAL MSB (Offset 0x2E)
This value will be stored in the MFVAL register (0x5824).
Intel® 82576 GbE Controller
Datasheet
242
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Bit
Name
Description
15:12
Reserved
Reserved - must be zero.
11:8
IPv6
Indicates whether or not the IPv6 address filter registers (MIPAF) contain
valid IPv6 addresses. Bit 8 corresponds to address 0, etc. Bit 11 (filter 3)
applies only when IPv4 address filters are not enabled
(MANC.EN_IPv4_FILTER=0b).
7:4
Reserved
Reserved - must be zero.
3:0
IPv4
Indicates whether or not the IPv4 address filters (MIPAF) contain a valid
IPv4 address. These bits apply only when IPv4 address filters are enabled
(MANC.EN_IPv4_FILTER=1b)
6.7.17
LAN0 MANC Value LSB (Offset 0x2F)
This value will be stored in the MANC register (0x5820).
Bit
Name
Description
15:0
Reserved
Reserved - must be zero.
6.7.18
LAN0 MANC Value MSB (Offset 0x30)
This value will be stored in the MANC register (0x5820).
Bit
Name
Description
15:12
Reserved
Reserved - must be zero.
11
MACSec Mode
When set, only packets that matches one of the following 3 conditions will
be forwarded to the manageability:
10
NET_TYPE
•
The packet is a MACSec packet authenticated and/or decrypted
adequately by the HW.
•
The packet Ethertype matchesMETF[2]
•
The packet Ethertype matches METF[3].
NET TYPE:
0b = pass only un-tagged packets.
1b = pass only VLAN tagged packets.
Valid only if FIXED_NET_TYPE is set.
9
FIXED_NET_TYPE
Fixed net type: If set, only packets matching the net type defined by the
NET_TYPE field passes to manageability. Otherwise, both tagged and untagged packets can be forwarded to the manageability engine.
8
Enable IPv4 Address Filters
When set, the last 128 bits of the MIPAF register are used to store four
IPv4 addresses for IPv4 filtering. When cleared, these bits store a single
IPv6 filter.
7
Enable Xsum Filtering to MNG
When this bit is set, only packets that pass the L3 and L4 checksum are
send to the manageability block.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
243
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6
Bypass VLAN
When set, VLAN filtering is bypassed for MNG packets.
5
Enable MNG Packets to Host
Memory
This bit enables the functionality of the MANC2H register. When set, the
packets that are specified in the MANC2H registers are also sent to host
memory if they pass the manageability filters.
4:0
Reserved
Reserved - must be zero.
6.7.19
LAN0 Receive Enable 1 (Offset 0x31)
Bit
Name
Description
15:8
Receive Enable Byte 12
BMC SMBus slave address.
7
Enable MC Dedicated MAC
6
Reserved
5:4
Notification Method
Always set to 1b.
00b = SMBus alert.
01b = Asynchronous notify.
10b = Direct receive.
11b = Reserved.
3
Enable ARP Response
2
Enable Status Reporting
1
Enable Receive All
0
Enable Receive TCO
6.7.20
LAN0 Receive Enable 2 (Offset 0x32)
Bit
Name
Description
15:8
Receive Enable Byte 14
Alert value.
7:0
Receive Enable Byte 13
Interface value.
6.7.21
LAN0 MANC2H Value LSB (Offset 0x33)
This value will be stored in the MANC2H register (0x5860).
Bit
Name
Description
15:8
Reserved
Must be 0.
7:0
Host Enable
When set, indicates that packets routed by the manageability filters to
manageability are also sent to the host. Bit 0 corresponds to decision rule
0, etc.
6.7.22
LAN0 MANC2H Value MSB (Offset 0x34)
This value will be stored in the MANC2H register (0x5860).
Intel® 82576 GbE Controller
Datasheet
244
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Bit
Name
Description
15:0
Reserved
Reserved - must be zero.
6.7.23
Manageability Decision Filters; MDEF0,1 (Offset 0x35)
This value will be stored in the MDEF0 register (0x5890).
Bit
Name
Description
15:12
Flex Port
Controls the inclusion of flex port filtering in the manageability filter
decision (OR section). Bit 12 corresponds to flex port 0, etc. (see also bits
11:0 of the next word).
11
Port 0x26F
Controls the inclusion of port 0x26F filtering in the manageability filter
decision (OR section).
10
Port 0x298
Controls the inclusion of port 0x298 filtering in the manageability filter
decision (OR section).
9
neighbor Discovery
Controls the inclusion of neighbor discovery filtering in the manageability
filter decision (OR section).
8
ARP Response
Controls the inclusion of ARP response filtering in the manageability filter
decision (OR section).
7
ARP Request
Controls the inclusion of ARP request filtering in the manageability filter
decision (OR section).
6
Multicast
Controls the inclusion of Multicast addresses filtering in the manageability
filter decision (AND section).
5
Broadcast
Controls the inclusion of broadcast address filtering in the manageability
filter decision (OR section).
4
Unicast
Controls the inclusion of unicast address filtering in the manageability filter
decision (OR section).
3
IP Address
Controls the inclusion of IP address filtering in the manageability filter
decision (AND section).
2
VLAN
Controls the inclusion of VLAN addresses filtering in the manageability
filter decision (AND section).
1
Broadcast
Controls the inclusion of broadcast address filtering in the manageability
filter decision (AND section).
0
Unicast
Controls the inclusion of unicast address filtering in the manageability filter
decision (AND section).
6.7.24
Manageability Decision Filters; MDEF0,2 (Offset 0x36)
This value will be stored in the MDEF0 register (0x5890).
Reserved - must be zero
Bit
Name
Description
15:12
Flex TCO
Controls the inclusion of flex TCO filtering in the manageability filter
decision (OR section). Bit 12 corresponds to flex TCO filter 0, etc.
11:0
Flex Port
Controls the inclusion of flex port filtering in the manageability filter
decision (OR section). Bit 11 corresponds to flex port 0, etc. (see also bits
15:12 of the previous word).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
245
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.7.25
Manageability Decision Filters; MDEF0,3 (Offset 0x37)
This value will be stored in the MDEF_EXT0 register (0x5930).
Bit
Name
Description
15:12
Reserved
Reserved - must be zero.
11:8
L2 EtherType OR
L2 EtherType - Controls the inclusion of L2 EtherType filtering in the
manageability filter decision (OR section).
7:4
Reserved
Reserved - must be zero.
3:0
L2 EtherType AND
L2 EtherType - Controls the inclusion of L2 EtherType filtering in the
manageability filter decision (AND section).
6.7.26
Manageability Decision Filters; MDEF0,4 (Offset 0x38)
This value will be stored in the MDEF_EXT0 register (0x5930).
Bit
Name
Description
15:0
Reserved
Reserved - must be zero.
6.7.27
Manageability Decision Filters; MDEF1:6, 1:4 (Offset
0x39:0x50)
Same as words 0x35 to 0x38 for MDEF1:MDEF6.
These values are stored in the MDEF [6:1] registers (0x5894 - 0x58AC) and MDEF_EXT[6:1] registers
(0x5934 - 0x594C).
6.7.28
Ethertype Data (Word 0x
6.7.29
Ethertype filter; METF0, 1 (Offset 0x51)
This value is stored in the METF0 register (0x5060).
Bit
Name
Description
15:0
METF
EtherType value to be compared against the L2 EtherType field in the Rx
packet.
6.7.30
Ethertype filter; METF0, 1 (Offset 0x52)
This value is stored in the METF0 register (0x5060).
Intel® 82576 GbE Controller
Datasheet
246
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Bit
Name
Description
15
Reserved
Reserved - must be zero
14
Polarity
0 = Positive filter - forward packets matching this filter to the
manageability block.
1 = Negative filter - block packets matching this filter from the
manageability block.
13:0
Reserved
Reserved - must be zero
6.7.31
Ethertype filter; METF1:3,1:2 (Offset 0x53:0x58)
Same as words 0x51 and 0x52 for METF1:METF3.
These values are stored in the METF[3:1] registers (0x5064 - 0x506C).
6.7.32
ARP Response IPv4 Address 0 LSB (Offset 0x59)
Bit
Name
15:8
ARP Response IPv4 Address Byte 1
7:0
ARP Response IPv4 Address Byte 0
6.7.33
Description
ARP Response IPv4 Address 0 MSB (Offset 0x5A)
Bit
Name
15:8
ARP Response IPv4 Address Byte 3
7:0
ARP Response IPv4 Address Byte 2
6.7.34
Description
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x5B)
This value will be stored in the MIPAF0 register (0x58B0).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 1
7:0
LAN0 IPv6 Address 0 Byte 0
6.7.35
Description
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x5C)
This value will be stored in the MIPAF0 register (0x58B0).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 3
7:0
LAN0 IPv6 Address 0 Byte 2
320961-015EN
Revision: 2.61
December 2010
Description
Intel® 82576 GbE Controller
Datasheet
247
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.7.36
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x5D)
This value will be stored in the MIPAF1 register (0x58B4).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 5
7:0
LAN0 IPv6 Address 0 Byte 4
6.7.37
Description
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x5E)
This value will be stored in the MIPAF1 register (0x58B4).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 7
7:0
LAN0 IPv6 Address 0 Byte 6
6.7.38
Description
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x5F)
This value will be stored in the MIPAF2 register (0x58B8).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 9
7:0
LAN0 IPv6 Address 0 Byte 8
6.7.39
Description
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x60)
This value will be stored in the MIPAF2 register (0x58B8).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 11
7:0
LAN0 IPv6 Address 0 Byte 10
6.7.40
Description
LAN0 IPv6 Address 0 LSB; MIPAF (Offset 0x61)
This value will be stored in the MIPAF3 register (0x58BC).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 13
7:0
LAN0 IPv6 Address 0 Byte 12
Intel® 82576 GbE Controller
Datasheet
248
Description
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.7.41
LAN0 IPv6 Address 0 MSB; MIPAF (Offset 0x62)
This value will be stored in the MIPAF3 register (0x58BC).
Bit
Name
15:8
LAN0 IPv6 Address 0 Byte 15
7:0
LAN0 IPv6 Address 0 Byte 14
6.7.42
Description
LAN0 IPv6 Address 1; MIPAF (Offset 0x63:0x6A)
Same structure as LAN0 IPv6 Address 0.
These value are stored in the MIPAF[7:4] registers (0x58C0 - 0x58CC).
6.7.43
LAN0 IPv6 Address 2; MIPAF (Offset 0x6B:0x72)
Same structure as LAN0 IPv6 Address 0.
These value are stored in the MIPAF[11:8] registers (0x58D0 - 0x58DC).
6.8
Sideband Configuration Structure
This section defines parameters of the SMBus and NC-SI interfaces.
6.8.1
Section Header (Offset 0x0)
Bit
Name
15:8
Block CRC8
7:0
Block Length
6.8.2
Description
SMBus Max Fragment Size (Offset 0x1)
Bit
Name
Description
15:0
SMBus Max Fragment Size
(Bytes)
Between 32 and 240 bytes.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
249
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.8.3
SMBus Notification Timeout and Flags (Offset 0x2)
Bit
Name
Description
15:8
SMBus Notification Timeout
(ms)
Timeout until the discarding of a packet not read by the external MC
completes.
0b - No discard.
7:6
SMBus Connection Speed
00b = Slow SMBus connection.
01b = Fast SMBus connection (1 MHz).
10b = Reserved.
11b = Reserved.
5
SMBus Block Read Command
0b = Block read command is C0.
1b = Block read command is D0.
4
SMBus Addressing Mode
0b = Single address mode.
1b = Dual address mode.
3
Reserved
2
Disable SMBus ARP
Functionality
1
SMBus ARP PEC
0
Reserved
6.8.4
Reserved - must be zero
Reserved - must be zero
SMBus Slave Address (Offset 0x3)
Bit
Name
Description
15:9
SMBus 1 Slave Address
Dual-address mode only.
8
Reserved
Reserved - must be zero.
7:1
SMBus 0 Slave Address
0
Reserved
6.8.5
Reserved - must be zero.
SMBus Fail-Over Register; Low Word (Offset 0x4)
Bit
Name
15:12
Gratuitous ARP Counter
11:10
Reserved
9
Enable Teaming Fail-over on
DX
8
Remove Promiscuous on DX
7
Enable MAC Filtering
6
Enable Repeated Gratuitous
ARP
Intel® 82576 GbE Controller
Datasheet
250
Description
Reserved - must be zero.
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
5
Reserved
4
Enable Preferred Primary
3
Preferred Primary Port
2
Transmit Pair
1:0
Reserved
6.8.6
Reserved - set to 1.
SMBus Fail-Over Register; High Word (Offset 0x5)
Bit
Name
15:8
Gratuitous ARP Transmission
Interval (seconds)
7:0
Link Down Fail-over Time
6.8.7
Reserved - must be zero.
Description
NC-SI Configuration (Offset 0x6)
Bit
Name
Description
15:11
Reserved
Reserved - must be zero.
10
Reserved
Reserved - must be zero.
9
NC-SI HW arbitration
supported
0b = Not supported.
NC-SI HW-based packet copy
enable
0b = Disable.
8
7:5
Package ID
4:0
LAN 0 Internal channel ID
1b = Supported.
1b = Enable.
Valid values are 0 to 0x1D.
Note: If not disabled, LAN1 channel ID is assigned the value of LAN0+1.
6.8.8
NC-SI Hardware arbitration Configuration (Offset 0x8)
Bit
Name
Description
15:0
Token timeout
NC-SI HW-Arbitration TOKEN Timeout (in 16 ns cycles). In order to get the
value if NC-SI REF_CLK cycles, this field should be multiplied by 4/5.
Setting value to 0 disables the timeout mechanism.
6.8.9
Reserved (Offset 0x9 - 0xC)
Reserved. Must be zero
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
251
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.9
Flex TCO Filter Configuration Structure
Used to pre-configure the manageability-TCO flex filters so that pass-thru traffic can be received
without explicit configuration by the BMC. This should be used in configuration with the PT-LAN
configuration structure.
6.9.1
Section Header (Offset 0x0)
Bit
Name
15:8
Block CRC8
7:0
Block Length
6.9.2
Flex Filter Length and Control (Offset 0x01)
Bit
Name
15:8
Flex Filter Length (Bytes)
7:5
Reserved
4
Last Filter
3:2
Filter Index (3:0)
1
Apply Filter to LAN 1
0
Apply Filter to LAN 0
6.9.3
Name
15:0
Flex Filter Enable Mask
Reserved - must be zero.
Description
Flex Filter Data (Offset 0x0A - Block Length)
Bit
Name
15:0
Flex Filter Data
6.10
Description
Flex Filter Enable Mask (Offset 0x02:0x09)
Bit
6.9.4
Description
Description
Software Accessed Words
Words 0x03 to 0x07 in the EEPROM image are used for compatibility information. New bits within these
fields will be defined as the need arises for determining software compatibility between various
hardware revisions.
Words 0x8 and 0x09 are used to indicate the Printed Board Assembly (PBA) number and words 0x42
and 0x43 identifies the EEPROM image.
Intel® 82576 GbE Controller
Datasheet
252
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Words 0x30 to 0x3E have been used for configuration and version values by PXE code. The only
exceptions are word 0x3D, which is used for the iSCSI boot configuration and word 0x37 used for
alternate MAC address pointer.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
253
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
6.10.1
Bit
Compatibility (Word 0x03)
Loaded from
EEPROM:
0x0410
Description
15:13
000
Reserved (set to 000b).
12
0
ASF SMBus Connected.
0b = Not connected.
1b = Connected.
11
0
LOM/Not a LOM.
0b = NIC.
1b = LOM.
10
1
Server/Not a Server NIC.
0b = Client.
1b = Server.
9
0
Client/Not a Client NIC.
0b = Server.
1b = Client.
8
0
Retail/OEM.
0b = Retail.
1b = OEM.
7:6
00
Reserved (set to 00b).
5
0
Reserved (set to 1b).
4
1
SMBus Connected.
0b = Not connected.
1b = Connected.
3
0
2
0
Reserved (set to 0b).
PCI Bridge/No PCI Bridge.
0b = PCI bridge not present.
1b = PCI bridge present.
1:0
00
Intel® 82576 GbE Controller
Datasheet
254
Reserved (set to 00b)
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.10.2
OEM specific (Word 0x04)
Loaded from
EEPROM:
0xFFFF
Bit
15:12
0xF
Description
Control for LED 3.
0001b = Default in STATE1 + Default in STATE2.
0010b = Default in STATE1 + LED is ON in STATE2.
0011b = Default in STATE1 + LED is OFF in STATE2.
0100b = LED is ON in STATE1 + Default in STATE2.
0101b = LED is ON in STATE1 + LED is ON in STATE2.
0110b = LED is ON in STATE1 + LED is OFF in STATE2.
0111b = LED is OFF in STATE1 + Default in STATE2.
1000b = LED is OFF in STATE1 + LED is ON in STATE2.
1001b = LED is OFF in STATE1 + LED is OFF in STATE2.
11:8
0xF
Control for LED 2 – same encoding as for LED 3.
7:4
0xF
Control for LED 1 – same encoding as for LED 3.
3:0
0xF
Control for LED 0 – same encoding as for LED 3.
6.10.3
OEM Specific (Word 0x06, 0x07)
These words are available for OEM use.
Loaded from sample EEPROM: 0xFFFF 0xFFFF.
6.10.4
EEPROM Image Revision (Word 0x05)
This word is valid only for device starter images and indicates the ID and version of the EEPROM image.
Bit
Loaded from
EEPROM:
0x2011
Description
15:1
2
0x2
EEPROM major version.
11:4
0x01
EEPROM minor version.
3:0
0x1
EEPROM image ID.
6.10.5
PBA Number Module (Word 0x08, 0x09)
Loaded from sample EEPROM: 0xFFFF 0xFFFF.
The nine-digit Printed Board Assembly (PBA) number used for Intel manufactured Network Interface
Cards (NICs) is stored in EEPROM.
Through the course of hardware ECOs, the suffix field is incremented. The purpose of this information is
to enable customer support (or any user) to identify the revision level of a product.
Network driver software should not rely on this field to identify the product or its capabilities.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
255
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
PBA numbers have exceeded the length that can be stored as HEX values in two words. For newer NICs,
the high word in the PBA Number Module is a flag (0xFAFA) indicating that the actual PBA is stored in a
separate PBA block. The low word is a pointer to the starting word of the PBA block.
The following shows the format of the PBA Number Module field for new products.
PBA Number
Word 0x8
G23456-003
Word 0x9
FAFA
Pointer to PBA Block
The following provides the format of the PBA block; pointed to by word 0x9 above:
Word Offset
Description
0x0
Length in words of the PBA Block (default is 0x6)
0x1 ... 0x5
PBA Number stored in hexadecimal ASCII values.
The new PBA block contains the complete PBA number and includes the dash and the first digit of the 3digit suffix which were not included previously. Each digit is represented by its hexadecimal-ASCII
values.
The following shows an example PBA number (in the new style):
PBA Number
Word
Offset 0
Word
Offset 1
Word
Offset 2
Word
Offset 3
Word
Offset 4
Word
Offset 5
G23456-003
0006
4732
3334
3536
2D30
3033
Specifies
6 words
G2
34
56
-0
03
Older NICs have PBA numbers starting with [A,B,C,D,E] and are stored directly in words 0x8-0x9. The
dash in the PBA number is not stored; nor is the first digit of the 3-digit suffix (the first digit is always
0b for older products).
The following example shows a PBA number stored in the PBA Number Module field (in the old style):
PBA Number
Byte 1
E23456-003
6.10.6
E2
Byte 2
34
Byte 3
56
Byte 4
03
PXE Configuration Words (Word 0x30:3B)
PXE configuration is controlled by the following Ewords.
Intel® 82576 GbE Controller
Datasheet
256
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.10.6.1
Main Setup Options PCI Function 0 (Word 0x30)
The main setup options are stored in word 30h. These options are those that can be changed by the
user via the Control-S setup menu. Word 30h has the following format:
Hardware
Default
Loaded from
EEPROM:
0x0100
Bit(s)
Name
15:13
RFU
0x0
0x0
Reserved. Must be 0.
12:10
FSD
0x0
0x0
Bits 12-10 control forcing speed and duplex during driver
operation.
Description
Valid values are:
000b – Auto-negotiate
001b – 10Mbps Half Duplex
010b – 100Mbps Half Duplex
011b – Not valid (treated as 000b)
100b – 10Mbps Full Duplex
101b – 100Mbps Full Duplex
111b – 1000Mbps Full Duplex
Only applicable for copper-based adapters. Not
applicable to 10GbE. Default value is 000b.
9
RSV
0b
0b
8
DSM
1b
1b
Reserved. Set to 0.
Display Setup Message.
If the bit is set to 1, the Press Control-S message is displayed
after the title message.
Default value is 1.
7:6
PT
0x0
0x0
Prompt Time.
These bits control how long the CTRL-S setup prompt
message is displayed, if enabled by DIM.
00 = 2 seconds (default)
01 = 3 seconds
10 = 5 seconds
11 = 0 seconds
Note: CTRL-S message is not displayed if 0 seconds prompt
time is selected.
5
IBD
320961-015EN
Revision: 2.61
December 2010
0b
0b
iSCSI Boot Disable.
Intel® 82576 GbE Controller
Datasheet
257
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
4:3
DBS
0b
0b
Default Boot Selection.
These bits select which device is the default boot device.
These bits are only used if the agent detects that the BIOS
does not support boot order selection or if the MODE field of
word 31h is set to MODE_LEGACY.
00 = Network boot, then local boot (default)
01 = Local boot, then network boot
10 = Network boot only
11 = Local boot only
2
DEP
0b
0b
1:0
PS
0x0
0x0
Deprecated. Must be 0.
Protocol Select.
These bits select the active boot protocol.
00 = PXE (default value)
01 = RPL (only if RPL is in the flash)
10 = iSCSI Boot primary port (only if iSCSI Boot is using this
adapter)
11 = iSCSI Boot secondary port (only if iSCSI Boot is using this
adapter)
Only the default value of 00b should be initially programmed
into the adapter; other values should only be set by
configuration utilities.
6.10.6.2
Configuration Customization Options PCI Function 0 (Word
0x31)
Word 31h of the EEPROM contains settings that can be programmed by an OEM or network
administrator to customize the operation of the software. These settings cannot be changed from within
the Control-S setup menu. The lower byte contains settings that would typically be configured by a
network administrator using an external utility; these settings generally control which setup menu
options are changeable. The upper byte is generally settings that would be used by an OEM to control
the operation of the agent in a LOM environment, although there is nothing in the agent to prevent
their use on a NIC implementation. The default value for this word is 4000h.
Loaded from
EEPROM:
0x4000
Bit(s)
Name
Hardware
Default
15:14
SIG
0x1
0x1
Signature. Must be set to 01 to indicate that this word has
been programmed by the agent or other configuration
software.
13
RFU
0b
0b
Reserved. Must be 0.
12
RFU
0b
0b
Reserved. Must be 0.
11
RETRY
0b
0b
Function
Selects Continuous Retry operation.
If this bit is set, IBA will NOT transfer control back to the
BIOS if it fails to boot due to a network error (such as failure
to receive DHCP replies). Instead, it will restart the PXE boot
process again. If this bit is set, the only way to cancel PXE
boot is for the user to press ESC on the keyboard. Retry will
not be attempted due to hardware conditions such as an
invalid EEPROM checksum or failing to establish link.
Default value is 0.
Intel® 82576 GbE Controller
Datasheet
258
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
10:8
MODE
0b
0b
Selects the agent’s boot order setup mode.
This field changes the agent’s default behavior in order to
make it compatible with systems that do not completely
support the BBS and PnP Expansion ROM standards. Valid
values and their meanings are:
000b
- Normal behavior. The agent will attempt to detect
BBS and PnP Expansion ROM support as it normally
does.
001b
- Force Legacy mode. The agent will not attempt to
detect BBS or PnP Expansion ROM supports in the BIOS
and will assume the BIOS is not compliant. The user
can change the BIOS boot order in the Setup Menu.
010b
- Force BBS mode. The agent will assume the BIOS is
BBS-compliant, even though it may not be detected as
such by the agent’s detection code. The user can NOT
change the BIOS boot order in the Setup Menu.
011b
- Force PnP Int18 mode. The agent will assume the
BIOS allows boot order setup for PnP Expansion ROMs
and will hook interrupt 18h (to inform the BIOS that the
agent is a bootable device) in addition to registering as
a BBS IPL device. The user can NOT change the BIOS
boot order in the Setup Menu.
100b
- Force PnP Int19 mode. The agent will assume the
BIOS allows boot order setup for PnP Expansion ROMs
and will hook interrupt 19h (to inform the BIOS that the
agent is a bootable device) in addition to registering as
a BBS IPL device. The user can NOT change the BIOS
boot order in the Setup Menu.
101b
- Reserved for future use. If specified, is treated as a
value of 000b.
110b
- Reserved for future use. If specified, is treated as a
value of 000b.
111b
- Reserved for future use. If specified, is treated as a
value of 000b.
7
RFU
0b
0b
Reserved. Must be 0.
6
RFU
0b
0b
Reserved. Must be 0.
5
DFU
0b
0b
Disable Flash Update.
If this bit is set to 1, the user is not allowed to update the
flash image using PROSet. Default value is 0.
4
DLWS
0b
0b
Disable Legacy Wakeup Support.
If this bit is set to 1, the user is not allowed to change the
Legacy OS Wakeup Support menu option. Default value is 0.
3
DBS
0b
0b
Disable Boot Selection.
If this bit is set to 1, the user is not allowed to change the
boot order menu option. Default value is 0.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
259
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
2
DPS
0b
0b
1
DTM
0b
0b
Disable Protocol Select. If set to 1, the user is not allowed to
change the boot protocol. Default value is 0.
Disable Title Message.
If this bit is set to 1, the title message displaying the version
of the Boot Agent is suppressed; the Control-S message is
also suppressed. This is for OEMs who do not wish the boot
agent to display any messages at system boot. Default value
is 0.
0
DSM
0b
0b
Disable Setup Menu.
If this bit is set to 1, the user is not allowed to invoke the
setup menu by pressing Control-S. In this case, the EEPROM
may only be changed via an external program. Default value
is 0.
6.10.6.3
PXE Version (Word 0x32)
Word 32h of the EEPROM is used to store the version of the boot agent that is stored in the flash image.
When the Boot Agent loads, it can check this value to determine if any first-time configuration needs to
be performed. The agent then updates this word with its version. Some diagnostic tools to report the
version of the Boot Agent in the flash also read this word. The format of this word is:
Hardware
Default
Loaded
From
EEPROM:
0x1314
Bit(s)
Name
15 - 12
MAJ
0x0
0x1
PXE Boot Agent Major Version. Default value is 0.
11 – 8
MIN
0x0
0x3
PXE Boot Agent Minor Version. Default value is 0.
7–0
BLD
0x0
0x14
PXE Boot Agent Build Number. Default value is 0.
6.10.6.4
Function
IBA Capabilities (Word 0x33)
Word 33h of the EEPROM is used to enumerate the boot technologies that have been programmed into
the flash. This is updated by flash configuration tools and is not updated or read by IBA.
Bit(s)
Name
15 - 14
SIG
Hardware
Default
0x1
Loaded From
EEPROM:
0x4003
0x1
Function
Signature.
Must be set to 01 to indicate that this word has been
programmed by the agent or other configuration software.
13 – 5
RFU
0b
0b
Reserved. Must be 0.
4
ISCSI
0b
0b
iSCSI Boot is present in flash if set to 1.
3
EFI
0b
0b
EFI UNDI driver is present in flash if set to 1.
2
Reserved
0b
0b
Set to 0.
1
UNDI
0b
1b
PXE UNDI driver is present in flash if set to 1.
0
BC
0b
1b
PXE Base Code is present in flash if set to 1.
Intel® 82576 GbE Controller
Datasheet
260
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
6.10.6.5
Setup Options PCI Function 1 (Word 0x34)
This word is the same as word 30h, but for function 1 of the device.
6.10.6.6
Configuration Customization Options PCI Function 1 (Word
0x35)
This word is the same as word 31h, but for function 1 of the device.
6.10.6.7
iSCSI Option ROM Version (Word 0x36)
Word 0x36 of the NVM is used to store the version of iSCSI Option ROM updated as the same format as
PXE Version at Word 0x32. The value must be above 0x2000 and the value below (word 0x1FFF = 16
KB NVM size) is reserved. iSCSIUtl, FLAUtil, DMiX update iSCSI Option ROM version if the value is
above 0x2000, 0x0000, or 0xFFFF. The value (0x0040 - 0x1FFF) should be kept and not be overwritten.
6.10.6.8
Setup Options PCI Function 2 (Word 0x38)
This word is the same as word 30h, but for function 2 of the device.
6.10.6.9
Configuration Customization Options PCI Function 2 (Word
0x39)
This word is the same as word 31h, but for function 2 of the device.
6.10.6.10
Setup Options PCI Function 3 (Word 0x3A)
This word is the same as word 30h, but for function 3 of the device.
6.10.6.11
Configuration Customization Options PCI Function 3 (Word
0x3B)
This word is the same as word 31h, but for function 3 of the device.
6.10.7
iSCSI Boot Configuration Offset (Word 0x3D)
Bit
Name
Description
15:0
Offset
Defines the offset in EEPROM where the iSCSI boot configuration structure starts.
6.10.7.1
iSCSI Module Structure
Configuration Item
Size in Bytes
Comments
iSCSI Boot Signature
2
‘i’, ‘S’
iSCSI Block Size
2
Total byte size of the iSCSI configuration block
Structure Version
1
Version of this structure. Should be set to 1.
Reserved
1
Reserved for future use.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
261
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
Initiator Name
255 + 1
iSCSI initiator name. This field is optional and built by manual input, DHCP
host name, or with MAC address as defined in section 4.4.
Reserved
34
Reserved for future use.
2
Bit 00h  Enable DHCP
BELOW FIELDS ARE
PER PORT.
Flags
0 – Use static configurations from this structure
1 – Overrides configurations retrieved from DHCP.
Bit 01h  Enable DHCP for getting iSCSI target information.
0 – Use static target configuration
1 – Use DHCP to get target information by the Option 17 Root Path.
Bit 02h – 03h  Authentication Type
00 – none
01 – one way chap
02 – mutual chap
Bit 04h – 05h  Ctrl-D setup menu
00 – enabled
03 – disabled, skip Ctrl-D entry
Bit 06h – 07h  Reserved
Bit 08h – 09h  ARP Retries
Retry value
Bit 0Ah – 0Fh  ARP Timeout
Timeout value for each try
Initiator IP
4
Initiator DHCP flag;
not set  This field should contain the initiator IP address.
set  this field is ignored.
Subnet Mask
4
Initiator DHCP flag;
not set  This field should contain the subnet mask.
set  this field is ignored.
Gateway IP
4
Initiator DHCP flag;
not set  This field should contain the gateway IP address.
set  If DHCP bit is set this field is ignored.
Boot LUN
2
Target DHCP flag;
not set  iSCSI target LUN number should be specified.
set  this field is ignored.
Target IP
4
Target DHCP flag;
not set  IP address of iSCSI target.
set  this field is ignored.
Target Port
2
Target DHCP flag;
not set  TCP port used by iSCSI target. Default is 3260.
set  this field is ignored.
Intel® 82576 GbE Controller
Datasheet
262
320961-015EN
Revision: 2.61
December 2010
Non-Volatile Memory Map - EEPROM — Intel® 82576 GbE Controller
Target Name
255 + 1
Target DHCP flag;
not set  iSCSI target name should be specified.
set  this field is ignored.
CHAP Password
16 + 2
The minimum CHAP secret must be 12 octets and maximum CHAP secret size
is 16. The last 2 bytes are null alignment padding.
CHAP User Name
127 + 1
The user name must be non-null value and maximum size of user name
allowed is 127 characters.
Reserved
2
Reserved
Mutual CHAP Password
16 + 2
The minimum mutual CHAP secret must be 12 octets and maximum mutual
CHAP secret size is 16. The last 2 bytes are null alignment padding.
Reserved
160
Reserved for future use.
The maximum amount of boot configuration information that is stored is 834 bytes (417 words);
however, the iSCSI boot implementation can limit this value in order to work with a smaller EEPROM.
Variable length fields are used to limit the total amount of EEPROM that is used for iSCSI boot
information. Each field is preceded by a single byte that indicates how much space is available for that
field. For example, if the Initiator Name field is being limited to 128 bytes, then it is preceded with a
single byte with the value of 128. The following field begins at 128 bytes after the beginning of the
Initiator Name field regardless of the actual size of the field. The variable length fields must be NULL
terminated unless they reach the maximum size specified in the length byte.
6.10.8
Alternate MAC Address Pointer (Word 0x37)
This word may point to a location in the EEPROM containing additional MAC addresses used by system
management functions. If the additional MAC addresses are not supported, the word shall be set to
0xFFFF
6.10.9
Checksum Word (Word 0x3F)
The checksum word (0x3F) is used to ensure that the base EEPROM image is a valid image. The value
of this word should be calculated such that after adding all the words (0x00:0x3F), including the
checksum word itself, the sum should be 0xBABA. The initial value in the 16-bit summing register
should be 0x0000 and the carry bit should be ignored after each addition.
Note:
Hardware does not calculate the word 0x3F checksum during EEPROM write; it must be
calculated by software independently and included in the EEPROM write data. Hardware
does not compute a checksum over words 0x00:0x3F during EEPROM reads in order to
determine validity of the EEPROM image; this field is provided strictly for software
verification of EEPROM validity. All hardware configurations based on word 0x00:0x3F
content is based on the validity of the Signature field of EEPROM Initialization Control Word
1 (Signature must be 01b).
6.10.10
Image Unique ID (Word 0x42, 0x43)
These words contain a unique 32-bit ID for each image generated by Intel to enable tracking of images
and comparison to the original image if testing a customer EEPROM image.
§§
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
263
Intel® 82576 GbE Controller — Non-Volatile Memory Map - EEPROM
NOTE:
This page intentionally left blank.
Intel® 82576 GbE Controller
Datasheet
264
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.0
Inline Functions
7.1
Receive Functionality
7.1.1
Rx Queues Assignment
A received packet goes through three stages of filtering as shown in Figure 7-1. Figure 7-1 describes a
switch-like structure that is used in virtualization mode to route packets between the network port (top
of drawing) and one or more virtual ports (bottom of figure), where each virtual port can be associated
with a virtual machine, an IOVM, a VMM, or the like.
The first step in queue assignment is to make sure that the packet is received by the port. This is done
by a set of L2 filters as described in Section 7.1.2.
The second stage is specific to virtualization environments and defines the virtual ports (called pools)
that are the targets for the Rx packet. A packet can be associated with any number of ports/pools and
the selection process as described in Section 7.1.1.2.
Figure 7-1.
320961-015EN
Revision: 2.61
December 2010
Stages in Packet Filtering
Intel® 82576 GbE Controller
Datasheet
265
Intel® 82576 GbE Controller — Inline Functions
In the third stage, a receive packet that successfully passed the Rx filters is associated with one of
more receive descriptor queues as described in this section.
The following filter mechanisms determine the destination of a receive packet. These are described
briefly in this section and in full details in separate sections:
• Virtualization — In a virtualized environment, DMA resources are shared between more than one
software entity (operating system and/or software device driver). This is done by allocating receive
descriptor queues to virtual partitions (VMM, IOVM, VMs, or VFs). Allocating queues to virtual
partitions is done in sets, each with the same number of queues called queue pools or pools.
Virtualization assigns to each received packet one or more pool indices. Packets are routed to a pool
based on their pool index and other considerations, such Receive Side Scaling (RSS). See
Section 7.1.1.2 for details on routing for virtualization.
• RSS — RSS distributes packet processing between several processor cores by assigning packets
into different descriptor queues. RSS assigns to each received packet an RSS index. Packets are
routed to one of a set of Rx queues based on their RSS index and other considerations such as
virtualization. See Section 7.1.1.7 for details on RSS.
• L2 Ethertype filters — These filters identify packets by their L2 Ether type and assign them to
receive queues. Examples of possible uses are LLDP packets and 802.1X packets. See
Section 7.1.1.4 for mode details. The 82576 incorporates four Ether-type filters.
• L3/L4 5-tuple filters — These filters identify specific L3/L4 flows or sets of L3/L4 flows. Each filter
consists of a 5-tuple (protocol, source and destination IP addresses, source, and destination TCP/
UDP port) and routes packets into one of the Rx queues. The 82576 has eight such filters. See
Section 7.1.1.5 for details.
• TCP SYN filters — The 82576 might route TCP packets with their SYN flag set into a separate queue.
SYN packets are often used in SYN attacks to load the system with numerous requests for new
connections. By filtering such packets to a separate queue, security software can monitor and act
on SYN attacks. See Section 7.1.1.6 for mode details.
Typically, packet reception consists of recognizing the presence of a packet on the wire, performing
address filtering, storing the packet in the receive data FIFO, transferring the data to one of the 16
receive queues in host memory, and updating the state of a receive descriptor.
Note:
Maximum supported received-packet size is 9.5 KB (9728 bytes).
A received packet is allocated to a queue based on the previous criteria and the following order:
• Queue by L2 Ether-type filters (if a match)
• If RFCTL.SYNQFP is 0b, then:
— Queue by L3/L4 5-tuple filters (if a match)
— Queue by SYN filter (if a match)
• If RFCTL.SYNQFP is 1b, then:
— Queue by SYN filter (if a match)
— Queue by L3/L4 5-tuple filters (if a match)
• Define a pool (in case of virtualization)
• Queue by RSS.
Table 7-1lists the allocation of the queues in each of the modes.
Intel® 82576 GbE Controller
Datasheet
266
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Table 7-1.
Virtualization
Disabled
Enabled
Queue Allocation1
RSS
Queue allocation
Disabled
One default queue (MRQC.DEF_Q)
Enabled
Up to 16 queues by RSS.
Disabled
One queue per VM (queues 0-7 for VM 0-7).
Enabled
Two queues per VM (queues 0, 8; 1, 9; 2, 10; 3, 11; 4, 12; 5, 13; 6, 14; 7; 15 for VM 0-7,
respectively). Spread between the queues by RSS.
1. On top of this allocation, the special filters can override the queueing decision.
7.1.1.1
Queuing in a Non-Virtualized Environment
A received packet is assigned to a queue in the following manner:
• L2 Ether-type filters — Each filter identifies one of 16 Rx queues.
• SYN filter — Identifies one of 16 Rx queues.
• L3/L4 5-tuple filters — Each filter identifies one of 16 Rx queues.
• RSS filters - Identifies one of 2 x 8 queues through the RSS index. The following modes are
supported:
— No RSS — The default queue as defined in MRQC.DEF_Q is used for packets that do not meet
any of the previous conditions.
— RSS — A set of 16 queues is allocated for RSS. The queue is identified through the RSS index.
Note that it is possible to use a subset of the 16 queues.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
267
Intel® 82576 GbE Controller — Inline Functions
Figure 7-2.
7.1.1.2
Rx Queuing Flow (Non-Virtualized)
Rx Queuing in a Virtualized Environment
The 16 Rx queues are allocated to a pre-configured number of queue sets called pools. In Next
Generation VMDq mode, system software allocates the pools to the VMM, an IOVM, or to VMs. In IOV
mode, each pool is associated with a VF.
Incoming packets are associated with pools based on their L2 characteristics as described in
Section 7.10.3. This section describes the following stage, where an Rx queue is assigned to each
replication of the Rx packet as determined by its pool’s association.
Intel® 82576 GbE Controller
Datasheet
268
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
A received packet is assigned to a queue within a pool in the following manner:
• L2 Ether-type filters — Each filter identifies a specific queue, belonging to some pool (the queue
designation determines the pool and is usually allocated to the VMM or a service operating system).
• SYN filter — Not supported in VT modes.
• L3/L4 5-tuple filters — Each filter is associated with a single Rx queue, belonging to a specific pool.
• RSS filters — The following modes are supported:
— No RSS — A single queue is allocated per pool (queue 0 of each pool).
— RSS — All 16 queues are allocated to pools. Note that it is possible to enable RSS usage per
pool using the VMOLR.RSSE bit. If the packet is not suitable for RSS, then a queue 0 for each
pool is used.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
269
Intel® 82576 GbE Controller — Inline Functions
Figure 7-3.
Rx Queuing Flow (Virtualization)
Intel® 82576 GbE Controller
Datasheet
270
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.1.1.3
Queue Configuration Registers
Configuration registers (CSRs) that control queue operation are replicated per queue (total of 16 copies
of each register). Each of the replicated registers correspond to a queue such that the queue index
equals the serial number of the register (such as register 0 corresponds to queue 0, etc.). Registers
included in this category are:
• RDBAL and RDBAH — Rx Descriptor Base
• RDLEN — RX Descriptor Length
• RDH — RX Descriptor Head
• RDT — RX Descriptor Tail
• RXDCTL — Receive Descriptor Control
• RXCTL — Rx DCA Control
CSRs that define the functionality of descriptor queues are replicated per VF index to allow for a
separate configuration in a virtualization environment (total of eight copies of each register). Each of
the replicated registers correspond to a set of queues with the same VF index, such that the VF index of
the queue identifies the serial number of the register. Registers included in this category are:
• SRRCTL — Split and Replication Receive Control
• PSRTYPE — Packet Split Receive type
7.1.1.4
L2 Ether-Type Filters
These filters identify packets by L2 Ether-type and assign them to a receive queue. The following
usages have been identified:
• IEEE 802.1X packets — Extensible Authentication Protocol over LAN (EAPOL).
• Time sync packets (such as IEEE 1588) — Identifies Sync or Delay_Req packets
The 82576 incorporates eight Ether-type filters.
The Packet Type field in the Rx descriptor captures the filter number that matched with the L2 Ethertype. See Section 7.1.5 for decoding of the Packet Type field.
The Ether-type filters are configured via the ETQF register as follows:
• The EType field contains the 16-bit Ether-type compared against all L2 type fields in the Rx packet.
• The Filter Enable bit enables identification of Rx packets by Ether-type according to this filter. If this
bit is cleared, the filter is ignored for all purposes.
• The Rx Queue field contains the absolute destination queue for the packet.
• The 1588 Time Stamp field indicates that the packet should be time stamped according to the IEEE
1588 specification.
• The Queue Enable field enables forwarding Rx packets based on the Ether-type defined in this
register.
Special considerations for Virtualization modes:
• Packets that match an Ether-type filter are diverted from their original pool (the pool identified by
the L2 filters) to the pool used as the pool to which the queue in the Queue field belongs. In other
words, The L2 filters are ignored in determining the pool for such packets.
• The same applies for multi-cast packets. A single copy is posted to the pool defined by the filter.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
271
Intel® 82576 GbE Controller — Inline Functions
• Mirroring rules:
— If a pool is being mirrored, the pool to which the queue in the Queue field belongs is used to
determine if a packet that matches the filter should be mirrored.
— The queue inside the pool (indicated by the Queue field) is used for both the original pool and
the mirroring pool.
7.1.1.5
L3/L4 5-Tuple Filters
These filters identify specific L3/L4 flows or sets of L3/L4 flows. Each filter consists of a 5-tuple
(protocol, source and destination IP addresses, source and destination TCP/UDP port) and forwards
packets into one of the Rx queues. In a virtualized environment, each filter can be associated with one
specific VF and a packet must match the L2 conditions for that VF to match the 5-tuple filter.
The 82576 incorporates eight such filters.
The 5-tuple filters are configured via the FTQF, SPQF, IMIR, IMIR_EXT, DAQF & SAQF registers as
follows (per filter):
• Protocol — Identifies the IP protocol, part of the 5-tuple queue filters. Enabled by a bit in the Mask
field.
• Source address — Identifies the IP source address, part of the 5-tuple queue filters. Enabled by a
bit in the Mask field. Only IPv4 addresses are supported.
• Destination address — Identifies the IP destination address, part of the 5-tuple queue filters.
Enabled by a bit in the Mask field. Only IPv4 addresses are supported.
• Source port — Identifies the TCP/UDP source port, part of the 5-tuple queue filters. Enabled by a bit
in the Mask field.
• Destination port — Identifies the TCP/UDP destination port, part of the 5-tuple queue filters.
Enabled if the IMIR.PORT_BP field is cleared.
• Size threshold — Identifies the length of the packet that should trigger the filter. This is the length
as received by the host, not including any part of the packet removed by hardware. Enabled by the
Size_BP field.
• Control Bits — Identify TCP flags that might be part of the filtering process. Enabled by the
CtrlBit_BP field.
• Rx queue — Determines the Rx queue for packets that match this filter. Only the LSB bits are used:
— In a non-virtualized configuration, the Rx Queue field contains the queue serial number.
— In the virtualized configuration, the Rx Queue field contains the queue serial number within the
set of queues of the VF associated (via the VF field) with this filter. In this case, the packet is
sent to all VFs in the VF index list (see Section 7.1.1.2 for details) in the queue defined in the
filter.
• Queue enable — Enables forwarding a packet that uses this filter.
• VF — Identifies the VF associated with this filter by its VF index (virtualization modes only). A
packet must match the VF filters (such as MAC address) and the 5-tuple filter for this filter to apply.
Note:
The above field should not be set to match a mirror port (such as a port that receives
promiscuous traffic), as it influences the queuing of packets sent to mirrored port.
• VF Mask — Determines if the VF field participates in the 5-tuple match or is ignored:
— Must be set to 1b in non-virtualized case
— In a virtualized configuration:
•
When set to 0b, only unicast packets that match the VF field are candidates for this filter.
Intel® 82576 GbE Controller
Datasheet
272
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
•
When set to 1b, unicast, multicast, and broadcast packets might all match with the 5-tuple
filter. VF association is not checked. The Rx Queue field defines a queue for each VF.
• Mask — A5-bit field that masks each of the fields in the 5-tuple (L4 protocol, IP addresses, TCP/UDP
ports). The filter is a logical AND of the non-masked 5-tuple fields. If all 5-tuple fields are masked,
the filter is not used for queue forwarding.
Note:
If more than one 5-tuple filter with the same priority are matched by the packet, the first
filter (lowest ordinal number) is used in order to define the queue destination of this packet.
The immediate interrupt and 1588 actions are defined by the OR of all the matching filters.
Filtering rules for IPv6 packets are:
• If a filter defines at least one of the IP source and destination addresses, then an IPv6 packet
always misses such a filter.
• If a filter masks both the IP source and destination addresses, then an IPv6 packet is compared
against the remaining fields of the filter.
• Tunnelled packets are not matched by the 5-tuple filters.
Note:
These filters are not available for VM to VM traffic forwarding.
7.1.1.6
SYN Packet Filters
The 82576 might forward TCP packets whose SYN flag is set into a separate queue. SYN packets are
often used in SYN attacks to load the system with numerous requests for new connections. By filtering
such packets to a separate queue, security software can monitor and act on SYN attacks.
SYN filters are configured via the SYNQF registers as follows:
• Queue En — Enables forwarding of SYN packets to a specific queue.
• Rx Queue field — Contains the destination queue for the packet.
This filter is not to be used in a virtualized environment.
7.1.1.7
Receive-Side Scaling (RSS)
RSS is a mechanism to distribute received packets into several descriptor queues. Software then
assigns each queue to a different processor, sharing the load of packet processing among several
processors.
As described in Section 7.1.1.7, the 82576 uses RSS as one ingredient in its packet assignment policy
(the others are the various filters and virtualization). The RSS output is a RSS index. The 82576’s global
assignment uses these bits (or only some of the LSBs) as part of the queue number.
RSS is enabled in the MRQC register. The RSS Status field in the descriptor write-back is enabled when
the RXCSUM.PCSD bit is set (fragment checksum is disabled). RSS is therefore mutually exclusive with
UDP fragmentation. Also, support for RSS is not provided when legacy receive descriptor format is
used.
When RSS is enabled, the 82576 provides software with the following information as required by
Microsoft* RSS or for device driver assistance:
• A Dword result of the Microsoft* RSS hash function, to be used by the stack for flow classification,
is written into the receive packet descriptor (required by Microsoft* RSS).
• A 4-bit RSS Type field conveys the hash function used for the specific packet (required by
Microsoft* RSS).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
273
Intel® 82576 GbE Controller — Inline Functions
Figure 7-4 shows the process of computing an RSS output:
1. The receive packet is parsed into the header fields used by the hash operation (such as IP
addresses, TCP port, etc.).
2. A hash calculation is performed. The 82576 supports a single hash function, as defined by
Microsoft* RSS. The 82576 does not indicate to the software device driver which hash function is
used. The 32-bit result is fed into the packet receive descriptor.
3. The seven LSBs of the hash result are used as an index into a 128-entry indirection table. Each
entry provides a 3-bit RSS output index.
When RSS is disabled, packets are assigned an RSS output index = zero. System software might enable
or disable RSS at any time. While disabled, system software might update the contents of any of the
RSS-related registers.
When multiple requests queues are enabled in RSS mode, un-decodable packets are assigned an RSS
output index = zero. The 32-bit tag (normally a result of the hash function) equals zero.
Figure 7-4.
7.1.1.7.1
RSS Block Diagram
RSS Hash Function
Intel® 82576 GbE Controller
Datasheet
274
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Section 7.1.1.7.1 provides a verification suite used to validate that the hash function is computed
according to Microsoft* nomenclature.
The 82576 hash function follows Microsoft* definition. A single hash function is defined with several
variations for the following cases:
• TcpIPv4 — The 82576 parses the packet to identify an IPv4 packet containing a TCP segment per
the criteria described later in this section. If the packet is not an IPv4 packet containing a TCP
segment, RSS is not done for the packet.
• IPv4 — The 82576 parses the packet to identify an IPv4 packet. If the packet is not an IPv4 packet,
RSS is not done for the packet.
• TcpIPv6 — The 82576 parses the packet to identify an IPv6 packet containing a TCP segment per
the criteria described later in this section. If the packet is not an IPv6 packet containing a TCP
segment, RSS is not done for the packet.
• TcpIPv6Ex — The 82576 parses the packet to identify an IPv6 packet containing a TCP segment
with extensions per the criteria described later in this section. If the packet is not an IPv6 packet
containing a TCP segment, RSS is not done for the packet. Extension headers should be parsed for
a Home-Address-Option field (for source address) or the Routing-Header-Type-2 field (for
destination address).
• IPv6Ex — The 82576 parses the packet to identify an IPv6 packet. Extension headers should be
parsed for a Home-Address-Option field (for source address) or the Routing-Header-Type-2 field
(for destination address). Note that the packet is not required to contain any of these extension
headers to be hashed by this function. In this case, the IPv6 hash is used. If the packet is not an
IPv6 packet, RSS is not done for the packet.
• IPv6 — The 82576 parses the packet to identify an IPv6 packet. If the packet is not an IPv6 packet,
receive-side-scaling is not done for the packet.
The following additional cases are not part of the Microsoft* RSS specification:
• UdpIPV4 — The 82576 parses the packet to identify a packet with UDP over IPv4.
• UdpIPV6 — The 82576 parses the packet to identify a packet with UDP over IPv6.
• UdpIPV6Ex — The 82576 parses the packet to identify a packet with UDP over IPv6 with
extensions.
A packet is identified as containing a TCP segment if all of the following conditions are met:
• The transport layer protocol is TCP (not UDP, ICMP, IGMP, etc.).
• The TCP segment can be parsed (such as IP options can be parsed, packet not encrypted).
• The packet is not fragmented (even if the fragment contains a complete TCP header).
Bits[31:16] of the Multiple Receive Queues Command (MRQC) register enable each of the above hash
function variations (several can be set at a given time). If several functions are enabled at the same
time, priority is defined as follows (skip functions that are not enabled):
IPv4 packet:
1. Try using the TcpIPv4 function.
2. Try using IPV4_UDP function.
3. Try using the IPv4 function.
IPv6 packet:
1. If TcpIPv6Ex is enabled, try using the TcpIPv6Ex function; else if TcpIPv6 is enabled try using the
TcpIPv6 function.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
275
Intel® 82576 GbE Controller — Inline Functions
2. If UdpIPv6Ex is enabled, try using UdpIPv6Ex function; else if UpdIPv6 is enabled try using UdpIPv6
function.
3. If IPv6Ex is enabled, try using the IPv6Ex function, else if IPv6 is enabled, try using the IPv6
function.
The following combinations are currently supported:
• Any combination of IPv4, TcpIPv4, and UdpIPv4.
• And/or.
• Any combination of either IPv6, TcpIPv6, and UdpIPv6 or IPv6Ex, TcpIPv6Ex, and UdpIPv6Ex.
When a packet cannot be parsed by the previously mentioned rules, it is assigned an RSS output index
= zero. The 32-bit tag (normally a result of the hash function) equals zero.
The 32-bit result of the hash computation is written into the packet descriptor and also provides an
index into the indirection table.
The following notation is used to describe the hash functions:
• Ordering is little endian in both bytes and bits. For example, the IP address 161.142.100.80
translates into 0xa18e6450 in the signature.
• A “^ “denotes bit-wise XOR operation of same-width vectors.
• @x-y denotes bytes x through y (including both of them) of the incoming packet, where byte 0 is
the first byte of the IP header. In other words, it is considered that all byte-offsets as offsets into a
packet where the framing layer header has been stripped out. Therefore, the source IPv4 address is
referred to as @12-15, while the destination v4 address is referred to as @16-19.
• @x-y, @v-w denotes concatenation of bytes x-y, followed by bytes v-w, preserving the order in
which they occurred in the packet.
All hash function variations (IPv4 and IPv6) follow the same general structure. Specific details for each
variation are described in the following section. The hash uses a random secret key length of 320 bits
(40 bytes); the key is typically supplied through the RSS Random Key Register (RSSRK).
The algorithm works by examining each bit of the hash input from left to right. Intel’s nomenclature
defines left and right for a byte-array as follows: Given an array K with k bytes, Intel’s nomenclature
assumes that the array is laid out as shown:
K[0] K[1] K[2] … K[k-1]
K[0] is the left-most byte, and the MSB of K[0] is the left-most bit. K[k-1] is the right-most byte, and
the LSB of K[k-1] is the right-most bit.
ComputeHash(input[], N)
For hash-input input[] of length N bytes (8N bits) and a random secret key K of 320 bits
Result = 0;
For each bit b in input[] {
if (b == 1) then Result ^= (left-most 32 bits of K);
shift K left 1 bit position;
}
return Result;
The following four pseudo-code examples are intended to help clarify exactly how the hash is to be
performed in four cases, IPv4 with and without ability to parse the TCP header and IPv6 with an without
a TCP header.
Intel® 82576 GbE Controller
Datasheet
276
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.1.1.7.1.1
Hash for IPv4 with TCP
Concatenate SourceAddress, DestinationAddress, SourcePort, DestinationPort into one single bytearray, preserving the order in which they occurred in the packet:
Input[12] = @12-15, @16-19, @20-21, @22-23.
Result = ComputeHash(Input, 12);
7.1.1.7.1.2
Hash for IPv4 with UDP
Concatenate SourceAddress, DestinationAddress, SourcePort, DestinationPort into one single bytearray, preserving the order in which they occurred in the packet:
Input[12] = @12-15, @16-19, @20-21, @22-23.
Result = ComputeHash(Input, 12);
7.1.1.7.1.3
Hash for IPv4 without TCP
Concatenate SourceAddress and DestinationAddress into one single byte-array
Input[8] = @12-15, @16-19
Result = ComputeHash(Input, 8)
7.1.1.7.1.4
Hash for IPv6 with TCP
Similar to above:
Input[36] = @8-23, @24-39, @40-41, @42-43
Result = ComputeHash(Input, 36)
7.1.1.7.1.5
Hash for IPv6 with UDP
Similar to above:
Input[36] = @8-23, @24-39, @40-41, @42-43
Result = ComputeHash(Input, 36)
7.1.1.7.1.6
Hash for IPv6 without TCP
Input[32] = @8-23, @24-39
Result = ComputeHash(Input, 32)
7.1.1.7.2
Indirection Table
The indirection table is a 128-entry structure, indexed by the seven LSBs of the hash function output.
Each entry of the table contains the following:
• Bits [3:0] — RSS index
Note:
In RSS mode, all bits are used. In Next Generation VMDq + RSS mode only bit 0 is used.
System software might update the indirection table during run time. Such updates of the table are not
synchronized with the arrival time of received packets. Therefore, it is not guaranteed that a table
update takes effect on a specific packet boundary.
7.1.1.7.3
RSS Verification Suite
Assume that the random key byte-stream is:
0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2,
0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0,
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
277
Intel® 82576 GbE Controller — Inline Functions
0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4,
0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c,
0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa
7.1.1.7.3.1
Table 7-2.
IPv4
IPv4
Destination Address/Port
Source Address/Port
IPv4 Only
IPv4 With TCP
161.142.100.80:1766
66.9.149.187:2794
0x323e8fc2
0x51ccc178
65.69.140.83:4739
199.92.111.2:14230
0xd718262a
0xc626b0ea
12.22.207.184:38024
24.19.198.95:12898
0xd2d0a5de
0x5c2b394a
209.142.163.6:2217
38.27.205.30:48228
0x82989176
0xafc7327f
202.188.127.2:1303
153.39.163.191:44251
0x5d1809c5
0x10e828a2
7.1.1.7.3.2
IPv647
The IPv6 address tuples are only for verification purposes and might not make sense as a tuple.
Table 7-3.
IPv6
Destination Address/Port
Source Address/Port
IPv6 Only
IPv6 With TCP
3ffe:2501:200:3::1 (1766)
3ffe:2501:200:1fff::7 (2794)
0x2cc18cd5
0x40207d3d
ff02::1 (4739)
3ffe:501:8::260:97ff:fe40:efab
(14230)
0x0f0c461c
0xdde51bbf
fe80::200:f8ff:fe21:67cf
(38024)
3ffe:1900:4545:3:200:f8ff:fe21:6
7cf (44251)
0x4b61e985
0x02d1feef
7.1.1.7.4
Association Through MAC Address
Each of the 24 MAC address filters can be associated with a VF/VM. The VIND field in the Receive
Address High (RAH) register determines the target VM. Packets that do not match any of the MAC filters
(such as promiscuous) are assigned with the default VT.
Software can program different values to the MAC filters (any bits in RAH or RAL) at any time. The
82576 would respond to the change on a packet boundary but does not guarantee the change to take
place at some precise time.
7.1.2
L2 Packet Filtering
The receive packet filtering role is to determine which of the incoming packets are allowed to pass to
the local system and which of the incoming packets should be dropped since they are not targeted to
the local system. Received packets can be destined to the host, to a manageability controller (BMC), or
to both. This section describes how host filtering is done, and the interaction with management
filtering.
As shown in Figure 7-5, host filtering has three stages:
1. Packets are filtered by L2 filters (MAC address, unicast/multicast/broadcast). See Section 7.1.2.1
for details.
Intel® 82576 GbE Controller
Datasheet
278
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
2. Packets are then filtered by VLAN if a VLAN tag is present. See Section 7.1.2.2 for details.
3. Packets are filtered by the manageability filters (port, IP, flex, other). See Section 10.4.1 for
details.
A packet is not forwarded to the host if any of the following takes place:
1. The packet does not pass MAC address filters as described later in this section.
• The packet does not pass VLAN filtering as described later in this section.
• The packet passes manageability filtering and then the manageability filters determine that the
packet should not pass to host as well (see Section 10.4.1 and the MANC2H register).
A packet that passes receive filtering as previously described might still be dropped due to other
reasons. Normally, only good packets are received. These are defined as those packets with no Under
Size Error, Over Size Error, Packet Error, Length Error and CRC Error are detected. However, if the
storebad-packet bit is set (FCTRL.SBP), then bad packets that pass the filter function are stored in host
memory. Packet errors are indicated by error bits in the receive descriptor (RDESC.ERRORS). It is
possible to receive all packets, regardless of whether they are bad, by setting the promiscuous enables
and the store-bad-packet bit.
If there is insufficient space in the receive FIFO, hardware drops the packet and indicates the missed
packet in the appropriate statistics registers.
Note:
CRC errors before the SFD are ignored. Any packet must have a valid SFD in order to be
recognized by the 82576 (even bad packets).
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
279
Intel® 82576 GbE Controller — Inline Functions
Figure 7-5.
7.1.2.1
Rx Filtering Flow Chart
MAC Address Filtering
Figure 7-6 shows the MAC address filtering. A packet passes successfully through the MAC address
filtering if any of the following conditions are met:
1. It is a unicast packet and promiscuous unicast filtering is enabled.
2. It is a multicast packet and promiscuous multicast filtering is enabled.
3. It is a unicast packet and it matches one of the unicast MAC filters (host or manageability).
4. It is a multicast packet and it matches one of the multicast filters.
Intel® 82576 GbE Controller
Datasheet
280
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
5. It is a broadcast packet and Broadcast Accept Mode (BAM) is enabled. Note that in this case, for
manageability traffic, the packet does not go through VLAN filtering (VLAN filtering is assumed to
match).
Figure 7-6.
7.1.2.1.1
MAC Address Rx Filtering Flow Chart
Unicast Filter
The entire MAC address is checked against the 16 host unicast addresses and four management unicast
addresses (if enabled). The 16 host unicast addresses are controlled by the host interface (the MC must
not change them). The other four addresses are dedicated to management functions and are only
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
281
Intel® 82576 GbE Controller — Inline Functions
accessed by the BMC. The destination address of incoming packet must exactly match one of the preconfigured host address filters or the manageability address filters. These addresses can be unicast or
multicast. Those filters are configured through RAL, RAH, MMAL, and MMAH registers.
Promiscuous Unicast — Receive all unicasts. Promiscuous unicast mode can be set/cleared only through
the host interface (not by the BMC) and it is usually used when the 82576 is used as a sniffer.
Unicast Hash Table — Destination address matching the Unicast Hash Table (UTA).
7.1.2.1.2
Multicast Filter (Partial)
The 12-bit portion of incoming packet multicast address must exactly match Multicast Filter Address
(MFA) in order to pass multicast filtering. Those 12 bits out of 48 bits of the destination address can be
selected by the MO field of RCTL (Section 8.10.1). These entries can be configured only by the host
interface and cannot be controlled by the BMC. Packets received according to this mode have the PIF bit
in the descriptor set to indicate imperfect filtering that should be validated by the software device
driver.
Promiscuous Multicast — Receive all multicast. Promiscuous multicast mode can be set/cleared only
through the host interface (not by the BMC) and it is usually used when the 82576 is used as a sniffer.
Note:
7.1.2.2
When the promiscuous bit is set and a multicast packet is received, the PIF bit of the packet
status is not set.
VLAN Filtering
A receive packet that successfully passed MAC address filtering is then subjected to VLAN header
filtering.
1. If the packet does not have a VLAN header, it passes to the next filtering stage.
Note:
If extended VLAN is enabled (CTRL_EXT.EXTENDED_VLAN is set), it is assumed that the
first VLAN tag is an extended VLAN and it is skipped. All next stages refer to the second
VLAN.
2. If the packet has a VLAN header and it passes a valid manageability VLAN filter, then is passes to
the next filtering stage.
3. If VLAN filtering is disabled (RCTL.VFE bit is cleared), the packet is forwarded to the next filtering
stage.
4. If the packet has a VLAN header, and it matches an enabled host VLAN filter, the packet is
forwarded to the next filtering stage.
5. If the packet has a VLAN header and MANC.Bypass VLAN is set, the packet is forwarded to the next
filtering stage, but is candidate for manageability forwarding only.
6. Otherwise, the packet is dropped.
Figure 7-7 shows the VLAN filtering flow.
Intel® 82576 GbE Controller
Datasheet
282
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Figure 7-7.
7.1.2.3
VLAN Filtering
Manageability Filtering
Manageability filtering is described in Chapter 10.4.1.
Figure 7-8 shows the manageability portion of the packet filtering and it is brought here to make the
receive packet filtering functionality description complete.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
283
Intel® 82576 GbE Controller — Inline Functions
Figure 7-8.
Note:
Manageability Filtering
The manageability engine might decide to snoop or redirect part of the received packets,
according to the external MC instructions and the EEPROM settings.
Intel® 82576 GbE Controller
Datasheet
284
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.1.3
Receive Data Storage
7.1.3.1
Host Buffers
Each descriptor points to a one or more memory buffers that are designated by the software device
driver to store packet data.
The size of the buffer can be set using either the generic RCTL.BSIZE field, or the per queue
SRRCTL[n].BSIZEPACKET field.
The receive buffer size is selected by bit settings in the Receive Control (RCTL.BSIZE). The register
supports buffer sizes of 256, 512, 1024, and 2048 bytes. See section 12.7.1 for details.
If SRRCTL[n].BSIZEPACKET is set to zero for any queue, the buffer size defined by RCTL.BSIZE is used.
Otherwise, the buffer size defined by SRRCTL[n].BSIZEPACKET is used.
For advanced descriptor usage, the SRRCTL.BSIZEHEADER field is used to define the size of the buffers
allocated to headers. The maximum buffer size supported is 960 bytes.
The 82576 places no alignment restrictions on receive memory buffer addresses. This is desirable in
situations where the receive buffer was allocated by higher layers in the networking software stack, as
these higher layers might have no knowledge of a specific device's buffer alignment requirements.
Note:
When the No-Snoop Enable bit is used in advanced descriptors, the buffer address is 16-bit
(2-byte) aligned.
7.1.3.2
On-Chip Rx Buffers
The 82576 contains a 64 KBytes packet buffer that can be used to store packets until they are
forwarded to the host.
In addition, to support the forwarding of local packets as described in Section 7.10.3, a switch buffer of
20 KBytes is provided. This buffer serves as a receive buffer for all the local traffic.
7.1.3.3
On-Chip descriptor Buffers
The 82576 contains a 32 descriptor cache for each receive queue used to reduce the latency of packet
processing and to optimize the usage of PCIe bandwidth by fetching and writing back descriptors in
bursts. The fetch and writeback algorithm are described in Section 7.1.6 and Section 7.1.7.
7.1.4
Legacy Receive Descriptor Format
A receive descriptor is a data structure that contains the receive data buffer address and fields for
hardware to store packet information. If SRRCTL[n],DESCTYPE = 000b, the 82576 uses the legacy Rx
descriptor as shown in Table 7-4. The shaded areas indicate fields that are modified by hardware upon
packet reception (so-called descriptor write-back).
Note:
Legacy descriptor must not be used when advanced features such as Virtualization or
security features are activated.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
285
Intel® 82576 GbE Controller — Inline Functions
Table 7-4.
Legacy Receive Descriptor (RDESC) Layout
63
48 47
40 39
0
32 31
16 15
0
Buffer Address [63:0]
8
VLAN Tag
Errors
Status
Fragment Checksum
Length
After receiving a packet for the 82576, hardware stores the packet data into the indicated buffer and
writes the length, packet checksum, status, errors, and status fields.
Packet Buffer Address (64) — Physical address of the packet buffer.
Length Field (16)
Length covers the data written to a receive buffer including CRC bytes (if any). Software must read
multiple descriptors to determine the complete length for a packet that spans multiple receive buffers.
Fragment Checksum (16)
This field is used to provide the fragment checksum value. This field equals to the unadjusted 16-bit
ones complement of the packet. Checksum calculation starts at the L4 layer (after the IP header) until
the end of the packet excluding the CRC bytes. In order to use the fragment checksum assist to offload
L4 checksum verification, software might need to back out some of the bytes in the packet. For more
details see Section 7.1.10.2
Status Field (8)
Status information indicates whether the descriptor has been used and whether the referenced buffer is
the last one for the packet. See Table 7-5 for the layout of the Status field. Error status information is
shown in Figure 7-9.
Table 7-5.
Receive Status (RDESC.STATUS) Layout
7
6
5
4
3
2
1
0
PIF
IPCS
L4CS
UDPCS
VP
Rsv
EOP
DD
• PIF (bit 7) — Passed in-exact filter
• IPCS (bit 6) — Ipv4 checksum calculated on packet
• L4CS (bit 5) — L4 (UDP or TCP) checksum calculated on packet
• UDPCS (bit 4) — UDP checksum calculated on packet
• VP (bit 3) — Packet is 802.1q (matched VET); indicates strip VLAN in 802.1q packet
• RSV (bit 2) — Reserved
• EOP (bit 1) — End of packet
• DD (bit 0) — Descriptor done
EOP and DD
Intel® 82576 GbE Controller
Datasheet
286
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The following table lists the meaning of these bits:
Table 7-6.
Receive Status Bits
DD
EOP
Description
0b
0b
Software setting of the descriptor when it hands it off to the hardware.
0b
1b
Reserved (invalid option).
1b
0b
A completion status indication for a non-last descriptor of a packet that spans across multiple
descriptors. In a single packet case, DD indicates that the hardware is done with the descriptor and its
buffers. Only the Length fields are valid on this descriptor.
1b
1b
A completion status indication of the entire packet. Note that software Might take ownership of its
descriptors. All fields in the descriptor are valid (reported by the hardware).
VP Field
The VP field indicates whether the incoming packet's type matches VET. For example, if the packet is a
VLAN (802.1q) type, it is set if the packet type matches VET and CTRL.VME is set. It also indicates that
VLAN has been stripped in 802.1q packet. For more details, see Section 7.4.
IPCS (Ipv4 Checksum), L4CS (L4 Checksum), and UDPCS (UDP Checksum)
The meaning of these bits is shown in the table below:
Table 7-7.
IPCS, L4CS, and UDPCS
L4CS
UDPCS
IPCS
Functionality
0b
0b
0b
Hardware does not provide checksum offload. Special case: Hardware does not provide
UDP checksum offload for IPV4 packet with UDP checksum = 0b
1b
0b
1b / 0b
Hardware provides IPv4 checksum offload if IPCS is active and TCP checksum is offload. A
pass/fail indication is provided in the Error field – IPE and L4E.
0b
1b
1b / 0b
Hardware provides IPv4 checksum offload if IPCS is active and UDP checksum is offload. A
pass/fail indication is provided in the Error field – IPE and L4E.
Refer to Table 7-18 for a description of supported packet types for receive checksum offloading.
Unsupported packet types do not have the IPCS or L4CS bits set. IPv6 packets do not have the IPCS bit
set, but might have the L4CS bit set if the 82576 recognized the TCP or UDP packet.
PIF
Hardware supplies the PIF field to expedite software processing of packets. Software must examine any
packet with PIF set to determine whether to accept the packet. If PIF is clear, then the packet is known
to be for this station, so software need not look at the packet contents. Multicast packets passing only
the Multicast Vector (MTA) or unicast packets passing only the Unicast Hash Table (UTA) but not any of
the MAC address exact filters (RAH, RAL) have PIF set. In addition, the following condition causes PIF to
be cleared:
• The DA of the packet is a multicast address and promiscuous multicast is set (RCTL.MPE = 1b).
• The DA of the packet is a broadcast address and accept broadcast mode is set (RCTL.BAM = 1b)
A MAC control frame forwarded to the host (RCTL.PMCF = 0b) that does not match any of the exact
filters, has the PIF bit set.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
287
Intel® 82576 GbE Controller — Inline Functions
Error Field (8)
Most error information appears only when the store-bad-packet bit (RCTL.SBP) is set and a bad packet
is received. See Figure 7-9 for a definition of the possible errors and their bit positions.
Table 7-8.
RXE, LPE, L4E
7
6
5
RXE
IPE
L4E
4
3
2
1
0
Reserved
• RXE (bit 7) — RX Data Error
• IPE (bit 6) — Ipv4 Checksum Error
• L4E (bit 5) — TCP/UDP Checksum Error
• Reserved (bit 4:0)
IPE/L4E
The IP and TCP checksum error bits from Figure 7-9 are valid only when the IPv4 or TCP/UDP
checksum(s) is performed on the received packet as indicated via IPCS and L4CS. These, along with the
other error bits, are valid only when the EOP and DD bits are set in the descriptor.
Note:
Receive checksum errors have no affect on packet filtering.
If receive checksum offloading is disabled (RXCSUM.IPOFL and RXCSUM.TUOFL), the IPE and L4E bits
are 0b.
RXE
The RXE error bit is asserted in one of two cases (software might distinguish between these errors by
monitoring the respective statistics registers):
1. CRC error is detected. CRC can be a result of reception of /V/ symbol on the TBI interface (see
section 3.5.3.3.2) or assertion of RxERR on the MII/GMII interface or bad EOP or lose of sync
during packet reception. Packets with a CRC error are posted to host memory only when store-badpacket bit (RCTL.SBP) is set.
2. Hardware checks the data integrity when received packets are fetched from its internal packet
buffer (see Section 7.6 for details). Packets with integrity errors are posted to host memory
regardless of store-bad-packet setting (RCTL.SBP).
VLAN Tag Field (16)
Hardware stores additional information in the receive descriptor for 802.1q packets. If the packet type
is 802.1q (determined when a packet matches VET and CTRL.VME = 1b), then the VLAN Tag field
records the VLAN information and the four-byte VLAN information is stripped from the packet data
storage. Otherwise, the VLAN Tag field contains 0x0000. The rule for VLAN tag is to use network
ordering (also called big endian). It appears in the following manner in the descriptor:
Table 7-9.
15
VLAN Tag Field Layout (for 802.1q Packet)
13
PRI
Intel® 82576 GbE Controller
Datasheet
288
12
CFI
11
0
VLAN
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.1.5
Advanced Receive Descriptors
7.1.5.1
Advanced Receive Descriptors (Read Format)
Figure 7-10 shows the receive descriptor. This is the format that software writes to the descriptor queue
and hardware reads from the descriptor queue in host memory. Hardware writes back the descriptor in
a different format, shown in Table 7-10.
Table 7-10.
Descriptor Read Format
63
1
0
0
Packet Buffer Address [63:1]
A0/NSE
8
Header Buffer Address [63:1]
DD
Packet Buffer Address (64) — Physical address of the packet buffer. The lowest bit is either A0 (LSB of
address) or NSE (No-Snoop Enable), depending on bit RXCTL.RXdataWriteNSEn of the relevant queue.
See Section 8.13.1.
Header Buffer Address (64) — Physical address of the header buffer. The lowest bit is DD.
Note:
The 82576 does not support null descriptors (a packet or header address is always equal to
zero.
When software sets the NSE bit, the 82576 places the received packet associated with this descriptor in
memory at the packet buffer address with NSE set in the PCIe attribute fields. NSE does not affect the
data written to the header buffer address.
When a packet spans more than one descriptor, the header buffer address is not used for the second,
third, etc. descriptors; only the packet buffer address is used in this case.
NSE is enabled for packet buffers that the software device driver knows have not been touched by the
processor since the last time they were used, so the data cannot be in the processor cache and snoop is
always a miss. Avoiding these snoop misses improves system performance. No-snoop is particularly
useful when the DMA engine is moving the data from the packet buffer into application buffers, and the
software device driver is using the information in the header buffer for its work with the packet.
Note:
When No-Snoop Enable is used, relaxed ordering should also be enabled with
CTRL_EXT.RO_DIS.
7.1.5.2
Advanced Receive Descriptors — Writeback Format
When the 82576 writes back the descriptors, it uses the descriptor format shown in Table 7-11.
Note:
SRRCTL[n]. DESCTYPE must be set to a value other than 000b for the 82576 to write back
the special descriptors.
Table 7-11.
Descriptor Write-Back Format
63
0
8
48
47
35
34
32
RSS Hash Value/Fragment Checksum and
IP identification
VLAN Tag
PKT_LEN
31
SPH
30
21
HDR_LEN
Extended Error
20
19
RSV
17
16
4
Packet
Type
3
0
RSS
Type
Extended Status
RSS Type (4)
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
289
Intel® 82576 GbE Controller — Inline Functions
Table 7-12.
RSS Type
Packet Type
Description
0x0
No hash computation done for this packet.
0x1
HASH_TCP_IPV4
0x2
HASH_IPV4
0x3
HASH_TCP_IPV6
0x4
HASH_IPV6_EX
0x5
HASH_IPV6
0x6
HASH_TCP_IPV6_EX
0x7
HASH_UDP_IPV4
0x8
HASH_UDP_IPV6
0x9
HASH_UDP_IPV6_EX
0xA:0xF
Reserved
The 82576 must identify the packet type and then choose the appropriate RSS hash function to be used
on the packet. The RSS type reports the packet type that was used for the RSS hash function.
Packet Type (13)
• VPKT (bit 12) — VLAN Packet indication
The 12 LSB bits of the packet type reports the packet type identified by the hardware as follows:
Table 7-13.
LSB Bits
Bit Index
Bit 11 = 0b
Bit 11 = 1b (L2 packet)
0
IPV4 — IPv4 header present
1
IPV4E — IPv4 Header includes extensions
2
IPV6 — IPv6 header present
3
IPV6E- IPv6 Header includes extensions
4
TCP — TCP header present
5
UDP — UDP header present
Reserved
6
SCTP — SCTP header present
Reserved
7
NFS — NFS header present
Reserved
8
IPSec ESP – IPSec encapsulation1
Reserved
9
IPSec AH – IPSec encapsulation
Reserved
10
MACSec – MACSec encapsulation
MACSec – MACSec encapsulation
Intel® 82576 GbE Controller
Datasheet
290
EtherType — ETQF register index that matches the
packet. Special types might be defined for 1588,
802.1X, or any other requested type.
Reserved — for future expansion of ETQF
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
1. IPsec functionality not available in 82576NS.
RSV(5) — Reserved.
HDR_LEN (10) — The length (bytes) of the header as parsed by the 82576. In split mode when HBO is
set, the HDR_LEN can be greater then zero though nothing is written to the header buffer. In header
replication mode (SPH is also set in this mode), this does not reflect the size of the data actually stored
in the header buffer because the 82576 fills the buffer up to the size configured by
SRRCTL[n].BSIZEHEADER, which might be larger than the header size reported here. This field is only
valid in the first descriptor of a packet and should be ignored in all subsequent descriptors.
Packet types supported by the header split and header replication are listed later. Other packet types
are posted sequentially in the host packet buffer. Each line in the following table has an enable bit in the
PSRTYPE register. When one of the bits is set, the corresponding packet type is split. If the bit is not
set, a packet matching the header layout is not split.
Header split and replication is described in Section 7.1.9 while the packet types for this functionality are
enabled by the PSRTYPE[n] registers (Section 8.10.3).
Note:
The header of a fragmented IPv6 packet is defined before the fragmented extension header.
SPH (1) — Split Header — When set, indicates that the HDR_LEN field reflects the length of the header
found by hardware. If cleared, the HDR_LEN field should be ignored, unless in Split – always use
header buffer mode, where PKT_LEN = 0, in which case, the HDR_LEN reflects the size of the packet,
even if SPH is cleared.
RSS Hash / Fragment Checksum (32)
This field has multiplexed functionality according to the received packet type (reported on the Packet
Type field in this descriptor) and device setting.
Fragment Checksum (16-Bit; 63:48)
The fragment checksum word contains the unadjusted one’s complement checksum of the IP payload
and is used to offload checksum verification for fragmented UDP packets as described in
Section 7.1.10.2. This field is mutually exclusive with the RSS hash. It is enabled when the
RXCSUM.PCSD bit is cleared and the RXCSUM.IPPCSE bit is set.
IP identification (16-Bit; 47:32)
The IP identification word identifies the IP packet to whom this fragment belongs and is used to offload
checksum verification for fragmented UDP packets as described in Section 7.1.10.2. This field is
mutually exclusive with the RSS hash. It is enabled when the RXCSUM.PCSD bit is cleared and the
RXCSUM.IPPCSE bit is set.
RSS Hash Value (32)
The RSS hash value is required for RSS functionality as described in Section 7.1.1.7. This bit is mutually
exclusive with the IP identification and the fragment checksum. It is enabled when the RXCSUM.PCSD
bit is set.
Extended Status (20)
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
291
Intel® 82576 GbE Controller — Inline Functions
Status information indicates whether the descriptor has been used and whether the referenced buffer is
the last one for the packet. Table 7-14 lists the extended status word in the last descriptor of a packet
(EOP is set). Table 7-15 lists the extended status word in any descriptor but the last one of a packet
(EOP is cleared).
Table 7-14.
Receive Status (RDESC.STATUS) Layout of the Last Descriptor
19
18
17
16
Rsv
LB
SECP
TS
VEXT
Rsv
PIF
IPCS
L4I
UDPCS
9
8
7
6
5
4
Table 7-15.
15
14
13
12
11
10
Strip
CRC
LLINT
UDPV
VP
Rsv
EOP
DD
3
2
1
0
Reserved
Receive Status (RDESC.STATUS) Layout of Non-Last Descriptor
19
..............................
Reserved
...2
1
0
EOP = 0b
DD
TS (16) — Time Stamped Packet (Time Sync). The Time Stamp bit is set when the device recognized a
Time Sync packet. In such a case the hardware captures its arrival time and stores it in the “Time
Stamp” register.
Note:
If TSYNCRXCTL.TYPE=100b, all the packets are time stamped; however, this bit is never set
as the time stamp value is not locked.
Reserved (2, 8, 15:13, 19) — Reserved at zero.
PIF (7), IPCS(6), UDPCS(4), VP(3), EOP (1), DD (0) — These bits are described in the legacy descriptor
format in Section 7.1.4.
L4I (5) — This bit indicates that an L4 integrity check was done on the packet, either TCP checksum,
UDP checksum or SCTP CRC checksum. This bit is valid only for the last descriptor of the packet. An
error in the integrity check is indicated by the L4E bit in the error field. The type of check done can be
induced from the packet type bits 4, 5 and 6. If bit 4 is set, a TCP checksum was done. If bit 5 is set a
UDP checksum was done, and if bit 6 is set, a CRC checksum was done.
VEXT (9)- First VLAN is found on a double VLAN packet. This bit is valid only when
CTRL_EXT.EXTENDED_VLAN is set. For more details see Section 7.4.5.
UDPV (10) — This bit indicates that the incoming packet contains a valid (non-zero value) checksum
field in an incoming fragmented UDP Ipv4 packet. This means that the Fragment Checksum field in the
receive descriptor contains the UDP checksum as described in Section 7.1.10.2. When this field is
cleared in the first fragment that contains the UDP header, means that the packet does not contain a
valid UDP checksum and the checksum field in the Rx descriptor should be ignored. This field is always
cleared in incoming fragments that do not contain the UDP header.
LLINT (11) — This bit indicates that the packet caused an immediate interrupt via the low latency
interrupt mechanism.
SECP (17) — The security processing bit indicates that hardware identified the security encapsulation
and processed it as configured.
Intel® 82576 GbE Controller
Datasheet
292
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
• MACSec processing: This bit is set each time MACSec processing of the packet was attempted (such
as a MACSec header was found and MACSec offload is enabled) regardless if a matched SA was
found. This bit is not set for clear packets even if they have a MACSec header (such as SECP
packets).
• IPsec processing: This bit is set only if a matched SA was found. Note that hardware does not
process packets with the IPv4 option or IPv6 extension header and SECP is not set.1
LB (18) - This bit provides a loopback status indication meaning that this packet is sent by a local
virtual machine (VM-to-VM switch indication).
Extended Error (12)
Table 7-16 and the text that follows describes the possible errors reported by hardware.
Table 7-16.
Receive Errors (RDESC.ERRORS) Layout
11
10
9
RXE
IPE
L4E
8
7
SECERR
6
4
Reserved
3
2
HBO
0
Reserved
RXE (bit 11) — RXE is described in the legacy descriptor format in Section 7.1.4.
IPE (bit 10) — The IPE error indication is described in the legacy descriptor format in Section 7.1.4.
L4E (bit 9) — L4 error indication — When set, indicates that hardware attempted to do an L4 integrity
check as described in the L4I bit, but the check failed.
Security Error (bit 8:7)
MACSec Status
Indicates potential errors in the MACSec processing according to the following encoding.
Code
Error Type
00b
No error
01b
No SA match
10b
Replay error
11b
Bad signature
IPSec Status (Valid Only if SA Match, Else Zero)2
1. IPsec functionality not available in 82576NS.
2. IPsec functionality not available in the 82576NS.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
293
Intel® 82576 GbE Controller — Inline Functions
Indicates potential errors in the IPSec processing according to the following encoding:
Code
Error Type
00b
No error, either no SA match (SECP is cleared), or the incoming packet was successfully authenticated by
hardware.1
01b
Invalid IPsec protocol, the Protocol field value included in the IP header (or in the IP next header for IPv6) does not
match the PROTO field stored in the corresponding Rx SA entry.
10b
Packet length error, ESP packet is not 4-bytes aligned or the AH/ESP header is truncated (for example, a 28-byte
IPv4 packet with IPv4 header and ESP header that contains only SPI and SN) or AH Length field content in AH
header is not valid (i.e. not equal to 0x07 for IPv4 or to 0x08 for IPv6).
11b
Authentication failed. For example, the computed ICV field does not match the ICV field included in the packet.
1. For incoming IPv4 packets where the Protocol field is AH/ESP, and for which any IPv4 option is present; or for incoming IPv6
packets where there is an AH/ESP extension header together with any other extension header (even another AH/ESP extension
header), no IPsec or Layer4 offload is performed by hardware and the packet is passed to software with the SECP bit cleared (such
as no SA match) - without performing any SA lookup.
Reserved (bit 6:4)
HBO (bit 3) — Header Buffer Overflow
Note:
This bit is relevant only if SPH is set.
1. In both header replication modes, HBO is set if the header size (as calculated by hardware) is
bigger than the allocated buffer size (SRRCTL.BSIZEHEADER) but the replication still takes place up
to the header buffer size. Hardware sets this bit in order to indicate to software that it needs to
allocate bigger buffers for the headers.
2. In header split mode, when SRRCTL[n] BSIZEHEADER is smaller than HDR_LEN, then HBO is set to
1b, In this case, the header is not split. Instead, the header resides within the host packet buffer.
The HDR_LEN field is still valid and equal to the calculated size of the header. However, the header
is not copied into the header buffer.
3. In header split mode, always use header buffer mode, when SRRCTL[n] BSIZEHEADER is smaller
than HDR_LEN, then HBO is set to 1b. In this case, the header buffer is used as part of the data
buffers and contains the first BSIZEHEADER bytes of the packet. The HDR_LEN field is still valid and
equal to the calculated size of the header.
Note:
Most error information appears only when the store–bad–packet bit (RCTL.SBP) is set and a
bad packet is received.
Using SRRCTL.BSIZEHEADER, the maximum buffer size supported is 960 bytes.
Reserved (bits 2:0) — Reserved
PKT_LEN (16) – Number of bytes existing in the host packet buffer
The length covers the data written to a receive buffer including CRC bytes (if any). Software must read
multiple descriptors to determine the complete length for packets that span multiple receive buffers. If
SRRCTL.DESC_TYPE = 4 (advanced descriptor header replication large packet only) and the total
packet length is smaller than the size of the header buffer (no replication is done), this field continues
to reflect the size of the packet, although no data is written to the packet buffer. Otherwise, if the buffer
is not split because the header is bigger than the allocated header buffer, this field reflects the size of
the data written to the first packet buffer (header and data).
VLAN Tag (16)
These bits are described in the legacy descriptor format in Section 7.1.4.
Intel® 82576 GbE Controller
Datasheet
294
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.1.6
Receive Descriptor Fetching
The fetching algorithm attempts to make the best use of PCIe bandwidth by fetching a cache-line (or
more) descriptor with each burst. The following paragraphs briefly describe the descriptor fetch
algorithm and the software control provided.
When the on-chip buffer is empty, a fetch happens as soon as any descriptors are made available (host
writes to the tail pointer). When the on-chip buffer is nearly empty (RXDCTL.PTHRESH), a prefetch is
performed each time enough valid descriptors (RXDCTL.HTHRESH) are available in host memory.
When the number of descriptors in host memory is greater than the available on-chip descriptor
storage, the 82576 might elect to perform a fetch that is not a multiple of cache-line size. Hardware
performs this non-aligned fetch if doing so results in the next descriptor fetch being aligned on a cacheline boundary. This enables the descriptor fetch mechanism to be most efficient in the cases where it
has fallen behind software.
All fetch decisions are based on the number of descriptors available and do not take into account any
split of the transaction due to bus access limitations.
Note:
The 82576 NEVER fetches descriptors beyond the descriptor tail pointer.
7.1.7
Receive Descriptor Write-Back
Processors have cache-line sizes that are larger than the receive descriptor size (16 bytes).
Consequently, writing back descriptor information for each received packet would cause expensive
partial cache-line updates. A receive descriptor packing mechanism minimizes the occurrence of partial
line write-backs.
To maximize memory efficiency, receive descriptors are packed together and written as a cache-line
whenever possible. Descriptors write-backs accumulate and are opportunistically written out in cache
line-oriented chunks, under the following scenarios:
• RXDCTL.WTHRESH descriptors have been used (the specified maximum threshold of unwritten
used descriptors has been reached).
• The receive timer expires (EITR) — in this case all descriptors are flushed ignoring any cache-line
boundaries.
• Explicit software flush (RXDCTLn.SWFLS).
• Dynamic packets — if at least one of the descriptors that are waiting for write-back are classified as
packets requiring immediate notification the entire queue is flushed out.
When the number of descriptors specified by RXDCTL.WTHRESH have been used, they are written back
regardless of cache-line alignment. It is therefore recommended that WTHRESH be a multiple of cacheline size. When the receive timer (EITR) expires, all used descriptors are forced to be written back prior
to initiating the interrupt, for consistency. Software might explicitly flush accumulated descriptors by
writing the RXDCTLn register with the SWFLS bit set.
When the 82576 does a partial cache-line write-back, it attempts to recover to cache-line alignment on
the next write-back.
For applications where the latency of received packets is more important that the bus efficiency and the
CPU utilization, an EITR value of zero may be used. In this case, each receive descriptor will be written
to the host immediately. If RXDCTL.WTHRESH equals zero, then each descriptor will be written back
separately, otherwise, write back of descriptors may be coalesced if descriptor accumulates in the
internal descriptor ring due to bandwidth constrains.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
295
Intel® 82576 GbE Controller — Inline Functions
All write-back decisions are based on the number of descriptors available and do not take into account
any split of the transaction due to bus access limitations.
7.1.8
Receive Descriptor Ring Structure
Figure 7-9 shows the structure of each of the 16 receive descriptor rings. Hardware maintains 16
circular queues of descriptors and writes back used descriptors just prior to advancing the head
pointer(s). Head and tail pointers wrap back to base when size descriptors have been processed.
Figure 7-9.
Receive Descriptor Ring Structure
Software inserts receive descriptors by advancing the tail pointer(s) to refer to the address of the entry
just beyond the last valid descriptor. This is accomplished by writing the descriptor tail register(s) with
the offset of the entry beyond the last valid descriptor. The hardware adjusts its internal tail pointer(s)
accordingly. As packets arrive, they are stored in memory and the head pointer(s) is incremented by
hardware. When the head pointer(s) is equal to the tail pointer(s), the queue(s) is empty. Hardware
stops storing packets in system memory until software advances the tail pointer(s), making more
receive buffers available.
Intel® 82576 GbE Controller
Datasheet
296
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The receive descriptor head and tail pointers reference to 16-byte blocks of memory. Shaded boxes in
Figure 7-9 represent descriptors that have stored incoming packets but have not yet been recognized
by software. Software can determine if a receive buffer is valid by reading the descriptors in memory.
Any descriptor with a non-zero DD value has been processed by the hardware and is ready to be
handled by the software.
Note:
The head pointer points to the next descriptor that is written back. After the descriptor
write-back operation completes, this pointer is incremented by the number of descriptors
written back. Hardware owns all descriptors between [head..tail]. Any descriptor not in this
range is owned by software.
The receive descriptor rings are described by the following registers:
• Receive Descriptor Base Address (RDBA15 to RDBA0) registers:
This register indicates the start of the descriptor ring buffer. This 64-bit address is aligned on a 16byte boundary and is stored in two consecutive 32-bit registers. Note that hardware ignores the
lower 4 bits.
• Receive Descriptor Length (RDLEN15 to RDLEN0) registers:
This register determines the number of bytes allocated to the circular buffer. This value must be a
multiple of 128 (the maximum cache-line size). Since each descriptor is 16 bytes in length, the
total number of receive descriptors is always a multiple of eight.
• Receive Descriptor Head (RDH15 to RDH0) registers:
This register holds a value that is an offset from the base and indicates the in-progress descriptor.
There can be up to 64 KB, 8 KB descriptors in the circular buffer. Hardware maintains a shadow
copy that includes those descriptors completed but not yet stored in memory.
• Receive Descriptor Tail (RDT15 to RDT0) registers:
This register holds a value that is an offset from the base and identifies the location beyond the last
descriptor hardware can process. This is the location where software writes the first new descriptor.
If software statically allocates buffers, uses legacy receive descriptors, and uses memory read to check
for completed descriptors, it has to zero the status byte in the descriptor before bumping the tail
pointer to make it ready for reuse by hardware. Zeroing the status byte is not a hardware requirement
but is necessary for performing an in-memory scan.
All the registers controlling the descriptor rings behavior should be set before receive is enabled, apart
from the tail registers that are used during the regular flow of data.
7.1.8.1
Low Receive Descriptors Threshold
As described above, the size of the receive queues is measured by the number of receive descriptor.
During run time the software processes completed descriptors and then increments the Receive
Descriptor Tail registers (RDT). At the same time, the hardware may post new packets received from
the LAN incrementing the Receive Descriptor Head registers (RDH) for each used descriptor.
The number of usable (free) descriptors for the hardware is the distance between Tail and Head
registers. When the Tail reaches the Head, there are no free descriptors and further packets may be
either dropped or block the receive FIFO. In order to avoid it, the 82576 may generate a low latency
interrupt (associated to the relevant Rx queue) once there are less equal free descriptors than a
threshold. The threshold is defined in 16 descriptors granularity per queue in the SRRCTL[n].RDMTS
field.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
297
Intel® 82576 GbE Controller — Inline Functions
7.1.9
7.1.9.1
Header Splitting and Replication
Purpose
This feature consists of splitting or replicating packet's header to a different memory space. This helps
the host to fetch headers only for processing: headers are replicated through a regular snoop
transaction in order to be processed by the host CPU. It is recommended to perform this transaction
with the DCA feature enabled (see section 8.3) or in conjunction with a software-prefetch.
The packet (header and payload) is stored in memory through a (optionally) non-snoop transaction.
Later, a data movement engine transaction moves the payload from the software device driver buffer to
application memory or it is moved using a normal memory copy operation.
The 82576 supports header splitting in several modes:
• Legacy mode: legacy descriptors are used; headers and payloads are not split.
• Advanced mode, no split: advanced descriptors are in use; header and payload are not split.
• Advanced mode, split: advanced descriptors are in use; header and payload are split to different
buffers. If the packet cannot be split, only the packet buffer is used.
• Advanced mode, replication: advanced descriptors are in use; header is replicated in a separate
buffer and also in a payload buffer.
• Advanced mode, replication, conditioned by packet size: advanced descriptors are in use;
replication is performed only if the packet is larger than the header buffer size.
• Advanced mode, split, always use header buffer: advanced descriptors are in use; header and
payload are split to different buffers. If no split is done, the first part of the packet is stored in the
header buffer.
7.1.9.2
Description
In Figure 7-10 and Figure 7-11, the header splitting and header replication modes are shown.
Figure 7-10.
Header Splitting
Intel® 82576 GbE Controller
Datasheet
298
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Figure 7-11.
Header Replication
The physical address of each buffer is written in the Buffer Addresses fields. The sizes of these buffers
are statically defined by BSIZEPACKET in the SRRCTL[n] registers.
The packet buffer address includes the address of the buffer assigned to the replicated packet,
including header and data payload portions of the received packet. In the case of a split header, only
the payload is included.
The header buffer address includes the address of the buffer that contains the header information. The
receive DMA module stores the header portion of the received packets into this buffer.
The 82576 uses the packet replication or splitting feature when the SRRCTL[n].DESCTYPE is larger that
one. The software device driver must also program the buffer sizes in the SRRCTL[n] registers.
When header split is selected, the packet is split only on selected types of packets. A bit exists for each
option in PSRTYPE[n] registers so several options can be used in conjunction with them. If one or more
bits are set, the splitting is performed for the corresponding packet type.
The following table lists the behavior of the 82576 in the different modes:
Table 7-17.
DESCTYPE
Split
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller Behavior
Condition
SPH
HBO
1. Header can't be
decoded
0b
0b
Min(Packet length,
Buffer size)
N/A
Header + Payload  Packet
buffer
2. Header <=
BSIZEHEADER
1b
0b
Min(Payload
length, Buffer
size)1
Header size
Header  Header buffer
3. Header >
BSIZEHEADER
1b
Min(Packet length,
Buffer size)
Header
size2
1b
PKT_LEN
HDR_LEN
Header and Payload DMA
Payload  Packet buffer
Header + Payload  Packet
buffer
Intel® 82576 GbE Controller
Datasheet
299
Intel® 82576 GbE Controller — Inline Functions
Table 7-17.
Split –
always use
header
buffer
Replicate
Large
Packet only
Intel® 82576 GbE Controller Behavior
1. Packet length
<= BSIZEHEADER
0b
0b
Zero
Packet
length
Header + Payload  Header
buffer
2. Header can’t be
Decoded and
Packet length >
BSIZEHEADER
0b
0b
Min(Packet length
– BSIZEHEADER,
Data Buffer size)
BSIZEHEAD
ER
Header + Payload  Header +
Packet buffers3
3. Header <=
BSIZEHEADER
and Packet length
>= BSIZEHEADER
1b
0b
Min(Payload
length, Data
Buffer size)
Header Size
4. Header >
BSIZEHEADER
1b
1b
Min(Packet length
– BSIZEHEADER,
Data Buffer size)
Header
Size2
Header + Payload  Header +
Packet buffer3
1. Header +
Payload <=
BSIZEHEADER
0b/
1b4
0b
Packet length
Header
size, N/A4
Header + Payload  Header
buffer
2. Header +
Payload >
BSIZEHEADER
0b/
1b4
0b/
1b5
Min(Packet length,
Buffer size)
Header
size, N/A4
(Header + Payload)(partial6) 
Header buffer
Header  Header buffer
Payload  Packet buffer
Header + Payload  Packet
buffer
1. In a header only packet (such as TCP ACK packet), the PKT_LEN is zero.
2. The HDR_LEN doesn't reflect the actual data size stored in the Header buffer. It reflects the header size determined by the parser.
3. If the packet spans more than one descriptor, only the header buffer of the first descriptor is used. The header buffer is used for
the first part of the packet until it is filled up, and then the first packet buffer is used for the continuation of the packet.
Software Notes:
• If SRRCTL#.NSE is set, all buffers' addresses in a packet descriptor must be word aligned.
• Packet header can't span across buffers, therefore, the size of the header buffer must be larger
than any expected header size. Otherwise, only the part of the header fitting the header buffer is
replicated. In the case of header split mode (SRRCTL.DESCTYPE = 010b), a packet with a header
larger than the header buffer is not split.
7.1.10
Receive Packet Checksum Off Loading
The 82576 supports the off loading of three receive checksum calculations: the packet checksum, the
IPv4 header checksum, and the TCP/UDP checksum.
The packet checksum is the one's complement over the receive packet, starting from the byte indicated
by RXCSUM.PCSS (zero corresponds to the first byte of the packet), after stripping. For packets with a
VLAN header, the packet checksum includes the header if VLAN striping is not enabled by the
CTRL.VME. If a VLAN header strip is enabled, the packet checksum and the starting offset of the packet
checksum exclude the VLAN header due to masking of VLAN header. For example, for an Ethernet II
frame encapsulated as an 802.3ac VLAN packet and CTRL.VME is set and with RXCSUM.PCSS set to 14,
the packet checksum would include the entire encapsulated frame, excluding the 14-byte Ethernet
header (DA, SA, type/length) and the 4-byte q-tag. The packet checksum does not include the Ethernet
CRC if the RCTL.SECRC bit is set.
Software must make the required offsetting computation (to back out the bytes that should not have
been included and to include the pseudo-header) prior to comparing the packet checksum against the
TCP checksum stored in the packet.
Intel® 82576 GbE Controller
Datasheet
300
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
For supported packet/frame types, the entire checksum calculation can be off loaded to the 82576. If
RXCSUM.IPOFL is set to 1b, the 82576 calculates the IPv4 checksum and indicates a pass/fail indication
to software via the IPv4 Checksum Error bit (RDESC.IPE) in the Error field of the receive descriptor.
Similarly, if RXCSUM.TUOFL is set to 1b, the 82576 calculates the TCP or UDP checksum and indicates a
pass/fail condition to software via the TCP/UDP Checksum Error bit (RDESC.L4E). These error bits are
valid when the respective status bits indicate the checksum was calculated for the packet (RDESC.IPCS
and RDESC.L4CS, respectively). Similarly, if RFCTL.Ipv6_DIS and RFCTL.IP6Xsum_DIS are cleared to
0b and RXCSUM.TUOFL is set to 1b, the 82576 calculates the TCP or UDP checksum for IPv6 packets. It
then indicates a pass/fail condition in the TCP/UDP Checksum Error bit (RDESC.L4E).
If neither RXCSUM.IPOFL nor RXCSUM.TUOFL are set, the Checksum Error bits (IPE and L4E) are 0b for
all packets.
Supported frame types:
• Ethernet II
• Ethernet SNAP
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
301
Intel® 82576 GbE Controller — Inline Functions
7.1.10.1
Table 7-18.
Filters details
Supported Receive Checksum Capabilities
Packet Type
Hardware IP Checksum
Calculation
Hardware TCP/UDP Checksum
Calculation
IPv4 packets.
Yes
Yes
IPv6 packets.
No (n/a)
Yes
IPv6 packet with next header options:
•
Hop-by-hop options
No (n/a)
Yes
•
Destinations options
No (n/a)
Yes
•
Routing (with len zero)
No (n/a)
Yes
•
Routing (with len > zero)
No (n/a)
No
•
Fragment
No (n/a)
No
•
Home option
No (n/a)
No
IPv4 tunnels:
•
IPv4 packet in an IPv4 tunnel.
No
No
•
IPv6 packet in an IPv4 tunnel.
Yes (IPv4)
Yes1
IPv6 tunnels:
•
IPv4 packet in an IPv6 tunnel.
No
No
•
IPv6 packet in an IPv6 tunnel.
No
No
Packet is an IPv4 fragment.
Yes
No
Packet is greater than 1518/1522/1526 bytes;
(LPE=1b).
Yes
Yes
Packet has 802.3ac tag.
Yes
Yes
IPv4 packet has IP options
Yes
Yes
Packet has TCP or UDP options.
Yes
Yes
IP header’s protocol field contains a protocol
number other than TCP or UDP.
Yes
No
(IP header is longer than 20 bytes).
1. The IPv6 header portion can include supported extension headers as described in the IPv6 filter section.
The previous table lists general details about what packets are processed. In more detail, the packets
are passed through a series of filters to determine if a receive checksum is calculated:
7.1.10.1.1
MAC Address Filter
This filter checks the MAC destination address to be sure it is valid (such as IA match, broadcast,
multicast, etc.). The receive configuration settings determine which MAC addresses are accepted. See
the various receive control configuration registers such as RCTL (RTCL.UPE, RCTL.MPE, RCTL.BAM),
MTA, RAL, and RAH.
7.1.10.1.2
SNAP/VLAN Filter
Intel® 82576 GbE Controller
Datasheet
302
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
This filter checks the next headers looking for an IP header. It is capable of decoding Ethernet II,
Ethernet SNAP, and IEEE 802.3ac headers. It skips past any of these intermediate headers and looks
for the IP header. The receive configuration settings determine which next headers are accepted. See
the various receive control configuration registers such as RCTL (RCTL.VFE), VET, and VFTA.
7.1.10.1.3
IPv4 Filter
This filter checks for valid IPv4 headers. The version field is checked for a correct value (4).
IPv4 headers are accepted if they are any size greater than or equal to five (Dwords). If the IPv4
header is properly decoded, the IP checksum is checked for validity. The RXCSUM.IPOFL bit must be set
for this filter to pass.
7.1.10.1.4
IPv6 Filter
This filter checks for valid IPpv6 headers, which are a fixed size and have no checksum. The IPv6
extension headers accepted are: hop-by-hop, destination options, and routing. The maximum size next
header accepted is 16 Dwords (64 bytes).
7.1.10.1.5
IPv6 Extension Headers
IPv4 and TCP provide header lengths, which enable hardware to easily navigate through these headers
on packet reception for calculating checksum and CRCs, etc. For receiving IPv6 packets; however, there
is no IP header length to help hardware find the packet's ULP (such as TCP or UDP) header. One or
more IPv6 extension headers might exist in a packet between the basic IPv6 header and the ULP
header. The hardware must skip over these extension headers to calculate the TCP or UDP checksum
for received packets.
The IPv6 header length without extensions is 40 bytes. The IPv6 field Next Header Type indicates what
type of header follows the IPv6 header at offset 40. It might be an upper layer protocol header such as
TCP or UDP (Next Header Type of 6 or 17, respectively), or it might indicate that an extension header
follows. The final extension header indicates with its Next Header Type field the type of ULP header for
the packet.
IPv6 extension headers have a specified order. However, destinations must be able to process these
headers in any order. Also, IPv6 (or IPv4) might be tunneled using IPv6, and thus another IPv6 (or
IPv4) header and potentially its extension headers might be found after the extension headers.
The IPv4 Next Header Type is at byte offset nine. In IPv6, the first Next Header Type is at byte offset
six.
All IPv6 extension headers have the Next Header Type in their first eight bits. Most have the length in
the second eight bits (Offset Byte[1]) as shown:
Table 7-19.
0
1
2
34
Typical IPv6 Extended Header Format (Traditional Representation)
5
6
Next Header Type
320961-015EN
Revision: 2.61
December 2010
7
8
9
0
12
1
3
4
5
6
7
8
9
0
2
1
2
3 4
5
67
8
9
0
3
1
Length
Intel® 82576 GbE Controller
Datasheet
303
Intel® 82576 GbE Controller — Inline Functions
The following table lists the encoding of the Next Header Type field and information on determining
each header type's length. The IPv6 extension headers are not otherwise processed by the 82576 so
their details are not covered here.
Table 7-20.
Header Type Encoding and Lengths
Header
Next Header Type
IPv6
6
IPv4
4
Header Length
(Units are Bytes Unless Otherwise
Specified)
Always 40 bytes
Offset Bits[7:4]
Unit = 4 bytes
TCP
6
Offset Byte[12].Bits[7:4]
Unit = 4 bytes
UDP
17
Always 8 bytes
Hop by Hop Options
0 (Note 1)
8+Offset Byte[1]
Destination Options
60
8+Offset Byte[1]
Routing
43
8+Offset Byte[1]
Fragment
44
Always 8 bytes
Authentication
51
8+4*(Offset Byte[1])
Encapsulating Security Payload
50
Note 3
No Next Header
59
Note 2
Notes:
1. Hop-by-hop options header is only found in the first Next Header Type of an IPv6 header.
2. When a No Next Header type is encountered, the rest of the packet should not be processed.
3. Encapsulated security payload — Intel® 82576 GbE Controller cannot offload packets with this header type.
Note that the 82576 hardware acceleration does not support all IPv6 extension header types (refer to
Table 7-20).
Also, the RFCTL.Ipv6_DIS bit must be cleared for this filter to pass.
7.1.10.1.6
UDP/TCP Filter
This filter checks for a valid UDP or TCP header. The prototype next header values are 0x11 and 0x06,
respectively. The RXCSUM.TUOFL bit must be set for this filter to pass.
7.1.10.2
Receive UDP Fragmentation Checksum
The 82576 might provide receive fragmented UDP checksum offload. The 82576 should be configured
in the following manner to enable this mode:
The RXCSUM.PCSD bit should be cleared. The Packet Checksum and IP Identification fields are mutually
exclusive with the RSS hash. When the PCSD bit is cleared, Packet Checksum and IP Identification are
active instead of RSS hash.
The RXCSUM.IPPCSE bit should be set. This field enables the IP payload checksum enable that is
designed for the fragmented UDP checksum.
Intel® 82576 GbE Controller
Datasheet
304
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The RXCSUM.PCSS field must be zero. The packet checksum start should be zero to enable auto-start
of the checksum calculation. The following table lists the exact description of the checksum calculation.
The following table also lists the outcome descriptor fields for the following incoming packets types:
Table 7-21.
Descriptor Fields
Incoming Packet Type
Fragment Checksum
UDPV
UDPCS / L4CS
Non IP Packet
0b
0b
0b / 0b
Ipv6 Packet
0b
0b
Depends on transport
header.
Non fragmented Ipv4 packet
0b
0b
Depends on transport
header.
Fragmented Ipv4, when not
first fragment
The unadjusted one’s complement
checksum of the IP payload.
0b
1b / 0b
Fragmented Ipv4, for the first
fragment
Same as above
1 if the UDP header
checksum is valid
(not zero)
1b / 0b
Note:
When the software device driver computes the 16-bit ones complement, the sum on the
incoming packets of the UDP fragments, it should expect a value of 0xFFFF. Refer to
Section 7.1.10 for supported packet formats.
7.1.11
SCTP Offload
If a receive packet is identified as SCTP, the 82576 checks the CRC32 checksum of this packet and
identifies this packet as SCTP. Software is notified of the CRC check via the CRCV bit in the Extended
Status field of the Rx descriptor. The detection of an SCTP packet is indicated via the SCTP bit in the
packet Type field of the Rx descriptor. The checker assumes the following SCTP packet format:
Table 7-22.
0
1
2
3
SCTP Header
4
5
6
7
8
1
9 0
12
3
4
5
6
7
Source Port
8
9
2
0
1
2
3
4
5
67
8
3
9 0
1
Destination Port
Verification Tag
Checksum
Chunks 1..n
7.2
Transmit Functionality
7.2.1
Packet Transmission
Output packets are made up of pointer-length pairs constituting a descriptor chain (descriptor based
transmission). Software forms transmit packets by assembling the list of pointer-length pairs, storing
this information in the transmit descriptor, and then updating the on-chip transmit tail pointer to the
descriptor. The transmit descriptor and buffers are stored in host memory. Hardware typically transmits
the packet only after it has completely fetched all the L2 packet data from host memory and deposited
it into the on-chip transmit FIFO. This permits TCP or UDP checksum computation and avoids problems
with PCIe under-runs.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
305
Intel® 82576 GbE Controller — Inline Functions
Another transmit feature of the 82576 is TCP/UDP segmentation. The hardware has the capability to
perform packet segmentation on large data buffers offloaded from the Network Operating System
(NOS). This feature is discussed in detail in Section 7.2.4.
In addition, the 82576 supports SCTP offloading for transmit requests. See section Section 7.2.5.3 for
details about SCTP.
7.2.1.1
Transmit Data Storage
Data is stored in buffers pointed to by the descriptors. Alignment of data is on an arbitrary byte
boundary with the maximum size per descriptor limited only to the maximum allowed packet size (9728
bytes). A packet typically consists of two (or more) buffers, one (or more) for the header and one for
the actual data. Each buffer is referenced by a different descriptor. Some software implementations
copy the header(s) and packet data into one buffer and use only one descriptor per transmitted packet.
7.2.1.2
On-Chip Tx Buffers
The 82576 contains a 40 KB packet buffer that can be used to store packets until they are forwarded to
the network or locally to another Virtual Machine (VM).
7.2.1.3
On-Chip descriptor Buffers
The 82576 contains a 32 descriptor cache for each transmit queue used to reduce the latency of packet
processing and to optimize the usage of the PCIe bandwidth by fetching and writing back descriptors in
bursts. The fetch and writeback algorithm are described in Section 7.2.2.5 and Section 7.2.2.6.
7.2.1.4
Transmit Contexts
The 82576 provides hardware checksum offload and TCP/UDP segmentation facilities. These features
enable TCP and UDP packet types to be handled more efficiently by performing additional work in
hardware, thus reducing the software overhead associated with preparing these packets for
transmission. Part of the parameters used by these features is handled though contexts.
A context refers to a set of device registers loaded or accessed as a group to provide a particular
function. The 82576 supports 32 context register sets on-chip (two per queue). The transmit queues
can contain transmit data descriptors (much like the receive queue) as well as transmit context
descriptors.
The contexts are queue specific and one context cannot be reused from one queue to another. This
differs from the method used in previous devices that supported a pool of contexts to be shared
between queues.
A transmit context descriptor differs from a data descriptor as it does not point to packet data. Instead,
this descriptor provides the ability to write to the on-chip contexts that support the transmit checksum
offloading and the segmentation features of the 82576.
The 82576 supports one type of transmit context. This on-chip context is written with a transmit
context descriptor DTYP=2 and is always used for transmit data descriptor DTYP=3.
The IDX field contains an index to one of the two queue contexts. Software must track what context is
stored in each IDX location.
Each advanced data descriptor that uses any of the advanced offloading features must refer to a
context.
Intel® 82576 GbE Controller
Datasheet
306
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Contexts can be initialized with a transmit context descriptor and then used for a series of related
transmit data descriptors. The context, for example, defines the checksum and offload capabilities for a
given type of TCP/IP flow. All packets of this type can be sent using this context.
Software is responsible for ensuring that a context is only overwritten when it is no longer needed.
Hardware does not include any logic to manage the on-chip contexts; it is completely up to software to
populate and then use the on-chip context table.
Each context defines information about the packet sent including the total size of the MAC header
(TDESC.MACHDR), the amount of payload data that should be included in each packet (TDESC.MSS),
TCP header length (TDESC.TCPHDR), IP header length (TDESC.IPHDR), and information about what
type of protocol (TCP, IP, etc.) is used. Other than TCP, IP (TDESC.TUCMD), most information is specific
to the segmentation capability.
Because there are dedicated on-chip resources for contexts, they remain constant until they are
modified by another context descriptor. This means that a context can be used for multiple packets (or
multiple segmentation blocks) unless a new context is loaded prior to each new packet. Depending on
the environment, it might be unnecessary to load a new context for each packet. For example, if most
traffic generated from a given node is standard TCP frames, this context could be setup once and used
for many frames. Only when some other frame type is required would a new context need to be loaded
by software. This new context could use a different index or the same index.
This same logic can also be applied to the TCP/UDP segmentation scenario, though the environment is
a more restrictive one. In this scenario, the host is commonly asked to send messages of the same
type, TCP/IP for instance, and these messages also have the same Maximum Segment Size (MSS). In
this instance, the same context could be used for multiple TCP messages that require hardware
segmentation.
7.2.2
Transmit Descriptors
The 82576 supports legacy descriptors and the 82576 advanced descriptors.
Legacy descriptors are intended to support legacy drivers to enable fast platform power up and to
facilitate debug.
Note:
These descriptors must not be used with advanced features such as virtualizationor MACSec
are used.
If legacy descriptors are used when CTRL_EXT.RT or DTXSWC.Loopback enable or
STATUS.VFE or one of the DTXSWC.MACAS bits or one of the DTXSWC.VLANAS bits are set,
packets are ignored and not sent.
The Legacy descriptors are recognized as such based on the DEXT bit as discussed later in this section.
In addition, the 82576 supports two types of advanced transmit descriptors:
1. Advanced Transmit Context Descriptor, DTYP = 0010b.
2. Advanced Transmit Data Descriptor, DTYP = 0011b.
Note:
DTYP values 0000b and 0001b are reserved.
The transmit data descriptor (both legacy and advanced) points to a block of packet data to be
transmitted. The advanced transmit context descriptor does not point to packet data. It contains
control/context information that is loaded into on-chip registers that affect the processing of packets for
transmission. The following sections describe the descriptor formats.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
307
Intel® 82576 GbE Controller — Inline Functions
7.2.2.1
Legacy Transmit Descriptor Format
Legacy descriptors are identified by having bit 29 of the descriptor (TDESC.DEXT) set to 0b. In this
case, the descriptor format is defined as shown in Table 7-23. Note that the address and length must be
supplied by software. Also note that bits in the command byte are optional, as are the CSO, and CSS
fields.
Table 7-23.
Transmit Descriptor (TDESC) Fetch Layout — Legacy Mode
63
48
47
40
39
36
0
31
24
23
16
15
0
Buffer Address [63:0]
8
VLAN
Table 7-24.
63
CSS
ExtCMD
STA
CMD
48
47 40
39
36
35 32
31 24
Reserved
VLAN
Note:
7.2.2.1.1
CSO
Length
Transmit Descriptor (TDESC) Write-Back Layout — Legacy Mode
0
8
35 32
CSS
Reserved
23 16
15
0
Reserved
STA
CMD
CSO
Length
For frames that spans multiple descriptors, the VLAN, CSS, CSO, CMD.VLE, CMD.IC, and
CMD.IFCS are valid only in the first descriptors and are ignored in the subsequent ones.
Address (64)
Physical address of a data buffer in host memory that contains a portion of a transmit packet.
7.2.2.1.2
Length
Length (TDESC.LENGTH) specifies the length in bytes to be fetched from the buffer address provided;
the maximum length associated with any single legacy descriptor is 9728 bytes.
Note:
The maximum allowable packet size for transmits changes based on the value written to the
Tx Packet Buffer Allocation (TXPBS) register.
Descriptor length(s) might be limited by the size of the transmit FIFO. All buffers comprising a single
packet must be able to be stored simultaneously in the transmit FIFO. For any individual packet, the
sum of the individual descriptors' lengths must be below 9728 bytes.
Note:
Descriptors with zero length (null descriptors) transfer no data. Null descriptors can only
appear between packets and must have their EOP bits set.
If the TCTL.PSP bit is set, the total length of the packet transmitted, not including FCS
should be at least 17 bytes.
7.2.2.1.3
Checksum Offset and Start — CSO and CSS
A Checksum Offset (TDESC.CSO) field indicates where, relative to the start of the packet, to insert a
TCP checksum if this mode is enabled. A Checksum Start (TDESC.CSS) field indicates where to begin
computing the checksum.
Both CSO and CSS are in units of bytes and must be in the range of data provided to the 82576 in the
descriptor. This means for short packets that are not padded by software, CSS and CSO must be in the
range of the unpadded data length, not the eventual padded length (64 bytes). CSO must be larger
Intel® 82576 GbE Controller
Datasheet
308
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
than CSS, CSS must be equal or greater than 14 bytes, and CSO must be smaller than the packet
length minus four bytes. Checksum calculation is not done if CSO or CSS are out of range. This occurs
if (CSS > length) OR (CSO > length - 1).
In the case of an 802.1Q header, the offset values depend on the VLAN insertion enable (VLE) bit. If
they are not set (VLAN tagging included in the packet buffers), the offset values should include the
VLAN tagging. If these bits are set (VLAN tagging is taken from the packet descriptor), the offset values
should exclude the VLAN tagging.
Note:
Software must compute an offsetting entry to back out the bytes of the header that are not
part of the IP pseudo header and should not be included in the TCP checksum and store it in
the position where the hardware computed checksum is to be inserted. Hardware does not
add the 802.1Q Ethertype or the VLAN field following the 802.1Q Ethertype to the
checksum. So for VLAN packets, software can compute the values to back out only on the
encapsulated packet rather than on the added fields.
UDP checksum calculation is not supported by the legacy descriptor as when using legacy
descriptors. The 82576 is not aware of the L4 type of the packet and thus, does not support
the translation of a checksum result of 0x0000 to 0xFFFF needed to differentiate between
an UDP packet with a checksum of zero and an UDP packet without checksum.
Because the CSO field is eight bits wide, it puts a limit on the location of the checksum to 255 bytes
from the beginning of the packet.
Hardware adds the checksum to the field at the offset indicated by the CSO field. Checksum
calculations are for the entire packet starting at the byte indicated by the CSS field. A value of zero
corresponds to the first byte in the packet.
CSS must be set in the first descriptor for a packet.
Table 7-25.
Transmit Command (TDESC.CMD) Layout
7
6
5
4
3
2
1
0
RSV
VLE
DEXT
Rsv
RS
IC
IFCS
EOP
7.2.2.1.4
Command Byte — CMD
The CMD byte stores the applicable command and has the fields shown in Figure 7-25.
• RSV (bit 7) — Reserved
• VLE (bit 6) — VLAN Packet Enable
• DEXT (bit 5) — Descriptor Extension (0 for legacy mode)
• Reserved (bit 4) — Reserved
• RS (bit 3) — Report Status
• IC (bit 2) — Insert Checksum
• IFCS (bit 1) — Insert FCS
• EOP (bit 0) — End of Packet
VLE: Indicates that the packet is a VLAN packet. For example, hardware should add the VLAN Ethertype
and an 802.1q VLAN tag to the packet.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
309
Intel® 82576 GbE Controller — Inline Functions
RS: Signals the hardware to report the status information. This is used by software that does inmemory checks of the transmit descriptors to determine which ones are done. For example, if software
queues up 10 packets to transmit, it can set the RS bit in the last descriptor of the last packet. If
software maintains a list of descriptors with the RS bit set, it can look at them to determine if all
packets up to (and including) the one with the RS bit set have been buffered in the output FIFO.
Looking at the status byte and checking the Descriptor Done (DD) bit do this. If DD is set, the
descriptor has been processed. Refer to Figure 7-27 for the layout of the status field.
IC: If set, requests hardware to add the checksum of the data from CSS to the end of the packet at the
offset indicated by the CSO field.
IFCS: When set, hardware appends the MAC FCS at the end of the packet. When cleared, software
should calculate the FCS for proper CRC check. There are several cases in which software must set
IFCS:
• Transmitting a short packet while padding is enabled by the TCTL.PSP bit.
• Checksum offload is enabled by the IC bit in the TDESC.CMD.
• VLAN header insertion enabled by the VLE bit in the TDESC.CMD or by the VMVIR registers.
• MACSec offload is requested.
EOP, when set, indicates the last descriptor making up the packet. Note that one or many descriptors
can be used to form a packet.
Note:
As opposed to 82571EB: VLE, IFCS, CSO, and IC must be set correctly in the first descriptor
of each packet. In previous silicon generations, some of these bits were required to be set
in the last descriptor of a packet.
Table 7-26.
VLAN Tag Insertion Decision Table
VLE
Action
0b
Send generic Ethernet packet.
1b
Send 802.1Q packet; the Ethernet Type field comes from the VET register and the VLAN data comes from the
VLAN field of the TX descriptor;
Note:
This table is relevant only if VMVIR.VLANA = 00b (use descriptor command) for the queue.
7.2.2.1.5
Status – STA
One bit provides transmit status, when RS is set in the command: DD indicates that the descriptor is
done and is written back after the descriptor has been processed.
Note:
When head write-back is enabled, the write-back of the DD bit to the descriptor is not
executed.
Table 7-27.
3
Transmit Status (TDESC.STA) Layout
2
1
Reserved
0
DD
7.2.2.1.6
DD (Bit 0) — Descriptor Done Status
7.2.2.1.7
VLAN
Intel® 82576 GbE Controller
Datasheet
310
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The VLAN field is used to provide the 802.1q/802.1ac tagging information. The VLAN field is qualified
only on the first descriptor of each packet when the VLE bit is set. The rule for VLAN tag is to use
network ordering (also called big endian). It appears in the following manner in the descriptor:
Table 7-28.
15
13
VLAN Field (TDESC.VLAN) Layout
12
PRI
11
0
CFI
VLAN ID
• VLAN ID — the 12-bit tag indicating the VLAN group of the packet.
• Canonical Form Indication (CFI) — Set to zero for Ethernet packets.
• PRI — indicates the priority of the packet.
Note:
The VLAN tag should be sent in network order.
7.2.2.2
Advanced Transmit Context Descriptor
Table 7-29.
Transmit Context Descriptor (TDESC) Layout — (Type = 0010b)
63
40
Reserved
0
63
8
39
48
MSS
7.2.2.2.1
32
31
VLAN
IPsec SA Index
47
40
L4LEN
39
38
RS
V
36
IDX
16
35
15
9
8
MACLEN
30
29
28 24
Reserved
DE
XT
RSV
23
0
IPLEN
20
DTYP
19
9
TUCMD
8
0
IPSec
ESP_LEN
IPLEN (9)
IP header length. If an offload is requested, IPLEN must be greater than or equal to 20 and less than or
equal to 511. For IPsec flows, it includes the length of the IPsec header.
7.2.2.2.2
MACLEN (7)
This field indicates the length of the MAC header. When an offload is requested (one of TSE or IXSM or
TXSM is set), MACHDR must be larger than or equal to 14 and less than or equal to 127. This field
should include only the part of the L2 header supplied by the software device driver and not the parts
added by hardware. The following table lists the value of MACLEN in the different cases.
Table 7-30.
SNAP
MACLEN Values
Regular VLAN
Extended VLAN
MACLEN
No
By hardware or no
No
14
No
By hardware or no
Yes
18
No
By software
No
18
No
By software
Yes
22
Yes
By hardware or no
No
22
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
311
Intel® 82576 GbE Controller — Inline Functions
Table 7-30.
MACLEN Values
SNAP
Regular VLAN
Extended VLAN
MACLEN
Yes
By hardware or no
Yes
26
Yes
By software
No
26
Yes
By software
Yes
30
VLAN (16) — 802.1Q VLAN tag to be inserted in the packet during transmission. This VLAN tag is
inserted and needed only when a packet using this context has its DCMD.VLE bit set. This field should
include the entire 16-bit VLAN field including the CFI and Priority fields as shown in Figure 7-28.
Note:
The VLAN tag should be sent in network order.
7.2.2.2.3
IPsec SA IDX (8)
IPsec SA Index. If an IPsec offload is requested for the packet (IPSEC bit is set in the advanced Tx data
descriptor), indicates the index in the SA table where the IPsec key and SALT are stored for that flow.
7.2.2.2.4
Reserved (24)
7.2.2.2.5
IPS_ESP_LEN (9)
Size of the ESP trailer and ESP ICV appended by software. Meaningful only if the IPSEC_TYPE bit is set
in the TUCMD field and to single send packets for which the IPSEC bit is set in their advanced Tx data
descriptor.
7.2.2.2.6
TUCMD (11)
• RSV (bit 10-6) — Reserved
• Encryption (bit5) — ESP encryption offload is required. Meaningful only to packets for which the
IPSEC bit is set in their advanced Tx data descriptor.
• IPSEC_TYPE (bit 4) — Set for ESP. Cleared for AH. Meaningful only to packets for which the IPSEC
bit is set in their advanced Tx data descriptor.
• L4T (bit 3:2) — L4 Packet TYPE (00b: UDP; 01b: TCP; 10b: SCTP; 11b: RSV)
• IPV4 (bit 1) — IP Packet Type: When 1b, Ipv4; when 0b, Ipv6
• SNAP (bit 0) — SNAP indication
7.2.2.2.7
DTYP (4)
Always 0010b for this type of descriptor.
7.2.2.2.8
RSV (5)
Reserved.
7.2.2.2.9
DEXT
Descriptor Extension (1b for advanced mode).
7.2.2.2.10
RSV (6)
Reserved.
Intel® 82576 GbE Controller
Datasheet
312
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.2.2.2.11
IDX (3)
Index into the hardware context table where this context is stored.
7.2.2.2.12
RSV (1)
7.2.2.2.13
L4LEN (8)
Layer 4 header length. If TSE is set in the data descriptor pointing to this context, this field must be
greater than or equal to 12 and less than or equal to 255. Otherwise, this field is ignored.
7.2.2.2.14
MSS (16)
Controls the Maximum Segment Size (MMS). This specifies the maximum TCP payload segment sent
per frame, not including any header or trailer. The total length of each frame (or section) sent by the
TCP/UDP segmentation mechanism (excluding Ethernet CRC) as follows:
Total length is equal to:
MACLEN + 4(if VLE set) + 4 or 8(if CMTGI is set or if also RLTTGI is set - assuming BCNTLEN is clear) +
IPLEN + L4LEN + MSS + [PADLEN + 18](if ESP packet)
The one exception is the last packet of a TCP/UDP segmentation, which is typically shorter.
MSS is ignored when DCMD.TSE is not set.
PADLEN ranges from 0 to 3 in Tx. It is the content of the ESP Padding Length field that is computed
when offloading ESP in cipher blocks of 16-bytes (AES-128) with respect to the following alignment
formula:
[L4LEN + MSS + PADLEN + 2] modulo(4) = 0
For single send packets: IPS_ESP_LEN = PADLEN + 18.
Note:
The headers lengths must meet the following:
MACLEN + IPLEN + L4LEN <= 512
The context descriptor requires valid data only in the fields used by the specific offload options. The
following table lists the required valid fields according to the different offload options.
Table 7-31.
Valid Field in Context vs. Required Offload
Required Offload
TSE
TXSM
IXSM
1b
1b
X
1b
1b
X
320961-015EN
Revision: 2.61
December 2010
Valid Fields in Context
VLAN
L4LEN
IPLEN
MACLEN
MSS
L4T
IPV4
IPsec
SA
Index
IPsec
ESP_
LEN
0b
VLE
Yes
Yes
Yes
Yes
Ye
s
Yes
No
No
1b
VLE
Yes
Yes
Yes
Yes
Ye
s
Yes
Yes
IPSE
C_TY
PE
IPSEC
Intel® 82576 GbE Controller
Datasheet
313
Intel® 82576 GbE Controller — Inline Functions
Table 7-31.
Valid Field in Context vs. Required Offload
0b
1b
X
0b
VLE
No
Yes
Yes
No
Ye
s
Yes
No
No
0b
1b
X
1b
VLE
No
Yes
Yes
No
Ye
s
Yes
Yes
IPSE
C_TY
PE
0b
0b
1b
0b
VLE
No
Yes
Yes
No
No
Yes
No
No
0b
0b
1b
1b
VLE
No
Yes
Yes
No
No
Yes
Yes
IPSE
C_TY
PE
0b
0b
0b
0b
0b
0b
0b
1b
Yes
Yes
IPSE
C_TY
PE
7.2.2.3
No context required unless VLE is set.
VLE
No
Yes
Yes
No
No
Advanced Transmit Data Descriptor
Table 7-32.
Advanced Tx Descriptor Read Format
0
Address[63:0]
8
PAYLEN
63
POPTS
46
Table 7-33.
45
40
CC
39
IDX
38
36
STA
35
32
DCMD
31
24
DTYP
23
20
MAC
19
18
RSV
17
16
DTALE
N
15
0
Advanced Tx descriptor write-back format
0
RSV
8
RSV
63
Note:
7.2.2.3.1
STA
36
35
32
RSV
31
0
For frames that spans multiple descriptors, all fields apart from DCMD.EOP, DCMD.RS,
DCMD.DEXT, DTALEN, Address and DTYP are valid only in the first descriptors and are
ignored in the subsequent ones.
Address (64)
Physical address of a data buffer in host memory that contains a portion of a transmit packet.
7.2.2.3.2
DTALEN (16)
Length in bytes of data buffer at the address pointed to by this specific descriptor.
Note:
7.2.2.3.3
If the TCTL.PSP bit is set, the total length of the packet transmitted, not including FCS,
should be at least 17 bytes.
RSV (2)
Reserved.
7.2.2.3.4
MAC (2)
Intel® 82576 GbE Controller
Datasheet
314
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
• ILSec (bit 0) - Apply MACSec on packet
• 1588 (bit 1) — IEEE1588 Timestamp packet.
ILSec, when set, hardware includes the MACSec header (SecTAG) and MACSec header digest
(signature). The MACSec processing is defined by the Enable Tx MACSec field in the LSECTXCTRL
register. The ILSec bit in the packet descriptor should not be set if MACSec processing is not enabled by
the Enable Tx MACSec field. If the ILSec bit is set erroneously while the Enable Tx MACSec field is set to
00b, then the packet is dropped.
7.2.2.3.5
DTYP (4)
0011b is the value for this descriptor type.
7.2.2.3.6
DCMD (8)
• TSE (bit 7) — TCP/UDP Segmentation Enable
• VLE (bit 6) — VLAN Packet Enable
• DEXT (bit 5) — Descriptor Extension (1b for advanced mode)
• Reserved (bit 4)
• RS (bit 3) — Report Status
• Reserved (bit 2)
• IFCS (bit 1) — Insert FCS
• EOP (bit 0) — End Of Packet
TSE indicates a TCP/UDP segmentation request. When TSE is set in the first descriptor of a TCP packet,
hardware must use the corresponding context descriptor in order to perform TCP segmentation. The
type of segmentation applied is defined according to the TUCMD.L4T field in the context descriptor.
Note:
It is recommended that TCTL.PSP be enabled when TSE is used since the last frame can be
shorter than 60 bytes - resulting in a bad frame if PSP is disabled.
VLE indicates that the packet is a VLAN packet and hardware must add the VLAN Ethertype and an
802.1q VLAN tag to the packet.
DEXT must be 1b to indicate advanced descriptor format (as opposed to legacy).
RS signals hardware to report the status information. This is used by software that does in-memory
checks of the transmit descriptors to determine which ones are done. For example, if software queues
up 10 packets to transmit, it can set the RS bit in the last descriptor of the last packet. If software
maintains a list of descriptors with the RS bit set, it can look at them to determine if all packets up to
(and including) the one with the RS bit set have been buffered in the output FIFO. Looking at the status
byte and checking the DD bit do this. If DD is set, the descriptor has been processed. Refer to the
sections that follow for the layout of the status field.
Note:
Descriptors with zero length transfer no data.
IFCS, when set, hardware appends the MAC FCS at the end of the packet. When cleared, software
should calculate the FCS for proper CRC check. There are several cases in which software must set
IFCS:
• Transmitting a short packet while padding is enabled by the TCTL.PSP bit.
• Checksum offload is enabled by the either TXSM or IXSM bits in the TDESC.DCMD.
• VLAN header insertion enabled by the VLE bit in the TDESC.DCMD.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
315
Intel® 82576 GbE Controller — Inline Functions
• TCP/UDP segmentation offload enabled by TSE bit in the TDESC.DCMD.
EOP indicates whether this is the last buffer for an incoming packet.
7.2.2.3.7
STA (4)
• Rsv (bits 1-3) — Reserved
• DD (bit 0) — Descriptor Done
7.2.2.3.8
IDX (3)
Index into the hardware context table to indicate which context should be used for this request. If no
offload is required, this field is not relevant and no context needs to be initiated before the packet is
sent. See Table 7-31 for details in which packets require a context reference.
7.2.2.3.9
RSV (1)
Reserved. Set to 0.
7.2.2.3.10
POPTS (6)
• RSV (bit 5:3) — Reserved
• IPSEC (bit 2) — IPSec Offload Request
• TXSM (bit 1) — Insert L4 Checksum
• IXSM (bit 0) — Insert IP Checksum
TXSM, when set, indicates that L4 checksum should be inserted. In this case, TUCMD.L4T indicates
whether the checksum is TCP, UDP, or SCTP.
When TUCMD.TSE is set, TXSM must be set to 1b.
If this bit is set, the packet should at least contain a TCP header.
IXSM, when set, indicates that IP checksum should be inserted. For IPv6 packets, this bit must be
cleared.
If the TUCMD.TSE bit is set, and TUCMD.IPV4 is set, IXSM must be set as well.
If this bit is set, the packet should at least contain an IP header.
7.2.2.3.11
PAYLEN (18)
PAYLEN indicates the size (in byte units) of the data buffer(s) in host memory for transmission. In a
single send packet, PAYLEN defines the entire packet size fetched from host memory. It does not
include the fields that hardware adds such as: optional VLAN tagging, Ethernet CRC or Ethernet
padding. When MACSec offload is enabled, it does not include the MACSec encapsulation. When IPsec
offload is enabled, it does not include the ESP trailer added by hardware. In a large send case
(regardless if it is transmitted on a single or multiple packets), PAYLEN defines the protocol payload size
fetched from host memory. In TCP or UDP segmentation offload, PAYLEN defines the TCP/UDP payload
size.
Note:
When a packet spreads over multiple descriptors, all the descriptor fields are only valid in
the first descriptor of the packet, except for RS, which is always checked, DTALEN that
reflects the size of the buffer in the current descriptor and EOP, which is always set at last
descriptor of the series.
Intel® 82576 GbE Controller
Datasheet
316
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.2.2.4
Transmit Descriptor Ring Structure
The transmit descriptor ring structure is shown in Figure 7-12. A pair of hardware registers maintains
each transmit descriptor ring in the host memory. New descriptors are added to the queue by software
by writing descriptors into the circular buffer memory region and moving the tail pointer associated
with that queue. The tail pointer points to one entry beyond the last hardware owned descriptor.
Transmission continues up to the descriptor where head equals tail at which point the queue is empty.
Descriptors passed to hardware should not be manipulated by software until the head pointer has
advanced past them.
Figure 7-12.
Transmit Descriptor Ring Structure
The shaded boxes in the figure represent descriptors that are not currently owned by hardware that
software can modify.
The transmit descriptor ring is described by the following registers:
• Transmit Descriptor Base Address register (TDBA 0-15):
This register indicates the start address of the descriptor ring buffer in the host memory; this 64-bit
address is aligned on a 16-byte boundary and is stored in two consecutive 32-bit registers.
Hardware ignores the lower four bits.
• Transmit Descriptor Length register (TDLEN 0-15):
This register determines the number of bytes allocated to the circular buffer. This value must be
zero modulo 128.
• Transmit Descriptor Head register (TDH 0-15):
This register holds a value that is an offset from the base and indicates the in-progress descriptor.
There can be up to 64 KB descriptors in the circular buffer. Reading this register returns the value of
head corresponding to descriptors already loaded in the output FIFO. This register reflects the
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
317
Intel® 82576 GbE Controller — Inline Functions
internal head of the hardware write-back process including the descriptor in the posted write pipe
and might point further ahead than the last descriptor actually written back to the memory.
• Transmit Descriptor Tail register (TDT 0-15):
This register holds a value, which is an offset from the base, and indicates the location beyond the
last descriptor hardware can process. This is the location where software writes the first new
descriptor.
The driver should not handle to the 82576 descriptors that describes a partial packet.
Consequently, the number of descriptors used to describe a packet can not be larger than the ring
size.
The base register indicates the start of the circular descriptor queue and the length register indicates
the maximum size of the descriptor ring. The lower seven bits of length are hard wired to 0b. Byte
addresses within the descriptor buffer are computed as follows: address = base + (ptr * 16), where ptr
is the value in the hardware head or tail register.
The size chosen for the head and tail registers permit a maximum of 65528 (64 KB by 8) descriptors, or
approximately 16 KB packets for the transmit queue given an average of four descriptors per packet.
Once activated, hardware fetches the descriptor indicated by the hardware head register. The hardware
tail register points one beyond the last valid descriptor. Software can read detect which packets had
already been processed by hardware as follows:
• Read the head register to determine which packets (those logically before the head) have been
transferred to the on-chip FIFO or transmitted. Note that this method is not recommended as races
between the internal update of the head register and the actual write-back of descriptors might
occur.
• Read the value of the head as stored at the address pointed by the TDBAH/TDBAL pair.
• Track the DD bits in the descriptor ring.
All the registers controlling the descriptor rings behavior should be set before transmit is enabled, apart
from the tail registers which are used during the regular flow of data.
Note:
Software can determine if a packet has been sent by either of three methods: setting the
RS bit in the transmit descriptor command field or by performing a PIO read of the transmit
head register, or by reading the head value written by the 82576 to the address pointed by
the TDWBAL and TDWBAH registers (see Section 7.2.3 for details).
Checking the transmit descriptor DD bit or head value in memory eliminates a potential
race condition. All descriptor data is written to the I/O bus prior to incrementing the head
register, but a read of the head register could pass the data write in systems performing I/
O write buffering. Updates to transmit descriptors use the same I/O write path and follow
all data writes. Consequently, they are not subject to the race.
In general, hardware prefetches packet data prior to transmission. Hardware typically updates the
value of the head pointer after storing data in the transmit FIFO.
7.2.2.5
Transmit Descriptor Fetching
The descriptor processing strategy for transmit descriptors is essentially the same as for receive
descriptors except that a different set of thresholds are used. As for receives, the number of on-chip
transmit descriptors has been increased (from 8 to 64) and the fetch and write-back algorithms
modified.
Intel® 82576 GbE Controller
Datasheet
318
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
When there is an on-chip descriptor buffer empty, a fetch happens as soon as any descriptors are made
available (host writes to the tail pointer). If several on-chip descriptor queues are in this situation at the
same time, the highest indexed queue must be served first and so forth, down to the lowest indexed
queue.
A queue is considered empty for the transmit descriptor fetch algorithm as long as:
• There is still not at least one complete packet (single or large send) in its corresponding internal
queue.
• There is no descriptor already in its way from system memory to the internal cache.
• The internal corresponding internal descriptor cache is not full.
Each time a descriptor fetch request is sent for an empty queue, the maximum available number of
descriptor is requested, regardless of cache alignment issues.
When the on-chip buffer is nearly empty (TXDCTL[n].PTHRESH), a prefetch is performed each time
enough valid descriptors (TXDCTL[n].HTHRESH) are available in host memory and no other DMA
activity of greater priority is pending (descriptor fetches and write-backs or packet data transfers). If
several on-chip descriptor queues are in this situation at the same time, then start from the more
starved queue, and among those equally starved, start from the highest indexed queue, as before.
Note:
The starvation level of a queue corresponds to the number of descriptors above the
prefetch threshold that are already in the internal queue. The queue is more starved if there
a less decorators in the internal queue. Comparing starvation level might be done roughly,
not at the descriptor level of resolution.
When the number of descriptors in host memory is greater than the available on-chip descriptor
storage, the 82576 might elect to perform a fetch that is not a multiple of cache-line size. Hardware
performs this non-aligned fetch if doing so results in the next descriptor fetch being aligned on a cacheline boundary. This enables the descriptor fetch mechanism to be more efficient in the cases where it
has fallen behind software.
Note:
The 82576 NEVER fetches descriptors beyond the descriptor tail pointer.
7.2.2.6
Transmit Descriptor Write-Back
The descriptor write-back policy for transmit descriptors is similar to that for receive descriptors when
the TXDCTL[n].WTHRESH value is not 0b. In this case, all descriptors are written back regardless of the
value of their RS bit.
When the TXDCTL[n].WTHRESH value is 0b, since transmit descriptor write-backs do not happen for
every descriptor (controlled by RS in the transmit descriptor), only descriptors that have RS bit set are
written back.
Any descriptor write-back includes the full 16 bytes of the descriptor.
Since the benefit of delaying and then bursting transmit descriptor write-backs is small at best, it is
likely that the threshold is left at the default value (0b) to force immediate write-back of transmit
descriptors and to preserve backward compatibility.
Descriptors are written back in one of three cases:
• TXDCTL[n].WTHRESH = 0b and a descriptor which has RS set is ready to be written back
• The corresponding EITR counter has reached zero
• TXDCTL[n].WTHRESH > 0b and TXDCTL[n].WTHRESH descriptors have accumulated
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
319
Intel® 82576 GbE Controller — Inline Functions
For the first condition, write-backs are immediate. This is the default operation and is backward
compatible with previous device implementations.
The other two conditions are only valid if descriptor bursting is enabled (Section 8.12.13). In the
second condition, the EITR counter is used to force timely write-back of descriptors. The first packet
after timer initialization starts the timer. Timer expiration flushes any accumulated descriptors and sets
an interrupt event (TXDW).
For the final condition, if TXDCTL[n].WTHRESH descriptors are ready for write-back, the write-back is
performed.
An additional mode in which transmit descriptors are not written back at all and the head pointer of the
descriptor ring is written instead as described in Section 7.2.3.
7.2.3
Tx Completions Head Write-Back
In legacy hardware, transmit requests are completed by writing the DD bit to the transmit descriptor
ring. This causes cache thrash since both the software device driver and hardware are writing to the
descriptor ring in host memory. Instead of writing the DD bits to signal that a transmit request
completed, hardware can write the contents of the descriptor queue head to host memory. The
software device driver reads that memory location to determine which transmit requests are complete.
In order to improve the performance of this feature, the software device driver needs to program DCA
registers to configure which CPU is processing each TX queue.
7.2.3.1
Description
The head counter is reflected in a memory location that is allocated by software, for each queue.
Head write-back occurs if TDWBAL#.Head_WB_En is set for this queue and the RS bit is set in the Tx
descriptor, following corresponding data upload into packet buffer. If the head write-back feature is
enabled, the 82576 ignores WTRESH and takes in account only descriptors with the RS bit set (as if the
WTRESH was set to 0b). In addition, the head write-back occurs upon EITR expiration for queues where
the WB_on_EITR field in TDWBAL is set.
The software device driver has control on this feature through Tx queue 0-15 head write-back address,
low and high (thus allowing 64-bit address). See in Section 8.12.8 and Section 8.12.9.
The low register's LSB hold the control bits.
• The Head_WB_En bit enables activation of tail write-back. In this case, no descriptor write-back is
executed.
• The 30 upper bits of this register hold the lowest 32 bits of the head write-back address, assuming
that the two last bits are zero.
The high register holds the high part of the 64-bit address.
Note:
Hardware writes a full Dword when writing this value, so software should reserve enough
space for each head value and make sure the TDBAL value is Dword aligned.
If software enables Head Write-Back, it must also disable PCI Express Relaxed Ordering on
the write-back transactions. This is done by disabling bit 11 in the TXCTL register for each
active transmit queue. See Section 8.13.2.
The 82576 might update the Head with values that are larger then the last Head pointer
which holds a descriptor with RS bit set, but still the value will always point to a free
descriptor (descriptor that are not owned by the the 82576 anymore).
Intel® 82576 GbE Controller
Datasheet
320
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.2.4
TCP/UDP Segmentation
Hardware TCP segmentation is one of the offloading options supported by the Windows* and Linux*
TCP/IP stack. This is often referred to as TCP Segmentation Offloading or TSO. This feature enables the
TCP/IP stack to pass to the network device driver a message to be transmitted that is bigger than the
Maximum Transmission Unit (MTU) of medium. It is then the responsibility of the software device driver
and hardware to divide the TCP message into MTU size frames that have appropriate layer 2 (Ethernet),
3 (IP), and 4 (TCP) headers. These headers must include sequence number, checksum fields, options
and flag values as required. Note that some of these values (such as the checksum values) are unique
for each packet of the TCP message and other fields such as the source IP address are constant for all
packets associated with the TCP message.
The 82576 supports also UDP segmentation for embedded applications, although this offload is not
supported by the regular Windows* and Linux* stacks. Any reference in this section to TCP
segmentation, should be considered as referring to both TCP and UDP segmentation.
Padding (TCTL.PSP) must be enabled in TCP segmentation mode, since the last frame might be shorter
than 60 bytes, resulting in a bad frame if PSP is disabled.
The offloading of these mechanisms to the software device driver and the 82576 save significant CPU
cycles. Note that the software device driver shares the additional tasks to support these options.
7.2.4.1
Assumptions
The following assumptions apply to the TCP segmentation implementation in the 82576:
• The RS bit operation is not changed.
• Interrupts are set after data in buffers pointed to by individual descriptors is transferred (DMA'd) to
hardware.
7.2.4.2
Transmission Process
The transmission process for regular (non-TCP segmentation packets) involves:
• The protocol stack receives from an application a block of data that is to be transmitted.
• The protocol stack calculates the number of packets required to transmit this block based on the
MTU size of the media and required packet headers.
For each packet of the data block:
• Ethernet, IP and TCP/UDP headers are prepared by the stack.
• The stack interfaces with the software device driver and commands it to send the individual packet.
• The software device driver gets the frame and interfaces with the hardware.
• The hardware reads the packet from host memory (via DMA transfers).
• The software device driver returns ownership of the packet to the Network Operating System (NOS)
when hardware has completed the DMA transfer of the frame (indicated by an interrupt).
The transmission process for the 82576 TCP segmentation offload implementation involves:
• The protocol stack receives from an application a block of data that is to be transmitted.
• The stack interfaces to the software device driver and passes the block down with the appropriate
header information.
• The software device driver sets up the interface to the hardware (via descriptors) for the TCP
segmentation context.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
321
Intel® 82576 GbE Controller — Inline Functions
Hardware DMA's (transfers) the packet data and performs the Ethernet packet segmentation and
transmission based on offset and payload length parameters in the TCP/IP context descriptor including:
• Packet encapsulation
• Header generation and field updates including IPv4, IPV6, and TCP/UDP checksum generation
• The software device driver returns ownership of the block of data to the NOS when hardware has
completed the DMA transfer of the entire data block (indicated by an interrupt).
7.2.4.2.1
TCP Segmentation Data Fetch Control
To perform TCP Segmentation in the 82576, the DMA must be able to fit at least one packet of the
segmented payload into available space in the on-chip Packet Buffer. The DMA does various
comparisons between the remaining payload and the Packet Buffer available space, fetching additional
payload and sending additional packets as space permits.
In order to enable interleaving between descriptor queues at the Ethernet frame resolution inside TSO
requests. For doing so, the frame header pointed by the so called header descriptors are reread from
system memory by hardware for every LSO segment again, storing in an internal cache only the
header’s descriptors instead of the header’s content.
In the aim to limit the internal cache dimensions, software is required to spread the header on
maximum 4 descriptors, while still allowed to mix header and data in the last header buffer. This
limitation stands for up to Layer4 header included, and for IPv4 or IPv6 indifferently.
7.2.4.2.2
TCP Segmentation Write-Back Modes
As the TCP segmentation mode uses the buffers that contains the header of the packet multiple time,
there are some limitation on the usage of the different combination of writeback and buffer release
methods in order to guarantee the header buffers availability until the entire packet is processed. These
limitations are described in the table below.
Table 7-34.
Write Back options for large send
HEAD Write
Back Enable
Software Expected Behavior for TSO
packets.
WTHRESH
RS
0
Set in EOP
descriptors
only
Disable
Hardware writes back
descriptors with RS bit
set one at a time.
Software can retake ownership of all
descriptors up to last descriptor with DD bit
set.
0
Set in any
descriptors
Disable
Hardware writes back
descriptors with RS bit
set one at a time.
Software can retake ownership of entire
packets (EOP bit set) up to last descriptor
with DD bit set.
0
Not set at all
Disable
Hardware does not write
back any descriptor
(since RS bit is not set)
Software should poll the TDH register. The
TDH register reflects the last descriptor
that software can take ownership of.1
>0
Don't care
Disable
Hardware writes back all
the descriptors in bursts
and set all the DD bits.
Software can retake ownership of entire
packets up to last descriptor with both DD
and EOP bits set.
Intel® 82576 GbE Controller
Datasheet
322
Hardware Behavior
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Table 7-34.
Write Back options for large send
HEAD Write
Back Enable
WTHRESH
RS
Don’t care
Not set at all
Enable
Hardware Behavior
Software Expected Behavior for TSO
packets.
Hardware writes back the
head pointer only at EITR
expire event reflecting
the last descriptor that
software can take
ownership of.
Software may poll the TDH register or use
the head value written back at EITR expire
event.
The TDH register reflects the last descriptor
that software can take ownership of.
Don't care
Set in EOP
descriptors
only
Enable
Hardware writes back the
Head pointer per each
descriptor with RS bit
set.2
Software can retake ownership of all
descriptors up to the descriptor pointed by
the head pointer read from system memory
(by interrupt or polling).
Don't care
Set in any
descriptors
Enable
Hardware writes back the
Head pointer per each
descriptor with RS bit set.
This mode is illegal since software won't
access the descriptor, it cannot tell when
the pointer passed the EOP descriptor.
1. Note that polling of the TDH register is a valid method only when the RS bit is never set, otherwise race conditions between software
and hardware accesses to the descriptor ring can occur.
2. At EITR expire event, the Hardware writes back the head pointer reflecting the last descriptor that software can take ownership of.
7.2.4.3
TCP Segmentation Performance
Performance improvements for a hardware implementation of TCP Segmentation off-load include:
• The stack does not need to partition the block to fit the MTU size, saving CPU cycles.
• The stack only computes one Ethernet, IP, and TCP header per segment, saving CPU cycles.
• The Stack interfaces with the device driver only once per block transfer, instead of once per frame.
• Larger PCI bursts are used which improves bus efficiency (such as lowering transaction overhead).
• Interrupts are easily reduced to one per TCP message instead of one per packet.
• Fewer I/O accesses are required to command the hardware.
7.2.4.4
Packet Format
Typical TCP/IP transmit window size is 8760 bytes (about 6 full size frames). Today the average size on
corporate Intranets is 12-14KB, and normally the maximum window size allowed is 64KB (unless
Windows Scaling - RFC 1323 is specified). A TCP message can be as large as 256 KB and is generally
fragmented across multiple pages in host memory. The 82576 partitions the data packet into standard
Ethernet frames prior to transmission. The 82576 supports calculating the Ethernet, IP, TCP, and UDP
headers, including checksum, on a frame-by-frame basis.
Frame formats supported by the 82576 include:
Table 7-35.
TCP/IP or UDP/IP Packet Format Sent by Host
L2/L3/L4 headers
Ethernet
Table 7-36.
L2/L3/L4
header
(updated)
320961-015EN
Revision: 2.61
December 2010
Data
IPv4/IPv6
TCP/UDP
DATA (full TCP message)
TCP/IP or UDP/IP Packet Format Sent by 82576
Data (first
MSS)
FCS
...
L2/L3/L4
header
(updated)
Data (Next
MSS)
FCS
...
Intel® 82576 GbE Controller
Datasheet
323
Intel® 82576 GbE Controller — Inline Functions
• Ethernet 802.3
• IEEE 802.1Q VLAN (Ethernet 802.3ac)
• Ethernet Type 2
• Ethernet SNAP
• IPv4 headers with options
• IPv4 headers without options with one AH/ESP IPsec header
• IPv6 headers with extensions
• TCP with options
• UDP with options.
VLAN tag insertion might be handled by hardware
Note:
UDP (unlike TCP) is not a “reliable protocol”, and fragmentation is not supported at the UDP
level. UDP messages that are larger than the MTU size of the given network medium are
normally fragmented at the IP layer. This is different from TCP, where large TCP messages
can be fragmented at either the IP or TCP layers depending on the software
implementation.
The 82576 has the ability to segment UDP traffic (in addition to TCP traffic), however,
because UDP packets are generally fragmented at the IP layer, the 82576's “TCP
Segmentation” feature is not normally conducive to handling UDP traffic.
7.2.4.5
TCP/UDP Segmentation Indication
Software indicates a TCP/UDP Segmentation transmission context to the hardware by setting up a TCP/
IP Context Transmit Descriptor (see Section 7.2.2). The purpose of this descriptor is to provide
information to the hardware to be used during the TCP segmentation off-load process.
Setting the TSE bit in the TUCMD field to 1b indicates that this descriptor refers to the TCP
Segmentation context (as opposed to the normal checksum off loading context). This causes the
checksum off loading, packet length, header length, and maximum segment size parameters to be
loaded from the descriptor into the device.
The TCP Segmentation prototype header is taken from the packet data itself. Software must identity
the type of packet that is being sent (IPv4/IPv6, TCP/UDP, other), calculate appropriate checksum off
loading values for the desired checksum, and calculate the length of the header which is pre-appended.
The header might be up to 240 bytes in length.
Once the TCP Segmentation context has been set, the next descriptor provides the initial data to
transfer. This first descriptor(s) must point to a packet of the type indicated. Furthermore, the data it
points to might need to be modified by software as it serves as the prototype header for all packets
within the TCP Segmentation context. The following sections describe the supported packet types and
the various updates which are performed by hardware. This should be used as a guide to determine
what must be modified in the original packet header to make it a suitable prototype header.
The following summarizes the fields considered by the driver for modification in constructing the
prototype header.
IP Header
For IPv4 headers:
• Identification Field should be set as appropriate for first packet of send (if not already)
Intel® 82576 GbE Controller
Datasheet
324
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
• Header Checksum should be zeroed out unless some adjustment is needed by the driver
TCP Header
• Sequence Number should be set as appropriate for first packet of send (if not already)
• PSH, and FIN flags should be set as appropriate for LAST packet of send
• TCP Checksum should be set to the partial pseudo-header sum as follows (there is a more detailed
discussion of this is Section 7.2.4.6):
Table 7-37.
TCP Partial Pseudo-Header Sum for IPv4
IP Source Address
IP Destination Address
Zero
Table 7-38.
Layer 4 Protocol ID
Zero
TCP Partial Pseudo-Header Sum for IPv6
IPv6 Source Address
IPv6 Final Destination Address
Zero
Zero
Next Header
UDP Header
• Checksum should be set as in TCP header, above
The following sections describe the updating process performed by the hardware for each frame sent
using the TCP Segmentation capability.
7.2.4.6
Transmit Checksum Offloading with TCP/UD Segmentation
The 82576 supports checksum off-loading as a component of the TCP Segmentation off-load feature
and as a standalone capability. Section 7.2.5 describes the interface for controlling the checksum offloading feature. This section describes the feature as it relates to TCP Segmentation.
The 82576 supports IP and TCP header options in the checksum computation for packets that are
derived from the TCP Segmentation feature.
Note:
The 82576 is capable of computing one level of IP header checksum and one TCP/UDP
header and payload checksum. In case of multiple IP headers, the driver needs to compute
all but one IP header checksum. The 82576 calculates check sums on the fly on a frame-byframe basis and inserts the result in the IP/TCP/UDP headers of each frame.
TCP and UDP checksum are a result of performing the checksum on all bytes of the payload
and the pseudo header.
Three specific types of checksum are supported by the hardware in the context of the TCP
Segmentation off-load feature:
• IPv4 checksum
• TCP checksum
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
325
Intel® 82576 GbE Controller — Inline Functions
Each packet that is sent via the TCP segmentation off-load feature optionally includes the IPv4
checksum and either the TCP checksum.
All checksum calculations use a 16-bit wide one's complement checksum. The checksum word is
calculated on the outgoing data.
Table 7-39.
Supported Transmit Checksum Capabilities
Hardware IP Checksum
Calculation
Packet Type
Hardware TCP/UDP Checksum
Calculation
IP v4 packets
Yes
Yes
IP v6 packets
NA
Yes
Packet is greater than 1518/1522/1526 bytes;
(LPE=1b).
Yes
Yes
Packet has 802.3ac tag
Yes
Yes
Packet has IP options
Yes
Yes
Packet has TCP or UDP options
Yes
Yes
IP header’s protocol field contains a protocol #
other than TCP or UDP.
Yes
No
(no IP checksum in Ipv6)
(IP header is longer than 20 bytes)
The table below summarizes the conditions of when checksum off loading can/should be calculated.
Table 7-40.
Conditions for Checksum Off Loading
Packet Type
IPv4
TCP/UDP
Reason
Non TSO
Yes
No
IP Raw packet (non TCP/UDP protocol)
Yes
Yes
TCP segment or UDP datagram with checksum off-load
No
No
Non-IP packet or checksum not offloaded
Yes
Yes
For TSO, checksum off-load must be done
TSO
7.2.4.7
IP/TCP/UDP Header Updating
IP/TCP or IP/UDP header is updated for each outgoing frame based on the IP/TCP header prototype
which hardware DMA's from the first descriptor(s). The checksum fields and other header information
are later updated on a frame-by-frame basis. The updating process is performed concurrently with the
packet data fetch.
The following sections define what fields are modified by hardware during the TCP Segmentation
process by the 82576.
Note:
7.2.4.7.1
Software must make PAYLEN and HDRLEN value of Context descriptors correct. Otherwise,
the failure of Large Send due to either under-run or over-run might cause hardware to send
bad packets or even cause TX hardware to hang. The indication of Large Send failure can be
checked in the TSCTFC statistic register.
TCP/IP/UDP Header for the First Frames
Intel® 82576 GbE Controller
Datasheet
326
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The hardware makes the following changes to the headers of the first packet that is derived from each
TCP segmentation context.
MAC Header (for SNAP)
• Type/Len field = MSS + MACLEN + IPLEN + L4LEN - 14
Ipv4 Header
• IP Total Length = MSS + L4LEN + IPLEN
• IP Checksum
Ipv6 Header
• Payload Length = MSS + L4LEN + IPV6_HDR_extension1
TCP Header
• Sequence Number: The value is the Sequence Number of the first TCP byte in this frame.
• The flag values of the first frame are set by ANDing the flag word in the pseudo header with the
DTXTCPFLGL.TCP_flg_first_seg. The default value of the DTXTCPFLGL.TCP_flg_first_seg are set so
that if the FIN flag and the PSH flag are cleared in the first frame.
• TCP Checksum
7.2.4.7.2
TCP/IP/UDP Headers for the Subsequent Frames
The hardware makes the following changes to the headers for subsequent packets that are derived as
part of a TCP segmentation context:
Number of bytes left for transmission = PAYLEN - (N * MSS). Where N is the number of frames that
have been transmitted.
MAC Header (for SNAP Packets)
Type/Len field = MSS + MACLEN + IPLEN + L4LEN - 14
Ipv4 Header
• IP Identification: incremented from last value (wrap around)
• IP Total Length = MSS + L4LEN + IPLEN
• IP Checksum
Ipv6 Header
• Payload Length = MSS + L4LEN + IPV6_HDR_extension2
TCP Header
• Sequence Number update: Add previous TCP payload size to the previous sequence number value.
This is equivalent to adding the MSS to the previous sequence number.
• The flag values of the subsequent frames are set by ANDing the flag word in the pseudo header
with the DTXTCPFLGL.TCP_Flg_mid_seg. The default value of the DTXTCPFLGL.TCP_Flg_mid_seg
are set so that if the FIN flag and the PSH flag are cleared in these frames.
1. IPV6_HDR_extension is calculated as IPLEN - 40 bytes.
2. IPV6_HDR_extension is calculated as IPLEN - 40 bytes.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
327
Intel® 82576 GbE Controller — Inline Functions
• TCP Checksum
UDP Header
• UDP Length = MSS + L4LEN
• UDP Checksum
7.2.4.7.3
TCP/IP/UDP Headers for the Last Frame
The hardware makes the following changes to the headers for the last frame of a TCP segmentation
context:
Last frame payload bytes = PAYLEN - (N * MSS)
MAC Header (for SNAP Packets)
• Type/Len field = Last frame payload bytes + MACLEN + IPLEN + L4LEN - 14
Ipv4 Header
• IP Total Length = last frame payload bytes + L4LEN + IPLEN
• IP Identification: incremented from last value (wrap around based on 16 bit-width)
• IP Checksum
Ipv6 Header
• Payload Length = last frame payload bytes + L4LEN + IPV6_HDR_extension2
TCP Header
• Sequence Number update: Add previous TCP payload size to the previous sequence number value.
This is equivalent to adding the MSS to the previous sequence number.
• The flag values of the last frames are set by ANDing the flag word in the pseudo header with the
DTXTCPFLGH.TCP_Flg_lst_seg. The default value of the DTXTCPFLGH.TCP_Flg_lst_seg are set so
that if the FIN flag and the PSH flag are set in the last frame.
• TCP Checksum
UDP Header
• UDP Length = last frame payload bytes + L4LEN
• UDP Checksum
7.2.4.8
IP/TCP/UDP Checksum Offloading
The 82576 performs checksum off loading as part of the TCP segmentation off-load feature.
These specific checksum are supported under TCP segmentation:
• IPv4 checksum
• TCP checksum
See Section 7.2.5 for description of checksum off loading of a single-send packet.
7.2.4.9
Data Flow
The flow used by the 82576 to do a TCP segmentation is as follow:
Intel® 82576 GbE Controller
Datasheet
328
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
1. Get a descriptor with a request for a TSO off-load of a TCP packet.
2. First Segment processing:
a.
Fetch all the buffers containing the header as calculated by the MACLEN, IPLEN & L4LEN fields.
Save the addresses and lengths of the buffers containing the header (up to 4 buffers). The
header content is not saved.
b.
Fetch data up to the MSS from subsequent buffers & calculate the adequate checksum(s).
c.
Update the Header accordingly and update internal state of the packet (next data to fetch and
IP SN).
d.
Send the packet to the network.
e.
If total packet was sent, go to step 4. else continue.
3. Next segments
a.
Wait for next arbitration of this queue.
b.
Fetch all the buffers containing the header from the saved addresses. Subsequent reads of the
header might be done with a no snoop attribute.
c.
Fetch data up to the MSS or end of packet form subsequent buffers & calculate the adequate
checksum(s.
d.
Update the Header accordingly and update internal state of the packet (next data to fetch and
IP SN).
e.
If total packet was sent, request is done, else restart from step 3.
4. Release all buffers (update head pointer).
Note:
Descriptors are fetched in a parallel process according to the consumption of the buffers.
7.2.5
Checksum Offloading in Non-Segmentation Mode
The previous section on TCP Segmentation off-load describes the IP/TCP/UDP checksum off loading
mechanism used in conjunction with TCP Segmentation. The same underlying mechanism can also be
applied as a standalone feature. The main difference in normal packet mode (non-TCP Segmentation) is
that only the checksum fields in the IP/TCP/UDP headers need to be updated.
Before taking advantage of the 82576's enhanced checksum off-load capability, a checksum context
must be initialized. For the normal transmit checksum off-load feature this is performed by providing
the device with a TCP/IP Context Descriptor with TUCMD.TSE=0b. Setting TSE=0b indicates that the
normal checksum context is being set, as opposed to the segmentation context. For additional details
on contexts, refer to Section 7.2.2.4.
Note:
Enabling the checksum off loading capability without first initializing the appropriate
checksum context leads to unpredictable results. CRC appending (CMD.IFCS) must be
enabled in TCP/IP checksum mode, since CRC must be inserted by hardware after the
checksum have been calculated.
As mentioned in Section 7.2.2, Transmit Descriptors, it is not necessary to set a new context for each
new packet. In many cases, the same checksum context can be used for a majority of the packet
stream. In this case, some performance can be gained by only changing the context on an as needed
basis or electing to use the off-load feature only for a particular traffic type, thereby avoiding all context
descriptors except for the initial one.
Each checksum operates independently. Insertion of the IP and TCP checksum for each packet are
enabled through the Transmit Data Descriptor POPTS.TSXM and POPTS.IXSM fields, respectively.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
329
Intel® 82576 GbE Controller — Inline Functions
7.2.5.1
IP Checksum
Three fields in the Transmit Context Descriptor set the context of the IP checksum off loading feature:
• TUCMD.IPv4
• IPLEN
• MACLEN
TUCMD.IPv4=1b specifies that the packet type for this context is IPv4, and that the IP header
checksum should be inserted. TUCMD.IPv4=0b indicates that the packet type is IPv6 (or some other
protocol) and that the IP header checksum should not be inserted.
MACLEN specifies the byte offset from the start of the DMA'd data to the first byte to be included in the
checksum, the start of the IP header. The minimal allowed value for this field is 12. Note that the
maximum value for this field is 127. This is adequate for typical applications.
Note:
The MACLEN+IPLEN value needs to be less than the total DMA length for a packet. If this is
not the case, the results are unpredictable.
IPLEN specifies the IP header length. Maximum allowed value for this field is 511 Bytes.
MACLEN+IPLEN specify where the IP checksum should stop. This is limited to the first 127+511 bytes
of the packet and must be less than or equal to the total length of a given packet. If this is not the case,
the result is unpredictable.
Note:
For IPsec packet offloaded by hardware in Tx, it is assumed that IPLEN provided by
software in the Tx context descriptor is the sum of the IP header length with the IPsec
header length. Thus For the IPv4 header checksum off-load, hardware could no more rely
on the IPLEN field provided by software in the Tx context descriptor, but should rely on the
fact that no IPv4 options are present in the packet. Consequently, for IPsec off-load packets
hardware computes IP header checksum over always a fixed amount of 20-bytes.
The 16-bit IPv4 Header Checksum is placed at the two bytes starting at MACLEN+10.
As mentioned in Section 7.2.2.2, Transmit Contexts, it is not necessary to set a new context for each
new packet. In many cases, the same checksum context can be used for a majority of the packet
stream. In this case, some performance can be gained by only changing the context on an as needed
basis or electing to use the off-load feature only for a particular traffic type, thereby avoiding all context
descriptors except for the initial one.
7.2.5.2
TCP Checksum
Three fields in the Transmit Context Descriptor set the context of the TCP checksum off loading feature:
• MACLEN
• IPLEN
• TUCMD.L4T
TUCMD.L4T=1b specifies that the packet type is TCP, and that the 16-bit TCP header checksum should
be inserted at byte offset MACLEN+IPLEN+16. TUCMD.L4T=0b indicates that the packet is UDP and
that the 16-bit checksum should be inserted starting at byte offset MACLEN+IPLEN+6.
IPLEN+MACLEN specifies the byte offset from the start of the DMA'd data to the first byte to be
included in the checksum, the start of the TCP header. The minimal allowed value for this sum is 32/42
for UDP or TCP respectively.
Intel® 82576 GbE Controller
Datasheet
330
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Note:
The IPLEN+MACLEN+L4LEN value needs to be less than the total DMA length for a packet.
If this is not the case, the results are unpredictable.
The TCP/UDP checksum always continues to the last byte of the DMA data.
Note:
For non-TSO, software still needs to calculate a full checksum for the TCP/UDP pseudoheader. This checksum of the pseudo-header should be placed in the packet data buffer at
the appropriate offset for the checksum calculation.
7.2.5.3
SCTP CRC Offloading
For SCTP packets, a CRC32 checksum offload is provided.
Three fields in the Transmit Context Descriptor set the context of the STCP checksum off loading
feature:
• MACLEN
• IPLEN
• TUCMD.L4T
TUCMD.L4T=10b specifies that the packet type is SCTP, and that the 32-bit STCP CRC should be
inserted at byte offset MACLEN+IPLEN+8.
IPLEN+MACLEN specifies the byte offset from the start of the DMA'd data to the first byte to be
included in the checksum, the start of the STCP header. The minimal allowed value for this sum is 26.
The SCTP CRC calculation always continues to the last byte of the DMA data.
The SCTP total L3 payload size (PAYLEN - IPLEN - MACLEN) should be a multiple of 4 bytes (SCTP
padding not supported).
Note:
TSO is not available for SCTP packets.
Software must initialize the SCTP CRC field to zero (0x00000000).
7.2.5.4
Checksum Supported Per Packet Types
The following table summarizes which checksum is supported per packet type.
Note:
TSO is not supported for packet types for which IP checksum & TCP checksum can not be
calculated.
Table 7-41.
Packet Type
Checksum Per Packet Type
Hardware IP Checksum
Calculation
Hardware TCP/UDP/SCTP
Checksum Calculation
Ipv4 packets
Yes
Yes
Ipv6 packets
No (n/a)
Yes
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
331
Intel® 82576 GbE Controller — Inline Functions
Table 7-41.
Checksum Per Packet Type (Continued)
Ipv6 packet with next header options:
•
Hop-by-Hop options
No (n/a)
Yes
•
Destinations options
No (n/a)
Yes
•
Routing (w len 0b)
No (n/a)
Yes
•
Routing (w len >0b)
No (n/a)
No
•
Fragment
No (n/a)
No
•
Home option
No (n/a)
No
Security Option (AH/ESP)
Yes1
Yes1
•
Ipv4 tunnels:
•
•
Ipv4 packet in an Ipv4 tunnel
Ipv6 packet in an Ipv4 tunnel
Either IP or TCP/SCTP
2
Either IP or TCP/SCTP
2
Either IP or TCP/SCTP
2
Either IP or TCP/SCTP
2
Ipv6 tunnels:
•
Ipv4 packet in an Ipv6 tunnel
No
Yes
•
Ipv6 packet in an Ipv6 tunnel
No
Yes
Packet is an Ipv4 fragment
Yes
No
Packet is greater than 1518/1522/1526 bytes;
(LPE=1b).
Yes
Yes
Packet has 802.3ac tag
Yes
Yes
Ipv4 Packet has IP options and no IPSec header
(IP header is longer than 20 bytes)
Yes
Yes
Ipv4 Packet has IPSec Header without IP options
Yes1
Yes1
Packet has TCP or UDP options
Yes
Yes
IP header’s protocol field contains protocol # other
than TCP or UDP.
Yes
No
1. Only offloaded flows
2. For the tunneled case, the driver might do only the TCP checksum or Ipv4 checksum. If TCP checksum is desired, the driver should
define the IP header length as the combined length of both IP headers in the packet. If an IPv4 checksum is required, the IP header
length should be set to the Ipv4 header length.
7.2.6
Multiple Transmit Queues
The number of transmit queues is increased to 16, to match the expected number of processors on
most server platforms and to support the new virtualization mode.
If there are more CPUs than queues, then one queue might be used to service more than one CPU.
For transmission process, each thread might set a queue in the host memory of the CPU it is tied to.
7.2.6.1
Bandwidth Allocation to Virtual Machines / Transmit Queues
When operated in either VMDq2 or SR-IOV mode, the 82576 has the ability to control the Tx bandwidth
used by each Virtual Machine (VM). Since in these virtualization modes each Tx Queue is owned by a
separate VM (or a separate set of VMs), bandwidth allocation to VMs is performed by assigning
bandwidth shares to Tx Queues. A rate-controller is internally associated to a Tx Queue to maintain its
allocated bandwidth share.
Intel® 82576 GbE Controller
Datasheet
332
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The bandwidth share represents the minimum percentage of the link’s bandwidth that is guaranteed to
be granted to the VM. Bandwidth unused by a VM is re-distributed among others according to their
relative bandwidth shares.
If the PCIe bandwidth available for transmission get below half of the link’s bandwidth, the bandwidth
allocation to VMs scheme will degenerate into a scheme close to a packet-based round-robin arbitration
between the VMs.
If the link is operated at 10Mbps, bandwidth allocation to VMs must be disabled as in non-virtualized
contexts, and Tx Queues are served in a packet-based round-robin manner.
A VM can be operated in a “Bandwidth Takeover” mode, where it takes over for itself all bandwidth left
unused by others. When several VMs are operated in this mode, unused bandwidth left by others is
equally distributed among them, in a packet-based round-robin manner.
The bandwidth share scheme is configured by the following set of registers:
• VMBACS, to control the general operation of the bandwidth allocation to VMs feature.
• VMBAMMW, to set the maximum amount of Tx payload compensation a VM can accumulate in case
it temporarily does not use its allocated bandwidth.
• VMBASEL, to select the VM / Tx Queue for which a bandwidth share is configured via the VMBAC
register.
• VMBAC, to set the minimum rate allocated to a VM.
7.3
Interrupts
7.3.1
Mapping of Interrupt Causes
The 82576 supports the following interrupt modes:
• PCI legacy interrupts or MSI - selected when GPIE.Multiple_MSIX is 0b
• MSI-X in non-IOV mode - selected when GPIE.Multiple_MSIX is 1b and the VFE bit in PCIe SR-IOV
control register is cleared.
• MSI-X in IOV mode - selected when GPIE.Multiple_MSIX is 1b and the VFE bit in PCIe SR-IOV
control register is set.
Note:
If only one MSI-X vector is allocated by the operating system, then the driver might use the
non MSI-X mapping method even in MSI-X mode.
Mapping of interrupts causes is different in each of the above modes and is described below.
7.3.1.1
Legacy and MSI Interrupt Modes
In legacy and MSI modes, an interrupt cause is reflected by setting a bit in the EICR register. This
section describes the mapping of interrupt causes (a specific Rx queue event or a LSC event) to bits in
the EICR.
Mapping of queue-related causes is accomplished through the IVAR register. Each possible queue
interrupt cause (each Rx or Tx queue) is allocated an entry in the IVAR, and each entry in the IVAR
identifies one bit in the EICR register among the bits allocated to queue interrupt causes. It is possible
to map multiple interrupt causes into the same EICR bit.
In this mode, causes can be mapped to the first 16 bits of the EICR register.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
333
Intel® 82576 GbE Controller — Inline Functions
Interrupt causes related to non-queue causes are mapped into the ICR legacy register; each cause is
allocated a separate bit. The sum of all causes is reflected in the Other bit in EICR. Figure 7-13 below
describes the allocation process.
The following configuration and parameters are involved:
• The IVAR[7:0] entries map 16 Tx queues and 16 Rx queues into EICR[15:0] bits
• The IVAR_MISC that maps non-queue causes is not used
• The EICR[30] bit is allocated to the TCP timer interrupt cause.
• The EICR[31] bit is allocated to the other interrupt causes summarized in the ICR reg.
• A single interrupt vector is provided.
Figure 7-13.
Cause Mapping in Legacy Mode
The Table below maps the different interrupt causes into the IVAR registers.
Table 7-42.
Cause Allocation in the IVAR Registers — MSI and Legacy Mode
Interrupt
Entry
Description
Rx_i
i*4 (i= 0..7)
Receive queues i — Associates an interrupt occurring in the Rx queues i with a
corresponding bit in the EICR register.
Tx_i
i*4+1 (i= 0..7)
Transmit queues i — Associates an interrupt occurring in the Tx queues I with a
corresponding bit in the EICR register.
Rx_i
(i-8)*4+2 (i=
8..15)
Receive queues i — Associates an interrupt occurring in the Rx queues i with a
corresponding bit in the EICR register.
Tx_i
(i-8)*4+3 (i=
8..15)
Transmit queues i — Associates an interrupt occurring in the Tx queues I with a
corresponding bit in the EICR register.
7.3.1.2
MSI-X Mode — Non-IOV Mode
In a non Single Root - IOV setup (SR-IOV capability is not exposed in the PCIe configuration space), the
82576 can request up to 25 Vectors.
In MSI-X mode, an interrupt cause is mapped into an MSI-X vector. This section describes the mapping
of interrupt causes (a specific Rx queue event or other events) to MSI-X vectors.
Intel® 82576 GbE Controller
Datasheet
334
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Mapping is accomplished through the IVAR register. Each possible cause for an interrupt is allocated an
entry in the IVAR, and each entry in the IVAR identifies one MSI-X vector. It is possible to map multiple
interrupt causes into the MSI-X vector.
The EICR also reflects interrupt vectors. The EICR bits allocated for queue causes reflect the MSI-X
vector (bit 2 is set when MSI-X vector 2 is used). Interrupt causes related to non-queue causes are
mapped into the ICR (as in the legacy case). The MSI-X vector for all such causes is reflected in the
EICR.
The following configuration and parameters are involved:
• The IVAR[7:0] entries map 16 Tx queues, 16 Rx queues, a TCP timer, and other events to up to 23
interrupt vectors
• The IVAR_MISC register maps a TCP timer and other events to 2 MSI-X vectors
Figure 7-14 describes the allocation process.
Figure 7-14.
Cause Mapping in MSI-X Mode
Table 7-43 below defines which interrupt cause is represented by each entry in the MSI-X Allocation
registers.
In non SR-IOV mode, the software has access to 34 mapping entries to map each cause to one of the
25 MSI-x vectors.
Table 7-43.
Cause Allocation in the IVAR Registers — Non-IOV Mode
Interrupt
Entry
Description
Rx_i
i*4 (i= 0..7)
Receive queues i — Associates an interrupt occurring in the Rx queues i with a
corresponding entry in the MSI-X Allocation registers.
Tx_i
i*4+1 (i= 0..7)
Transmit queues i — Associates an interrupt occurring in the Tx queues I with a
corresponding entry in the MSI-X Allocation registers.
Rx_i
(i-8)*4+2 (i=
8..15)
Receive queues i — Associates an interrupt occurring in the Rx queues i with a
corresponding entry in the MSI-X Allocation registers.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
335
Intel® 82576 GbE Controller — Inline Functions
Table 7-43.
Cause Allocation in the IVAR Registers — Non-IOV Mode
Interrupt
Entry
Description
Tx_i
(i-8)*4+3 (i=
8..15)
Transmit queues i — Associates an interrupt occurring in the Tx queues I with a
corresponding entry in the MSI-X Allocation registers.
TCP timer
32
TCP Timer — Associates an interrupt issued by the TCP timer with a
corresponding entry in the MSI-X Allocation registers
Other cause
33
Other causes — Associates an interrupt issued by the “other causes” with a
corresponding entry in the MSI-X Allocation registers
7.3.1.3
MSI-X Interrupts in SR-IOV Mode
Each of the VF functions in PCI-SIG SR-IOV mode is allocated 3 MSI-X vectors. The PF can request up to
10 vectors.
Interrupt allocation for the physical function (PF) is done as in the MSI-X non-IOV case. However, the PF
should not assign interrupt vectors to queues not assigned to it. The IVAR_MISC register allocates nonqueue interrupts as in the non-IOV case with a single change - the entry assigned to “other” causes
also handles interrupt on the mailbox.
Although the PF is allocated up to 10 vectors, these vectors shares the internal interrupts with the VFs.
See Section 7.3.3.1 for details of the sharing of the internal interrupts.
Each of the VFs in IOV mode is allocated separate IVAR registers (called VTIVAR), translating its queuerelated interrupt causes into MSI-X vectors for this virtual function. The IVAR register has one entry per
Tx or Rx queue. A VTIVAR_MISC register is provided to map the mailbox interrupt into an MSI-X vector.
The PF can allocate interrupt causes not used by the VFs to one of it’s own vectors.
The EICR of each VF or of the PF reflects the status of the MSI-X vectors allocated to this function.
Figure 7-15.
Cause Mapping of a VF in MSI-X Mode (IOV)
Table 7-44 below, defines for a given VM (not PF) which interrupt cause is represented by each entry in
the MSI-X Allocation registers.
In the IOV mode the software have access to 5 mapping entries to map each cause to one out of 3 MSIx vectors
Intel® 82576 GbE Controller
Datasheet
336
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The 3 VM vectors (per each VM) can be allocated to one or more causes (2 Q traffic interrupt, Mail Box
interrupt).
Table 7-44.
Cause Allocation for a VF in the VTIVAR Registers — IOV Mode
Interrupt
Entry
Description
Rx Queue i
(i=0...1)
i*2
Receive queue i — Associates an interrupt occurring in Rx queue i with a
corresponding entry in the MSI-X Allocation registers.
Tx Queue i
(i=0...1)
i*2+1
Transmit queue i — Associates an interrupt occurring in Tx queue i with a
corresponding entry in the MSI-X Allocation registers.
7.3.2
Registers
The interrupt logic consists of the registers listed in the tables below, plus the registers associated with
MSI/MSI-X signaling. The first table describes the use of the registers in legacy mode and the second
one the use of the register when using the extended interrupts functionality
Table 7-45.
Interrupt Registers — Legacy Mode
Register
Acronym
Function
Interrupt Cause
ICR
Records interrupt conditions.
Interrupt Cause Set
ICS
Allows software to set bits in the ICR.
Interrupt Mask Set/Read
IMS
Sets or reads bits in the interrupt mask.
Interrupt Mask Clear
IMC
Clears bits in the interrupt mask.
Interrupt Acknowledge
auto-mask
IAM
Under some conditions, the content of this register is copied to the mask
register following read or write of ICR.
Table 7-46.
Interrupt Registers — Extended Mode
Register
Acronym
Function
Extended Interrupt Cause
EICR
Records interrupt causes from receive and transmit queues. An interrupt
is signaled when unmasked bits in this register are set.
Extended Interrupt Cause
Set
EICS
Allows software to set bits in the Interrupt Cause register.
Extended Interrupt Mask
Set/Read
EIMS
Sets or read bits in the interrupt mask.
Extended Interrupt Mask
Clear
EIMC
Clears bits in the interrupt mask.
Extended Interrupt Auto
Clear
EIAC
Allows bits in the EICR to be cleared automatically following an MSI-X
interrupt without a read or write of the EICR.
Extended Interrupt
Acknowledge auto-mask
EIAM
This register is used to decide which masks are cleared in the extended
mask register following read or write of EICR or which masks are set
following a write to EICS. In MSI-X mode, this register also controls
which bits in EIMC are cleared automatically following an MSI-X
interrupt.
Interrupt Cause
ICR
Records interrupt conditions for special conditions — a single interrupt
from all the conditions of ICR is reflected in the “other” field of the EICR.
Interrupt Cause Set
ICS
Allows software to set bits in the ICR.
Interrupt Mask Set/Read
IMS
Sets or reads bits in the other interrupt mask.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
337
Intel® 82576 GbE Controller — Inline Functions
Table 7-46.
Interrupt Registers — Extended Mode (Continued)
Interrupt Mask Clear
IMC
Clears bits in the Other interrupt mask.
Interrupt Acknowledge
auto-mask
IAM
Under some conditions, the content of this register is copied to the mask
register following read or write of ICR.
General Purpose Interrupt
Enable
GPIE
Controls different behaviors of the interrupt mechanism.
7.3.2.1
Interrupt Cause Register (ICR)
7.3.2.1.1
Legacy Mode
In Legacy mode, ICR is used as the sole interrupt cause register. Upon reception of an interrupt, the
interrupt handling routine can read this register in order to find out what are the causes of this
interrupt.
7.3.2.1.2
Advanced Mode
In advanced mode, this register captures the interrupt causes not directly captured by the EICR. These
are infrequent management interrupts and error conditions.
Note that when EICR is used in advanced mode, the RX /TX related bits in ICR should be masked.
ICR bits are cleared on register read. If GPIE.NSICR = 0b, then the clear on read occurs only if no bit
is set in the IMS or at least one bit is set in the IMS and there is a true interrupt as reflected in
ICR.INTA.
7.3.2.2
Interrupt Cause Set Register (ICS)
This registers allows setting the bits of ICR by software, by writing a 1b in the corresponding bits in
ICS. Used usually to rearm interrupts the software didn't have time to handle in the current interrupt
routine.
7.3.2.3
Interrupt Mask Set/Read Register (IMS)
An interrupt is enabled if its corresponding mask bit in this register is set to 1b, and disabled if its
corresponding mask bit is set to 0b. A PCIe interrupt is generated whenever one of the bits in this
register is set, and the corresponding interrupt condition occurs. The occurrence of an interrupt
condition is reflected by having a bit set in the Interrupt Cause Register.
Reading this register returns which bits have an interrupt mask set.
A particular interrupt might be enabled by writing a 1b to the corresponding mask bit in this register.
Any bits written with a 0b are unchanged. Thus, if software desires to disable a particular interrupt
condition that had been previously enabled, it must write to the Interrupt Mask Clear Register (see
below), rather than writing a 0b to a bit in this register.
7.3.2.4
Interrupt Mask Clear Register (IMC)
Software blocks interrupts by clearing the corresponding mask bit. This is accomplished by writing a 1b
to the corresponding bit in this register. Bits written with 0b are unchanged (their mask status does not
change).
Intel® 82576 GbE Controller
Datasheet
338
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.3.2.5
Interrupt Acknowledge Auto-mask register (IAM)
An ICR read or write has the side effect of writing the contents of this register to the mask register. If
GPIE.NSICR = 0b, then the copy of this register to the mask register occurs only at least one bit is set
in the mask register and there is a true interrupt as reflected in ICR.INTA.
7.3.2.6
Extended Interrupt Cause Registers (EICR)
7.3.2.6.1
MSI/INT-A Mode
This register records the interrupts causes to provide to the software information on the interrupt
source.
The interrupt causes include:
1. The Receive and Transmit queues — Each queue (either Tx or Rx) can be mapped to one of the 16
interrupt causes bits (RTxQ) available in this register according to the mapping in the IVAR
registers
2. Indication for the TCP timer interrupt.
3. Legacy and other indications — When any interrupt in the Interrupt Cause register is active.
Writing 1bs clears the corresponding bits in this register. Most systems have write-buffering that
minimizes overhead, but this might require a read operation to guarantee that the write has been
flushed from posted buffers. Reading this register auto-clears all bits.
7.3.2.6.2
MSI-X Mode
This register records the interrupt vectors currently emitted. In this mode only the first 25 bits are
valid.
For all the subsequent registers, in MSI-X mode, each bit controls the behavior of one vector.
Bits in this register can be configured to auto-clear when the MSI-X interrupt message is sent, in order
to minimize driver overhead when using MSI-X interrupt signaling.
7.3.2.7
Extended Interrupt Cause Set Register (EICS)
This registers allows to set the bits of EICR by software, by writing a 1b in the corresponding bits in
EICS. Used usually to rearm interrupts the software didn't have time to handle in the current interrupt
routine.
7.3.2.8
Extended Interrupt Mask Set and Read Register (EIMS) &
Extended Interrupt Mask Clear Register (EIMC)
Interrupts appear on PCIe only if the interrupt cause bit is a one and the corresponding interrupt mask
bit is a one. Software blocks assertion of an interrupt by clearing the corresponding bit in the mask
register. The cause bit stores the interrupt event regardless of the state of the mask bit. Different Clear
(EIMC) and set (EIMS) registers make this register more “thread safe” by avoiding a read-modify-write
operation on the mask register. The mask bit is set for each bit written to a one in the set register
(EIMS) and cleared for each bit written in the clear register (EIMC). Reading the set register (EIMS)
returns the current mask register value.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
339
Intel® 82576 GbE Controller — Inline Functions
7.3.2.9
Extended Interrupt Auto Clear Enable Register (EIAC)
Each bit in this register enables clearing of the corresponding bit in EICR following interrupt generation.
When a bit is set, the corresponding bit in EICR are automatically cleared following an interrupt. This
feature should only be used in MSI-X mode.
When used in conjunction with MSI-X interrupt vector, this feature allows interrupt cause recognition,
and selective interrupt cause, without requiring software to read or write the EICR register; therefore,
the penalty related to a PCIe read or write transaction is avoided.
The process of interrupt cause bits reset is described below in Section 7.3.4
7.3.2.10
Extended Interrupt Auto Mask Enable Register (EIAM)
Each bit set in this register enables clearing of the corresponding bit in the extended mask register
following read or write-to-clear to EICR. It also enables setting of the corresponding bit in the extended
mask register following a write-to-set to EICS.
This mode is provided in case MSI-X is not used, and therefore auto-clear through EIAC register is not
available.
In MSI-X mode, the driver software might set the bits of this register to select mask bits that must be
reset during interrupt processing. In this mode, each bit in this register enables clearing of the
corresponding bit in EIMC following interrupt generation.
7.3.2.11
GPIE
There are a few bits in the GPIE register that define the behavior of the interrupt mechanism. The
setting of these bits is different in each mode of operation. The following table describes the
recommended setting of these bits in the different modes:
Table 7-47.
Field
Bit(s)
NSICR
0
Settings for Different Interrupt Modes
Initial
Value
0b
Intel® 82576 GbE Controller
Datasheet
340
Description
Non Selective Interrupt clear on read:
When set, every read of ICR clears it. When this
bit is cleared, an ICR read causes it to be
cleared only if an actual interrupt was asserted
or IMS = 0b.
INT-x/
MSI +
Legacy
0b1
INT-x/
MSI +
Extend
1b
MSI-X
Multi
vector
1b
MSI-X
Single
vector
1b
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Table 7-47.
Multipl
e_MSI
X
4
Settings for Different Interrupt Modes (Continued)
0b
Multiple_MSIX - multiple vectors:
0b
0b
1b
0b
0b = non-MSIX or MSI-X with 1 vector IVAR
map Rx/Tx causes to 16 EICR bits, but MSIX[0]
is asserted for all
1b = MSIX mode, IVAR maps Rx/Tx causes to
25 EICR bits
EIAME
30
0b
EIAME: When set, upon firing of an MSI-X
message, mask bits set in EIAM associated with
this message are cleared. Otherwise, EIAM is
used only upon read or write of EICR/EICS
registers.
0b
0b
1b
1b
PBA_
31
0b
PBA support: When set, setting one of the
extended interrupts masks via EIMS causes the
PBA bit of the associated MSI-X vector to be
cleared. Otherwise, the 82576 behaves in a
way supporting legacy INT-x interrupts.
0b
0b
1b
1b
suppor
t
Should be cleared when working in INT-x or
MSI mode and set in MSI-X mode.
1. In systems where interrupt sharing is not expected, the NSICR bit can be set by legacy drivers also
As this register affects the way the hardware interprets write to the other interrupt control registers, it
should be set the to the right mode before any access to the other registers.
7.3.3
MSI-X and Vectors
MSI-X defines a separate optional extension to basic MSI functionality. Compared to MSI, MSI-X
supports a larger maximum number of vectors per function, the ability for software to control aliasing
when fewer vectors are allocated than requested, plus the ability for each vector to use an independent
address and data value, specified by a table that resides in Memory Space. However, most of the other
characteristics of MSI-X are identical to those of MSI. For more information on MSI-X, refer to the PCI
Local Bus Specification, Revision 3.0.
MSI-X maps each of the Intel® 82576 GbE Controller interrupt causes into an interrupt vector that is
conveyed by the 82576 as a posted-write PCIe transaction. Mapping of an interrupt cause into an MSIX vector is determined by system software (a device driver) through a translation table stored in the
MSI-X Allocation registers. Each entry of the allocation registers defines the vector for a single interrupt
cause.
There are 34 extended interrupt causes exit in the 82576:
1. 32 traffic causes — 16 Tx, 16 Rx.
2. TCP timer
3. Other causes — Summarizes legacy interrupts into one extended cause.
The way the 82576 exposes causes to the software is determined by the IOV mode. See Section 7.3.1
for details.
7.3.3.1
Usage of Spare MSI-X Vectors by Physical Function
The total number of available MSI-X vector is 34. The PF should not request vectors that may be later
allocated to VFs. For example, if the driver knows that at most 6 VFs will be enabled, it can request up
to 34 - 3*6 = 16 vectors.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
341
Intel® 82576 GbE Controller — Inline Functions
In any case, the PF can request 10 vectors, even if all the VFs are allocated. However, the number of
internal interrupts is only 25. Thus, when VFs are enabled, the PF should release all the internal
interrupt resources allocated to the VFs.
The following table describes the PF vectors available according to the number of VFs enabled assuming
the PF requests up to 10 vectors. The available vectors can be referenced in the IVAR registers and
indicates which EITR registers are available for the PF.
Table 7-48.
Internal vectors available to the PF
Enabled VFs
Available vectors
0-5
0-9
6
0-7
7
0-3
8
0
7.3.3.2
Interrupt Moderation
The 82576 implements interrupt moderation to reduce the number of interrupts software processes.
The moderation scheme is based on the EITR (Interrupt Throttle Register; see Section 8.8.12).
Whenever an interrupt event happens, the corresponding bit in the EICR is activated. However, an
interrupt message is not sent out on the PCIe interface until the EITR counter assigned to that EICR bit
has counted down to zero. As soon as the interrupt is issued, the EITR counter is reloaded with its initial
value and the process repeats again.
The flow follows the diagram below:
Intel® 82576 GbE Controller
Datasheet
342
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Figure 7-16.
Interrupt Throttle Flow Diagram
For cases where the 82576 is connected to a small number of clients, it is desirable to fire off the
interrupt as soon as possible with minimum latency. For these cases, when the EITR counter counts
down to zero and no interrupt event has happened, then the EITR counter is not reset but stays at zero.
Thus, the next interrupt event triggers an interrupt immediately. That scenario is illustrated as “Case B”
below.
Figure 7-17.
320961-015EN
Revision: 2.61
December 2010
Case A: Heavy Load, Interrupts Moderated
Intel® 82576 GbE Controller
Datasheet
343
Intel® 82576 GbE Controller — Inline Functions
Figure 7-18.
7.3.3.2.1
Light load, Interrupts Immediately on Packet Receive
More on Using EITR
There is an EITR register for each MSI-X vector. See also: Section 8.8.12.
EITR provides a guaranteed inter-interrupt delay between interrupts asserted by the 82576, regardless
of network traffic conditions. To independently validate configuration settings, software can use the
following algorithm to convert the inter-interrupt interval value to the common interrupts/sec.
performance metric:
interrupts/sec = (1 * 10-6sec x interval)-1
A counter counts in units of 1*10-6 sec. After counting “interval “number of units, an interrupt is sent to
the software. The above equation gives the number of interrupts per second. The equation below time
in seconds between consecutive interrupts.
For example, if the interval is programmed to 125 (decimal), the 82576 guarantees the processor does
not receive an interrupt for 125 s from the last interrupt. The maximum observable interrupt rate from
the 82576 should never exceed 8000 interrupts/sec.
Inversely, inter-interrupt interval value can be calculated as:
inter-interrupt interval = (1 * 10-6sec x interrupt/sec)-1
The optimal performance setting for this register is very system and configuration specific. An initial
suggested range is 2 to 175 (0x02 to 0xAF).
Note:
7.3.4
Setting EITR to a non zero value can cause an interrupt cause Rx/Tx statistics miscount.
Clearing Interrupt Causes
The 82576 has three methods available for to clear EICR bits: Autoclear, clear-on-write, and clear-onread. ICR bits might only be cleared with clear-on-write or clear-on-read.
7.3.4.1
Auto-Clear
In systems that support MSI-X, the interrupt vector allows the interrupt service routine to know the
interrupt cause without reading the EICR. With interrupt moderation active, software load from
spurious interrupts is minimized. In this case, the software overhead of a I/O read or write can be
avoided by setting appropriate EICR bits to autoclear mode by setting the corresponding bits in the
Extended Interrupt Auto-clear Enable Register (EIAC).
Intel® 82576 GbE Controller
Datasheet
344
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
When auto-clear is enabled for an interrupt cause, the EICR bit is set when a cause event mapped to
this vector occurs. When the EITR Counter reaches zero, the MSI-X message is sent on PCIe. Then the
EICR bit is cleared and enabled to be set by a new cause event. The vector in the MSI-X message
signals software the cause of the interrupt to be serviced.
It is possible that in the time after the EICR bit is cleared and the interrupt service routine services the
cause, for example checking the transmit and receive queues, that another cause event occurs that is
then serviced by this ISR call, yet the EICR bit remains set. This results in a “spurious interrupt”.
Software can detect this case, for example if there are no entries that require service in the transmit
and receive queues, and exit knowing that the interrupt has been automatically cleared. The use of
interrupt moderations through the EITR register limits the extra software overhead that can be caused
by these spurious interrupts.
7.3.4.2
Write to Clear
In the case where the driver wishes to configure itself in MSI-X mode to not use the “auto-clear”
feature, it might clear the EICR bits by writing to the EICR register. Any bits written with a 1b is cleared.
Any bits written with a 0b remain unchanged.
7.3.4.3
Read to Clear
The EICR and ICR registers are cleared on a read.
Note that the driver should never do a read-to-clear of the EICR when in MSI-X mode, since this might
clear interrupt cause events which are processed by a different interrupt handler (assuming multiple
vectors).
7.3.5
Rate Controlled Low Latency Interrupts (LLI)
There are some types of network traffic for which latency is a critical issue. For these types of traffic,
interrupt moderation hurts performance by increasing latency between the time a packet is received by
hardware and the time it is handled to the host operating system. This traffic can be identified by the 5tuple value, in conjunction with Control Bits and specific size. In addition packets with specific ethernet
types, TCP flag or specific VLAN priority might generate an immediate interrupt.
Low latency interrupts shares the filters used by the queueing mechanism described in Section 7.1.1.
Each of these filters, in addition to the queueing action might also indicate matching packets might
generate immediate interrupt.
If a received packet matches one of these filters, hardware should interrupt immediately, overriding the
interrupt moderation by the EITR counter.
Each time a Low Latency Interrupt is fired, the EITR interval is loaded and down-counting starts again.
The logic of the low latency interrupt mechanism is as follows:
• There are 8 5-tuples filters. The content of each filter is described in Section 7.1.1.5. The
immediate interrupt action of each filter can be enabled or disabled. If one of the filters detects an
adequate packet, an immediate interrupt is issued.
• When VLAN priority filtering is enabled, VLAN packets must trigger an immediate interrupt when
the VLAN Priority is equal to or above the VLAN priority threshold. This is regardless of the status of
the 5-tuple filters.
• The SYN packets filter defined in Section 7.1.1.6 and the ethernet type filters defined in section
Section 7.1.1.4 might also be used to indicate low latency interrupt conditions.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
345
Intel® 82576 GbE Controller — Inline Functions
Note:
Immediate interrupts are available only when using advanced receive descriptors and not
for legacy descriptors.
Packets that are dropped or have errors do not cause a Low Latency Interrupt.
7.3.5.1
Rate Control Mechanism
In a network with lots of latency sensitive traffics the Low Latency Interrupt can eliminate the Interrupt
throttling capability by flooding the Host with too many interrupts (more than the Host can handle).
In order to mitigate the above, Intel® 82576 GbE Controller supports a credit base mechanism to
control the rate of the Low Latency Interrupts.
Rules:
• The default value of each counter is 0b (no moderation). This also preserves backward
compatibility.
• The counter increments at a configurable rate, and saturates at the maximum value (31d).
— The configurable rate granularity is 4 s (250K interrupt/sec. down to 250K/32 ~ 8K interrupts
per sec.).
• A LLI might be issued as long as the counter value is strictly positive (> zero).
— The credit counter allows bursts of low latency interrupts but the interrupt average are not
more than the configured rate.
• Each time a Low Latency Interrupt is fired the credit counter decrements by one.
• Once the counter reaches zero, a low latency interrupt cannot be fired
— Must wait for the next ITR expired or for the next incrementing of this counter (if the EITR
expired happened first the counter does not decrement).
The following fields manages rate control of LLI:
• The LL Interval field in the GPIE register controls the rate of credits.
• The 5-bit LL Counter field in the EITR register contains the credits
7.3.6
7.3.6.1
TCP Timer Interrupt
Introduction
In order to implement TCP timers for IOAT, software needs to take action periodically (every 10
milliseconds). Today, the driver must rely on software-based timers, whose granularity can change from
platform to platform. This software timer generates a software NIC interrupt, which then allows the
driver to perform timer functions as part of its usual DPC, avoiding cache thrash and enabling
parallelization. The timer interval is system-specific.
It would be more accurate and more efficient for this periodic timer to be implemented in hardware.
The driver would program a timeout value (usual value of 10 ms), and each time the timer expires,
hardware sets a specific bit in the EICR. When an interrupt occurs (due to normal interrupt moderation
schemes), software reads the EICR and discover that it needs to process timer events during that DPC.
The timeout should be programmable by the driver, and the driver should be able to disable the timer
interrupt if it is not needed.
Intel® 82576 GbE Controller
Datasheet
346
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.3.6.2
Description
A stand-alone down-counter is implemented. An interrupt is issued each time the value of the counter
is zero.
The software is responsible for setting initial value for the timer in the Duration field. Kick-starting is
done by writing a 1b to the KickStart bit.
Following kick-starting, an internal counter is set to the value defined by the Duration field. Then the
counter is decreased by one each millisecond. When the counter reaches zero, an interrupt is issued
(see EICR register Section 8.8.1). The counter re-start counting from its initial value if the Loop field is
set.
7.4
802.1q VLAN Support
The 82576 provides several specific mechanisms to support 802.1q VLANs:
• Optional adding (for transmits) and stripping (for receives) of IEEE 802.1q VLAN tags.
• Optional ability to filter packets belonging to certain 802.1q VLANs.
7.4.1
802.1q VLAN Packet Format
The following table compares an untagged 802.3 Ethernet packet with an 802.1q VLAN tagged packet.
Table 7-49.
Comparing Packets
802.3 Packet
#Octets
802.1q VLAN Packet
#Octets
DA
6
DA
6
SA
6
SA
6
Type/Length
2
802.1q Tag
4
Data
46-1500
Type/Length
2
CRC
4
Data
46-1500
CRC*
4
Note:
The CRC for the 802.1q tagged frame is re-computed, so that it covers the entire tagged
frame including the 802.1q tag header. Also, max frame size for an 802.1q VLAN packet is
1522 octets as opposed to 1518 octets for a normal 802.3z Ethernet packet.
7.4.2
802.1q Tagged Frames
For 802.1q, the Tag Header field consists of four octets comprised of the Tag Protocol Identifier (TPID)
and Tag Control Information (TCI); each taking 2 octets. The first 16 bits of the tag header makes up
the TPID. It contains the “protocol type” which identifies the packet as a valid 802.1q tagged packet.
The two octets making up the TCI contain three fields:
• User Priority (UP)
• Canonical Form Indicator (CFI). Should be 0b for transmits. For receives, the device has the
capability to filter out packets that have this bit set. See the CFIEN and CFI bits in the RCTL
described in Section 8.10.1.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
347
Intel® 82576 GbE Controller — Inline Functions
• VLAN Identifier (VID)
Bit ordering is shown below.
Table 7-50.
TCI Bit Ordering
Octet 1
UP
7.4.3
7.4.3.1
Octet 2
VID
Transmitting and Receiving 802.1q Packets
Adding 802.1q Tags on Transmits
Software might command the 82576 to insert an 802.1q VLAN tag on a per packet or per flow basis. If
CTRL.VME is set to 1b, and the VLE bit in the transmit descriptor is set to 1b, then the 82576 inserts a
VLAN tag into the packet that it transmits over the wire. The Tag Protocol Identifier (TPID) field of the
802.1q tag comes from the VET register. 8021.Q tag insertion is done in different ways for legacy and
advanced Tx descriptors:
• Legacy Transmit Descriptors: The Tag Control Information (TCI) of the 802.1q tag comes from the
VLAN field (see Figure 7-9) of the descriptor. Refer to Table 7-26, for more information regarding
hardware insertion of tags for transmits.
• Advanced Transmit Descriptor: The Tag Control Information (TCI) of the 802.1q tag comes from the
VLAN Tag field (see Table 7.2.2.2.1) of the advanced context descriptor. The IDX field of the
advanced Tx descriptor should be set to the adequate context.
7.4.3.2
Stripping 802.1q Tags on Receives
Software might instruct the 82576 to strip 802.1q VLAN tags from received packets. If the CTRL.VME
bit is set to 1b, and the incoming packet is an 802.1q VLAN packet (its Ethernet Type field matched the
VET), then the 82576 strips the 4 byte VLAN tag from the packet, and stores the TCI in the VLAN Tag
field (see Figure 7-5 and Section 7.1.10.2) of the receive descriptor.
The 82576 also sets the VP bit in the receive descriptor to indicate that the packet had a VLAN tag that
was stripped. If the CTRL.VME bit is not set, the 802.1Q packets can still be received if they pass the
receive filter, but the VLAN tag is not stripped and the VP bit is not set. Refer Table 7-19 for more
information regarding receive packet filtering.
7.4.4
802.1q VLAN Packet Filtering
VLAN filtering is enabled by setting the RCTL.VFE bit to 1b. If enabled, hardware compares the type
field of the incoming packet to a 16-bit field in the VLAN Ether Type (VET) register. If the VLAN type
field in the incoming packet matches the VET register, the packet is then compared against the VLAN
Filter Table Array for acceptance
The 82576 provides exact VLAN filtering for VLAN tags for host traffic and VLAN tags for manageability
traffic.
The Virtual LAN ID field indexes a 4096 bit vector. If the indexed bit in the vector is one; there is a
Virtual LAN match. Software might set the entire bit vector to ones if the node does not implement
802.1q filtering. The register description of the VLAN Filter Table Array is described in detail in
Section 8.10.19.
Intel® 82576 GbE Controller
Datasheet
348
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
In summary, the 4096-bit vector is comprised of 128, 32-bit registers. The VLAN Identifier (VID) field
consists of 12 bits. The upper 7 bits of this field are decoded to determine the 32-bit register in the
VLAN Filter Table Array to address and the lower 5 bits determine which of the 32 bits in the register to
evaluate for matching.
The MC configures the 82576 with eight different manageability VIDs via the Management VLAN TAG
Value [7:0] - MAVTV[7:0] registers and enables each filter in the MFVAL register.
Two other bits in the Receive Control register (see Section 8.10.1), CFIEN and CFI, are also used in
conjunction with 802.1q VLAN filtering operations. CFIEN enables the comparison of the value of the
CFI bit in the 802.1q packet to the Receive Control register CFI bit as acceptance criteria for the packet.
Note:
The VFE bit does not effect whether the VLAN tag is stripped. It only effects whether the
VLAN packet passes the receive filter.
The following table lists reception actions per control bit settings.
Figure 7-19.
Is packet
802.1q?
Packet Reception Decision Table
CTRL.
VME
RCTL.
VFE
Action
No
X
X
Normal packet reception
Yes
0b
0b
Receive a VLAN packet if it passes the standard MAC address filters (only). Leave the
packet as received in the data buffer. VP bit in receive descriptor is cleared.
Yes
0b
1b
Receive a VLAN packet if it passes the standard filters and the VLAN filter table. Leave
the packet as received in the data buffer (the VLAN tag would not be stripped). VP bit
in receive descriptor is cleared.
Yes
1b
0b
Receive a VLAN packet if it passes the standard filters (only). Strip off the VLAN
information (four bytes) from the incoming packet and store in the descriptor. Sets VP
bit in receive descriptor.
Yes
1b
1b
Receive a VLAN packet if it passes the standard filters and the VLAN filter table. Strip
off the VLAN information (four bytes) from the incoming packet and store in the
descriptor. Sets VP bit in receive descriptor.
Note:
A packet is defined as a VLAN/802.1q packet if its type field matches the VET.
7.4.5
Double VLAN Support
The 82576 supports a mode where all received and sent packet have at least one VLAN tag in addition
to the regular tagging which might optionally be added. This mode is used for systems where the
switches add an additional tag containing switching information.
This mode is activated by setting CTRL_EXT.EXTENDED_VLAN bit. The default of this bit is set
according to bit 1 in word 24h/14h of the EEPROM for ports 0 and 1 respectively.
The type of the VLAN tag used for the additional VLAN is defined in the VET.VET_EXT field.
7.4.5.1
Transmit Behavior
It is expected that the driver includes the external VLAN header as part of the transmit data structure.
The software may post the internal VLAN header as part of the transmit data structure or embedded in
the transmit descriptor (see Section 7.2.2 for details). The 82576 does not relate to the external VLAN
header other than the capability of “skipping” it for parsing of inner fields.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
349
Intel® 82576 GbE Controller — Inline Functions
Note:
The VLAN header in a packet that carries a single VLAN header is treated as the external
VLAN.
The 82576 expects that any transmitted packet has at least the external VLAN added by the
software. For those packets where an external VLAN is not present, any offload that relates
to inner fields to the EtherType may not be provided.
7.4.5.2
Receive Behavior
When a port of the 82576 is working in this mode, the 82576 assumes that all packets received by this
port have at least one VLAN, including packet received or sent on the manageability interface.
One exception to this rule are flow control PAUSE packets which are not expected to have any VLAN.
Other packets may contain no VLAN, however a received packet that does not contain the first VLAN is
forwarded to the host but filtering and offloads are not applied to this packet.
See the table below for the supported receive processing when the device is set to “Double VLAN”
mode.
Stripping of VLAN is done on the second VLAN if it exists. All the filtering functions of the 82576 ignores
the first VLAN in this mode.
The presence of a first VLAN tag is indicated it in the RDESC.STATUS.VEXT bit.
Queue assignment of the Rx packets is not affected by the external VLAN header. It may depend on the
internal VLAN, MAC address or any upper layer content as described in Section 7.1.1.
Table 7-51.
Receive Processing in Double VLAN Mode
VLAN Headers
Status.VEXT
Status.VP
Packet Parsing
Rx offload functions
External and internal
1
1
+
+
Internal Only
Not supported
V-Ext
1
0
+
+
None1
0
0
+ (flow control only)
-
1. A few examples for packets that may not carry any VLAN header may be: Flow control; LACP; LLDP; GMRP; 802.1x packets
7.5
Configurable LED Outputs
The 82576 implements 4 output drivers intended for driving external LED circuits per port. Each LAN
device provides an independent set of LED outputs - these pins and their function are bound to a
specific LAN device. Each of the four LED outputs can be individually configured to select the particular
event, state, or activity, which is indicated on that output. In addition, each LED can be individually
configured for output polarity as well as for blinking versus non-blinking (steady-state) indication.
The configuration for LED outputs is specified via the LEDCTL register. Furthermore, the hardwaredefault configuration for all the LED outputs, can be specified via EEPROM fields, thereby supporting
LED displays configurable to a particular OEM preference.
Each of the 4 LED's might be configured to use one of a variety of sources for output indication. The
MODE bits control the LED source as described in Table 7-52.
Intel® 82576 GbE Controller
Datasheet
350
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The IVRT bits allow the LED source to be inverted before being output or observed by the blink-control
logic. LED outputs are assumed to normally be connected to the negative side (cathode) of an external
LED.
The BLINK bits control whether the LED should be blinked (on for 200ms, then off for 200ms) while the
LED source is asserted. The blink control might be especially useful for ensuring that certain events,
such as ACTIVITY indication, cause LED transitions, which are sufficiently visible by a human eye.
Note:
When LED Blink mode is enabled the appropriate LED Invert bit should be set to 0b.
The LINK/ACTIVITY source functions slightly different from the others when BLINK is
enabled. The LED is off if there is no LINK, on if there is LINK and no ACTIVITY, and blinking
if there is LINK and ACTIVITY.
The dynamic LED modes (FILTER_ACTIVITY, LINK/ACTIVITY, COLLISION, ACTIVITY, PAUSED) should be
used with LED Blink mode enabled.
7.5.1
MODE Encoding for LED Outputs
Table 7-52 lists the MODE encoding used to select the desired LED signal source for each LED output.
Table 7-52.
Mode
Mode Encoding for LED Outputs
Selected Mode
Source Indication
0000b
LINK_10/1000
Asserted when either 10 or 1000 Mb/s link is established and
maintained.
0001b
LINK_100/1000
Asserted when either 100 or 1000 Mb/s link is established and
maintained.
0010b
LINK_UP
Asserted when any speed link is established and maintained.
0011b
FILTER_ACTIVITY
Asserted when link is established and packets are being
transmitted or received that passed MAC filtering.
0100b
LINK/ACTIVITY
Asserted when link is established and when there is no transmit
or receive activity.
0101b
LINK_10
Asserted when a 10 Mb/s link is established and maintained.
0110b
LINK_100
Asserted when a 100 Mb/s link is established and maintained.
0111b
LINK_1000
Asserted when a 1000 Mb/s link is established and maintained.
1000b
SDP_MODE
LED activation is a reflection of the SDP signal. SDP0, SDP1,
SDP2, SDP3 are reflected to LED0, LED1, LED2, LED3
respectively.
1001b
FULL_DUPLEX
Asserted when the link is configured for full duplex operation
(de-asserted in half-duplex).
1010b
COLLISION
Asserted when a collision is observed.
1011b
ACTIVITY
Asserted when link is established and packets are being
transmitted or received.
1100b
BUS_SIZE
Asserted when the 82576 detects a 1-lane PCIe connection.
1101b
PAUSED
Asserted when the 82576’s transmitter is flow controlled.
1110b
LED_ON
Always high (Asserted)
1111b
LED_OFF
Always low (De-asserted)
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
351
Intel® 82576 GbE Controller — Inline Functions
7.6
Memory Error Correction and Detection
The 82576 main internal memories are protected by error correcting code or parity code. The larger
memories are protected by an error correcting code (ECC) that can detect two errors and correct one
error. The smaller memories are protected either with an error correcting code (ECC) that correct one
error or a parity bit that can detect one error.
Correctable errors are silently corrected and are counted in the RPBECCSTS.Corr_err_cnt,
TPBECCSTS.Corr_err_cnt, SWPBECCSTS.Corr_err_cnt, IPPBECCSTS.Corr_err_cnt,
RDHESTS.Corr_err_cnt, TDHESTS.Corr_err_cnt, PRBESTS.Corr_err_cnt, PWBESTS.Corr_err_cnt or
PMSIXESTS.Corr_err_cnt fields according to the memory in which the error was found.
Part of the uncorrectable errors are counted in the RPBECCSTS.Uncorr_err_cnt,
TPBECCSTS.Uncorr_err_cnt, SWPBECCSTS.Uncorr_err_cnt, IPPBECCSTS.Uncorr_err_cnt,
RDHESTS.Uncorr_err_cnt or TDHESTS.Uncorr_err_cnt fields according to the memory in which the
error was found. The 82576 reacts to uncorrectable error detection according to the location in which
the error was found:
• If the error was detected in a receive packet data in the main Rx packet buffer, the packet is sent to
the host with the RXE bit set in the receive descriptor. This packet should be discarded by the host.
This is considered as a non fatal error.
• If the error was detected in a transmit packet data in the main Tx packet buffer, the packet is sent
to the network with a wrong FCS so that the link partner can discard it. This is also, considered as a
non fatal error.
• If the error was detected in the descriptors attached to receive or transmit packets in the descriptor
handler cache memory, or a parity error was detected in one of the internal control memories the
consistency of the receive/transmit flow can not be guaranteed any more. In this case the traffic is
stopped and an interrupt is raised and the memory in which the error was detected is indicated in
the PEIND register. The flow stop can be released only by software reset (CTRL.RST). This is
considered as a fatal error.
• If an error is detected in the IPsec Rx SA table, the traffic is stopped and an interrupt is raised and
the memory in which the error was detected is indicated in the PEIND register. The flow stop can be
released only by software reset (CTRL.RST). This is considered as a fatal error.
• The interrupt causes used to indicate an error are ICR[23:22] according to the severity of the error.
Note:
Once an interrupt indicating a memory error was asserted, the PEIND register must be read
before a new interrupt can be asserted.
The enabling of the reaction mechanism of the 82576 to uncorrectable errors for each of the memories
is done using the PEINDM register. Enablement of parity error detection is done using the PEINDM.
Parity_en field. Enablement of ECC error correction for each memory is done using the ECC Enable field
in the RPBECCSTS, TPBECCSTS, SWPBECCSTS, IPPBECCSTS, RDHESTS, TDHESTS, PRBESTS,
PWBESTS or PMSIXESTS registers.
7.7
DCA
7.7.1
Description
Direct Cache Access (DCA) is a method to improve network I/O performance by placing some posted
inbound writes directly within CPU cache. Through research and experiments, DCA has been shown to
reduce CPU Cache miss rates significantly.
Intel® 82576 GbE Controller
Datasheet
352
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
DCA provides a mechanism where the posted write data from an I/O device, such as an Ethernet NIC,
can be placed into CPU cache with a hardware pre-fetch. This mechanism is initialized upon a power
good reset. A device driver for the I/O device configures the I/O device for DCA and sets up the
appropriate CPU ID and bus ID for the device to send data. The device will then encapsulate that
information in PCIe TLP headers, in the TAG field, to trigger a hardware pre-fetch by the MCH /IOH to
the CPU cache.
DCA implementation is controlled by separated registers (RXCTL and TXCTL) for each receive and
transmit queues. In addition, a DCA Enable bit can be found in the DCA_CTRL register, and a DCA_ID
register can be found for each port, in order to make visible the function, device, and bus numbers to
the driver.
The RXCTL and TXCTL registers can be written by software on the fly and can be changed at any time.
When software changes the register contents, hardware applies changes only after all the previous
packets in progress for DCA has been completed.
However, in order to implement DCA, the 82576 has to be aware of the Crystal Beach version used. The
software driver must initialize the 82576 to let be aware of the crystal Beach version. A new register
named DCA_CTRL is used in order to properly define the system configuration.
There are 2 modes for DCA implementation:
1. Legacy DCA: The DCA target ID is derived from CPU ID (similar to Goshen)
2. DCA 1.0: The DCA target ID is derived from APIC ID.
The software driver selects one of these modes through the DCA_mode register.
The details of both modes are described below.
7.7.2
7.7.2.1
Details of Implementation
PCIe Message Format for DCA
Figure 7-20 shows the format of the PCIe message for DCA.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
353
Intel® 82576 GbE Controller — Inline Functions
Figure 7-20.
PCIe Message Format for DCA
The DCA preferences field has the following formats.
Table 7-53.
Legacy DCA Systems
Bits
Name
0
DCA indication
Description
0b: DCA disabled
1b: DCA enabled
4:1
DCA Target ID
The DCA Target ID specifies the target cache for
the data.
7:5
Reserved
Reserved
Table 7-54.
DCA 1.0 Systems
Bits
Name
7:0
DCA target ID
Description
0000.0000b: DCA is disabled
Other: Target Core Id derived from APIC Id.
The method for this is described in DCA
Platform Architecture Specification, section
7.3.1 (Anacapa reference number 16802)
Note:
All functions within a the 82576 have to adhere to the “tag encoding” rules for DCA writes.
Even if a given function is not capable of DCA, but other functions are capable of DCA,
memory writes from the non-DCA function must set the Tag field to “00000000”.
Intel® 82576 GbE Controller
Datasheet
354
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
7.8
Transmit Rate Limiting (TRL)
A rate-scheduler enforces its rate limitation on a packet by packet basis, by computing the next time
the entity it controls can be served, spacing the packets from each others according to the limited rate
to be achieved. The output of a rate-scheduler is whether the entity can be currently served or not. This
can be viewed as if an oscillating “on/off switch” controlled by the rate-scheduler was appended at the
exit of each entity it controls.
Rate control is defined in terms of maximum payload rate, and not in term of maximum packet rate. It
means that whenever a rate controlled packet is sent, the next time a new packet can be sent out of
the same rate controlled queue is relative to the packet size of the last packet sent. The minimum
spacing in time between two starts of packets sent from the same rate controlled queue is recalculated
in hardware on every packet again, by using the following formula:
MIFS = PL x RF
Where:
• PL (Packet Length) is the Layer2 length (without preamble and IPG) in bytes of the previous packet
sent out of that rate controller. It is an integer ranging from 64 to 9K (at least 14 bits).
• RF = 1Gb/s / Target-Rate (Rate Factor) is the ratio between the nominal link rate and the target
maximum rate to achieve for that rate controlled queue. It is a decimal number ranging from 1 to
1,000 (1 Mb/s minimum target rate) at least 10-bits before the hexadecimal point and 14-bits after,
as required for the maximum PL by which it is multiplied.
• MIFS (Minimum Inter Frame Space) is the minimum delay in bytes units, between the starting of
two Ethernet frames issued from the same rate controlled queue. It is an integer ranging from 76 to
9,216,012 (at least 24 bits). In spite the 8-bytes resolution provided at the internal data path, the
byte-level resolution is required here to maintain acceptable rate resolution (at 1% level) for the
small packets case and high rates.
Note:
It might be that a pipeline implementation causes the MIFS calculated on a transmitted
packet to be enforced only on the subsequent transmitted packet.
Note:
Rate-Factor is defined here relatively to a link speed of 1Gb/s. However, for validation
purposes only, rate-schedulers may be operated over a link run at 100Mb/s. In this case,
the Rate-Factor must be configured relatively to the link speed, replacing 1Gb/s by 100Mb/
s in its defining formula above.
TimeStamps - A Rate-Scheduling Table contains the so accumulated interval MIFS, for each rate
controlled descriptor-queue separately, and stored as an absolute TimeStamp (TS) relative to an
internal free running timer. The TS value points to the time in the future at which a next data read
request can be sent for that queue. For example, the time at which the TRL switch is switched-on again.
Each time updating a TimeStamp we get:
TimeStamp(new) = TimeStamp(old) + MIFS
When a descriptor queue starts to be rate controlled, the first interval MIFS value is equal to 0 (TS
equal to the current timer value) - without taking in account the last packet sent prior to rate control.
When the TS value stored becomes equal to or smaller than the current free running timer value, it
means that the switch is “on” and that the queue starts accumulating compensation times from the
past (referred as a negative TS). When the TS value stored is strictly greater than the current free
running timer value, it means that the switch is off (referred as a positive TS).
(CurrentTime) < TimeStamp
<-->
switch is “off”
(CurrentTime) >= TimeStamp
<-->
switch is “on”
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
355
Intel® 82576 GbE Controller — Inline Functions
MMW - The ability to accumulate negative compensation times saturate to a Max Memory Window
(MMW) time backward. MMW size is configured per each traffic class via the MMW_SIZE field of the
TRLMMW register, and is expressed in 1KB units of payload, ranging from 0 up to 2K units (at least 11
bits). The MMW_SIZE configured in KB units of payload has to be converted in time interval MMW_TIME
expressed in KB, before a new timestamp is checked for saturation. It is computed for each queue
according to its associated Rate-Factor (RF), by using the following formula:
MMW_TIME = MMW_SIZE x RF
Note:
MMW_TIME is rounded by default to a 1KB precision level, and it must be at least 31-bits
long. Hence, the timestamp byte-level values stored must be at least 32-bits long for
handling properly the wrap around case, and 29-bits are required for the internal free
running timer clocked once every 8-bytes.
When updating a TimeStamp, use this formula for verification:
TimeStamp(old) + MIFS >= (CurrentTime) - MMW_TIME
and then the TimeStamp is updated according to the non-saturated formula:
TimeStamp(new) = TimeStamp(old) + MIFS
Otherwise, we enforce saturation by assigning:
TimeStamp(new) = (CurrentTime) - MMW_TIME + MIFS
Non null Max Memory Window introduces some flexibility in the way controlled rates are enforced. It is
required to avoid overall throughput losses and unfairness caused by rate controlled packets overdelayed, consequently to packets inserted in between. Between two rate-limited packets spaced by at
least the MIFS interval, non-rate-limited packets, or rate-limited packets from other rate controlled
queues, can be inserted. In the case a rate controlled packet has been delayed by more time than it
was required for rate control, the next MIFS accumulates from the last time the queue was “switched
on” by the Rate-Scheduling Table - and not from the current time. Refer to Figure 7-21 for visualizing
the effect of MMW.
Intel® 82576 GbE Controller
Datasheet
356
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Caution:
MMW_SIZE set to zero must be supported as well.
Figure 7-21.
320961-015EN
Revision: 2.61
December 2010
Minimum Inter Frame Spacing for Rate Controlled Frames (Shown in Orange)
Intel® 82576 GbE Controller
Datasheet
357
Intel® 82576 GbE Controller — Inline Functions
7.9
Next Generation Security
7.9.1
MACSec
MACSec (or MACsec, 802.1AE) is a MAC level encryption/authentication scheme defined in IEEE
802.1AE that uses symmetric cryptography. The 802.1AE defines AES-GCM 128 bit key as a mandatory
cipher suite which can be processed by the LAN controller. You need to have a MACSec-ready switch in
order to complete the ecosystem and make use of MACSec functionality.
The MACSec implementation supports the following:
• GCM AES 128 bit off-load engine in the Tx and Rx data path that support GbE wire speed.
• Both Host and MC traffic can be processed by the GCM AES engines.
• Support a single CA (secure Connectivity Association)
— Single SC (Secure Connection) on transmit data path.
— Single SC on receive data path.
— Each SC supports 2 SA (Security Association) for seamless re-keying.
• Both MC and host can act as Key agreement entity (KaY – in 802.1AE spec terminology) such as
control and access the off loading engine (SecY in 802.1AE spec terminology)
— Arbitration semaphores that indicates to whether the MC or the host acts as the KaY.
— Tamper resistance - When the MC acts as KaY it can disable accesses from host to SecY’s
address space. When the host acts as the KaY no protection is provided.
• Provide statistic counters as listed in Chapter 8.0, Programming Interface.
• Support replay protection with replay window equal to zero.
• Receive memory structure
— New MACSec off load receive status indication in the receive descriptors. MACSec offload must
not be used with the “legacy receive” format but rather use the “extended Receive descriptor”
format.
— MACSec Header/tag can be posted to the KaY for debug.
• Support VLAN header location according to IEEE 802.1AE (first header inner to the MACSec tag)
the 82576 do not support the End Station (ES bit in the TCI field of the SecTag Header is
set) mode of operation in transmit or in receive. It is never set in transmit packets and
incorrectly handled if received.On every place in this document the reference to MC can be
replaced to ME if the last one is the KaY in addition to the host. The ME and MC cannot act
as a KaY together and no switching mechanism between them is possible.
7.9.1.1
Packet Format
MACSec defines frame encapsulation format as shown below.
Table 7-55.
MAC DA,
SA
Legacy Frame Format
VLAN
(optional)
Legacy Type/
Len
LLC Data (might include IP/TCP and higher level payload)
--------------------
Intel® 82576 GbE Controller
Datasheet
358
CRC
User Data - - - - - - - - - - - - - - - - - - - - 
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
Table 7-56.
MACSec Encapsulation
MAC DA,
SA
Note:
MACSec header (SecTag)
User Data (optional encrypted)
MACSec ICV (tag)
CRC
A 802.3 packet with SNAP encapsulation will be decrypted or authenticated by the MACSec
engine only if the SNAP header is part of the MACSec user data.
7.9.1.2
MACSec Header (SecTag) Format
Table 7-57.
Sectag Format
MACSec Ethertype
TCI and AN
SL
PN
SCI (optional)
2 bytes
1 byte
1 byte
4 bytes
8 bytes
7.9.1.2.1
MACSec Ethertype
The MACsec Ethertype comprises octet 1 and octet 2 of the SecTAG. It is included to allow
a.
Coexistence of MACsec capable systems in the same environment as other systems
b.
Incremental deployment of MACsec capable systems
c.
Peer SecY’s to communicate using the same media as other communicating entities
d.
Concurrent operation of Key Agreement protocols that are independent of the MACsec protocol
and the Current Cipher Suite
e.
Operation of other protocols and entities that make use of the service provided by the SecY’s
Uncontrolled Port to communicate independently of the Key Agreement state
Table 7-58.
MACSec Ethertype
Tag Type
Name
Value
802.1AE Security TAG
MACSec EtherType
88-E5
7.9.1.2.2
TCI and AN
Table 7-59.
TCI and AN Description
Bit(s)
Description
7
Version number (V). The LAN controller support only version 0. Packets with other version value are discarded
by the controller.
6
End Station (ES). When set means that the sender is an end station thus the SCI is redundant, causes the SC bit
to be clear. Currently should be always 0b.
5
Secure Channel (SC). Equals 1b when the SCI field is active if ES bit is set SC must be cleared. Currently should
always be 1b.
4
Single Copy Broadcast (SCB). Cleared to 0b unless the SC supports EPON. Should be always 0b.
3
Encryption (E). Set to 1b when the user data is encrypted. (see note 1 below)
2
Changed Text (C). Set to 1b if the data portion is modified by the integrity algorithm. For example, if non default
integrity algorithm is used or if packet is encrypted. (see note below)
1:0
Association Number (AN). 2-bit value defined by control channel to uniquely identify SA (Keys, etc.)
Note:
The combination of E bit equals 1b and C bit equals 0b is reserved for KaY packets. The
MACSec logic ignores these packets on the receive path and transfer them to KaY as is (no
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
359
Intel® 82576 GbE Controller — Inline Functions
MACSec processing and no MACSec header strip). The 82576 never issues a packet in which
E bit is clear and C is set although it can tolerate such packets on receive. See
Section 7.9.1.4 for details of the handling of received packets with the C bit set.
7.9.1.2.3
Short Length
Table 7-60.
Short Length (SL) Field Description
Bit(s)
Description
7:6
Reserved 0b.
5:0
Short Length (SL). Number of octets in the secure data field from end of SecTag to beginning of ICV if it is less
then 48 octets, else SL value is 0b.
7.9.1.2.4
Packet Number (PN)
The MACSec engine increments it for each packet on the transmit side. The PN is used to generate the
initial value (IV) for the crypto engines. When the KaY is establishing a new SA it should set the initial
value of PN to one. See more details on PN exhausting in Section 7.9.1.5.1.
7.9.1.2.5
Secure Channel Identifier (SCI)
The SCI is composed of the MAC address and port number as shown in the table below. If the SC bit in
TCI is not set the SCI is not encoded in the SecTag.
Table 7-61.
Byte 0
SCI Field Description
Byte 1
Byte 2
Byte 3
Byte 4
Source MAC Address
7.9.1.2.6
Byte 5
Byte 6
Byte 7
Port Number
Initial Value (IV) Calculation
The IV is the Initial Value used by the Tx and Rx authentication engines. The IV is generated from the
PN and SCI as described in the 802.1AE spec.
7.9.1.3
MACSec Management – KaY (Key Agreement Entity)
The Kay management is done by the Host or the BMC. See Chapter 10.0 for details on the transfer of
ownership between these two entities.
The ownership of the MACSec management is as follows:
1. Initialization at power up or after wake on LAN
• In most cases the MC wakes before the host thus:
— If the MC is capable to be a KaY it establishes a SC (Authentication and key exchange).
— If the MC is not capable to be a KaY the only way for it to communicate is through VLAN. This
means that the switch must to support settings that allow specific VLAN to bypass MACSec.
• When the host is awake
— If the MC acted as KaY host should authenticate itself and transfer his ability to authenticate to
MC in order for MC to transfer ownership over the MACSec hardware. At this stage the system
works in proxy mode where the host manages the secured channel while the MC piggybacks on
it.
Intel® 82576 GbE Controller
Datasheet
360
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
— If the MC wasn't KaY the Host takes ownership over the MACSec hardware and establishes an
SC (authentication and key exchange) the MC remains on separate VLAN and all host traffic
should have VLAN tag.
2. Host at SX state - MC active
— If MC is not Kay capable then the SC should be reset by link reset or by send a Logoff packet
(1af) and MC can return to VLAN solution (or remain in such).
— If the MC is KaY capable host should notify MC that it retires KaY ownership and the MC should
retake it. Alternatively, the MC should identify cases where the communication is broken due to
lack of KaY maintenance by the host and retake ownership.
3. Host and MC at SX
— The active KaY should reset the secured channel by link reset or sending a Logoff packet (1af)
in order to enable WoL packet on the clear.
7.9.1.4
Receive Flow
The 82576 might receive packets that contain MACSec encapsulation as well as packets that do not
include MACSec encapsulation concurrently. This section describes the incoming packet classification.
Note:
This flow assumes the Rx mode is set to strict.
• Examine the user data for a SecTAG.
— If no SecTag, proceed packet with SECP bit cleared in descriptor
• Validate frames with a SecTAG
— The MPDU comprises at least 17 octets
— Octets 1 and 2 compose the MACsec Ethertype (0x88E5)
— The V bit in the TCI is clear
— If the ES or the SCB bit in the TCI is set, then the SC bit is cleared
— Bits 7 and 8 of octet 4 of the SecTAG are clear SL <= 0x3F
— If the C and SC bits in the TCI are clear, the MPDU comprises 24 octets plus the number of
octets indicated by the SL field if that is non-zero and at least 72 octets otherwise
— If the C bit is clear and the SC bit set, then the MPDU comprises 32 octets plus the number of
octets indicated by the SL field if that is non-zero and at least 80 octets otherwise
— If the C bit is set and the SC bit clear, then the MPDU comprises 8 octets plus the minimum
length of the ICV as determined by the Cipher Suite in use at the receiving SecY, plus the
number of octets indicated by the SL field if that is non-zero and at least 48 additional octets
otherwise
— If the C and SC bits are both set, the frame comprises at least 16 octets plus the minimum
length of the ICV as determined by the Cipher Suite in use at the receiving SecY, plus the
number of octets indicated by the SL field if that is non-zero and at least 48 additional octets
otherwise
• Extract and decode the SecTAG as specified in Section 7.9.1.2.
• Extract the User Data and ICV as specified Section 7.9.1.1.
• Assign the frame to an SA
— If valid SCI use it to identify the SC
— Select SA according to AN value
— If no valid SC or no valid SA found drop packet
— If SCI is omitted use default SC
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
361
Intel® 82576 GbE Controller — Inline Functions
— Select SA according to AN value
— If no valid SC (or more then SC active) or no valid SA found drop packet
• Perform a preliminary replay check against the last validated PN
• Provide the validation function with:
— The SA Key (SAK)
— The SCI for the SC used by the SecY to transmit
— The PN
— The SecTAG
— The sequence of octets that compose the Secure Data
— The ICV
• Receive the following parameters from the Cipher Suite validation operation
— A Valid indication, if the integrity check was valid and the User Data could be recovered
— The sequence of octets that compose the User Data
• Update the replay check
• Issue an indication to the Controlled Port with the DA, SA, and priority of the frame as received
from the Receive De-multiplexer, and the User Data provided by the validation operation
Note:
7.9.1.4.1
All the references to clauses are to the IEEE P802.1AE/D5.1 document from January 19,
2006.
MACSec Receive Modes
There are 4 modes of operation defined for MACSec Rx as defined by the LSECRXCTRL.LSRXEN field:
1. Bypass (LSRXEN = 00) - in this mode, MACSec is not off-loaded. There is no authentication or
decrypting of the incoming traffic. The MACSec header and trailer are not removed and these
packets are forwarded to the host or the MC according to the regular L2 MAC filtering. The packet is
considered as untagged (no VLAN filtering). No further offloads are done on MACSec packets.
2. Check (LSRXEN = 01) - in this mode, incoming packets with matching key are decrypted and
authenticated according to the MACSec tag. The MACSec header and trailer might be removed from
these packets and the packets are forwarded to the host or the MC according to the regular L2
filtering. Additional offloads are possible on MACSec packets assuming the packet was decrypted.
The header is not removed from KaY packets. At this mode the HW has less tight policy then the
strict mode on whether forward packets or drop them. Since this mode is mainly for debug
purposes or to overcome first generation standard inconsistencies most of the packets are yet
forwarded to higher layers with a suitable error code. The only case where packets are dropped is if
C bit is set and packet failed authentication. In cases where HW failed to locate a key but still
forwards the packet the SecTag won’t be removed if bit 6 of LSECRXCTRL is set while the ICV won’t
be included in the packet.
3. Strict (LSRXEN = 10) - in this mode, incoming packets with matching key are decrypted and
authenticated according to the MACSec tag. The MACSec header and trailer might be removed from
these packets and the packets are forwarded to the host only if the decrypting or authentication
was successful. Additional offloads are possible on MACSec packets. The header is not removed
from KaY packets.
Note:
Setting RCTL.SBP (Store bad packets) might override this mode, as all packets are
forwarded to the host - regardless of the MACSec offload success
4. Disable (LSRXEN = 11) - in this mode, MACSec is not offloaded and MACSec packets are dropped.
There is no authentication or decrypting of the incoming traffic.
7.9.1.4.2
Receive SA Exhausting – Re-Keying
Intel® 82576 GbE Controller
Datasheet
362
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The seamless re-keying mechanism is explained in the following example.
KaY establishes SC and sets SA0 as the active SA by writing the key in register MACSec RX Key writing
the AN in LSECRXSA[0] and setting the SA Valid bit in the same register, this clears the Frame Received
bit. On the first packet arrived to SA0 the Frame received automatically sets. Only at this time the KaY
can and should initiate SA1 in the same manner as for SA0. When a frame of SA1 arrives, SA0 retires
and can be used for the next SA.
7.9.1.4.3
Receive SA Context and Identification
Upon arrival of a secured frame the context of the SecTag is verified. This context of the SecTag is
described in Section 7.9.1.2. In order to process the secured frame it should be associated with one of
the SA keys. The identification is done by comparing the SCI data with MACSec RX SC registers to
ensure that the frame belongs to the SC. The incoming frame AN field is compared to the AN field of the
Link RX SA register of the SC in order to select an SA. The selected SA PN (register MACSec RX SA PN)
field is compared to the incoming PN which should be equal or greater then the MACSec RX SA PN
value, otherwise this frame is dropped. On a match the selected SA key is used for the secured frame
processing.
7.9.1.4.4
Receive Statistic Counters
Detailed list and description of the MACSec RX statistics counters can found in Section 8.0,
Programming Interface.
7.9.1.5
Transmit Flow
The 82576 might transmit packets that contain MACSec encapsulation as well as packets that do not
include MACSec encapsulation concurrently. This section describes the transmit packet classification,
transmit descriptors and statistic counters.
Note:
Since flow control (PAUSE) packets are part of the MAC service they should not go through
the MACSec logic.
1. Assign the frame to an SA by adding the AN according to SA Select bit in LSECTXSA register.
2. Assign the nextPN variable for that SA to be used as the value of the PN in the SecTAG based on the
value in the appropriate (according to SA) LSECTXPN register.
3. Encode the octets of the SecTAG according to the setting in LSECTXCTRL register.
4. Provide the protection function of the Current Cipher Suite with:
a.
The SA Key (SAK).
b.
The SCI for the SC used by the SecY to transmit.
c.
The PN.
d.
The SecTAG.
e.
The sequence of octets that compose the User Data.
5. Receive the following parameters from the Cipher Suite protection operation:
a.
The sequence of octets that compose the Secure Data.
b.
The ICV.
6. Issue a request to the Transmit Multiplexer with the destination and source MAC addresses, and
priority of the frame as received from the Controlled Port, and an MPDU comprising the octets of
the SecTAG, Secure Data, and the ICV concatenated in that order.
7.9.1.5.1
320961-015EN
Revision: 2.61
December 2010
Transmit SA Exhausting – Re-keying
Intel® 82576 GbE Controller
Datasheet
363
Intel® 82576 GbE Controller — Inline Functions
The 82576 supports a single SC on the transmit data path with seamless re-keying mechanism. The SC
might act with one of two optional SAs. The SA is selected statically by the Active SA field in the
LSECTXSA register. Once the KaY entity (could be either software or firmware as defined by the MACSec
Ownership field in the FWSM register) changes the setting of the SA Select field in the LSEXTXSA
register the Active SA field is getting the same value on a packet boundary. The next packet that is
processed by the transmit MACSec engine uses the updated SA.
The KaY should switch between the two SAs before the Packet Number (PN) is exhausted. In order to
protect against such event the hardware generates a “MACSec Packet Number” interrupt to KaY when
the PN reaches the exhaustion threshold as defined in the LSECTXCTRL register. The exhaustion
threshold should be set to a level that enables the KaY to switch between SA’s faster then the PN might
be exhausted. If the KaY is slower than it should be, then the PN might be increment above planned.
The hardware guarantees that the PN never repeats itself, even if the KaY is “slow”. Once the PN
reaches a value of 0xFF…FFEF the hardware clears the Enable Tx MACSec field in the LSECTXCTRL
register to 00b. Clearing the Enable Tx MACSec field the hardware disables MACSec off-load before the
PN could wraparound and then might repeat itself.
Note:
Potential race conditions are possible as follow. The LAN controller might fetch a transmit
packet (indicated as TxPacketN) from the host memory (host or MC packet). KaY can
change the setting of the Tx SA Index.
The TxPacketN might use the new TX SA Index if the TX SA index was updated before the
TxPacketN propagated to the transmit MACSec engine. This race is not critical since the
receiving node should be able to process the previous SA as well as the new SA in the rekeying transition period.
7.9.1.5.2
Transmit SA Context
Upon transmission of a secured frame the SA associated data is inserted into the SecTag field of the
frame. The SecTag data is composed from the MACSec Tx registers. The SCI value is taken from
MACSec TX SCI Low and High registers unless instructed to omit SCI. The AN value is taken from the
active MACSec TX SA and the PN from the appropriate MACSec TX SA PN.
7.9.1.5.3
Transmit Statistic Counters
Detailed list and description of the MACSec TX statistics counters can found in Section 8.0,
Programming Interface.
7.9.1.6
Manageability Engine/ Host Relations
The LAN controller supports a single CA for all the traffic that it handles. At a given time host might be
active or inactive as well the BMC. It is expected that when only MC is enabled it acts as the KaY
controlling the secured channel. The host can act as the KaY when it is functional and the control switch
was executed. The following section describes the semaphore between MC and Host controlling MACSec
setting and its tamper resistance (protection) mechanism.
7.9.1.6.1
Key and Tamper Protection
MACSec provides the network administrator protection to the network infrastructure from hostile or
unauthorized devices. Since the local host operating system might itself be compromised, the hardware
protects vital MACSec context from software access. There are two levels of protection:
• Disable host read access to the MACSec Keys (keys are write-only),
• Disable host access to MACSec logic while the firmware manages the MACSec Secure Channel (SC).
7.9.1.6.2
Key Protection
Intel® 82576 GbE Controller
Datasheet
364
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
The MACSec keys are protected against read accesses at all times. Both software and firmware are not
able to read back the keys that the hardware uses for transmit and receive activity. Instead, the
hardware enables the software and firmware reading a signature enabling to verify proper
programming of the device. The signature is a simple byte XOR operation of the Tx and Rx keys
readable in the LSECTXSUM and LSECRXSUM fields in the LSECTXCAP and LSECRXCAP registers.
7.9.1.6.3
Tamper Protection
In a scenario where the host failed authentication thus can not act as the KaY the MC disables the host
access to network and manages the MACSec channel while host operating system is already up and
running. In such cases, the hardware provides the required hooks to protect MACSec connectivity
against hostile software. The MC firmware can disable Write accesses generated by the host CPU (on
the PCI interface) by setting the Lock MACSec Logic bit (bit 0) in the LSWFW register.
7.9.1.6.4
MACSec Control Switch Between Firmware and Software
The stages to switch MACSec control ownership between MC and the host are described in
Chapter 10.0. The owner after the switch procedure must assume all KaY needed responsibility.
7.9.1.7
Manageability Flow
7.9.1.7.1
Initialization
In the manageability case the main difference in initialization is that in some cases the host is in off
mode. In such cases the BMC should do the authentication, MACSec and SGT acquirement by its own.
When the host is on it is the host responsibility to acquire the SGT values for the BMC.
It is assumed that the BMC will use only one SGT on the TX side so no table is needed only one SGT
register. On the RX side the table holds one vector for the BMC at the same manner as an additional
queue. When the host is off it is the BMC responsibility to initialize the HW tables also for the host entry
(disable traffic in both directions).
7.9.1.7.2
Operation flow
Since it is assumed that the manageability traffic will be assigned only one SGT. The SGT value that the
HW will add to the CMD TAG is stored in register CTSTXCTL. On the receive side the CTSRXMNGT table
is used for filtering traffic.
7.9.1.8
Switching ownership between Host and Manageability.
Since it is assumed that CTS will never be activated without MACSec the CTS ownership is tightly
coupled with MACSec ownership. In other words the entity that owns the MACSec logic also owns the
CTS tagging.
7.9.2
IPSec Support
Note:
This section defines the hardware requirements for the IPsec off-load ability included in the 82576.
IPsec off-load is the ability to handle in hardware a certain amount of the total number of IPsec flows,
while the remaining are still handled by the operating system. It is the operating system responsibility
to submit to hardware the most loaded flows, in order to take maximum benefits of the IPsec off-load in
term of CPU utilization savings. The establishment of the IPsec Security Associations between peers is
outside the scope of this document, since it always is handled by the operating system. In general, the
requirements on the driver or on the operating system for enabling IPsec off-load are not detailed here.
320961-015EN
Revision: 2.61
December 2010
Intel® 82576 GbE Controller
Datasheet
365
Intel® 82576 GbE Controller — Inline Functions
When an IPsec flow is handled in software, since the packet might be encrypted and the integrity check
field already valid (IPv4 options might be present in the packet together with IPsec headers) the 82576
processes it like it does for any other unsupported Layer4 protocol, and without performing on it any
Layer4 offload.
7.9.2.1
Related RFCs and Other References
• RFC4106 — The Use of Galois/Counter Mode (GCM) in IPsec Encapsulating Security Payload (ESP)
• RFC4302 — IP Authentication Header (AH)
• RFC4303 — IP Encapsulating Security Payload (ESP)
• RFC4543 — The Use of Galois Message Authentication Code (GMAC) in IPsec ESP and AH
• GCM spec — McGrew, D. and J. Viega, “The Galois/Counter Mode of Operation (GCM)”, Submission
to NIST. http://csrc.nist.gov/CryptoToolkit/modes/proposedmodes/gcm/gcm-spec.pdf, January
2004.
7.9.2.2
Hardware Features List
7.9.2.2.1
Main Features
• off-load IPsec for up to 256 Security Associations (SA) for each side separately, Tx and Rx.
— On-chip storage for both Tx and Rx SA tables
— Tx SA index is conveyed to hardware via Tx context descriptor
— Rx SA lookup is a deterministic search according to a search key made of SPI, destination IP
address, and IP version type (IPv6 or IPv4)
— Performance in RX: update the whole Rx SA table in less than 1msec, while receiving back-toback 64-bytes packets
• IPsec protocols:
— IP Authentication Header (AH) protocol for authentication
— IP Encapsulating Security Payload (ESP) for authentication only
— IP Encapsulating Security Payload (ESP) for both authentication and encryption, only if using
the same key for both
• Crypto engines:
— For AH or ESP authentication only use AES-128-GMAC (128-bit key)
— For ESP encryption and authentication use AES-128-GCM (128-bit key)
• IPsec encapsulation mode: Transport mode
— In Tx, packets are provided by software already encapsulated with a valid IPsec header (for AH
with blank ICV inside), and
• for ESP single send, with a valid ESP trailer and ESP ICV (blank ICV)
• for ESP large send, without ESP trailer and without ESP ICV
— In Rx, packets are provided to software encapsulated with their IPsec header and for ESP with
the ESP trailer and ESP ICV,
• where up to 255-bytes of incoming ESP padding is supported, for peers that
prefer hiding the packet length
• IP versions:
— IPv4 packets that do not include any IP option
— IPv6 packets that do not include any extension header (other than AH/ESP extension header)
Intel® 82576 GbE Controller
Datasheet
366
320961-015EN
Revision: 2.61
December 2010
Inline Functions — Intel® 82576 GbE Controller
• Rx statuses reported to software via Rx descriptor:
— Packet type: AH/ESP
— IPsec off-load done (SA match)
• One Rx error reported to software via Rx descriptor in the following precedence order: No error,
Invalid IPsec protocol, Packet length error, Authentication failed
7.9.2.2.2
Cross Features
• w/ segmentation: full coexistence (TCP/UDP packets only)
— increment IPsec Sequence Number (SN) and Initialization Vector (IV) on each additional
segment
• w/ checksum off-load: full coexistence (Tx and Rx)
— IP header checksum
— TCP/UDP checksum
• w/ IP fragment: no IPsec offload don