EEH Overview - Linux Foundation Events

IBM Linux Technology Center
September 16, 2011
EEH Overview
Gavin Shan
Linux Technology Center, IBM, China
shangw@cn.ibm.com
May, 8, 2012
© 2006 IBM Corporation
IBM Linux Technology Center
Agenda
What's EEH?
High-level Overview
How EEH Core Works?
Further Development Work
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
What's EEH?



EEH is the abbreviation of Extended Error Handling.
Isolate PCI errors within IO domains without affecting the rest of the system
- Enhanced system reliability and availability
A feature available on Power platforms only.
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
What is EEH?
RTAS(Run-Time Abstraction Services) in the firmware provides PCI error
related services to the OS.


The EEH core in the OS handles the error by either
– Taking appropriate corrective actions
– Or resetting the IO domain responsible for the error.
RTAS services are documented in PAPR (Power Architecture Requirements,
www.power.org).

IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
High-level Overview
Read on PCI Config Space
Read on MMIO Space
Error Detected
Device Driver
EEH Core
Results:
Errors cleared or
PCI devices removed
RTAS
Open Firmware
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
Partitionable Endpoint (PE)

PE is an I/O error and recovery domain made up of
– A single or multi-function IO Adapter or
– A function of a multi-function IO Adapter or
– Multiple IOAs, possibly includes upstream switches and bridges
Partitionable Endpoint (PE) is defined in PAPR (Power Architecture Platform
Requirements).


RTAS compliant firmware supports EEH related operations at PE granularity.
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
Partitionable Endpoint (PE)
CPU
CPU
PCI Host Bridge (PHB)
PCI Host Bridge (PHB)
PCI Bridge
PCIE-PCI Bridge
PCI IOA
PCI IOA
PCI IOA
PE
PE
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
EEH RTAS Calls
The RTAS (Run-Time Abstraction Services) calls intend to insulate OS from
having to know about and manipulate the platform hardware by registers
directly.

QEMU emulates
RTAS call
Guest
Hypercall
QEMU
Return control
to guest
Guest
Hypervisor
RTAS
OPAL
Open Firmware
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
EEH RTAS Calls
OF (Open Firmware) is using device tree to pass information to Linux kernel.
Usually, we call those device tree nodes as OF nodes. OF node “/rtas”
includes lots of properties to designate EEH calls.

- “ibm,set-eeh-option”
Enable/Disable EEH for PE, or enable/disable MMIO/DMA for PE.
- “ibm,set-slot-reset”
Reset PE
- “ibm,read-slot-reset-state2”, “ibm,read-slot-reset-state”
Query the state of PE
- “ibm,slot-error-detail”
Retrieve error log
- “ibm,get-config-addr-info2”, “ibm,get-config-addr”
Retrieve PE address
- “ibm,configure-pe”, “ibm,configure-bridge”
Configure the PCI bridges in the PE
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
PCI Access with EEH Errors
EEH traps PCI config and MMIO read errors and continue with normal
operation without frozen state of the corresponding PE.

Return of 0xFF's from PCI config and MMIO read is the criteria of trapping
into EEH.


A PE in frozen state drops all PCI config and MMIO writes quietly.
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
EEH Device
EEH device , “struct eeh_dev”, traces the EEH related information for each
PCI device.

OF node, “struct device_node”, represents device tree nodes.

For those PCI based OF nodes, “struct pci_dn” introduced to store the PCI
related information (e.g. bus number, slot, etc.)

device_node
void *data
eeh_dev
pci_dev
pci_dn
struct eeh_dev *edev
struct device_node *dn
struct pci_dev *pdev
IBM Confidential
device
archdata.edev
© 2006 IBM Corporation
IBM Linux Technology Center
PCI Address Cache

PCI address cache stores PCI devices using a RB tree.

Helps finding the EEH device associated with a MMIO address.
Each node in the RB tree stores MMIO physical window and its associated
PCI device.

IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
How EEH Talks to DD?

EEH core uses EEH handlers registered by device drivers.
EEH core and device drivers communicate and handle errors through the
EEH handlers.

struct pci_error_handlers {
pci_ers_result_t (*error_detected)(struct pci_dev *dev,
enum pci_channel_state error);
pci_ers_result_t (*mmio_enabled)(struct pci_dev *dev);
pci_ers_result_t (*link_reset)(struct pci_dev *dev);
pci_ers_result_t (*slot_reset)(struct pci_dev *dev);
void (*resume)(struct pci_dev *dev);
};
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
How EEH Core Works?
Read on MMIO space
Read on PCI Config Space
Returns 0xFF's
OF node
Returns 0xFF's
PCI Address Cache
PCI Device
Retrieve EEH device (struct eeh_dev)
Retrieve PE sensitive EEH device
Good PE state
Check PE's state
Exit
Frozen PE
…..
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
How EEH Core Works?
Send EEH event
Stack dump
Start new kernel thread to
process the EEH event
…..
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
How EEH Core Works?
Retrieve the associated PCI
bus of the problematic PE
Enable MMIO for the
PE and Temporary
failure log collection
Report error to all child PCI devices of
the PCI bus and collect DD's desire
Disconnect
Check DD's desire
None
Can recover
PE reset
Failure
Enable MMIO
Removed PCI bus
and the associated
PCI devices
Enable DMA
Notify DD to resume
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
Further Development work
PE explicit support. It will introduce data struct “eeh_pe” to represent PE so
that EEH core becomes more data centralized. The EEH core needs somewhat
rework accordingly.


EEH support for P7IOC based powernv platform.

EEH emulation for KVM based guests.
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
Legal Statement
This work represents the view of the author and does not necessarily
represent the view of IBM.
IBM, IBM (logo), AIX, POWER, POWER6, POWER7 and PowerVM are
trademarks or registered trademarks of International Business Machines
Corporation in the United States and/or other countries.
Linux is a registered trademark of Linus Torvalds.
Other company, product and service names may be trademarks or service
marks of others.
IBM Confidential
© 2006 IBM Corporation
IBM Linux Technology Center
Thanks & Questions
IBM Confidential
© 2006 IBM Corporation