mirror_ubuntu-kernels/drivers/pci
Lukas Wunner a97396c6eb PCI: pciehp: Ignore Link Down/Up caused by DPC
Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon
an error and attempts to re-enable it when instructed by the DPC driver.

A slot which is both DPC- and hotplug-capable is currently powered off by
pciehp once DPC is triggered (due to the link change) and powered back up
on successful recovery.  That's undesirable, the slot should remain powered
so the hotplugged device remains bound to its driver.  DPC notifies the
driver of the error and of successful recovery in pcie_do_recovery() and
the driver may then restore the device to working state.

Moreover, Sinan points out that turning off slot power by pciehp may foil
recovery by DPC:  Power off/on is a cold reset concurrently to DPC's warm
reset.  Sathyanarayanan reports extended delays or failure in link
retraining by DPC if pciehp brings down the slot.

Fix by detecting whether a Link Down event is caused by DPC and awaiting
recovery if so.  On successful recovery, ignore both the Link Down and the
subsequent Link Up event.

Afterwards, check whether the link is down to detect surprise-removal or
another DPC event immediately after DPC recovery.  Ensure that the
corresponding DLLSC event is not ignored by synthesizing it and invoking
irq_wake_thread() to trigger a re-run of pciehp_ist().

The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and
dpc_handler(), race against each other.  If pciehp is faster than DPC, it
will wait until DPC recovery completes.

Recovery consists of two steps:  The first step (waiting for link
disablement) is recognizable by pciehp through a set DPC Trigger Status
bit.  The second step (waiting for link retraining) is recognizable through
a newly introduced PCI_DPC_RECOVERING flag.

If DPC is faster than pciehp, neither of the two flags will be set and
pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag.
The flag is zero if DPC didn't occur at all, hence DLLSC events are not
ignored by default.

pciehp waits up to 4 seconds before assuming that DPC recovery failed and
bringing down the slot.  This timeout is not taken from the spec (it
doesn't mandate one) but based on a report from Yicong Yang that DPC may
take a bit more than 3 seconds on HiSilicon's Kunpeng platform.

The timeout is necessary because the DPC Trigger Status bit may never
clear:  On Root Ports which support RP Extensions for DPC, the DPC driver
polls the DPC RP Busy bit for up to 1 second before giving up on DPC
recovery.  Without the timeout, pciehp would then wait indefinitely for DPC
to complete.

This commit draws inspiration from previous attempts to synchronize DPC
with pciehp:

By Sinan Kaya, August 2018:
https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/

By Ethan Zhao, October 2020:
https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/

By Kuppuswamy Sathyanarayanan, March 2021:
https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/

Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de
Reported-by: Sinan Kaya <okaya@kernel.org>
Reported-by: Ethan Zhao <haifeng.zhao@intel.com>
Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <kbusch@kernel.org>
2021-06-16 17:16:57 -05:00
..
controller pci-v5.13-changes 2021-05-05 13:24:11 -07:00
endpoint Merge branch 'remotes/lorenzo/pci/endpoint' 2021-05-04 10:43:27 -05:00
hotplug PCI: pciehp: Ignore Link Down/Up caused by DPC 2021-06-16 17:16:57 -05:00
pcie PCI: pciehp: Ignore Link Down/Up caused by DPC 2021-06-16 17:16:57 -05:00
switch PCI: switchtec: Add missing __iomem tag to fix sparse warnings 2020-07-31 11:23:45 -05:00
access.c Merge branch 'pci/misc' 2020-08-05 18:24:16 -05:00
ats.c PCI: Fix kernel-doc errors 2021-03-11 17:37:20 -06:00
bus.c PCI: Add device even if driver attach failed 2020-07-07 17:33:41 -05:00
ecam.c PCI: Unify ECAM constants in native PCI Express drivers 2020-12-10 14:55:49 -06:00
host-bridge.c
iov.c PCI/IOV: Add sysfs MSI-X vector assignment interface 2021-04-04 10:26:30 +03:00
irq.c PCI: Remove unused pci_lost_interrupt() 2020-07-29 14:25:18 -05:00
Kconfig pci-v5.10-changes 2020-10-22 12:41:00 -07:00
Makefile PCI: Apply CONFIG_PCI_DEBUG to entire drivers/pci hierarchy 2021-02-09 15:10:20 -06:00
mmap.c
msi.c PCI/MSI: Document the various ways of ending up with NO_MSI 2021-04-20 14:11:22 +01:00
of.c PCI: Fix kernel-doc errors 2021-03-11 17:37:20 -06:00
p2pdma.c RDMA 5.11 pull request 2020-12-16 13:42:26 -08:00
pci-acpi.c PCI/ACPI: Fix acpi_pci_set_power_state() debug message 2021-04-01 14:54:43 -05:00
pci-bridge-emul.c PCI: pci-bridge-emul: Fix array overruns, improve safety 2021-02-17 17:25:31 -06:00
pci-bridge-emul.h
pci-driver.c Merge branch 'pci/misc' 2020-12-15 15:11:08 -06:00
pci-label.c PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in "show" functions 2021-04-29 10:07:31 -05:00
pci-mid.c PCI: intel-mid: Convert to new X86 CPU match macros 2020-03-24 21:35:06 +01:00
pci-pf-stub.c PCI/IOV: Simplify pci-pf-stub with module_pci_driver() 2020-09-17 12:40:20 -05:00
pci-stub.c
pci-sysfs.c pci-v5.13-changes 2021-05-05 13:24:11 -07:00
pci.c pci-v5.13-changes 2021-05-05 13:24:11 -07:00
pci.h PCI: pciehp: Ignore Link Down/Up caused by DPC 2021-06-16 17:16:57 -05:00
probe.c Merge branch 'remotes/lorenzo/pci/msi' 2021-05-04 10:43:30 -05:00
proc.c PCI: Revoke mappings like devmem 2021-02-11 15:59:19 +01:00
quirks.c Merge branch 'remotes/lorenzo/pci/msi' 2021-05-04 10:43:30 -05:00
remove.c PCI/sysfs: Convert "reset" to static attribute 2021-04-27 17:53:20 -05:00
rom.c PCI: Use ioremap(), not phys_to_virt() for platform ROM 2020-03-30 09:52:23 -05:00
search.c PCI: Remove WARN_ON(in_interrupt()) 2021-02-10 16:46:29 -06:00
setup-bus.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
setup-irq.c
setup-res.c PCI: Decline to resize resources if boot config must be preserved 2021-01-12 16:39:52 -06:00
slot.c Merge branch 'pci/misc' 2020-12-15 15:11:08 -06:00
syscall.c PCI: Align checking of syscall user config accessors 2021-01-27 10:41:59 -06:00
vc.c PCI: Fix kerneldoc warnings 2020-08-05 18:23:14 -05:00
vpd.c Merge branch 'pci/sysfs' 2021-05-04 10:43:23 -05:00
xen-pcifront.c swiotlb: remove swiotlb_nr_tbl 2021-03-19 04:58:25 +00:00