We used to first parse all the _HPP and _HPX tables before using the
information to program registers of PCIe devices. Up through HPX Type 2,
there was only one structure of each type, so we could cheat and store it
on the stack.
With HPX Type 3 we get an arbitrary number of entries, so the above model
doesn't scale that well. Instead of parsing all tables at once, parse and
program each entry separately. For _HPP and _HPX Types 0 through 2, this
is functionally equivalent. The change enables the upcoming _HPX Type 3 to
integrate more easily.
Link: https://lore.kernel.org/lkml/20190208162414.3996-3-mr.nuke.me@gmail.com
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
[bhelgaas: fix build errors]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The "Enhanced Allocation (EA) for Memory and I/O Resources" ECN, approved
23 October 2014, sec 6.9.1.2, specifies a second DW in the capability for
type 1 (bridge) functions to describe fixed secondary and subordinate bus
numbers. This ECN was included in the PCIe r4.0 spec, but sec 6.9.1.2 was
omitted, presumably by mistake.
Read fixed bus numbers from the EA capability for bridges.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
[bhelgaas: add pci_ea_fixed_busnrs() return value]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Two functions allocate a host bridge: devm_pci_alloc_host_bridge() and
pci_alloc_host_bridge(). At the moment, only the unmanaged one initializes
the PCIe feature bits, which prevents from using features such as hotplug
or AER on some systems, when booting with device tree. Make the
initialization code common.
Fixes: 02bfeb4842 ("PCI/portdrv: Simplify PCIe feature permission checking")
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.17+
If a multi-function device's bandwidth is already limited when it is
enumerated, a message is logged only for function 0. By contrast, when
downtraining occurs after enumeration, a message is logged for all
functions. That's because the former uses pcie_report_downtraining(),
whereas the latter uses __pcie_print_link_status() (which doesn't filter
functions != 0). I am seeing this happen on a MacBookPro9,1 with a GPU
(function 0) and an integrated HDA controller (function 1).
Avoid this incongruence by calling pcie_report_downtraining() in both
cases.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alexandru Gagniuc <alex.gagniuc@dellteam.com>
- Probe bridge window attributes only once at enumeration-time to fix
device accesses during rescan (Bjorn Helgaas)
- Return BAR size (not "size -1 ") from pci_size() to simplify code (Du
Changbin)
- Use config header type (not class code) identify bridges more reliably
(Honghui Zhang)
- Work around Intel Denverton incorrect Trace Hub BAR size reporting
(Alexander Shishkin)
* pci/enumeration:
x86/PCI: Fixup RTIT_BAR of Intel Denverton Trace Hub
PCI: Rely on config space header type, not class code
PCI: Make pci_size() return real BAR size
PCI: Probe bridge window attributes once at enumeration-time
- Use Latency Tolerance Reporting if already enabled by platform (Bjorn
Helgaas)
- Save/restore LTR info for suspend/resume (Bjorn Helgaas)
* pci/aspm:
PCI/ASPM: Save LTR Capability for suspend/resume
PCI/ASPM: Use LTR if already enabled by platform
RussianNeuroMancer reported that the Intel 7265 wifi on a Dell Venue 11 Pro
7140 table stopped working after wakeup from suspend and bisected the
problem to 9ab105deb6 ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't
have LTR"). David Ward reported the same problem on a Dell Latitude 7350.
After af8bb9f898 ("PCI/ACPI: Request LTR control from platform before
using it"), we don't enable LTR unless the platform has granted LTR control
to us. In addition, we don't notice if the platform had already enabled
LTR itself.
After 9ab105deb6 ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have
LTR"), we avoid using LTR if we don't think the path to the device has LTR
enabled.
The combination means that if the platform itself enables LTR but declines
to give the OS control over LTR, we unnecessarily avoided using ASPM L1.2.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201469
Fixes: 9ab105deb6 ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR")
Fixes: af8bb9f898 ("PCI/ACPI: Request LTR control from platform before using it")
Reported-by: RussianNeuroMancer <russianneuromancer@ya.ru>
Reported-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.18+
As per Figure 6-3 in PCIe r4.0, sec 6.2.6, ERR_ messages will be forwarded
from the secondary interface to the primary interface, if the SERR# Enable
bit in the Bridge Control register is set.
It seems clear that an ACPI hotplug parameter method (_HPP or _HPX) that
tells us to "enable SERR in the command register" (ACPI v6.2, sec 6.2.8,
6.2.9.1) refers to PCI_COMMAND_SERR, which enables reporting of errors by
the function itself.
For bridges, we also interpreted that to mean we should enable
PCI_BRIDGE_CTL_SERR, which enables *forwarding* of errors by the bridge.
But we didn't enable PCI_BRIDGE_CTL_SERR anywhere else, which means we
never enabled it for non-ACPI systems or ACPI systems that didn't supply
hotplug parameters.
That means errors reported below bridges were often never forwarded up to a
Root Port where they could be signaled via AER.
Enable PCI_BRIDGE_CTL_SERR for all bridges so we can get better error
reporting for downstream devices.
Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI configuration space header type tells us whether the device is a
bridge, a CardBus bridge, or a normal device, and defines the layout of the
rest of the header (PCI r3.0 sec 6.1, PCIe r4.0 sec 7.5.1.1.9).
When we rely on the header format, e.g., when we're dealing with bridge
windows, we should check the header type, not the class code. The class
code is loosely related to the header type, but is often incorrect and the
spec doesn't actually require it to be related to the header format.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[bhelgaas: changelog, keep the PCI_CLASS_BRIDGE_HOST check]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, the pci_size() function actually returns 'size-1'. Make it
return real size to avoid confusion.
Signed-off-by: Du Changbin <changbin.du@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bridge_check_ranges() determines whether a bridge supports the optional
I/O and prefetchable memory windows and sets the flag bits in the bridge
resources. This *could* be done once during enumeration except that the
resource allocation code completely clears the flag bits, e.g., in the
pci_assign_unassigned_bridge_resources() path.
The problem with pci_bridge_check_ranges() in the resource allocation path
is that we may allocate resources after devices have been claimed by
drivers, and pci_bridge_check_ranges() *changes* the window registers to
determine whether they're writable. This may break concurrent accesses to
devices behind the bridge.
Add a new pci_read_bridge_windows() to determine whether a bridge supports
the optional windows, call it once during enumeration, remember the
results, and change pci_bridge_check_ranges() so it doesn't touch the
bridge windows but sets the flag bits based on those remembered results.
Link: https://lore.kernel.org/linux-pci/1506151482-113560-1-git-send-email-wangzhou1@hisilicon.com
Link: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html
Reported-by: Yandong Xu <xuyandong2@huawei.com>
Tested-by: Yandong Xu <xuyandong2@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Ofer Hayut <ofer@lightbitslabs.com>
Cc: Roy Shterman <roys@lightbitslabs.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
A malicious PCI device may use DMA to attack the system. An external
Thunderbolt port is a convenient point to attach such a device. The OS
may use IOMMU to defend against DMA attacks.
Some BIOSes mark these externally facing root ports with this
ACPI _DSD [1]:
Name (_DSD, Package () {
ToUUID ("efcc06cc-73ac-4bc3-bff0-76143807c389"),
Package () {
Package () {"ExternalFacingPort", 1},
Package () {"UID", 0 }
}
})
If we find such a root port, mark it and all its children as untrusted.
The rest of the OS may use this information to enable DMA protection
against malicious devices. For instance the device may be put behind an
IOMMU to keep it from accessing memory outside of what the driver has
allocated for it.
While at it, add a comment on top of prp_guids array explaining the
possible caveat resulting when these GUIDs are treated equivalent.
[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
- Cache VF config space size to optimize enumeration of many VFs
(KarimAllah Ahmed)
- Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas)
* pci/virtualization:
PCI/IOV: Remove unnecessary include of <linux/pci-ats.h>
PCI/IOV: Use VF0 cached config space size for other VFs
Cache the config space size from VF0 and use it for all other VFs instead
of reading it from the config space of each VF. We assume that it will be
the same across all associated VFs.
This is an optimization when enabling SR-IOV on a device with many VFs.
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
[bhelgaas: use CONFIG_PCI_IOV (not CONFIG_PCI_ATS)]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The few callers can just use dma_set_max_seg_size ()directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The two callers can just use dma_set_seg_boundary() directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The spec has timing requirements when waiting for a link to become active
after a conventional reset. Implement those hard delays when waiting for
an active link so pciehp and dpc drivers don't need to duplicate this.
For devices that don't support data link layer active reporting, wait the
fixed time recommended by the PCIe spec.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Switch to bitmap_zalloc() to show clearly what we are allocating. Besides
that it returns pointer of bitmap type ("unsigned long *") instead of the
opaque "void *".
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Set the eetlp_prefix_path on PCIE_EXP_TYPE_RC_END devices to allow PASID
to be enabled on them. This fixes IOMMUv2 initialization on AMD Carrizo
APUs.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201079
Fixes: 7ce3f912ae ("PCI: Enable PASID only if entire path supports End-End TLP prefixes")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- To avoid bus errors, enable PASID only if entire path supports End-End
TLP prefixes (Sinan Kaya)
- Unify slot and bus reset functions and remove hotplug knowledge from
callers (Sinan Kaya)
- Add Function-Level Reset quirks for Intel and Samsung NVMe devices to
fix guest reboot issues (Alex Williamson)
- Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD Controller
(Bjorn Helgaas)
* pci/virtualization:
PCI: Add function 1 DMA alias quirk for Marvell 88SS9183
PCI: Delay after FLR of Intel DC P3700 NVMe
PCI: Disable Samsung SM961/PM961 NVMe before FLR
PCI: Export pcie_has_flr()
PCI: Rename pci_try_reset_bus() to pci_reset_bus()
PCI: Deprecate pci_reset_bus() and pci_reset_slot() functions
PCI: Unify try slot and bus reset API
PCI: Hide pci_reset_bridge_secondary_bus() from drivers
IB/hfi1: Use pci_try_reset_bus() for initiating PCI Secondary Bus Reset
PCI: Handle error return from pci_reset_bridge_secondary_bus()
PCI/IOV: Tidy pci_sriov_set_totalvfs()
PCI: Enable PASID only if entire path supports End-End TLP prefixes
# Conflicts:
# drivers/pci/hotplug/pciehp_hpc.c
- Clean up devm_of_pci_get_host_bridge_resources() resource allocation
(Jan Kiszka)
- Fixup resizable BARs after suspend/resume (Christian König)
- Make "pci=earlydump" generic (Sinan Kaya)
- Fix ROM BAR access routines to stay in bounds and check for signature
correctly (Rex Zhu)
* pci/resource:
PCI: Make pci_get_rom_size() static
PCI: Add check code for last image indicator not set
PCI: Avoid accessing memory outside the ROM BAR
PCI: Make early dump functionality generic
PCI: Cleanup PCI_REBAR_CTRL_BAR_SHIFT handling
PCI: Restore resized BAR state on resume
PCI: Clean up resource allocation in devm_of_pci_get_host_bridge_resources()
# Conflicts:
# Documentation/admin-guide/kernel-parameters.txt
- Work around IDT switch ACS Source Validation erratum (James
Puthukattukaran)
- Emit diagnostics for all cases of PCIe Link downtraining (Links
operating slower than they're capable of) (Alexandru Gagniuc)
- Skip VFs when configuring Max Payload Size (Myron Stowe)
- Reduce Root Port Max Payload Size if necessary when hot-adding a device
below it (Myron Stowe)
* pci/enumeration:
PCI: Match Root Port's MPS to endpoint's MPSS as necessary
PCI: Skip MPS logic for Virtual Functions (VFs)
PCI: Check for PCIe Link downtraining
PCI: Workaround IDT switch ACS Source Validation erratum
- Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy
Shevchenko)
- Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas)
* pci/aspm:
PCI: Remove unnecessary include of <linux/pci-aspm.h>
iwlwifi: Remove unnecessary include of <linux/pci-aspm.h>
ath9k: Remove unnecessary include of <linux/pci-aspm.h>
igb: Remove unnecessary include of <linux/pci-aspm.h>
PCI/ASPM: Convert to use sysfs_match_string() helper
- Decode AER errors with names similar to "lspci" (Tyler Baicar)
- Expose AER statistics in sysfs (Rajat Jain)
- Clear AER status bits selectively based on the type of recovery (Oza
Pawandeep)
- Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru
Gagniuc)
- Don't clear AER status bits if we're using the "Firmware-First"
strategy where firmware owns the registers (Alexandru Gagniuc)
* pci/aer:
PCI/AER: Don't clear AER bits if error handling is Firmware-First
PCI/AER: Remove duplicate PCI_EXP_AER_FLAGS definition
PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
PCI/AER: Clear device status bits during ERR_COR handling
PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL
PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
PCI/AER: Factor out ERR_NONFATAL status bit clearing
PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
PCI/AER: Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST
PCI/AER: Add sysfs attributes for rootport cumulative stats
PCI/AER: Add sysfs attributes to provide AER stats and breakdown
PCI/AER: Define aer_stats structure for AER capable devices
PCI/AER: Move internal declarations to drivers/pci/pci.h
PCI/AER: Adopt lspci names for AER error decoding
PCI/AER: Expose internal API for obtaining AER information
# Conflicts:
# drivers/pci/pci.h
In commit 27d868b5e6 ("PCI: Set MPS to match upstream bridge"), we made
sure every device's MPS setting matches its upstream bridge, making it more
likely that a hot-added device will work in a system with an optimized MPS
configuration.
Recently I've started encountering systems where the endpoint device's MPSS
capability is less than its Root Port's current MPS value, thus the
endpoint is not capable of matching its upstream bridge's MPS setting (see:
bugzilla via "Link:" below). This leaves the system vulnerable - the
upstream Root Port could respond with larger TLPs than the device can
handle, and the device will consider them to be 'Malformed'.
One could use the "pci=pcie_bus_safe" kernel parameter to work around the
issue, but that forces a user to supply a kernel parameter to get the
system to function reliably and may end up limiting MPS settings of other
unrelated, sub-topologies which could benefit from maintaining their larger
values.
Augment Keith's approach to include tuning down a Root Port's MPS setting
when its hot-added endpoint device is not capable of matching it.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jon Mason <jdmason@kudzu.us>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sinan Kaya <okaya@kernel.org>
Cc: Dongdong Liu <liudongdong3@huawei.com>
PCIe r4.0, sec 9.3.5.4, "Device Control Register", shows both
Max_Payload_Size (MPS) and Max_Read_request_Size (MRRS) to be 'RsvdP' for
VFs. Just prior to the table it states:
"PF and VF functionality is defined in Section 7.5.3.4 except where
noted in Table 9-16. For VF fields marked 'RsvdP', the PF setting
applies to the VF."
All of which implies that with respect to Max_Payload_Size Supported
(MPSS), MPS, and MRRS values, we should not be paying any attention to the
VF's fields, but rather only to the PF's. Only looking at the PF's fields
also logically makes sense as it's the sole physical interface to the PCIe
bus.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
Fixes: 27d868b5e6 ("PCI: Set MPS to match upstream bridge")
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # 4.3+
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sinan Kaya <okaya@kernel.org>
Cc: Dongdong Liu <liudongdong3@huawei.com>
Cc: Jon Mason <jdmason@kudzu.us>
When both ends of a PCIe Link are capable of a higher bandwidth than is
currently in use, the Link is said to be "downtrained". A downtrained Link
may indicate hardware or configuration problems in the system, but it's
hard to identify such Links from userspace.
Refactor pcie_print_link_status() so it continues to always print PCIe
bandwidth information, as several NIC drivers desire.
Add a new internal __pcie_print_link_status() to emit a message only when a
device's bandwidth is constrained by the fabric and call it from the PCI
core for all devices, which identifies all downtrained Links. It also
emits messages for a few cases that are technically not downtrained, such
as a x4 device in an open-ended x1 slot.
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
[bhelgaas: changelog, move __pcie_print_link_status() declaration to
drivers/pci/, rename pcie_check_upstream_link() to
pcie_report_downtraining()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Several PCI core files include pci-aspm.h even though they don't need
anything provided by that file. Remove the unnecessary includes of it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
When a PCI device is detected, pdev->is_added is set to 1 and proc and
sysfs entries are created.
When the device is removed, pdev->is_added is checked for one and then
device is detached with clearing of proc and sys entries and at end,
pdev->is_added is set to 0.
is_added and is_busmaster are bit fields in pci_dev structure sharing same
memory location.
A strange issue was observed with multiple removal and rescan of a PCIe
NVMe device using sysfs commands where is_added flag was observed as zero
instead of one while removing device and proc,sys entries are not cleared.
This causes issue in later device addition with warning message
"proc_dir_entry" already registered.
Debugging revealed a race condition between the PCI core setting the
is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue
setting the is_busmaster bit in pci_set_master(). As these fields are not
handled atomically, that clears the is_added bit.
Move the is_added bit to a separate private flag variable and use atomic
functions to set and retrieve the device addition state. This avoids the
race because is_added no longer shares a memory location with is_busmaster.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283
Signed-off-by: Hari Vyas <hari.vyas@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Define a structure to hold the AER statistics. There are 2 groups of
statistics: dev_* counters that are to be collected for all AER capable
devices and rootport_* counters that are collected for all (AER capable)
rootports only. Allocate and free this structure when device is added or
released (thus counters survive the lifetime of the device).
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some IDT switches incorrectly flag an ACS Source Validation error on
completions for config read requests even though PCIe r4.0, sec 6.12.1.1,
says that completions are never affected by ACS Source Validation. Here's
the text of IDT 89H32H8G3-YC, erratum #36:
Item #36 - Downstream port applies ACS Source Validation to Completions
Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that
completions are never affected by ACS Source Validation. However,
completions received by a downstream port of the PCIe switch from a
device that has not yet captured a PCIe bus number are incorrectly
dropped by ACS Source Validation by the switch downstream port.
Workaround: Issue a CfgWr1 to the downstream device before issuing the
first CfgRd1 to the device. This allows the downstream device to capture
its bus number; ACS Source Validation no longer stops completions from
being forwarded by the downstream port. It has been observed that
Microsoft Windows implements this workaround already; however, some
versions of Linux and other operating systems may not.
When doing the first config read to probe for a device, if the device is
behind an IDT switch with this erratum:
1. Disable ACS Source Validation if enabled
2. Wait for device to become ready to accept config accesses (by using
the Config Request Retry Status mechanism)
3. Do a config write to the endpoint
4. Enable ACS Source Validation (if it was enabled to begin with)
The workaround suggested by IDT is basically only step 3, but we don't know
when the device is ready to accept config requests. That means we need to
do config reads until we receive a non-Config Request Retry Status, which
means we need to disable ACS SV temporarily.
Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
[bhelgaas: changelog, clean up whitespace, fold in unused variable fix
from Anders Roxell <anders.roxell@linaro.org>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
A PCIe endpoint carries the process address space identifier (PASID) in
the TLP prefix as part of the memory read/write transaction. The address
information in the TLP is relevant only for a given PASID context.
An IOMMU takes PASID value and the address information from the
TLP to look up the physical address in the system.
PASID is an End-End TLP Prefix (PCIe r4.0, sec 6.20). Sec 2.2.10.2 says
It is an error to receive a TLP with an End-End TLP Prefix by a
Receiver that does not support End-End TLP Prefixes. A TLP in
violation of this rule is handled as a Malformed TLP. This is a
reported error associated with the Receiving Port (see Section 6.2).
Prevent error condition by proactively requiring End-End TLP prefix to be
supported on the entire data path between the endpoint and the root port
before enabling PASID.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move early dump functionality into common code so that it is available for
all architectures. No need to carry arch-specific reads around as the read
hooks are already initialized by the time pci_setup_device() is getting
called during scan.
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
- fix use-before-set error in ibmphp (Dan Carpenter)
- fix pciehp timeouts caused by Command Completed errata (Bjorn Helgaas)
- fix refcounting in pnv_php hotplug (Julia Lawall)
- clear pciehp Presence Detect and Data Link Layer Status Changed on
resume so we don't miss hotplug events (Mika Westerberg)
- only request pciehp control if we support it, so platform can use ACPI
hotplug otherwise (Mika Westerberg)
- convert SHPC to be builtin only (Mika Westerberg)
- request SHPC control via _OSC if we support it (Mika Westerberg)
- simplify SHPC handoff from firmware (Mika Westerberg)
* pci/hotplug:
PCI: Improve "partially hidden behind bridge" log message
PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc
PCI: Move resource distribution for single bridge outside loop
PCI: Account for all bridges on bus when distributing bus numbers
ACPI / hotplug / PCI: Drop unnecessary parentheses
ACPI / hotplug / PCI: Mark stale PCI devices disconnected
ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug
PCI: hotplug: Add hotplug_is_native()
PCI: shpchp: Add shpchp_is_native()
PCI: shpchp: Fix AMD POGO identification
PCI: shpchp: Use dev_printk() for OSHP-related messages
PCI: shpchp: Remove get_hp_hw_control_from_firmware() wrapper
PCI: shpchp: Remove acpi_get_hp_hw_control_from_firmware() flags
PCI: shpchp: Rely on previous _OSC results
PCI: shpchp: Request SHPC control via _OSC when adding host bridge
PCI: shpchp: Convert SHPC to be builtin only
PCI: pciehp: Make pciehp_is_native() stricter
PCI: pciehp: Rename host->native_hotplug to host->native_pcie_hotplug
PCI: pciehp: Request control of native hotplug only if supported
PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
PCI: pnv_php: Add missing of_node_put()
PCI: pciehp: Add quirk for Command Completed errata
PCI: Add Qualcomm vendor ID
PCI: ibmphp: Fix use-before-set in get_max_bus_speed()
# Conflicts:
# drivers/acpi/pci_root.c
- neaten pci=earlydump output (Andy Shevchenko)
- avoid errors when extended config space inaccessible (Gilles Buloz)
- prevent sysfs disable of device while driver attached (Christoph
Hellwig)
- use core interface to report PCIe link properties in bnx2x, bnxt_en,
cxgb4, ixgbe (Bjorn Helgaas)
- remove unused pcie_get_minimum_link() (Bjorn Helgaas)
* pci/enumeration:
PCI: Remove unused pcie_get_minimum_link()
ixgbe: Report PCIe link properties with pcie_print_link_status()
cxgb4: Report PCIe link properties with pcie_print_link_status()
bnxt_en: Report PCIe link properties with pcie_print_link_status()
bnx2x: Report PCIe link properties with pcie_print_link_status()
PCI: Prevent sysfs disable of device while driver is attached
PCI: Check whether bridges allow access to extended config space
x86/PCI: Make pci=earlydump output neat
pci_scan_child_bus_extend() complains when we assign an unreachable
secondary bus number to a bridge. For example, given the topology below:
+-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
+-01.0-[04-39]--
\-02.0-[3a]----00.0
it logs the following messages:
pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
These messages are incorrect (0000:02 is a bus, not a bridge) and
confusing. Make the message more understandable:
pci 0000:02:02.0: devices behind bridge are unusable because [bus 3a] cannot be assigned for them
Also, remove the reference to CardBus, because this issue affects all
varieties of PCI, not just CardBus.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It is not immediately clear what the two functions actually return so
add kernel-doc comment explaining it a bit better.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
When distributing extra bus number space to hotplug bridges for future
extension, we don't account for the fact that there might be non-hotplug
bridges on the bus after the hotplug bridges. For example:
01:00.0 --+- 02:00.0 (HotPlug-) -- Thunderbolt host controller
+- 02:01.0 (HotPlug+)
\- 02:02.0 (HotPlug-) -- xHCI host controller
pci_scan_child_bus_extend() is supposed to distribute the remaining bus
numbers to the hotplug bridge at 02:01.0, but only after accounting for all
bridges on bus 02. Since we don't check whether there's another
non-hotplug bridge after the hotplug bridge 02:01.0, it may not leave space
for the non-hotplug bridge:
pci 0000:00:1b.0: PCI bridge to [bus 01-39] (Root Port)
pci 0000:01:00.0: PCI bridge to [bus 02-39]
...
pci 0000:02:00.0: PCI bridge to [bus 03]
pci 0000:02:01.0: PCI bridge to [bus 04]
pci_bus 0000:04: [bus 04-39] extended by 0x35
pci_bus 0000:04: bus scan returning with max=39
pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39
pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
pci_bus 0000:3a: scanning bus
pci_bus 0000:3a: bus scan returning with max=3a
pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a
pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39]
pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39]
pci_bus 0000:02: bus scan returning with max=3a
pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a
The resulting 'lspci -t' output looks like this:
+-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0
^^ +-01.0-[04-39]--
\-02.0-[3a]----00.0
^^
The xHCI host controller behind 02:02.0 is not usable because it would have
to be assigned bus 3a, which is not accessible through 00:1b.0.
To fix this, reserve at least one bus for each bridge while scanning
already configured bridges. Then use this information in the second
scan to correct the available extra bus space for hotplug bridges.
After this change the 'lspci -t' output is what is expected:
+-1b.0-[01-39]----00.0-[02-39]--+-00.0-[03]----00.0
+-01.0-[04-38]--
\-02.0-[39]----00.0
The xHCI controller is now on bus 39, where it is usable.
Fixes: 1c02ea8100 ("PCI: Distribute available buses to hotplug-capable bridges")
Reported-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@vger.kernel.org
The SHPC driver now must be builtin (it cannot be a module). If it is
present, request SHPC control immediately when adding the ACPI host bridge.
This is similar to how we handle native PCIe hotplug via pciehp.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rename host->native_hotplug to host->native_pcie_hotplug to make room for a
similar flag for SHPC hotplug.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix a memory leak by freeing the PCI resource list in
devm_pci_release_host_bridge_dev().
Fixes: 5c3f18cce0 ("PCI: Add devm_pci_alloc_host_bridge() interface")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Even if a device supports extended config space, i.e., it is a PCI-X Mode 2
or a PCI Express device, the extended space may not be accessible if
there's a conventional PCI bus in the path to it.
We currently figure that out in pci_cfg_space_size() by reading the first
dword of extended config space. On most platforms that returns ~0 data if
the space is inaccessible, but it may set error bits in PCI status
registers, and on some platforms it causes exceptions that we currently
don't recover from.
For example, a PCIe-to-conventional PCI bridge treats config transactions
with a non-zero Extended Register Address as an Unsupported Request on PCIe
and a received Master-Abort on the destination bus (see PCI Express to
PCI/PCI-X Bridge spec, r1.0, sec 4.1.3).
A sample case is a LS1043A CPU (NXP QorIQ Layerscape) platform with the
following bus topology:
LS1043 PCIe Root Port
-> PEX8112 PCIe-to-PCI bridge (doesn't support ext cfg on PCI side)
-> PMC slot connector (for legacy PMC modules)
With a PMC module topology as follows:
PMC connector
-> PCI-to-PCIe bridge
-> PCIe switch (4 ports)
-> 4 PCIe devices (one on each port)
The PCIe devices on the PMC module support extended config space, but we
can't reach it because the PEX8112 can't generate accesses to the extended
space on its secondary bus. Attempts to access it cause Unsupported
Request errors, which result in synchronous aborts on this platform.
To avoid these errors, check whether bridges are capable of generating
extended config space addresses on their secondary interfaces. If they
can't, we restrict devices below the bridge to only the 256-byte
PCI-compatible config space.
Signed-off-by: Gilles Buloz <gilles.buloz@kontron.com>
[bhelgaas: changelog, rework patch so bus_flags testing is all in
pci_bridge_child_ext_cfg_accessible()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per the PCI Firmware spec r3.2, sec 4.5, an ACPI-based OS should use _OSC
to request control of Latency Tolerance Reporting (LTR) before using it.
Request control of LTR, and if the platform does not grant control, don't
use it.
N.B. If the hardware supports LTR and the ASPM L1.2 substate but the BIOS
doesn't support LTR in _OSC, we previously would enable ASPM L1.2. This
patch will prevent us from enabling ASPM L1.2 in that case. It does not
prevent us from enabling PCI-PM L1.2, since that doesn't depend on LTR.
See PCIe r40, sec 5.5.1, for the L1 PM substate entry conditions.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlrHeY8UHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxhLRAAndV/0NDyWZU0eZNM6twri2SEFnF7
E4ar+YthxDxxJG4TLJbIA12jc5NgHZy4WuttDa6Jb99KreBXIHJFlNi/V/tme6zf
+yXUuxWae7wJzBiaay57VqLGSc80gt/LTgjLa1siwQqjTbO3wSXR6JJXNaE9FtQ4
/jL61t8bD1Peb5cWTpt9p0hrnKI0/pHwASdReyFS4F/HDKdvpof7BxE/OU3HSxxA
XKC2v6RjY4S93vkzvApDXQ+vhKquVRK7/ojyTXQUO/GIzcARprO7H4k62N4ar0x/
qbXLkR8IMkwA8ecsNmcL92ftb/cXoHfd+wdK8WpijqzF4kW4SdteVWbIhUzI0gbr
0gjDYIzjplvH3pZGv/qvx+8sFtAP95OdPjuAAW2qJ9TCVfmiS8naNFCvcxg87RhD
gjyQD3If1X7F8wy309lhq7VNyRexTHgIMgTXHyFvuZMzn/Qe1huL2XCwDcEAg/OX
AvU2iuSE5tWAh7gIUMF/aWi3uoeJUyyoru5ZR//gqdFfx9YxpSimO1UDXnpPi8SR
Iz/jzHJc0aWGYdQ9l6HiSbJF3P/QQcWYs9igt0A7BRGB05SPdWCh7sSO70FJa8ME
f4WID5/qEiaH26kiSRX4cUqpc8Amk8bT0DXw2OT57qy3JM0ZdV5ENQX11pSpr9hv
uLEf0DU7AEmdvzQ=
=T++R
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- move pci_uevent_ers() out of pci.h (Michael Ellerman)
- skip ASPM common clock warning if BIOS already configured it (Sinan
Kaya)
- fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva)
- remove last user of pci_get_bus_and_slot() and the function itself
(Sinan Kaya)
- add decoding for 16 GT/s link speed (Jay Fang)
- add interfaces to get max link speed and width (Tal Gilboa)
- add pcie_bandwidth_capable() to compute max supported link bandwidth
(Tal Gilboa)
- add pcie_bandwidth_available() to compute bandwidth available to
device (Tal Gilboa)
- add pcie_print_link_status() to log link speed and whether it's
limited (Tal Gilboa)
- use PCI core interfaces to report when device performance may be
limited by its slot instead of doing it in each driver (Tal Gilboa)
- fix possible cpqphp NULL pointer dereference (Shawn Lin)
- rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI
hotplug (Mika Westerberg)
- add support for PCI I/O port space that's neither directly accessible
via CPU in/out instructions nor directly mapped into CPU physical
memory space. This is fairly intrusive and includes minor changes to
interfaces used for I/O space on most platforms (Zhichang Yuan, John
Garry)
- add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
John Garry)
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
- remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
(Shawn Lin)
- report quirk timings with dev_info (Bjorn Helgaas)
- report quirks that take longer than 10ms (Bjorn Helgaas)
- add and use Altera Vendor ID (Johannes Thumshirn)
- tidy Makefiles and comments (Bjorn Helgaas)
- don't set up INTx if MSI or MSI-X is enabled to align cris, frv,
ia64, and mn10300 with x86 (Bjorn Helgaas)
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
Lawler)
- merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
- move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
Helgaas)
- completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
- remove portdrv link order dependency (Bjorn Helgaas)
- remove support for unused VC portdrv service (Bjorn Helgaas)
- simplify portdrv feature permission checking (Bjorn Helgaas)
- remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
Helgaas)
- remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
- use cached AER capability offset (Frederick Lawler)
- don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
- rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
- use generic pci_mmap_resource_range() instead of powerpc and xtensa
arch-specific versions (David Woodhouse)
- support arbitrary PCI host bridge offsets on sparc (Yinghai Lu)
- remove System and Video ROM reservations on sparc (Bjorn Helgaas)
- probe for device reset support during enumeration instead of runtime
(Bjorn Helgaas)
- add ACS quirk for Ampere (née APM) root ports (Feng Kan)
- add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
Vincent-Cross)
- protect device restore with device lock (Sinan Kaya)
- handle failure of FLR gracefully (Sinan Kaya)
- handle CRS (config retry status) after device resets (Sinan Kaya)
- skip various config reads for SR-IOV VFs as an optimization
(KarimAllah Ahmed)
- consolidate VPD code in vpd.c (Bjorn Helgaas)
- add Tegra dependency on PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)
- add DT support for R-Car r8a7743 (Biju Das)
- fix a PCI_EJECT vs PCI_BUS_RELATIONS race condition in Hyper-V host
bridge driver that causes a general protection fault (Dexuan Cui)
- fix Hyper-V host bridge hang in MSI setup on 1-vCPU VMs with SR-IOV
(Dexuan Cui)
- fix Hyper-V host bridge hang when ejecting a VF before setting up MSI
(Dexuan Cui)
- make several structures static (Fengguang Wu)
- increase number of MSI IRQs supported by Synopsys DesignWare bridges
from 32 to 256 (Gustavo Pimentel)
- implemented multiplexed IRQ domain API and remove obsolete MSI IRQ
API from DesignWare drivers (Gustavo Pimentel)
- add Tegra power management support (Manikanta Maddireddy)
- add Tegra loadable module support (Manikanta Maddireddy)
- handle 64-bit BARs correctly in endpoint support (Niklas Cassel)
- support optional regulator for HiSilicon STB (Shawn Guo)
- use regulator bulk API for Qualcomm apq8064 (Srinivas Kandagatla)
- support power supplies for Qualcomm msm8996 (Srinivas Kandagatla)
* tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (123 commits)
MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
HISI LPC: Add ACPI support
ACPI / scan: Do not enumerate Indirect IO host children
ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
of: Add missing I/O range exception for indirect-IO devices
PCI: Apply the new generic I/O management on PCI IO hosts
PCI: Add fwnode handler as input param of pci_register_io_range()
PCI: Remove __weak tag from pci_register_io_range()
MAINTAINERS: Add missing /drivers/pci/cadence directory entry
fm10k: Report PCIe link properties with pcie_print_link_status()
net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
net/mlx5: Report PCIe link properties with pcie_print_link_status()
net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
PCI: Add pcie_print_link_status() to log link speed and whether it's limited
PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
misc: pci_endpoint_test: Handle 64-bit BARs properly
PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly
PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing
PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar
...
- probe for device reset support during enumeration instead of runtime
(Bjorn Helgaas)
- add ACS quirk for Ampere (née APM) root ports (Feng Kan)
- add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
Vincent-Cross)
- protect device restore with device lock (Sinan Kaya)
- handle failure of FLR gracefully (Sinan Kaya)
- handle CRS (config retry status) after device resets (Sinan Kaya)
- skip various config reads for SR-IOV VFs as an optimization (KarimAllah
Ahmed)
* pci/virtualization:
PCI/IOV: Add missing prototypes for powerpc pcibios interfaces
PCI/IOV: Use VF0 cached config registers for other VFs
PCI/IOV: Skip BAR sizing for VFs
PCI/IOV: Skip INTx config reads for VFs
PCI: Wait for device to become ready after secondary bus reset
PCI: Add a return type for pci_reset_bridge_secondary_bus()
PCI: Wait for device to become ready after a power management reset
PCI: Rename pci_flr_wait() to pci_dev_wait() and make it generic
PCI: Handle FLR failure and allow other reset types
PCI: Protect restore with device lock to be consistent
PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
PCI: Add ACS quirk for Ampere root ports
PCI: Remove redundant probes for device reset support
PCI: Probe for device reset support during enumeration
Conflicts:
include/linux/pci.h
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
Lawler)
- merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
- move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
Helgaas)
- completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
- remove portdrv link order dependency (Bjorn Helgaas)
- remove support for unused VC portdrv service (Bjorn Helgaas)
- simplify portdrv feature permission checking (Bjorn Helgaas)
- remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
Helgaas)
- remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
- use cached AER capability offset (Frederick Lawler)
- don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
- rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
* pci/portdrv:
PCI/DPC: Rename from pcie-dpc.c to dpc.c
PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOS
PCI/AER: Use cached AER Capability offset
PCI/portdrv: Rename and reverse sense of pcie_ports_auto
PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver
PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter
PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h>
PCI/portdrv: Simplify PCIe feature permission checking
PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
PCI/portdrv: Remove pcie_port_bus_type link order dependency
PCI/portdrv: Disable port driver in compat mode
PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
PCI/PM: Move pcie_clear_root_pme_status() to core
PCI/portdrv: Merge pcieport_if.h into portdrv.h
PCI/portdrv: Move pcieport_if.h to drivers/pci/pcie/
Conflicts:
drivers/pci/pcie/Makefile
drivers/pci/pcie/portdrv.h
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
- remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
(Shawn Lin)
- report quirk timings with dev_info (Bjorn Helgaas)
- report quirks that take longer than 10ms (Bjorn Helgaas)
- add and use Altera Vendor ID (Johannes Thumshirn)
- tidy Makefiles and comments (Bjorn Helgaas)
* pci/misc:
PCI: Always define the of_node helpers
PCI: Tidy comments
PCI: Tidy Makefiles
mcb: Add Altera PCI ID to mcb-pci
PCI: Add Altera vendor ID
PCI: Report quirks that take more than 10ms
PCI: Report quirk timings with pci_info() instead of pr_debug()
PCI: Fix NULL pointer dereference in of_pci_bus_find_domain_nr()
rapidio/tsi721: use PCI_EXP_DEVCTL2_COMP_TIMEOUT macro
Cache some config data from VF0 and use it for all other VFs instead of
reading it from the config space of each VF. We assume these items are the
same across all associated VFs:
Revision ID
Class Code
Subsystem Vendor ID
Subsystem ID
This is an optimization when enabling SR-IOV on a device with many VFs.
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
[bhelgaas: changelog, simplify comments, remove unused "device", test
CONFIG_PCI_IOV instead of CONFIG_PCI_ATS, rename functions]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the
platform firmware or the OS, so the host bridge driver may have to request
permission from the platform before using them. On ACPI systems, this is
done by negotiate_os_control() in acpi_pci_root_add().
The PCIe port driver later uses pcie_port_platform_notify() and
pcie_port_acpi_setup() to figure out whether it can use these features.
But all we need is a single bit for each service, so these interfaces are
needlessly complicated.
Simplify this by adding bits in the struct pci_host_bridge to show when the
OS has permission to use each feature:
+ unsigned int native_aer:1; /* OS may use PCIe AER */
+ unsigned int native_hotplug:1; /* OS may use PCIe hotplug */
+ unsigned int native_pme:1; /* OS may use PCIe PME */
These are set when we create a host bridge, and the host bridge driver can
clear the bits corresponding to any feature the platform doesn't want us to
use.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PCIe 4.0 defines the 16.0 GT/s link speed. Links can run at that speed
without any Linux changes, but previously their sysfs "max_link_speed" and
"current_link_speed" files contained "Unknown speed", not the expected
"16.0 GT/s".
Add decoding for the new 16 GT/s link speed.
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
[bhelgaas: add PCI_EXP_LNKCAP2_SLS_16_0GB]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
Per PCIe r4.0, sec 9.3.4.1.11, the BAR registers in VF config space are all
RO Zero, so skip sizing them.
This is an optimization when enabling SR-IOV on a device with many VFs.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Remove pointless comments that tell us the file name, remove blank line
comments, follow multi-line comment conventions. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r4.0, sec 7.5.1.1.9, multi-function devices are required to have a
function 0. Therefore, Linux scans for devices at function 0 (devfn
0/8/16/...) and only scans for other functions if function 0 has its
Multi-Function Device bit set or ARI or SR-IOV indicate there are more
functions.
The Jailhouse hypervisor may pass individual functions of a multi-function
device to a guest without passing function 0, which means a Linux guest
won't find them.
Change Linux PCI probing so it scans all function numbers when running as a
guest over Jailhouse.
This is technically prohibited by the spec, so it is possible that PCI
devices without the Multi-Function Device bit set may have unexpected
behavior in response to this probe.
Originally-by: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: jailhouse-dev@googlegroups.com
Cc: Benedikt Spranger <b.spranger@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Link: https://lkml.kernel.org/r/06e279b2a3e06cf6689ab3975f8ab592bba02362.1520408357.git.jan.kiszka@siemens.com
Per PCIe r4.0, sec 9.2.1.4, VFs can not implement INTX, and their Interrupt
Line and Interrupt Pin registers must be RO Zero. Some devices have
thousands of VFs, so skip reading the registers as an optimization.
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
[bhelgaas: changelog, comment]
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Previously we called pci_probe_reset_function() in this path:
pci_sysfs_init # late_initcall
for_each_pci_dev(dev)
pci_create_sysfs_dev_files(dev)
pci_create_capabilities_sysfs(dev)
pci_probe_reset_function
pci_dev_specific_reset
pcie_has_flr
pcie_capability_read_dword
pci_sysfs_init() is a late_initcall, and a driver may have already claimed
one of these devices and enabled runtime power management for it, so the
device could already be in D3 by the time we get to pci_sysfs_init().
The device itself should respond to the config read even while it's in
D3hot, but if an upstream bridge is also in D3hot, the read won't even
reach the device because the bridge won't forward it downstream to the
device. If the bridge is a PCIe port, it should complete the read as an
Unsupported Request, which may be reported to the CPU as an exception or as
invalid data.
Avoid this case by probing for reset support from pci_init_capabilities(),
before a driver can claim the device. The device may be in D3hot, but any
bridges leading to it should be in D0, so the device's config space should
be fully accessible at that point.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/spdx:
PCI: Add SPDX GPL-2.0+ to replace implicit GPL v2 or later statement
PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate
PCI: Add SPDX GPL-2.0 to replace COPYING boilerplate
PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate
PCI: Add SPDX GPL-2.0 when no license was specified
* lorenzo/pci/cadence:
PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller
PCI: endpoint: Fix EPF device name to support multi-function devices
PCI: endpoint: Add the function number as argument to EPC ops
PCI: cadence: Add host driver for Cadence PCIe controller
dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller
PCI: Add vendor ID for Cadence
PCI: Add generic function to probe PCI host controllers
PCI: generic: fix missing call of pci_free_resource_list()
PCI: OF: Add generic function to parse and allocate PCI resources
PCI: Regroup all PCI related entries into drivers/pci/Makefile
Conflicts:
drivers/pci/of.c
include/linux/pci.h
* pci/misc:
PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build
PCI: Add wrappers for dev_printk()
PCI: Remove unnecessary messages for memory allocation failures
PCI: Add #defines for Completion Timeout Disable feature
hinic: Replace PCI pool old API
net: e100: Replace PCI pool old API
block: DAC960: Replace PCI pool old API
MAINTAINERS: Include more PCI files
PCI: Remove unneeded kallsyms include
powerpc/pci: Unroll two pass loop when scanning bridges
powerpc/pci: Use for_each_pci_bridge() helper
* pci/enumeration:
RDMA/qedr: Use pci_enable_atomic_ops_to_root()
PCI: Add pci_enable_atomic_ops_to_root()
PCI: Make PCI_SCAN_ALL_PCIE_DEVS work for Root as well as Downstream Ports
This patchs moves generic source code from
drivers/pci/host/pci-host-common.c into drivers/pci/probe.c.
Indeed the extracted lines of code were duplicated by many host
controller drivers. Regrouping them into a generic function gives a
change to properly share this code without introducing a useless
dependency to PCI_HOST_COMMON, which selects PCI_ECAM when not needed by
most host controller drivers.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@free-electrons.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
b24413180f ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") added SPDX GPL-2.0 to several PCI files that
previously contained no license information.
Add SPDX GPL-2.0 to all other PCI files that did not contain any license
information and hence were under the default GPL version 2 license of the
kernel.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add PCI-specific dev_printk() wrappers and use them to simplify the code
slightly. No functional change intended.
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
[bhelgaas: squash into one patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe Downstream Ports normally have only a Device 0 below them. To
optimize enumeration, we don't scan for other devices *unless* the
PCI_SCAN_ALL_PCIE_DEVS flag is set by set by quirks or the
"pci=pcie_scan_all" kernel parameter.
Previously PCI_SCAN_ALL_PCIE_DEVS only affected scanning below Switch
Downstream Ports, not Root Ports.
But the "Nemo" system, also known as the AmigaOne X1000, has a PA Semi Root
Port whose link leads to an AMD/ATI SB600 South Bridge. The Root Port is a
PCIe device, of course, but the SB600 contains only conventional PCI
devices with no visible PCIe port.
Simplify and restructure only_one_child() so that we scan for all possible
devices below Root Ports as well as Switch Downstream Ports when
PCI_SCAN_ALL_PCIE_DEVS is set.
This is enough to make Nemo work with "pci=pcie_scan_all". We would also
like to add a quirk to set PCI_SCAN_ALL_PCIE_DEVS automatically on Nemo so
users wouldn't have to use the "pci=pcie_scan_all" parameter, but we don't
have that yet.
Link: https://lkml.kernel.org/r/CAErSpo55Q8Q=5p6_+uu7ahnw+53ibVDNRXxrzRV9QnUr_9EUfw@mail.gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198057
Reported-and-Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Enable Latency Tolerance Reporting (LTR). Note that LTR must be enabled in
the Root Port first, and must not be enabled in any downstream device
unless the Root Port and all intermediate Switches also support LTR.
See PCIe r3.1, sec 6.18.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
System BIOS sometimes allocates extra bus space for hotplug-capable PCIe
root/downstream ports. This space is needed if the device plugged to the
port will have more hotplug-capable downstream ports. A good example of
this is Thunderbolt. Each Thunderbolt device contains a PCIe switch and
one or more hotplug-capable PCIe downstream ports where the daisy chain
can be extended.
Currently Linux only allocates minimal bus space to make sure all the
enumerated devices barely fit there. The BIOS reserved extra space is
not taken into consideration at all. Because of this we run out of bus
space pretty quickly when more PCIe devices are attached to hotplug
downstream ports in order to extend the chain.
Modify the PCI core so we distribute the available BIOS allocated bus space
equally between hotplug-capable bridges to make sure there is enough bus
space for extending the hierarchy later on.
Update kernel docs of the affected functions.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
One can ask more buses to be reserved for hotplug bridges by passing
pci=hpbussize=N in the kernel command line. If the parent bus does not
have enough bus space available we incorrectly create child bus with the
requested number of subordinate buses.
In the example below hpbussize is set to one more than we have available
buses in the root port:
pci 0000:07:00.0: [8086:1578] type 01 class 0x060400
pci 0000:07:00.0: scanning [bus 00-00] behind bridge, pass 0
pci 0000:07:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:07:00.0: scanning [bus 00-00] behind bridge, pass 1
pci_bus 0000:08: busn_res: can not insert [bus 08-ff] under [bus 07-3f] (conflicts with (null) [bus 07-3f])
pci_bus 0000:08: scanning bus
...
pci_bus 0000:0a: bus scan returning with max=40
pci_bus 0000:0a: busn_res: [bus 0a-ff] end is updated to 40
pci_bus 0000:0a: [bus 0a-40] partially hidden behind bridge 0000:07 [bus 07-3f]
pci_bus 0000:08: bus scan returning with max=40
pci_bus 0000:08: busn_res: [bus 08-ff] end is updated to 40
Instead of allowing this, limit the subordinate number to be less than or
equal the maximum subordinate number allocated for the parent bus (if it
has any).
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: remove irrelevant dmesg messages]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The current scanning code is really hard to understand because it calls
the same function in a loop where pass value is changed without any
comments explaining it:
for (pass = 0; pass < 2; pass++)
for_each_pci_bridge(dev, bus)
max = pci_scan_bridge(bus, dev, max, pass);
Unfamiliar reader cannot tell easily what is the purpose of this loop
without looking at internals of pci_scan_bridge().
In order to make this bit easier to understand, open-code the loop in
pci_scan_child_bus() and pci_hp_add_bridge() with added comments.
No functional changes intended.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There is not much point of having a file with a single function in it.
Instead we can just move pci_hp_add_bridge() to drivers/pci/probe.c and
make it available always when PCI core is enabled.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: convert printk to dev_err()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The following pattern is often used:
list_for_each_entry(dev, &bus->devices, bus_list) {
if (pci_is_bridge(dev)) {
...
}
}
Add a for_each_pci_bridge() helper to make that code easier to write and
read by reducing indentation level. It also saves one or few lines of code
in each occurrence.
Convert PCI core parts here at the same time.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[bhelgaas: fold in http://lkml.kernel.org/r/20171013165352.25550-1-andriy.shevchenko@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ
vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq
64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh
XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc
0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6
B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ
nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam
c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz
4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf
h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE
GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ
EG76ridTolUFVV+wzJD9
=9cLP
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- add enhanced Downstream Port Containment support, which prints more
details about Root Port Programmed I/O errors (Dongdong Liu)
- add Layerscape ls1088a and ls2088a support (Hou Zhiqiang)
- add MediaTek MT2712 and MT7622 support (Ryder Lee)
- add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang)
- add Qualcom IPQ8074 support (Varadarajan Narayanan)
- add R-Car r8a7743/5 device tree support (Biju Das)
- add Rockchip per-lane PHY support for better power management (Shawn
Lin)
- fix IRQ mapping for hot-added devices by replacing the
pci_fixup_irqs() boot-time design with a host bridge hook called at
probe-time (Lorenzo Pieralisi, Matthew Minter)
- fix race when enabling two devices that results in upstream bridge
not being enabled correctly (Srinath Mannam)
- fix pciehp power fault infinite loop (Keith Busch)
- fix SHPC bridge MSI hotplug events by enabling bus mastering
(Aleksandr Bezzubikov)
- fix a VFIO issue by correcting PCIe capability sizes (Alex
Williamson)
- fix an INTD issue on Xilinx and possibly other drivers by unifying
INTx IRQ domain support (Paul Burton)
- avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg
Roedel)
- allow APM X-Gene device assignment to guests by adding an ACS quirk
(Feng Kan)
- fix driver crashes by disabling Extended Tags on Broadcom HT2100
(Extended Tags support is required for PCIe Receivers but not
Requesters, and we now enable them by default when Requesters support
them) (Sinan Kaya)
- fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs
use the real Requester ID (not a phantom RID) (Robin Murphy)
- prevent assignment of Intel VMD children to guests (which may be
supported eventually, but isn't yet) by not associating an IOMMU with
them (Jon Derrick)
- fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott
Bauer)
- fix a Function-Level Reset issue with Intel 750 NVMe by waiting
longer (up to 60sec instead of 1sec) for device to become ready
(Sinan Kaya)
- fix a Function-Level Reset issue on iProc Stingray by working around
hardware defects in the CRS implementation (Oza Pawandeep)
- fix an issue with Intel NVMe P3700 after an iProc reset by adding a
delay during shutdown (Oza Pawandeep)
- fix a Microsoft Hyper-V lockdep issue by polling instead of blocking
in compose_msi_msg() (Stephen Hemminger)
- fix a wireless LAN driver timeout by clearing DesignWare MSI
interrupt status after it is handled, not before (Faiz Abbas)
- fix DesignWare ATU enable checking (Jisheng Zhang)
- reduce Layerscape dependencies on the bootloader by doing more
initialization in the driver (Hou Zhiqiang)
- improve Intel VMD performance allowing allocation of more IRQ vectors
than present CPUs (Keith Busch)
- improve endpoint framework support for initial DMA mask, different
BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon
Vijay Abraham I, Stan Drozd)
- rework CRS support to add periodic messages while we poll during
enumeration and after Function-Level Reset and prepare for possible
other uses of CRS (Sinan Kaya)
- clean up Root Port AER handling by removing unnecessary code and
moving error handler methods to struct pcie_port_service_driver
(Christoph Hellwig)
- clean up error handling paths in various drivers (Bjorn Andersson,
Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen,
Lorenzo Pieralisi, Sergei Shtylyov)
- clean up SR-IOV resource handling by disabling VF decoding before
updating the corresponding resource structs (Gavin Shan)
- clean up DesignWare-based drivers by unifying quirks to update Class
Code and Interrupt Pin and related handling of write-protected
registers (Hou Zhiqiang)
- clean up by adding empty generic pcibios_align_resource() and
pcibios_fixup_bus() and removing empty arch-specific implementations
(Palmer Dabbelt)
- request exclusive reset control for several drivers to allow cleanup
elsewhere (Philipp Zabel)
- constify various structures (Arvind Yadav, Bhumika Goyal)
- convert from full_name() to %pOF (Rob Herring)
- remove unused variables from iProc, HiSi, Altera, Keystone (Shawn
Lin)
* tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits)
PCI: xgene: Clean up whitespace
PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset
PCI: xgene: Fix platform_get_irq() error handling
PCI: xilinx-nwl: Fix platform_get_irq() error handling
PCI: rockchip: Fix platform_get_irq() error handling
PCI: altera: Fix platform_get_irq() error handling
PCI: spear13xx: Fix platform_get_irq() error handling
PCI: artpec6: Fix platform_get_irq() error handling
PCI: armada8k: Fix platform_get_irq() error handling
PCI: dra7xx: Fix platform_get_irq() error handling
PCI: exynos: Fix platform_get_irq() error handling
PCI: iproc: Clean up whitespace
PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP
PCI: iproc: Add 500ms delay during device shutdown
PCI: Fix typos and whitespace errors
PCI: Remove unused "res" variable from pci_resource_io()
PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag()
PCI/AER: Reformat AER register definitions
iommu/vt-d: Prevent VMD child devices from being remapping targets
x86/PCI: Use is_vmd() rather than relying on the domain number
...
Add a print statement in pci_bus_wait_crs() so that user observes the
progress of device polling instead of silently waiting for timeout to be
reached.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: check for timeout first so we don't print "waiting, giving up",
always print time we've slept (not the actual timeout, print a "ready"
message if we've printed a "waiting" message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Configuration Request Retry Status (CRS) was previously hidden inside
pci_bus_read_dev_vendor_id(). We want to add support for CRS in other
situations, such as waiting for a device to become ready after a Function
Level Reset.
Move CRS handling into pci_bus_wait_crs() so it can be called from other
places.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: pass pointer, not value, to pci_bus_wait_crs() so caller gets
correct Vendor ID]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add pci_bus_crs_vendor_id() to determine whether data returned for a config
read of the Vendor ID indicates a Configuration Request Retry Status (CRS)
response.
Per PCIe r3.1, sec 2.3.2, this data is only returned if:
- CRS Software Visibility is enabled,
- a config read includes both bytes of the Vendor ID, and
- the read receives a CRS completion
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: changelog, change name to pci_bus_crs_vendor_id(), make static
in probe.c, use it in pci_bus_read_dev_vendor_id()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
While waiting for a device to become ready (i.e., to return a non-CRS
completion to a read of its Vendor ID), if we got a valid response to the
very last read before timing out, we printed a warning and gave up on the
device even though it was actually ready.
For a typical 60s timeout, we wait about 65s (it's not exact because of the
exponential backoff), but we treated devices that became ready between 33s
and 65s as though they failed.
Move the Device ID read later so we check whether the device is ready
before checking for a timeout.
Thanks to Sinan Kaya <okaya@codeaurora.org>, reorder reads so we always
check device presence after sleep, since it's pointless to sleep unless we
recheck afterwards.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When bit4 is set in the PCIe Device Control register, it indicates
whether the device is permitted to use relaxed ordering.
On some platforms using relaxed ordering can have performance issues or
due to erratum can cause data-corruption. In such cases devices must avoid
using relaxed ordering.
The patch adds a new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING to indicate that
Relaxed Ordering (RO) attribute should not be used for Transaction Layer
Packets (TLP) targeted towards these affected root complexes.
This patch checks if there is any node in the hierarchy that indicates that
using relaxed ordering is not safe. In such cases the patch turns off the
relaxed ordering by clearing the capability for this device.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Multiple architectures define this as an empty function, and I'm adding
another one as part of the RISC-V port. Add a __weak version of
pcibios_fixup_bus() and delete the now-obselete ones in a handful of
ports.
The only functional change should be that microblaze used to export
pcibios_fixup_bus(). None of the other architectures exports this, so I
just dropped it.
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r3.1, sec 2.2.6.2 and 7.8.4, a Requester may not use 8-bit Tags
unless its Extended Tag Field Enable is set, but all Receivers/Completers
must handle 8-bit Tags correctly regardless of their Extended Tag Field
Enable.
Some devices do not handle 8-bit Tags as Completers, so add a quirk for
them. If we find such a device, we disable Extended Tags for the entire
hierarchy to make peer-to-peer DMA possible.
The Broadcom HT2100 seems to have issues with handling 8-bit tags. Mark it
as broken.
The pci_walk_bus() in the quirk handles devices we've enumerated in the
past, and pci_configure_device() handles devices we enumerate in the
future.
Fixes: 60db3a4d8c ("PCI: Enable PCIe Extended Tags if supported")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1467674
Reported-and-tested-by: Wim ten Have <wim.ten.have@oracle.com>
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: changelog, tweak messages, rename bit and quirk]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/irq-fixups:
arm64: PCI: Drop DT IRQ allocation from pcibios_alloc_irq()
PCI: xilinx-nwl: Move to struct pci_host_bridge IRQ mapping functions
PCI: rockchip: Move to struct pci_host_bridge IRQ mapping functions
PCI: xgene: Move to struct pci_host_bridge IRQ mapping functions
PCI: altera: Drop pci_fixup_irqs()
PCI: versatile: Drop pci_fixup_irqs()
PCI: generic: Drop pci_fixup_irqs()
PCI: faraday: Drop pci_fixup_irqs()
PCI: designware: Drop pci_fixup_irqs()
PCI: iproc: Drop pci_fixup_irqs()
PCI: rcar: Drop pci_fixup_irqs()
PCI: xilinx: Drop pci_fixup_irqs()
PCI: tegra: Drop pci_fixup_irqs()
ARM/PCI: Remove pci_fixup_irqs() call for bios32 host controllers
PCI: Add a call to pci_assign_irq() in pci_device_probe()
OF/PCI: Update of_irq_parse_and_map_pci() comment
PCI: Add pci_assign_irq() function and have pci_fixup_irqs() use it
PCI: Add IRQ mapping function pointers to pci_host_bridge struct
PCI: Build setup-irq.o on all arches
PCI: Remove pci_scan_root_bus_msi()
PCI: xilinx-nwl: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: rockchip: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: generic: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: xgene: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: xilinx: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: altera: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: versatile: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: iproc: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: rcar: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: aardvark: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: designware: Convert PCI scan API to pci_scan_root_bus_bridge()
ARM/PCI: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: Make pci_register_host_bridge() PCI core internal
PCI: Add pci_scan_root_bus_bridge() interface
PCI: tegra: Fix host bridge memory leakage
PCI: faraday: Fix host bridge memory leakage
PCI: Add devm_pci_alloc_host_bridge() interface
PCI: Add pci_free_host_bridge() interface
PCI: Initialize bridge release function at bridge allocation
PCI: faraday: Convert IRQ masking to raw PCI config accessors
PCI: iproc: Convert link check to raw PCI config accessors
PCI: xilinx-nwl: Remove nwl_pcie_enable_msi() unused bus parameter
The pci_scan_root_bus_bridge() function allows passing a parameterized
struct pci_host_bridge and scanning the resulting PCI bus; since the struct
msi_controller is part of the struct pci_host_bridge and the struct
pci_host_bridge can now be passed to pci_scan_root_bus_bridge() explicitly,
there is no need for a scan interface with a MSI controller parameter.
With all PCI host controller drivers and platform code relying on
pci_scan_root_bus_msi() converted over to pci_scan_root_bus_bridge() the
pci_scan_root_bus_msi() becomes obsolete and can be removed.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
John reported that an Intel QuickAssist crypto accelerator didn't work in a
Dell PowerEdge R730. The problem seems to be that we enabled ECRC when the
device doesn't support it:
85:00.0 Co-processor [0b40]: Intel Corporation DH895XCC Series QAT [8086:0435]
Capabilities: [100 v1] Advanced Error Reporting
AERCap: First Error Pointer: 00, GenCap- CGenEn+ ChkCap- ChkEn+
1302fcf0d0 ("PCI: Configure *all* devices, not just hot-added ones")
exposed the problem because it applies settings from the _HPX method to all
devices, not just hot-added ones. The R730 supplies an _HPX method that
allows the kernel to enable ECRC.
Only enable ECRC if the device advertises support for it.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1571798
Fixes: 1302fcf0d0 ("PCI: Configure *all* devices, not just hot-added ones")
Reported-by: John Mazzie <john_mazzie@dell.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
With the introduction of pci_scan_root_bus_bridge() there is no need to
export pci_register_host_bridge() to other kernel subsystems other than the
PCI compilation unit that needs it.
Make pci_register_host_bridge() static to its compilation unit and convert
the existing drivers usage over to pci_scan_root_bus_bridge().
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
The current pci_scan_root_bus() interface is made up of two main code
paths:
- pci_create_root_bus()
- pci_scan_child_bus()
pci_create_root_bus() is a wrapper function that allows to create a struct
pci_host_bridge structure, initialize it with the passed parameters and
register it with the kernel.
As the struct pci_host_bridge require additional struct members,
pci_create_root_bus() parameters list has grown in time, making it unwieldy
to add further parameters to it in case the struct pci_host_bridge gains
more members fields to augment its functionality.
Since PCI core code provides functions to allocate struct pci_host_bridge,
instead of forcing the pci_create_root_bus() interface to add new
parameters to cater for new struct pci_host_bridge functionality, it is
more suitable to add an interface in PCI core code to scan a PCI bus
straight from a struct pci_host_bridge created and customized by each
specific PCI host controller driver.
Add a pci_scan_root_bus_bridge() function to allow PCI host controller
drivers to create and initialize struct pci_host_bridge and scan the
resulting bus.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Struct pci_host_bridge can be allocated by PCI host bridge drivers which
usually allocate and map memory through devm managed interfaces.
Add a devm version for the pci_alloc_host_bridge() interface to simplify
PCI host controller driver porting and simplify the driver failure paths.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Commit a52d1443bb ("PCI: Export host bridge registration interface")
exported the pci_alloc_host_bridge() interface so that PCI host controllers
drivers can make use of it.
Introduce pci_alloc_host_bridge() kernel counterpart to free the host
bridge data structures, pci_free_host_bridge(), export it and update kernel
functions releasing host bridge objects allocated memory to make use of it.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
The introduction of pci_register_host_bridge() kernel interface allows PCI
host controller drivers to create the struct pci_host_bridge object,
initialize it and register it with the kernel so that its corresponding PCI
bus can be scanned and its devices probed.
The host bridge device release function pci_release_host_bridge_dev() is a
static function common for all struct pci_host_bridge allocated objects, so
in its current form cannot be used by PCI host bridge controllers drivers
to initialize the allocated struct pci_host_bridge, which leaves struct
pci_host_bridge devices release function uninitialized.
Since pci_release_host_bridge_dev() is a function common to all PCI host
bridge objects, initialize it in pci_alloc_host_bridge() (ie common host
bridge allocation interface) so that all struct pci_host_bridge objects
have their release function initialized by default at allocation time,
removing the need for exporting the common pci_release_host_bridge_dev()
function to other compilation units.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
The test for INTx masking via PCI_COMMAND_INTX_DISABLE performed in
pci_intx_mask_supported() should be done before the device can be used.
This is to avoid writing PCI_COMMAND while the driver owns the device, in
case that has any effect on MSI/MSI-X interrupts.
Move the content of pci_intx_mask_supported() to pci_intx_mask_broken() and
call it from pci_setup_device().
The test result can be queried at any time later using the same
pci_intx_mask_supported() interface as before (though with changed
implementation), so callers (uio, vfio) should be unaffected.
Signed-off-by: Piotr Gregor <piotrgregor@rsyncme.org>
[bhelgaas: changelog, remove quirk check, remove locking, move
dev->broken_intx_masking assignment to caller]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
This includes:
* Some code optimizations for the Intel VT-d driver
* Code to switch off a previously enabled Intel IOMMU
* Support for 'struct iommu_device' for OMAP, Rockchip and
Mediatek IOMMUs
* Some header optimizations for IOMMU core code headers and a
few fixes that became necessary in other parts of the kernel
because of that
* ACPI/IORT updates and fixes
* Some Exynos IOMMU optimizations
* Code updates for the IOMMU dma-api code to bring it closer to
use per-cpu iova caches
* New command-line option to set default domain type allocated
by the iommu core code
* Another command line option to allow the Intel IOMMU switched
off in a tboot environment
* ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using
an IDENTITY domain in conjunction with DMA ops, Support for
SMR masking, Support for 16-bit ASIDs (was previously broken)
* Various other small fixes and improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZEY4XAAoJECvwRC2XARrjth0QAKV56zjnFclv39aDo6eCq9CT
51+XT4bPY5VKQ2+Jx76TBNObHmGK+8KEMHfT9khpWJtFCDyy25SGckLry1nYqmZs
tSTsbj4sOeCyKzOLITlRN9/OzKXkjKAxYuq+sQZZFDFYf3kCM/eag0dGAU6aVLNp
tkIal3CSpGjCQ9M5JohrtQ1mwiGqCIkMIgvnBjRw+bfpLnQNG+VL6VU2G3RAkV2b
5Vbdoy+P7ZQnJSZr/bibYL2BaQs2diR4gOppT5YbsfniMq4QYSjheu1xBboGX8b7
sx8yuPi4370irSan0BDvlvdQdjBKIRiDjfGEKDhRwPhtvN6JREGakhEOC8MySQ37
mP96B72Lmd+a7DEl5udOL7tQILA0DcUCX0aOyF714khnZuFU5tVlCotb/36xeJ+T
FPc3RbEVQ90m8dYU6MNJ+ahtb/ZapxGTRfisIigB6wlnZa0Evabp9EJSce6oJMkm
whbBhDubeEU18n9XAaofMbu+P2LAzq8cxiRMlsDvT4mIy7jO86jjCmhpu1Tfn2GY
4wrEQZdWOMvhUsIhObXA0aC3BzC506uvnKPW3qy041RaxBuelWiBi29qzYbhxzkr
DLDpWbUZNYPyFJjttpavyQb2/XRduBTJdVP1pQpkJNDsW5jLiBkpSqm9xNADapRY
vLSYRX0JCIquaD+PAuxn
=3aE8
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- code optimizations for the Intel VT-d driver
- ability to switch off a previously enabled Intel IOMMU
- support for 'struct iommu_device' for OMAP, Rockchip and Mediatek
IOMMUs
- header optimizations for IOMMU core code headers and a few fixes that
became necessary in other parts of the kernel because of that
- ACPI/IORT updates and fixes
- Exynos IOMMU optimizations
- updates for the IOMMU dma-api code to bring it closer to use per-cpu
iova caches
- new command-line option to set default domain type allocated by the
iommu core code
- another command line option to allow the Intel IOMMU switched off in
a tboot environment
- ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an
IDENTITY domain in conjunction with DMA ops, Support for SMR masking,
Support for 16-bit ASIDs (was previously broken)
- various other small fixes and improvements
* tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (63 commits)
soc/qbman: Move dma-mapping.h include to qman_priv.h
soc/qbman: Fix implicit header dependency now causing build fails
iommu: Remove trace-events include from iommu.h
iommu: Remove pci.h include from trace/events/iommu.h
arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
ACPI/IORT: Fix CONFIG_IOMMU_API dependency
iommu/vt-d: Don't print the failure message when booting non-kdump kernel
iommu: Move report_iommu_fault() to iommu.c
iommu: Include device.h in iommu.h
x86, iommu/vt-d: Add an option to disable Intel IOMMU force on
iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed
iommu/arm-smmu: Correct sid to mask
iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code
omap3isp: Remove iommu_group related code
iommu/omap: Add iommu-group support
iommu/omap: Make use of 'struct iommu_device'
iommu/omap: Store iommu_dev pointer in arch_data
iommu/omap: Move data structures to omap-iommu.h
iommu/omap: Drop legacy-style device support
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZEHmsAAoJEFmIoMA60/r88SgQAJbFddueb0+DfJ+USDud4b/Z
akfS+G1UAm+TgtMyh1wM49dHzFssp36uWJxtWI+bPqBzuy94PMCbz7JVUV28gX9G
tFhFuc5YH94I/3y85rbZnolb6uZN9MhLjzTFqDC9ilW6HFqmwK4t4wlHSCjQN1St
svLYvs2G6n6/VK3Fre7/wOvdZ1erG4Qod+kn5Tx3K5TQydmRlaSBfK+DRANuDBkM
KzGO7Bkc/Cx8hb9pHmaey/wxmNrrgmVjTtWrEnb2tEq833zP4h6GhUIJEKodMSi5
gXPNZgKlu3n5L592M0UCh4EoHejzkv9wrcsoDm+djmsc5Zg2Howq4kAdHP8k4hUG
0gt8n0ni9vhJN56jikrGi7cAdHCKSNnx2Ue/qTCbX0ncB3XUMuJxJwCsgW/6wa9f
oU7tRtTS03UltnKoFAcyYclS4TaSY4SA4ySaK6Hi+cRkdVFDdyHQYbHHNSU7MsA+
IS2tXvGoIdSYyrZMHSRcl2rRTfYQUkmPEvBF3LvqZr32M4mJMmUNAPLZaly373ZE
iwq0ZJlrLeM0cqdFIG3S60RtJyQk/HBN1NMqrYHArWOxvWIgNd5F8NCsTTxY3wU3
IxgBIuUFcbVwVkqEHGs8K5AvB3oghqdnA3eGOV79799eMtLn3LOvyIlpHMSw9WUq
ags00JtMLitfNPBH3eSl
=eE4D
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- add framework for supporting PCIe devices in Endpoint mode (Kishon
Vijay Abraham I)
- use non-postable PCI config space mappings when possible (Lorenzo
Pieralisi)
- clean up and unify mmap of PCI BARs (David Woodhouse)
- export and unify Function Level Reset support (Christoph Hellwig)
- avoid FLR for Intel 82579 NICs (Sasha Neftin)
- add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)
- short-circuit config access failures for disconnected devices (Keith
Busch)
- remove D3 sleep delay when possible (Adrian Hunter)
- freeze PME scan before suspending devices (Lukas Wunner)
- stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)
- disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)
- add arch-specific alignment control to improve device passthrough by
avoiding multiple BARs in a page (Yongji Xie)
- add sysfs sriov_drivers_autoprobe to control VF driver binding
(Bodong Wang)
- allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)
- fix crashes when unbinding host controllers that don't support
removal (Brian Norris)
- add driver for MicroSemi Switchtec management interface (Logan
Gunthorpe)
- add driver for Faraday Technology FTPCI100 host bridge (Linus
Walleij)
- add i.MX7D support (Andrey Smirnov)
- use generic MSI support for Aardvark (Thomas Petazzoni)
- make Rockchip driver modular (Brian Norris)
- advertise 128-byte Read Completion Boundary support for Rockchip
(Shawn Lin)
- advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)
- convert atomic_t to refcount_t in HV driver (Elena Reshetova)
- add CPU IRQ affinity in HV driver (K. Y. Srinivasan)
- fix PCI bus removal in HV driver (Long Li)
- add support for ThunderX2 DMA alias topology (Jayachandran C)
- add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)
- add ITE 8893 bridge DMA alias quirk (Jarod Wilson)
- restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
(Manish Jaggi)
* tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
PCI: Don't allow unbinding host controllers that aren't prepared
ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
MAINTAINERS: Add PCI Endpoint maintainer
Documentation: PCI: Add userguide for PCI endpoint test function
tools: PCI: Add sample test script to invoke pcitest
tools: PCI: Add a userspace tool to test PCI endpoint
Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
misc: Add host side PCI driver for PCI test function device
PCI: Add device IDs for DRA74x and DRA72x
dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
PCI: dwc: dra7xx: Workaround for errata id i870
dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
PCI: dwc: dra7xx: Add EP mode support
PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
dt-bindings: PCI: Add DT bindings for PCI designware EP mode
PCI: dwc: designware: Add EP mode support
Documentation: PCI: Add binding documentation for pci-test endpoint function
ixgbe: Use pcie_flr() instead of duplicating it
IB/hfi1: Use pcie_flr() instead of duplicating it
PCI: imx6: Fix spelling mistake: "contol" -> "control"
...
* pci/resource:
PCI: Don't resize resources when realigning all devices in system
PCI: Don't reassign resources that are already aligned
PCI: Factor pci_reassigndev_resource_alignment()
powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
PCI: Add pcibios_default_alignment() for arch-specific alignment control
PCI: Fix calculation of bridge window's size and alignment
PCI: Ignore requested alignment for IOV BARs
PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant
Configuring DMA ops at probe time will allow deferring device probe when
the IOMMU isn't available yet. The dma_configure for the device is
now called from the generic device_attach callback just before the
bus/driver probe is called. This way, configuring the DMA ops for the
device would be called at the same place for all bus_types, hence the
deferred probing mechanism should work for all buses as well.
pci_bus_add_devices (platform/amba)(_device_create/driver_register)
| |
pci_bus_add_device (device_add/driver_register)
| |
device_attach device_initial_probe
| |
__device_attach_driver __device_attach_driver
|
driver_probe_device
|
really_probe
|
dma_configure
Similarly on the device/driver_unregister path __device_release_driver is
called which inturn calls dma_deconfigure.
This patch changes the dma ops configuration to probe time for
both OF and ACPI based platform/amba/pci bus devices.
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci part)
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
A 64-bit value is not needed since a PCI ROM address consists in 32 bits.
This fixes a clang warning about "implicit conversion from 'unsigned long'
to 'u32'".
Also remove now unnecessary casts to u32 from __pci_read_base() and
pci_std_update_resource().
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Local variables 'l' and 'sz' are uninitialized. Normally, they would
be initialized by pci_read_config_dword() but when an error occurs,
some drivers immediately return an error code, which leaves the
argument uninitialized.
Provide a safe initial value to make the code more robust.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Detect on probe whether a PCI device is part of a Thunderbolt controller.
Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234
on such devices. Detect presence of this VSEC and cache it in a newly
added is_thunderbolt bit in struct pci_dev.
Also, add a helper to check whether a given PCI device is situated on a
Thunderbolt daisy chain (i.e., below a PCI device with is_thunderbolt
set).
The necessity arises from the following:
* If an external Thunderbolt GPU is connected to a dual GPU laptop,
that GPU is currently registered with vga_switcheroo even though it
can neither drive the laptop's panel nor be powered off by the
platform. To vga_switcheroo it will appear as if two discrete
GPUs are present. As a result, when the external GPU is runtime
suspended, vga_switcheroo will cut power to the internal discrete GPU
which may not be runtime suspended at all at this moment. The
solution is to not register external GPUs with vga_switcheroo, which
necessitates a way to recognize if they're on a Thunderbolt daisy
chain.
* Dual GPU MacBook Pros introduced 2011+ can no longer switch external
DisplayPort ports between GPUs. (They're no longer just used for DP
but have become combined DP/Thunderbolt ports.) The driver to switch
the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence
of a Thunderbolt controller and, if found, keep external ports
permanently switched to the discrete GPU.
v2: Make kerneldoc for pci_is_thunderbolt_attached() more precise,
drop portion of commit message pertaining to separate series.
(Bjorn Helgaas)
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Amir Levy <amir.jer.levy@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: http://patchwork.freedesktop.org/patch/msgid/0ab165a4a35c0b60f29d4c306c653ead14fcd8f9.1489145162.git.lukas@wunner.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYrvYlAAoJEFmIoMA60/r8FuQQAMDpia3kacyCAJpa+zjmyMNF
1slytaoIvP37dFq9XF1em031lwGNr5sahZ7nP1EKgALz4odZUzait7BUABcfviIn
Uesz2E1s/miMo4/0X1j9DqY9xV649DmmSIgk1yn3kvCkH/+Ix27dexu47auGzPEb
H/sEfd1RZidjZ5EWaG0ww5FrHcuge+JHtcH6vFQtWsTOspcx++IhaVIGjC0JCpqK
DnlQKilsJ38KUkvuDcxWjtFKxAc8De9jvCR4kX96OvbHahfAWwBO4AtUv7U3JpJN
2nyQk+I5kRagbfBucaXZISUtWM7h4peLiL+TGkvKg8eOVlOCedjYlrZW4SWkbAN+
0qwcHRQ8lwhNmgp3VYq7pmnugIvW4P2Fh3uqaplCAIwlpODxWPDQP7HLM2kyzmvq
gPGi0R4Yo2PdIXqfbilrzbFVeyqkIFECr287a6+5PekC0DxsqZvOG0uA1mWKLIaH
pRQMT0FO2SCCSOpcxRExeIj+XxhXlDVOrIBP6eMiFXAMgzUAyU8fLSZVMtXAvsTS
02hVDOc/Fq2jKlCSoJRIiRp5aj1QDFS/DjBhOnW7pXuvUTCrfYBXY5NCdT9UV3Q7
W6qHWkizRmRDGxUzqSODRt5aU7VOKbWvZnp10eJyKt5s2Iawe6We5V1NX+u18UIS
Scc1nbuPTL6u1n8PsaBG
=4Owc
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- add ASPM L1 substate support
- enable PCIe Extended Tags when supported
- configure PCIe MPS settings on iProc, Versatile, X-Gene, and Xilinx
- increase VPD access timeout
- add ACS quirks for Intel Union Point, Qualcomm QDF2400 and QDF2432
- use new pci_irq_alloc_vectors() in more drivers
- fix MSI affinity memory leak
- remove unused MSI interfaces and update documentation
- remove unused AER .link_reset() callback
- avoid pci_lock / p->pi_lock deadlock seen with perf
- serialize sysfs enable/disable num_vfs operations
- move DesignWare IP from drivers/pci/host/ to drivers/pci/dwc/ and
refactor so we can support both hosts and endpoints
- add DT ECAM-like support for HiSilicon Hip06/Hip07 controllers
- add Rockchip system power management support
- add Thunder-X cn81xx and cn83xx support
- add Exynos 5440 PCIe PHY support
* tag 'pci-v4.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (93 commits)
PCI: dwc: Remove dependency of designware on CONFIG_PCI
PCI: dwc: Add CONFIG_PCIE_DW_HOST to enable PCI dwc host
PCI: dwc: Split pcie-designware.c into host and core files
PCI: dwc: designware: Fix style errors in pcie-designware.c
PCI: dwc: designware: Parse "num-lanes" property in dw_pcie_setup_rc()
PCI: dwc: all: Split struct pcie_port into host-only and core structures
PCI: dwc: designware: Get device pointer at the start of dw_pcie_host_init()
PCI: dwc: all: Rename cfg_read/cfg_write to read/write
PCI: dwc: all: Use platform_set_drvdata() to save private data
PCI: dwc: designware: Move register defines to designware header file
PCI: dwc: Use PTR_ERR_OR_ZERO to simplify code
PCI: dra7xx: Group PHY API invocations
PCI: dra7xx: Enable MSI and legacy interrupts simultaneously
PCI: dra7xx: Add support to force RC to work in GEN1 mode
PCI: dra7xx: Simplify probe code with devm_gpiod_get_optional()
PCI: Move DesignWare IP support to new drivers/pci/dwc/ directory
PCI: exynos: Support the PHY generic framework
Documentation: binding: Modify the exynos5440 PCIe binding
phy: phy-exynos-pcie: Add support for Exynos PCIe PHY
Documentation: samsung-phy: Add exynos-pcie-phy binding
...
Every PCIe device can generate 5-bit transaction Tags, which allow up to 32
concurrent requests. Some devices can generate 8-bit Extended Tags, which
allow up to 256 concurrent requests.
Per the ECN mentioned below, all PCIe Receivers are expected to support
Extended Tags, so devices are allowed (but not required) to enable them by
default.
If a device supports Extended Tags but does not enable them by default,
enable them. This allows the device to have up to 256 outstanding
transactions at a time, which may improve performance.
[bhelgaas: changelog, check for PCIe device]
Link: https://pcisig.com/sites/default/files/specification_documents/ECN_Extended_Tag_Enable_Default_05Sept2008_final.pdf
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A PCI-to-PCIe bridge (a "reverse bridge") has a PCI or PCI-X primary
interface and a PCI Express secondary interface. The PCIe interface is a
Downstream Port that originates a Link. See the "PCI Express to PCI/PCI-X
Bridge Specification", rev 1.0, sections 1.2 and A.6.
The bug report below involves a PCI-to-PCIe bridge and a PCIe switch below
the bridge:
00:1e.0 Intel 82801 PCI Bridge to [bus 01-0a]
01:00.0 Pericom PI7C9X111SL PCIe-to-PCI Reversible Bridge to [bus 02-0a]
02:00.0 Pericom Device 8608 [PCIe Upstream Port] to [bus 03-0a]
03:01.0 Pericom Device 8608 [PCIe Downstream Port] to [bus 0a]
01:00.0 is configured as a PCI-to-PCIe bridge (despite the name printed by
lspci). As we traverse a PCIe hierarchy, device connections alternate
between PCIe Links and internal Switch logic. Previously we did not
recognize that 01:00.0 had a secondary link, so we thought the 02:00.0
Upstream Port *did* have a secondary link. In fact, it's the other way
around: 01:00.0 has a secondary link, and 02:00.0 has internal Switch logic
on its secondary side.
When we thought 02:00.0 had a secondary link, the pci_scan_slot() ->
only_one_child() path assumed 02:00.0 could have only one child, so 03:00.0
was the only possible downstream device. But 03:00.0 doesn't exist, so we
didn't look for any other devices on bus 03.
Booting with "pci=pcie_scan_all" is a workaround, but we don't want users
to have to do that.
Recognize that PCI-to-PCIe bridges originate links on their secondary
interfaces.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=189361
Fixes: d0751b98df ("PCI: Add dev->has_secondary_link to track downstream PCIe links")
Tested-by: Blake Moore <blake.moore@men.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.2+
Previously we didn't check the type of device before trying to apply Type 1
(PCI-X) or Type 2 (PCIe) Setting Records from _HPX.
We don't support PCI-X Setting Records, so this was harmless, but the
warning was useless.
We do support PCIe Setting Records, and we didn't check whether a device
was PCIe before applying settings. I don't think anything bad happened on
non-PCIe devices because pcie_capability_clear_and_set_word(),
pcie_cap_has_lnkctl(), etc., would fail before doing any harm. But it's
ugly to depend on those internals.
Check the device type before attempting to apply Type 1 and Type 2 Setting
Records (Type 0 records are applicable to PCI, PCI-X, and PCIe devices).
A side benefit is that this prevents useless "not supported" warnings when
a BIOS supplies a Type 1 (PCI-X) Setting Record and we try to apply it to
every single device:
pci 0000:00:00.0: PCI-X settings not supported
After this patch, we'll get the warning only when a BIOS supplies a Type 1
record and we have a PCI-X device to which it should be applied.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYUt1vAAoJEFmIoMA60/r8abgP/3R+5Lsk5/kfAHk5/2Mtqbvg
mZ0eDUpY9GbUeMjSq84Nr2H8u7d+1AJCCu8KtDJYZCmjZpnSp2SuE2PS5JoGC7zC
fintD24jlIF4/J5+HeVXXmbfr3xATxvpTuiSLEi8sLBRJ3KRIswhMSwoPwOyeTQw
v/EclWKPGYcI5Zp0oigY9/Jd3q3lQ17KXppi/0dDoLh7PNOFvEHItXWzmf++u/NP
iYT9R1xmzEsy0/HRd6hiwPT2xA8YsAXxgobhHooUgh1FWmZ02Tg1WjgDemOW4lVh
kNIUcsLczh7wZCceogrrJ+pwb9+NyyIyKuHPv6OG3ieyz1IZdznaj1fAE5HJYiPo
eVS7cP1S6DyV3Y5qFj5F2dSRS7T4GXdXG5mNhmeCpUHs0vfzSCG36jLmhTy8UIxs
1rCf5oFa+uU9q0okfH8VtcGOXqWjGgyxTSGGfF71HUMLnPbsci2fxC2cO6svzIX7
wDY0uxOzpyMIYMuQR6iz7VqvAwEaZ+7pfMIrWWdDcQ9/5tCNJ49cLuKaThPL4bVu
juiGBQtnTLg8tjrhjDL9tQiJpuVIweVXyyQ1fvZoVXkMLlhVCF2ttirvwFUit2PB
84OlevQZ+9QdE/qalrWbv4qzhesuiwu0avkzjGoqg6tWTF0epu2AHI2vqy6UBYEG
tcfJPEcz1019PKZNSvWy
=ut0k
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"PCI changes:
- add support for PCI on ARM64 boxes with ACPI. We already had this
for theoretical spec-compliant hardware; now we're adding quirks
for the actual hardware (Cavium, HiSilicon, Qualcomm, X-Gene)
- add runtime PM support for hotplug ports
- enable runtime suspend for Intel UHCI that uses platform-specific
wakeup signaling
- add yet another host bridge registration interface. We hope this is
extensible enough to subsume the others
- expose device revision in sysfs for DRM
- to avoid device conflicts, make sure any VF BAR updates are done
before enabling the VF
- avoid unnecessary link retrains for ASPM
- allow INTx masking on Mellanox devices that support it
- allow access to non-standard VPD for Chelsio devices
- update Broadcom iProc support for PAXB v2, PAXC v2, inbound DMA,
etc
- update Rockchip support for max-link-speed
- add NVIDIA Tegra210 support
- add Layerscape LS1046a support
- update R-Car compatibility strings
- add Qualcomm MSM8996 support
- remove some uninformative bootup messages"
* tag 'pci-v4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (115 commits)
PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
PCI: Expand "VPD access disabled" quirk message
PCI: pciehp: Remove loading message
PCI: hotplug: Remove hotplug core message
PCI: Remove service driver load/unload messages
PCI/AER: Log AER IRQ when claiming Root Port
PCI/AER: Log errors with PCI device, not PCIe service device
PCI/AER: Remove unused version macros
PCI/PME: Log PME IRQ when claiming Root Port
PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors
PCI: Move config space size macros to pci_regs.h
x86/platform/intel-mid: Constify mid_pci_platform_pm
PCI/ASPM: Don't retrain link if ASPM not possible
PCI: iproc: Skip check for legacy IRQ on PAXC buses
PCI: pciehp: Leave power indicator on when enabling already-enabled slot
PCI: pciehp: Prioritize data-link event over presence detect
PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar
PCI: rcar: Use gen2 fallback compatibility last
PCI: rcar-gen2: Use gen2 fallback compatibility last
PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()
..
Allow PCI host bridge drivers to use the new host bridge interfaces to
register their host bridge.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Provide a way to allocate driver-specific data along with a PCI host bridge
structure. The bridge's ->private field points to this data.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Make the existing pci_host_bridge structure a proper device that is usable
by PCI host drivers in a more standard way. In addition to the existing
pci_scan_bus(), pci_scan_root_bus(), pci_scan_root_bus_msi(), and
pci_create_root_bus() interfaces, this unfortunately means having to add
yet another interface doing basically the same thing, and add some extra
code in the initial step.
However, this time it's more likely to be extensible enough that we won't
have to do another one again in the future, and we should be able to reduce
code much more as a result.
The main idea is to pull the allocation of 'struct pci_host_bridge' out of
the registration, and let individual host drivers and architecture code
fill the members before calling the registration function.
There are a number of things we can do based on this:
* Use a single memory allocation for the driver-specific structure
and the generic PCI host bridge
* consolidate the contents of driver-specific structures by moving
them into pci_host_bridge
* Add a consistent interface for removing a PCI host bridge again
when unloading a host driver module
* Replace the architecture specific __weak pcibios_*() functions with
callbacks in a pci_host_bridge device
* Move common boilerplate code from host drivers into the generic
function, based on contents of the structure
* Extend pci_host_bridge with additional members when needed without
having to add arguments to pci_scan_*().
* Move members of struct pci_bus into pci_host_bridge to avoid
having lots of identical copies.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
On DT based systems, the of_dma_configure() API implements DMA
configuration for a given device. On ACPI systems an API equivalent to
of_dma_configure() is missing which implies that it is currently not
possible to set-up DMA operations for devices through the ACPI generic
kernel layer.
This patch fills the gap by introducing acpi_dma_configure/deconfigure()
calls that for now are just wrappers around arch_setup_dma_ops() and
arch_teardown_dma_ops() and also updates ACPI and PCI core code to use
the newly introduced acpi_dma_configure/acpi_dma_deconfigure functions.
Since acpi_dma_configure() is used to configure DMA operations, the
function initializes the dma/coherent_dma masks to sane default values
if the current masks are uninitialized (also to keep the default values
consistent with DT systems) to make sure the device has a complete
default DMA set-up.
The DMA range size passed to arch_setup_dma_ops() is sized according
to the device coherent_dma_mask (starting at address 0x0), mirroring the
DT probing path behaviour when a dma-ranges property is not provided
for the device being probed; this changes the current arch_setup_dma_ops()
call parameters in the ACPI probing case, but since arch_setup_dma_ops()
is a NOP on all architectures but ARM/ARM64 this patch does not change
the current kernel behaviour on them.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> [pci]
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Tomasz Nowicki <tn@semihalf.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Tomasz Nowicki <tn@semihalf.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Tomasz Nowicki <tn@semihalf.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Per PCIe spec r3.0, sec 2.3.1.1, the Read Completion Boundary (RCB)
determines the naturally aligned address boundaries on which a Read Request
may be serviced with multiple Completions:
- For a Root Complex, RCB is 64 bytes or 128 bytes
This value is reported in the Link Control Register
Note: Bridges and Endpoints may implement a corresponding command bit
which may be set by system software to indicate the RCB value for the
Root Complex, allowing the Bridge/Endpoint to optimize its behavior
when the Root Complex’s RCB is 128 bytes.
- For all other system elements, RCB is 128 bytes
Per sec 7.8.7, if a Root Port only supports a 64-byte RCB, the RCB of all
downstream devices must be clear, indicating an RCB of 64 bytes. If the
Root Port supports a 128-byte RCB, we may optionally set the RCB of
downstream devices so they know they can generate larger Completions.
Some BIOSes supply an _HPX that tells us to set RCB, even though the Root
Port doesn't have RCB set, which may lead to Malformed TLP errors if the
Endpoint generates completions larger than the Root Port can handle.
The IBM x3850 X6 with BIOS version -[A8E120CUS-1.30]- 08/22/2016 supplies
such an _HPX and a Mellanox MT27500 ConnectX-3 device fails to initialize:
mlx4_core 0000:41:00.0: command 0xfff timed out (go bit not cleared)
mlx4_core 0000:41:00.0: device is going to be reset
mlx4_core 0000:41:00.0: Failed to obtain HW semaphore, aborting
mlx4_core 0000:41:00.0: Fail to reset HCA
------------[ cut here ]------------
kernel BUG at drivers/net/ethernet/mellanox/mlx4/catas.c:193!
After 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
and 7a1562d4f2 ("PCI: Apply _HPX Link Control settings to all devices
with a link"), we apply _HPX settings to *all* devices, not just those
hot-added after boot.
Before 7a1562d4f2, we didn't touch the Mellanox RCB, and the device
worked. After 7a1562d4f2, we set its RCB to 128, and it failed.
Set the RCB to 128 iff the Root Port supports a 128-byte RCB. Otherwise,
set RCB to 64 bytes. This effectively ignores what _HPX tells us about
RCB.
Note that this change only affects _HPX handling. If we have no _HPX, this
does nothing with RCB.
[bhelgaas: changelog, clear RCB if not set for Root Port]
Fixes: 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
Fixes: 7a1562d4f2 ("PCI: Apply _HPX Link Control settings to all devices with a link")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=187781
Tested-by: Frank Danapfel <fdanapfe@redhat.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
CC: stable@vger.kernel.org # v3.18+
Save the position of the error reporting capability so it doesn't need to
be rediscovered during error handling.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Lukas Wunner <lukas@wunner.de>
Add Precision Time Measurement (PTM) support (see PCIe r3.1, sec 6.22).
Enable PTM on PTM Root devices and switch ports. This does not enable PTM
on endpoints.
There currently are no PTM-capable devices on the market, but it is
expected to be supported by the Intel Apollo Lake platform.
[bhelgaas: complete rework]
Signed-off-by: Jonathan Yong <jonathan.yong@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/aspm:
PCI/ASPM: Remove redundant check of pcie_set_clkpm
* pci/dpc:
PCI: Remove DPC tristate module option
PCI: Bind DPC to Root Ports as well as Downstream Ports
PCI: Fix whitespace in struct dpc_dev
PCI: Convert Downstream Port Containment driver to use devm_* functions
* pci/hotplug:
PCI: Allow additional bus numbers for hotplug bridges
* pci/misc:
PCI: Include <asm/dma.h> for isa_dma_bridge_buggy
PCI: Make bus_attr_resource_alignment static
MAINTAINERS: Add file patterns for PCI device tree bindings
PCI: Fix comment typo
* pci/msi:
PCI/MSI: irqchip: Fix PCI_MSI dependencies
* pci/pm:
PCI: pciehp: Ignore interrupts during D3cold
PCI: Document connection between pci_power_t and hardware PM capability
PCI: Add runtime PM support for PCIe ports
ACPI / hotplug / PCI: Runtime resume bridge before rescan
PCI: Power on bridges before scanning new devices
PCI: Put PCIe ports into D3 during suspend
PCI: Don't clear d3cold_allowed for PCIe ports
PCI / PM: Enforce type casting for pci_power_t
* pci/virtualization:
PCI: Add ACS quirk for Solarflare SFC9220
PCI: Add DMA alias quirk for Adaptec 3805
PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset
PCI: Add function 1 DMA alias quirk for Marvell 88SE9182
A user may hot add a switch requiring more than one bus to enumerate. This
previously required a system reboot if BIOS did not sufficiently pad the
bus resource, which they frequently don't do.
Add a kernel parameter so a user can specify the minimum number of bus
numbers to reserve for a hotplug bridge's subordinate buses so rebooting
won't be necessary.
The default is 1, which is equivalent to previous behavior.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When a PCI device is removed through sysfs interface, the upstream bridge
(PCIe port) can be runtime suspended if it was the last device on that bus.
Now, if the bridge is in D3 we cannot find devices below the bridge
anymore. For example following fails to find the removed device again:
# echo 1 > /sys/bus/pci/devices/0000:00:01.0/0000:01:00.0/remove
# echo 1 > /sys/bus/pci/devices/0000:00:01.0/rescan
Where 0000:00:01.0 is the bridge device.
In order to be able to rescan devices below the bridge add
pm_runtime_get_sync()/pm_runtime_put() calls to pci_scan_bridge(). This
should keep bridges powered on while their children devices are being
scanned.
Reported-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Instead of assigning bus->domain_nr inside pci_bus_assign_domain_nr(),
return the domain and let the caller do the assignment. Rename
pci_bus_assign_domain_nr() to pci_bus_find_domain_nr() to reflect this.
No functional change intended.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
* pci/hotplug:
PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
* pci/resource:
PCI: Disable all BAR sizing for devices with non-compliant BARs
x86/PCI: Mark Broadwell-EP Home Agent 1 as having non-compliant BARs
PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs
b84106b4e2 ("PCI: Disable IO/MEM decoding for devices with non-compliant
BARs") disabled BAR sizing for BARs 0-5 of devices that don't comply with
the PCI spec. But it didn't do anything for expansion ROM BARs, so we
still try to size them, resulting in warnings like this on Broadwell-EP:
pci 0000:ff:12.0: BAR 6: failed to assign [mem size 0x00000001 pref]
Move the non-compliant BAR check from __pci_read_base() up to
pci_read_bases() so it applies to the expansion ROM BAR as well as
to BARs 0-5.
Note that direct callers of __pci_read_base(), like sriov_init(), will now
bypass this check. We haven't had reports of devices with broken SR-IOV
BARs yet.
[bhelgaas: changelog]
Fixes: b84106b4e2 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andi Kleen <ak@linux.intel.com>
Solve IOMMU support issues with PCIe non-transparent bridges that use
Requester ID look-up tables (RID-LUT), e.g., the PEX8733.
The NTB connects devices in two independent PCI domains. Devices separated
by the NTB are not able to discover each other. A PCI packet being
forwared from one domain to another has to have its RID modified so it
appears on correct bus and completions are forwarded back to the original
domain through the NTB. The RID is translated using a preprogrammed table
(LUT) and the PCI packet propagates upstream away from the NTB. If the
destination system has IOMMU enabled, the packet will be discarded because
the new RID is unknown to the IOMMU. Adding a DMA alias for the new RID
allows IOMMU to properly recognize the packet.
Each device behind the NTB has a unique RID assigned in the RID-LUT. The
current DMA alias implementation supports only a single alias, so it's not
possible to support mutiple devices behind the NTB when IOMMU is enabled.
Enable all possible aliases on a given bus (256) that are stored in a
bitset. Alias devfn is directly translated to a bit number. The bitset is
not allocated for devices that have no need for DMA aliases.
More details can be found in the following article:
http://www.plxtech.com/files/pdf/technical/expresslane/RTC_Enabling%20MulitHostSystemDesigns.pdf
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
* pci/resource:
PCI: Simplify pci_create_attr() control flow
PCI: Don't leak memory if sysfs_create_bin_file() fails
PCI: Simplify sysfs ROM cleanup
PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
ia64/PCI: Use ioremap() instead of open-coded equivalent
ia64/PCI: Use temporary struct resource * to avoid repetition
PCI: Clean up pci_map_rom() whitespace
PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
PCI: Set ROM shadow location in arch code, not in PCI core
PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
PCI: Don't assign or reassign immutable resources
PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
PCI: Disable IO/MEM decoding for devices with non-compliant BARs
* pci/host-hv:
PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs
PCI: Look up IRQ domain by fwnode_handle
PCI: Add fwnode_handle to x86 pci_sysdata
* pci/aer:
PCI/AER: Log aer_inject error injections
PCI/AER: Log actual error causes in aer_inject
PCI/AER: Use dev_warn() in aer_inject
PCI/AER: Fix aer_inject error codes
* pci/enumeration:
PCI: Fix broken URL for Dell biosdevname
* pci/kconfig:
PCI: Cleanup pci/pcie/Kconfig whitespace
PCI: Include pci/hotplug Kconfig directly from pci/Kconfig
PCI: Include pci/pcie/Kconfig directly from pci/Kconfig
* pci/misc:
PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
PCI: Add QEMU top-level IDs for (sub)vendor & device
unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition
PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h
PCI: Move pci_dma_* helpers to common code
frv/PCI: Remove stray pci_{alloc,free}_consistent() declaration
* pci/virtualization:
PCI: Wait for up to 1000ms after FLR reset
PCI: Support SR-IOV on any function type
* pci/vpd:
PCI: Prevent VPD access for buggy devices
PCI: Sleep rather than busy-wait for VPD access completion
PCI: Fold struct pci_vpd_pci22 into struct pci_vpd
PCI: Rename VPD symbols to remove unnecessary "pci22"
PCI: Remove struct pci_vpd_ops.release function pointer
PCI: Move pci_vpd_release() from header file to pci/access.c
PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code
PCI: Determine actual VPD size on first access
PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy
PCI: Allow access to VPD attributes with size 0
PCI: Update VPD definitions
Add pci_ops.{add,remove}_bus() callbacks, which will be called on every
newly created bus and when a bus is being removed, respectively. This can
be used by drivers to implement driver-specific initialization and teardown
of the bus, in addition to the architecture-specifics implemented by the
pcibios_add_bus() and the pcibios_remove_bus() functions.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There's only one kind of VPD, so we don't need to qualify it as "the
version described by PCI spec rev 2.2."
Rename the following symbols to remove unnecessary "pci22":
PCI_VPD_PCI22_SIZE -> PCI_VPD_MAX_SIZE
pci_vpd_pci22_size() -> pci_vpd_size()
pci_vpd_pci22_wait() -> pci_vpd_wait()
pci_vpd_pci22_read() -> pci_vpd_read()
pci_vpd_pci22_write() -> pci_vpd_write()
pci_vpd_pci22_ops -> pci_vpd_ops
pci_vpd_pci22_init() -> pci_vpd_init()
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
The PCI config header (first 64 bytes of each device's config space) is
defined by the PCI spec so generic software can identify the device and
manage its usage of I/O, memory, and IRQ resources.
Some non-spec-compliant devices put registers other than BARs where the
BARs should be. When the PCI core sizes these "BARs", the reads and writes
it does may have unwanted side effects, and the "BAR" may appear to
describe non-sensical address space.
Add a flag bit to mark non-compliant devices so we don't touch their BARs.
Turn off IO/MEM decoding to prevent the devices from consuming address
space, since we can't read the BARs to find out what that address space
would be.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
CC: stable@vger.kernel.org
If pci_host_bridge_msi_domain() can't find an IRQ domain through the OF
tree, try to look it up directly through the fwnode_handle.
[bhelgaas: changelog]
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
include/asm-generic/pci-bridge.h is now empty, so remove every #include of
it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com> (arm64)
The PCI flag management constants and functions were previously declared in
include/asm-generic/pci-bridge.h. But they are not specific to bridges,
and arches did not include pci-bridge.h consistently.
Move the following interfaces and related constants to include/linux/pci.h
and remove pci-bridge.h:
pci_set_flags()
pci_add_flags()
pci_clear_flags()
pci_has_flag()
This fixes these warnings when building for some arches:
drivers/pci/host/pcie-designware.c:562:20: error: 'PCI_PROBE_ONLY' undeclared (first use in this function)
drivers/pci/host/pcie-designware.c:562:7: error: implicit declaration of function 'pci_has_flag' [-Werror=implicit-function-declaration]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch introduces pci_msi_register_fwnode_provider() for irqchip
to register a callback, to provide a way to determine appropriate MSI
domain for a pci device.
It also introduces pci_host_bridge_acpi_msi_domain(), which returns
the MSI domain of the specified PCI host bridge with DOMAIN_BUS_PCI_MSI
bus token. Then, it is assigned to pci device.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* pci/aspm:
PCI/ASPM: Make sysfs link_state_store() consistent with link_state_show()
* pci/hotplug:
PCI: pciehp: Always protect pciehp_disable_slot() with hotplug mutex
* pci/misc:
x86/PCI: Simplify pci_bios_{read,write}
PCI: Simplify config space size computation
PCI: Limit config space size for Netronome NFP6000 family
PCI: Add Netronome vendor and device IDs
PCI: Support PCIe devices with short cfg_size
x86/PCI: Clarify AMD Fam10h config access restrictions comment
PCI: Print warnings for all invalid expansion ROM headers
PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask
* pci/msi:
PCI/MSI: Remove empty pci_msi_init_pci_dev()
PCI/MSI: Initialize MSI capability for all architectures
Restructure the logic so we return the config space size as soon as we know
it. This reduces indentation, removes negations, and removes gotos.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4a7cc83167 ("genirq/MSI: Move msi_list from struct pci_dev to struct
device") removed the contents of pci_msi_init_pci_dev(). All
implementation of it are now empty, so remove it completely.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't
support MSI") moved dev->msi_cap and dev->msix_cap initialization from the
pci_init_capabilities() path (used on all architectures) to the
pci_setup_device() path (not used on Open Firmware architectures).
This broke MSI or MSI-X on Open Firmware machines. 4d9aac397a
("powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case")
fixed it for PowerPC but not for SPARC.
Set up MSI and MSI-X (initialize msi_cap and msix_cap and disable MSI and
MSI-X) in pci_init_capabilities() so all architectures do it the same way.
This reverts 4d9aac397a since this patch fixes the problem generically
for both PowerPC and SPARC.
[bhelgaas: changelog, make pci_msi_setup_pci_dev() static]
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* acpi-smbus:
Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook"
ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code
* acpi-ec:
ACPI-EC: Drop unnecessary check made before calling acpi_ec_delete_query()
* acpi-pci:
PCI: Fix OF logic in pci_dma_configure()
This patch fixes a bug introduced by previous commit,
which incorrectly checkes the of_node of the end-point device.
Instead, it should check the of_node of the host bridge.
Fixes: 50230713b6 ("PCI: OF: Move of_pci_dma_configure() to pci_dma_configure()")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Support for the ACPI _CCA configuration object intended to tell
the OS whether or not a bus master device supports hardware
managed cache coherency and a new set of functions to allow
drivers to check the cache coherency support for devices in a
platform firmware interface agnostic way (Suravee Suthikulpanit,
Jeremy Linton).
- ACPI backlight quirks for ESPRIMO Mobile M9410 and Dell XPS L421X
(Aaron Lu, Hans de Goede).
- Fixes for the arm_big_little and s5pv210-cpufreq cpufreq drivers
(Jon Medhurst, Nicolas Pitre).
- kfree()-related fixup for the recently introduced CPPC cpufreq
frontend (Markus Elfring).
- intel_pstate fix reducing kernel log noise on systems where
P-states are managed by hardware (Prarit Bhargava).
- intel_pstate maintainers information update (Srinivas Pandruvada).
- cpufreq core optimization related to the handling of delayed work
items used by governors (Viresh Kumar).
- Locking fixes and cleanups of the Operating Performance Points
(OPP) framework (Viresh Kumar).
- Generic power domains framework cleanups (Lina Iyer).
- cpupower tool updates (Jacob Tanenbaum, Sriram Raghunathan,
Thomas Renninger).
- turbostat tool updates (Len Brown).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWQ96OAAoJEILEb/54YlRxyYYQALJ1HXu76SvYX1re2aawOw6Y
WgzF3Ly7JX034E1VvA2xP6wgkWpBRBDcpnRDeltNA4dYXPBDei/eTcRZTLX12N3g
AfFRGjGWTtLJfpNPecNMmUyF5xHjgDgMIQRabY+Is5NfP5STkPHJeqULnEpvTtx8
bd0lnC5jc4vuZiPEh1xVb+ClYDqWS8YQPyFJVjV/BaIf8Qwe5+oRX36byMBaKc9D
ZgmvmCk5n/HLQQ1uQsqe4xnhFLHN2rypt2BLvFrOtlnSz9VNNpQyB+OIW1mgCD4f
LhpKIwjP8NhZNQUq8HFu7nDlm8ciQtWmeMPB5NdGQ+OESu7yfKAOzQ+3U6Gl2Gaf
66zVGyV6SOJJwfDVJ3qKTtroWps9QV7ZClOJ+zJGgiujwU+tJ3pDQyZM9pa7CL3C
s7ZAUsI6IigSBjD3nJVOyG4DO0a8KQFCIE1mDmyqId45Qz8xJoOrYP33/ZnDuOdo
2OtL/emyfWsz9ixbHVfwIhb7EC6aoaUxQrhSWmNraaQS43YfioZR7h4we8gwenph
X4E1KY4SdML+uFf2VKIcd45NM3IBprCxx5UgFAJdrqe8+otqPNF2AVosG4iqhg/b
k4nxwuIvw2a8Fm77U9ytyXDYMItU/wIlAHMbnmgx+oTwRv6AbZ07MHkyfuQLYuhD
tq5Y14qSiTS7prNacx98
=XZiP
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management and ACPI updates from Rafael Wysocki:
"The only new feature in this batch is support for the ACPI _CCA device
configuration object, which it a pre-requisite for future ACPI PCI
support on ARM64, but should not affect the other architectures.
The rest is fixes and cleanups, mostly in cpufreq (including
intel_pstate), the Operating Performace Points (OPP) framework and
tools (cpupower and turbostat).
Specifics:
- Support for the ACPI _CCA configuration object intended to tell the
OS whether or not a bus master device supports hardware managed
cache coherency and a new set of functions to allow drivers to
check the cache coherency support for devices in a platform
firmware interface agnostic way (Suravee Suthikulpanit, Jeremy
Linton).
- ACPI backlight quirks for ESPRIMO Mobile M9410 and Dell XPS L421X
(Aaron Lu, Hans de Goede).
- Fixes for the arm_big_little and s5pv210-cpufreq cpufreq drivers
(Jon Medhurst, Nicolas Pitre).
- kfree()-related fixup for the recently introduced CPPC cpufreq
frontend (Markus Elfring).
- intel_pstate fix reducing kernel log noise on systems where
P-states are managed by hardware (Prarit Bhargava).
- intel_pstate maintainers information update (Srinivas Pandruvada).
- cpufreq core optimization related to the handling of delayed work
items used by governors (Viresh Kumar).
- Locking fixes and cleanups of the Operating Performance Points
(OPP) framework (Viresh Kumar).
- Generic power domains framework cleanups (Lina Iyer).
- cpupower tool updates (Jacob Tanenbaum, Sriram Raghunathan, Thomas
Renninger).
- turbostat tool updates (Len Brown)"
* tag 'pm+acpi-4.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
PCI: ACPI: Add support for PCI device DMA coherency
PCI: OF: Move of_pci_dma_configure() to pci_dma_configure()
of/pci: Fix pci_get_host_bridge_device leak
device property: ACPI: Remove unused DMA APIs
device property: ACPI: Make use of the new DMA Attribute APIs
device property: Adding DMA Attribute APIs for Generic Devices
ACPI: Adding DMA Attribute APIs for ACPI Device
device property: Introducing enum dev_dma_attr
ACPI: Honor ACPI _CCA attribute setting
cpufreq: CPPC: Delete an unnecessary check before the function call kfree()
PM / OPP: Add opp_rcu_lockdep_assert() to _find_device_opp()
PM / OPP: Hold dev_opp_list_lock for writers
PM / OPP: Protect updates to list_dev with mutex
PM / OPP: Propagate error properly from dev_pm_opp_set_sharing_cpus()
cpufreq: s5pv210-cpufreq: fix wrong do_div() usage
MAINTAINERS: update for intel P-state driver
Creating a common structure initialization pattern for struct option
cpupower: Enable disabled Cstates if they are below max latency
cpupower: Remove debug message when using cpupower idle-set -D switch
cpupower: cpupower monitor reports uninitialized values for offline cpus
...
This patch adds support for setting up PCI device DMA coherency from
ACPI _CCA object that should normally be specified in the DSDT node
of its PCI host bridge.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch move of_pci_dma_configure() to a more generic
pci_dma_configure(), which can be extended by non-OF code (e.g. ACPI).
This has no functional change.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Acked-by: Rob Herring <robh+dt@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/aer:
PCI/AER: Clear error status registers during enumeration and restore
* pci/hotplug:
PCI: pciehp: Queue power work requests in dedicated function
* pci/misc:
PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum
x86/PCI: Make pci_subsys_init() static
PCI: Add builtin_pci_driver() to avoid registration boilerplate
PCI: Remove unnecessary "if" statement
* pci/msi:
x86/PCI: Don't alloc pcibios-irq when MSI is enabled
PCI/MSI: Export all remapped MSIs to sysfs attributes
PCI: Disable MSI on SiS 761
* pci/resource:
sparc/PCI: Add mem64 resource parsing for root bus
PCI: Expand Enhanced Allocation BAR output
PCI: Make Enhanced Allocation bitmasks more obvious
PCI: Handle Enhanced Allocation capability for SR-IOV devices
PCI: Add support for Enhanced Allocation devices
PCI: Add Enhanced Allocation register entries
PCI: Handle IORESOURCE_PCI_FIXED when assigning resources
PCI: Handle IORESOURCE_PCI_FIXED when sizing resources
PCI: Clear IORESOURCE_UNSET when reverting to firmware-assigned address
* pci/virtualization:
PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
PCI: Wait 1 second between disabling VFs and clearing NumVFs
PCI: Reorder pcibios_sriov_disable()
PCI: Remove VFs in reverse order if virtfn_add() fails
PCI: Remove redundant validation of SR-IOV offset/stride registers
PCI: Set SR-IOV NumVFs to zero after enumeration
PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs
PCI: Don't try to restore VF BARs
Add support for devices using Enhanced Allocation entries instead of BARs.
This allows the kernel to parse the EA Extended Capability structure in PCI
config space and claim the BAR-equivalent resources.
See https://pcisig.com/sites/default/files/specification_documents/ECN_Enhanced_Allocation_23_Oct_2014_Final.pdf
[bhelgaas: add spec URL, s/pci_ea_set_flags/pci_ea_flags/, consolidate
declarations, print unknown property in hex to match spec]
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
[david.daney@cavium.com: Add more support/checking for Entry Properties,
allow EA behind bridges, rewrite some error messages.]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
So far, we've always considered that for a given PCI device, its
MSI controller was either set by the architecture-specific
pcibios hook, or simply inherited from the host bridge.
This doesn't cover things like firmware-defined topologies like
msi-map (DT) or IORT (ACPI), which can provide information about
which MSI controller to use on a per-device basis.
This patch adds the necessary hook into the MSI code to allow this
feature, and provides the msi-map functionnality as a first
implementation.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far, we have considered that the MSI domain for a device was
either set via the architecture-dependent pcibios implementation
or inherited from the host bridge.
As we're about to break that assumption, add pci_dev_msi_domain
which is the equivalent of pci_host_bridge_msi_domain, but for
a single device.
Other than moving things around a bit, this patch on its own
has no effect.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
SR-IOV creates a virtual bus where bus->self is NULL. When we add VFs and
scan for an MSI domain, pci_set_bus_msi_domain() dereferences bus->self,
which causes a kernel NULL pointer dereference oops.
Scan up to the parent bus until we find a real bridge where we can get the
MSI domain.
[bhelgaas: changelog]
Fixes: 44aa0c657e ("PCI/MSI: Add hooks to populate the msi_domain field")
Tested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
AER errors might be recorded when powering-on devices. These errors can be
ignored, so firmware usually clears them before the OS enumerates devices.
However, firmware is not involved when devices are added via hotplug, so
the OS may discover power-up errors that should be ignored. The same may
happen when powering up devices when resuming after suspend.
Clear the AER error status registers during enumeration and resume.
[bhelgaas: changelog, remove repetitive comments]
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Revert dff22d2054 ("PCI: Call pci_read_bridge_bases() from core instead
of arch code").
Reading PCI bridge windows is not arch-specific in itself, but there is PCI
core code that doesn't work correctly if we read them too early. For
example, Hannes found this case on an ARM Freescale i.mx6 board:
pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]
The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
0x204100 of space, and mem windows are megabyte-aligned.
Bus sizing can increase a bridge window size, but never *decrease* it (see
d65245c329 ("PCI: don't shrink bridge resources")). Prior to
dff22d2054, ARM didn't read bridge windows at all, so the "original size"
was zero, and we assigned a 3MB window.
After dff22d2054, we read the bridge windows before sizing the bus. The
firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
we never decrease the size, we kept 16MB even though we only needed 3MB.
But 16MB doesn't fit in the host bridge aperture, so we failed to assign
space for the window and the downstream devices.
I think this is a defect in the PCI core: we shouldn't rely on the firmware
to assign sensible windows.
Ray reported a similar problem, also on ARM, with Broadcom iProc.
Issues like this are too hard to fix right now, so revert dff22d2054.
Reported-by: Hannes <oe5hpm@gmail.com>
Reported-by: Ray Jui <rjui@broadcom.com>
Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map. This facility is used by the pmem driver to
enable pfn_to_page() operations on the page frames returned by DAX
('direct_access' in 'struct block_device_operations'). For now, the
'memmap' allocation for these "device" pages comes from "System
RAM". Support for allocating the memmap from device memory will
arrive in a later kernel.
2/ Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3. Completion of
the conversion is targeted for v4.4.
3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
5/ Miscellaneous updates and fixes to libnvdimm including support
for issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
slsw6DkrWT60CRE42nbK
=o57/
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().
Summary:
- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.
This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').
For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.
- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.
Completion of the conversion is targeted for v4.4.
- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"
* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...
Pull irq updates from Thomas Gleixner:
"This updated pull request does not contain the last few GIC related
patches which were reported to cause a regression. There is a fix
available, but I let it breed for a couple of days first.
The irq departement provides:
- new infrastructure to support non PCI based MSI interrupts
- a couple of new irq chip drivers
- the usual pile of fixlets and updates to irq chip drivers
- preparatory changes for removal of the irq argument from interrupt
flow handlers
- preparatory changes to remove IRQF_VALID"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
irqchip: Add documentation for the bcm2836 interrupt controller
irqchip/bcm2835: Add support for being used as a second level controller
irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
PCI: xilinx: Fix typo in function name
irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
irqchip/gic: Only allow the primary GIC to set the CPU map
PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
m68k/irq: Prepare irq handlers for irq argument removal
C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
blackfin: Prepare irq handlers for irq argument removal
arc/irq: Prepare idu_cascade_isr for irq argument removal
sparc/irq: Use access helper irq_data_get_affinity_mask()
sparc/irq: Use helper irq_data_get_irq_handler_data()
parisc/irq: Use access helper irq_data_get_affinity_mask()
mn10300/irq: Use access helper irq_data_get_affinity_mask()
irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
...
Commit 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel
doesn't support MSI") changed the location of the code that initialises
dev->msi_cap/msix_cap and then disables MSI/MSI-X interrupts at PCI
probe time in devices that have this flag set. It moved the code from
pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(),
called by pci_setup_device().
The pseries PCI probing code does not call pci_setup_device(), so since
the aforementioned commit the function pci_msi_setup_pci_dev() is not
called and MSI/MSI-X interrupts are left enabled. Additionally because
dev->msi_cap/msix_cap are not initialised no driver can ever enable
MSI/MSI-X.
To fix this, the pseries PCI probe should manually call
pci_msi_setup_pci_dev(), so this patch makes it non-static.
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Update change log to mention dev->msi_cap/msix_cap]
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Firmware typically configures the PCIe fabric with a consistent Max Payload
Size setting based on the devices present at boot. A hot-added device
typically has the power-on default MPS setting (128 bytes), which may not
match the fabric.
The previous Linux default, in the absence of any "pci=pcie_bus_*" options,
was PCIE_BUS_TUNE_OFF, in which we never touch MPS, even for hot-added
devices.
Add a new default setting, PCIE_BUS_DEFAULT, in which we make sure every
device's MPS setting matches the upstream bridge. This makes it more
likely that a hot-added device will work in a system with optimized MPS
configuration.
Note that if we hot-add a device that only supports 128-byte MPS, it still
likely won't work because we don't reconfigure the rest of the fabric.
Booting with "pci=pcie_bus_peer2peer" is a workaround for this because it
sets MPS to 128 for everything.
[bhelgaas: changelog, new default, rework for pci_configure_device() path]
Tested-by: Keith Busch <keith.busch@intel.com>
Tested-by: Jordan Hargrave <jharg93@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Previously we checked for invalid MPS settings, i.e., a device with MPS
different than its upstream bridge, in pcie_bus_detect_mps(). We only did
this if the arch or hotplug driver called pcie_bus_configure_settings(),
and then only if PCIe bus tuning was disabled (PCIE_BUS_TUNE_OFF).
Move the MPS checking code to pci_configure_device(), so we do it in the
pci_device_add() path for every device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add a pci_scan_root_bus_msi() interface so an arch can specify the MSI
controller up front. This removes the need for a pcibios callback to set
the MSI controller later.
This is not exported because I'd like to replace the variety of "scan root
bus" interfaces with a single, more extensible interface that can handle
the MSI controller, domain, pci_ops, resources, etc. I hope this interface
is temporary.
[bhelgaas: changelog, split into separate patch]
Suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
We should not assume any particular hardware topology. Commit d0751b98df
("PCI: Add dev->has_secondary_link to track downstream PCIe links") relied
on the assumption that every PCIe hierarchy is rooted at a Root Port. But
we can't rely on any assumption about what hardware we will find; we just
have to deal with the world as it is.
On some platforms, PCIe devices (endpoints, switch upstream ports, etc.)
appear directly on the root bus, and there is no Root Port in the PCI bus
hierarchy. For example, Meelis observed these top-level devices on a
Sparc V245:
0000:02:00.0 PCI bridge to [bus 03-0d] Switch Upstream Port
0001:02:00.0 PCI bridge to [bus 03] PCIe to PCI/PCI-X Bridge
These devices *look* like they have links going upstream, but there really
are no upstream devices.
In set_pcie_port_type(), we used the parent device to figure out which side
of a switch port has a link, so if the parent device did not exist, we
dereferenced a NULL parent pointer.
Check whether the parent device exists before dereferencing it.
Meelis observed this oops on Sparc V245 and T2000. Ben Herrenschmidt says
this is also possible on IBM PowerVM guests on PowerPC.
[bhelgaas: changelog, comment]
Link: http://lkml.kernel.org/r/alpine.LRH.2.20.1508122118210.18637@math.ut.ee
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
* pci/hotplug:
PCI: pciehp: Remove ignored MRL sensor interrupt events
PCI: pciehp: Remove unused interrupt events
PCI: pciehp: Handle invalid data when reading from non-existent devices
PCI: Hold pci_slot_mutex while searching bus->slots list
PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem
PCI: pciehp: Simplify pcie_poll_cmd()
PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot
* pci/iommu:
PCI: Remove pci_ats_enabled()
PCI: Stop caching ATS Invalidate Queue Depth
PCI: Move ATS declarations to linux/pci.h so they're all together
PCI: Clean up ATS error handling
PCI: Use pci_physfn() rather than looking up physfn by hand
PCI: Inline the ATS setup code into pci_ats_init()
PCI: Rationalize pci_ats_queue_depth() error checking
PCI: Reduce size of ATS structure elements
PCI: Embed ATS info directly into struct pci_dev
PCI: Allocate ATS struct during enumeration
iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth
* pci/irq:
PCI: Kill off set_irq_flags() usage
* pci/virtualization:
PCI: Add ACS quirks for Intel I219-LM/V
Previously, we allocated pci_ats structures when an IOMMU driver called
pci_enable_ats(). An SR-IOV VF shares the STU setting with its PF, so when
enabling ATS on the VF, we allocated a pci_ats struct for the PF if it
didn't already have one. We held the sriov->lock to serialize threads
concurrently enabling ATS on several VFS so only one would allocate the PF
pci_ats.
Gregor reported a deadlock here:
pci_enable_sriov
sriov_enable
virtfn_add
mutex_lock(dev->sriov->lock) # acquire sriov->lock
pci_device_add
device_add
BUS_NOTIFY_ADD_DEVICE notifier chain
iommu_bus_notifier
amd_iommu_add_device # iommu_ops.add_device
init_iommu_group
iommu_group_get_for_dev
iommu_group_add_device
__iommu_attach_device
amd_iommu_attach_device # iommu_ops.attach_device
attach_device
pci_enable_ats
mutex_lock(dev->sriov->lock) # deadlock
There's no reason to delay allocating the pci_ats struct, and if we
allocate it for each device at enumeration-time, there's no need for
locking in pci_enable_ats().
Allocate pci_ats struct during enumeration, when we initialize other
capabilities.
Note that this implementation requires ATS to be enabled on the PF first,
before on any of the VFs because the PF controls the STU for all the VFs.
Link: http://permalink.gmane.org/gmane.linux.kernel.iommu/9433
Reported-by: Gregor Dick <gdick@solarflare.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Quoting Arnd:
I was thinking the opposite approach and basically removing all uses
of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
them.and we can probably replace them all with hardcoded
ioremap_cached() calls in the cases they are actually useful.
All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().
Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* pci/irq:
PCI/MSI: Free legacy IRQ when enabling MSI/MSI-X
PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed
PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()
PCI: Add pcibios_alloc_irq() and pcibios_free_irq()
* pci/misc:
PCI: Remove unused "pci_probe" flags
PCI: Add VPD function 0 quirk for Intel Ethernet devices
PCI: Add dev_flags bit to access VPD through function 0
PCI / ACPI: Fix pci_acpi_optimize_delay() comment
PCI: Remove a broken link in quirks.c
PCI: Remove useless redundant code
PCI: Simplify pci_find_(ext_)capability() return value checks
PCI: Move PCI_FIND_CAP_TTL to pci.h and use it in quirks
PCI: Add pcie_downstream_port() (true for Root and Switch Downstream Ports)
PCI: Fix pcie_port_device_resume() comment
PCI: Shift PCI_CLASS_NOT_DEFINED consistently with other classes
PCI: Revert aeb30016fe ("PCI: add Intel USB specific reset method")
PCI: Fix TI816X class code quirk
PCI: Fix generic NCR 53c810 class code quirk
PCI: Use PCI_CLASS_SERIAL_USB instead of bare number
PCI: Add quirk for Intersil/Techwell TW686[4589] AV capture cards
PCI: Remove Intel Cherrytrail D3 delays
* pci/resource:
PCI: Call pci_read_bridge_bases() from core instead of arch code
* pci/virtualization:
PCI: Restore ACS configuration as part of pci_restore_state()
Previously, pci_setup_device() and similar functions searched the
pci_bus->slots list without any locking. It was possible for another
thread to update the list while we searched it.
Add pci_dev_assign_slot() to search the list while holding pci_slot_mutex.
[bhelgaas: changelog, fold in CONFIG_SYSFS fix]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In order to populate the PCI host bridge msi_domain, use the
"msi-parent" attribute to lookup a corresponding irq domain.
If found, this is our MSI domain.
This gets plugged into the core PCI code.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Ma Jun <majun258@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Duc Dang <dhdang@apm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1438091186-10244-6-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In order to be able to populate the device msi_domain field,
add the necessary hooks to propagate the host bridge msi_domain
across secondary busses to devices.
So far, nobody populates the initial msi_domain.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Ma Jun <majun258@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Duc Dang <dhdang@apm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1438091186-10244-5-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When we scan a PCI bus, we read PCI-PCI bridge window registers with
pci_read_bridge_bases() so we can validate the resource hierarchy. Most
architectures call pci_read_bridge_bases() from pcibios_fixup_bus(), but
PCI-PCI bridges are not arch-specific, so this doesn't need to be in
arch-specific code.
Call pci_read_bridge_bases() directly from the PCI core instead of from
arch code.
For alpha and mips, we now call pci_read_bridge_bases() always; previously
we only called it if PCI_PROBE_ONLY was set.
[bhelgaas: changelog]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: James E.J. Bottomley <jejb@parisc-linux.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: David Howells <dhowells@redhat.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Tony Luck <tony.luck@intel.com>
CC: David S. Miller <davem@davemloft.net>
CC: Ingo Molnar <mingo@redhat.com>
CC: Guenter Roeck <linux@roeck-us.net>
CC: Michal Simek <monstr@monstr.eu>
CC: Chris Zankel <chris@zankel.net>
The PCI class in dev->class is a three-byte value comprising a base class,
sub-class, and interface type. PCI_CLASS_NOT_DEFINED includes the base
class and sub-class, but not the interface type, so it should be shifted to
make space for the interface. It happens that PCI_CLASS_NOT_DEFINED is
zero, so it doesn't matter in the end, but we should still use it
consistently with other class definitions.
Treat PCI_CLASS_NOT_DEFINED as a base class/sub-class value that should
appear in bits 8-23 of dev->class.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No one uses pci_scan_bus_parented() any more, remove it.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
David Ahern reported that d63e2e1f3d ("sparc/PCI: Clip bridge windows
to fit in upstream windows") fails to boot on sparc/T5-8:
pci 0000:06:00.0: reg 0x184: can't handle BAR above 4GB (bus address 0x110204000)
The problem is that sparc64 assumed that dma_addr_t only needed to hold DMA
addresses, i.e., bus addresses returned via the DMA API (dma_map_single(),
etc.), while the PCI core assumed dma_addr_t could hold *any* bus address,
including raw BAR values. On sparc64, all DMA addresses fit in 32 bits, so
dma_addr_t is a 32-bit type. However, BAR values can be 64 bits wide, so
they don't fit in a dma_addr_t. d63e2e1f3d added new checking that
tripped over this mismatch.
Add pci_bus_addr_t, which is wide enough to hold any PCI bus address,
including both raw BAR values and DMA addresses. This will be 64 bits
on 64-bit platforms and on platforms with a 64-bit dma_addr_t. Then
dma_addr_t only needs to be wide enough to hold addresses from the DMA API.
[bhelgaas: changelog, bugzilla, Kconfig to ensure pci_bus_addr_t is at
least as wide as dma_addr_t, documentation]
Fixes: d63e2e1f3d ("sparc/PCI: Clip bridge windows to fit in upstream windows")
Fixes: 23b13bc76f ("PCI: Fail safely if we can't handle BARs larger than 4GB")
Link: http://lkml.kernel.org/r/CAE9FiQU1gJY1LYrxs+ma5LCTEEe4xmtjRG0aXJ9K_Tsu+m9Wuw@mail.gmail.com
Link: http://lkml.kernel.org/r/1427857069-6789-1-git-send-email-yinghai@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=96231
Reported-by: David Ahern <david.ahern@oracle.com>
Tested-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
CC: stable@vger.kernel.org # v3.19+
Previously we assumed that PCIe Root Ports and Downstream Ports had Links
on their secondary side. That is true in most systems, but it is possible
to connect a switch with either an Upstream or a Downstream Port leading
downstream.
Instead of relying on the component type to identify devices that have
links leading downstream, use the "dev->has_secondary_link" field.
[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A PCIe Port is an interface to a Link. A Root Port is a PCI-PCI bridge in
a Root Complex and has a Link on its secondary (downstream) side. For
other Ports, the Link may be on either the upstream (closer to the Root
Complex) or downstream side of the Port.
The usual topology has a Root Port connected to an Upstream Port. We
previously assumed this was the only possible topology, and that a
Downstream Port's Link was always on its downstream side, like this:
+---------------------+
+------+ | Downstream |
| Root | | Upstream Port +--Link--
| Port +--Link--+ Port |
+------+ | Downstream |
| Port +--Link--
+---------------------+
But systems do exist (see URL below) where the Root Port is connected to a
Downstream Port. In this case, a Downstream Port's Link may be on either
the upstream or downstream side:
+---------------------+
+------+ | Upstream |
| Root | | Downstream Port +--Link--
| Port +--Link--+ Port |
+------+ | Downstream |
| Port +--Link--
+---------------------+
We can't use the Port type to determine which side the Link is on, so add a
bit in struct pci_dev to keep track.
A Root Port's Link is always on the Port's secondary side. A component
(Endpoint or Port) on the other end of the Link obviously has the Link on
its upstream side. If that component is a Port, it is part of a Switch or
a Bridge. A Bridge has a PCI or PCI-X bus on its secondary side, not a
Link. The internal bus of a Switch connects the Port to another Port whose
Link is on the downstream side.
[bhelgaas: changelog, comment, cache "type", use if/else]
Link: http://lkml.kernel.org/r/54EB81B2.4050904@pobox.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=94361
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If we enable MSI, then kexec a new kernel, the new kernel may receive MSIs
it is not prepared for. Commit d5dea7d95c ("PCI: msi: Disable msi
interrupts when we initialize a pci device") prevents this, but only if the
new kernel is built with CONFIG_PCI_MSI=y.
Move the "disable MSI" functionality from drivers/pci/msi.c to a new
pci_msi_setup_pci_dev() in drivers/pci/probe.c so we can disable MSIs when
we enumerate devices even if the kernel doesn't include full MSI support.
[bhelgaas: changelog, disable MSIs in pci_setup_device(), put
pci_msi_setup_pci_dev() at its final destination]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Export the following symbols so they can be referenced by a PCI host bridge
driver compiled as a kernel loadable module:
pci_common_swizzle
pci_create_root_bus
pci_stop_root_bus
pci_remove_root_bus
pci_assign_unassigned_bus_resources
pci_fixup_irqs
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Previously, pci_scan_root_bus() created a root PCI bus, enumerated the
devices on it, and called pci_bus_add_devices(), which made the devices
available for drivers to claim them.
Most callers assigned resources to devices after pci_scan_root_bus()
returns, which may be after drivers have claimed the devices. This is
incorrect; the PCI core should not change device resources while a driver
is managing the device.
Remove pci_bus_add_devices() from pci_scan_root_bus() and do it after any
resource assignment in the callers.
Note that ARM's pci_common_init_dev() already called pci_bus_add_devices()
after pci_scan_root_bus(), so we only need to remove the first call:
pci_common_init_dev
pcibios_init_hw
pci_scan_root_bus
pci_bus_add_devices # first call
pci_bus_assign_resources
pci_bus_add_devices # second call
[bhelgaas: changelog, drop "root_bus" var in alpha common_init_pci(),
return failure earlier in mn10300, add "return" in x86 pcibios_scan_root(),
return early if xtensa platform_pcibios_fixup() fails]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
CC: Matt Turner <mattst88@gmail.com>
CC: David Howells <dhowells@redhat.com>
CC: Tony Luck <tony.luck@intel.com>
CC: Michal Simek <monstr@monstr.eu>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Chris Metcalf <cmetcalf@ezchip.com>
CC: Chris Zankel <chris@zankel.net>
CC: Max Filippov <jcmvbkbc@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Previously, pci_scan_bus() created a root PCI bus, enumerated the devices
on it, and called pci_bus_add_devices(), which made the devices available
for drivers to claim them.
Most callers assigned resources to devices after pci_scan_bus() returns,
which may be after drivers have claimed the devices. This is incorrect;
the PCI core should not change device resources while a driver is managing
the device.
Remove pci_bus_add_devices() from pci_scan_bus() and do it after any
resource assignment in the callers.
[bhelgaas: changelog, check for failure in mcf_pci_init()]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Guan Xuetao <gxt@mprc.pku.edu.cn>
CC: Richard Henderson <rth@twiddle.net>
CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
CC: Matt Turner <mattst88@gmail.com>
If there is a DT node available for the root bridge's parent device, use
the DMA configuration from that device node. For example, Keystone PCI
devices would require dma_pfn_offset to be set correctly in the device
structure of the PCI device in order to have the correct DMA mask. The DT
node will have dma-ranges defined for this. Also support using the DT
property dma-coherent to allow coherent DMA operation by the PCI device.
Use the new helper function of_pci_dma_configure() to update the device DMA
configuration. This fixes DMA on systems where DMA addresses are a
constant offset from CPU physical addresses.
Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> (AMD Seattle)
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: Grant Likely <grant.likely@linaro.org>
CC: Rob Herring <robh+dt@kernel.org>
CC: Russell King <linux@arm.linux.org.uk>
CC: Arnd Bergmann <arnd@arndb.de>
Use common resource list management data structure and interfaces
instead of private implementation.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As a consequence of restoring the detection of invalid BARs, add a new
informational printk like the following when such occurrences are
encountered.
pci ssss:bb:dd.f: [Firmware Bug]: reg 0xXX: invalid BAR (can't size)
Reported-by: William Unruh <unruh@physics.ubc.ca>
Reported-by: Martin Lucina <martin@lucina.net>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Matthew Wilcox <willy@linux.intel.com>
Aaron reported that a 32-bit x86 kernel with Physical Address Extension
(PAE) support complains about bridge prefetchable memory windows above 4GB:
pci_bus 0000:00: root bus resource [mem 0x380000000000-0x383fffffffff]
...
pci 0000:03:00.0: reg 0x10: [mem 0x383fffc00000-0x383fffdfffff 64bit pref]
pci 0000:03:00.0: reg 0x20: [mem 0x383fffe04000-0x383fffe07fff 64bit pref]
pci 0000:03:00.1: reg 0x10: [mem 0x383fffa00000-0x383fffbfffff 64bit pref]
pci 0000:03:00.1: reg 0x20: [mem 0x383fffe00000-0x383fffe03fff 64bit pref]
pci 0000:00:02.2: PCI bridge to [bus 03-04]
pci 0000:00:02.2: bridge window [io 0x1000-0x1fff]
pci 0000:00:02.2: bridge window [mem 0x91900000-0x91cfffff]
pci 0000:00:02.2: can't handle 64-bit address space for bridge
In this kernel, unsigned long is 32 bits and dma_addr_t is 64 bits.
Previously we used "unsigned long" to hold the bridge window address. But
this is a bus address, so we should use dma_addr_t instead.
Use dma_addr_t to hold the bridge window base and limit.
The question of whether the CPU can actually *address* the window is
separate and depends on what the physical address space of the CPU is and
whether the host bridge does any address translation.
[bhelgaas: fix "shift count > width of type", changelog, stable tag]
Fixes: d56dbf5bab ("PCI: Allocate 64-bit BARs above 4G when possible")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=88131
Reported-by: Aaron Ma <mapengyu@gmail.com>
Tested-by: Aaron Ma <mapengyu@gmail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.14+
Previously we applied _HPX type 2 record Link Control register settings
only to bridges with a subordinate bus. But it's better to apply them to
all devices with a link because if the subordinate bus has not been
allocated yet, we won't apply settings to the device.
Use pcie_cap_has_lnkctl() to determine whether the device has a Link
Control register instead of looking at dev->subordinate.
[bhelgaas: changelog]
Fixes: 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The functions pci_dev_put(), pci_pme_wakeup_bus(), and put_device() return
immediately if their argument is NULL. Thus the test before the call is
not needed.
Remove these unnecessary tests.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
__pci_read_base() disables decoding while sizing device BARs. We can't
print while decoding is disabled, which leads to some rather messy exit
logic.
Coalesce the sizing logic to minimize the time decoding is disabled. This
lets us print errors where they're detected.
The refactoring also takes advantage of the symmetry of obtaining the BAR's
extent (pci_size) and storing the result as the 'region' for both the
32-bit and 64-bit BARs, consolidating both cases.
No functional change intended.
[bhelgaas: move pci_size() up, per Thomas Petazzoni, Thierry Reding, Kevin Hilman]
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Matthew Wilcox <willy@linux.intel.com>
Commit 6ac665c63d ("PCI: rewrite PCI BAR reading code") masked off
low-order bits from 'l', but not from 'sz'. Both are passed to pci_size(),
which compares 'base == maxbase' to check for read-only BARs. The masking
of 'l' means that comparison will never be 'true', so the check for
read-only BARs no longer works.
Resolve this by also masking off the low-order bits of 'sz' before passing
it into pci_size() as 'maxbase'. With this change, pci_size() will once
again catch the problems that have been encountered to date:
- AGP aperture BAR of AMD-7xx host bridges: if the AGP window is
disabled, this BAR is read-only and read as 0x00000008 [1]
- BARs 0-4 of ALi IDE controllers can be non-zero and read-only [1]
- Intel Sandy Bridge - Thermal Management Controller [8086:0103];
BAR 0 returning 0xfed98004 [2]
- Intel Xeon E5 v3/Core i7 Power Control Unit [8086:2fc0];
Bar 0 returning 0x00001a [3]
Link: [1] https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/drivers/pci/probe.c?id=1307ef6621991f1c4bc3cec1b5a4ebd6fd3d66b9 ("PCI: probing read-only BARs" (pre-git))
Link: [2] https://bugzilla.kernel.org/show_bug.cgi?id=43331
Link: [3] https://bugzilla.kernel.org/show_bug.cgi?id=85991
Reported-by: William Unruh <unruh@physics.ubc.ca>
Reported-by: Martin Lucina <martin@lucina.net>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Matthew Wilcox <willy@linux.intel.com>
CC: stable@vger.kernel.org # v2.6.27+
* pci/host-generic:
arm64: Add architectural support for PCI
PCI: Add pci_remap_iospace() to map bus I/O resources
of/pci: Add support for parsing PCI host bridge resources from DT
of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()
PCI: Add generic domain handling
of/pci: Fix the conversion of IO ranges into IO resources
of/pci: Move of_pci_range_to_resource() to of/address.c
ARM: Define PCI_IOBASE as the base of virtual PCI IO space
of/pci: Add pci_register_io_range() and pci_pio_to_address()
asm-generic/io.h: Fix ioport_map() for !CONFIG_GENERIC_IOMAP
Conflicts:
drivers/pci/host/pci-tegra.c
The handling of PCI domains (or PCI segments in ACPI speak) is usually a
straightforward affair but its implementation is currently left to the
architectural code, with pci_domain_nr(b) querying the value of the domain
associated with bus b.
This patch introduces CONFIG_PCI_DOMAINS_GENERIC as an option that can be
selected if an architecture wants a simple implementation where the value
of the domain associated with a bus is stored in struct pci_bus.
The architectures that select CONFIG_PCI_DOMAINS_GENERIC will then have to
implement pci_bus_assign_domain_nr() as a way of setting the domain number
associated with a root bus. All child buses except the root bus will
inherit the domain_nr value from their parent.
Signed-off-by: Catalin Marinas <Catalin.Marinas@arm.com>
[Renamed pci_set_domain_nr() to pci_bus_assign_domain_nr()]
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arnd Bergmann <arnd@arndb.de>
This reverts commit 1820ffdccb ("PCI: Make sure bus number resources stay
within their parents bounds") because it breaks some systems with LSI Logic
FC949ES Fibre Channel Adapters, apparently by exposing a defect in those
adapters.
Dirk tested a Tyan VX50 (B4985) with this device that worked like this
prior to 1820ffdccb:
bus: [bus 00-7f] on node 0 link 1
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-07])
pci 0000:00:0e.0: PCI bridge to [bus 0a]
pci_bus 0000:0a: busn_res: can not insert [bus 0a] under [bus 00-07] (conflicts with (null) [bus 00-07])
pci 0000:0a:00.0: [1000:0646] type 00 class 0x0c0400 (FC adapter)
Note that the root bridge [bus 00-07] aperture is wrong; this is a BIOS
defect in the PCI0 _CRS method. But prior to 1820ffdccb, we didn't
enforce that aperture, and the FC adapter worked fine at 0a:00.0.
After 1820ffdccb, we notice that 00:0e.0's aperture is not contained in
the root bridge's aperture, so we reconfigure it so it *is* contained:
pci 0000:00:0e.0: bridge configuration invalid ([bus 0a-0a]), reconfiguring
pci 0000:00:0e.0: PCI bridge to [bus 06-07]
This effectively moves the FC device from 0a:00.0 to 07:00.0, which should
be legal. But when we enumerate bus 06, the FC device doesn't respond, so
we don't find anything. This is probably a defect in the FC device.
Possible fixes (due to Yinghai):
1) Add a quirk to fix the _CRS information based on what amd_bus.c read
from the hardware
2) Reset the FC device after we change its bus number
3) Revert 1820ffdccb
Fix 1 would be relatively easy, but it does sweep the LSI FC issue under
the rug. We might want to reconfigure bus numbers in the future for some
other reason, e.g., hotplug, and then we could trip over this again.
For that reason, I like fix 2, but we don't know whether it actually works,
and we don't have a patch for it yet.
This revert is fix 3, which also sweeps the LSI FC issue under the rug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=84281
Reported-by: Dirk Gouders <dirk@gouders.net>
Tested-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.15+
CC: Yinghai Lu <yinghai@kernel.org>
This reverts commit fc1b253141 ("PCI: Don't scan random busses in
pci_scan_bridge()") because it breaks CardBus on some machines.
David tested a Dell Latitude D505 that worked like this prior to
fc1b253141:
pci 0000:00:1e.0: PCI bridge to [bus 01]
pci 0000:01:01.0: CardBus bridge to [bus 02-05]
Note that the 01:01.0 CardBus bridge has a bus number aperture of
[bus 02-05], but those buses are all outside the 00:1e.0 PCI bridge bus
number aperture, so accesses to buses 02-05 never reach CardBus. This is
later patched up by yenta_fixup_parent_bridge(), which changes the
subordinate bus number of the 00:1e.0 PCI bridge:
pci_bus 0000:01: Raising subordinate bus# of parent bus (#01) from #01 to #05
With fc1b253141, pci_scan_bridge() fails immediately when it notices that
we can't allocate a valid secondary bus number for the CardBus bridge, and
CardBus doesn't work at all:
pci 0000:01:01.0: can't allocate child bus 01 from [bus 01]
I'd prefer to fix this by integrating the yenta_fixup_parent_bridge() logic
into pci_scan_bridge() so we fix the bus number apertures up front. But
I don't think we can do that before v3.17, so I'm going to revert this to
avoid the problem while we're working on the long-term fix.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=83441
Link: http://lkml.kernel.org/r/1409303414-5196-1-git-send-email-david.henningsson@canonical.com
Reported-by: David Henningsson <david.henningsson@canonical.com>
Tested-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.15+
There's not really a good way to determine whether firmware has already
configured a device with _HPP/_HPX settings. On legacy systems, the BIOS
has probably configured everything, but on UEFI systems it is not required
to do so.
Per the PCI Firmware Specification, rev 3.1, sec 3.5, if PCI_COMMAND_IO or
PCI_COMMAND_MEMORY is set, we can assume firmware has set the corresponding
BARs and maybe we can assume it has configured the rest of the device. And
if a bridge has PCI_COMMAND_PARITY or PCI_COMMAND_SERR set, we can assume
firmware has configured the bridge. But we can't tell much about devices
without BARs.
I think it should be safe to apply _HPP and _HPX settings anyway, even if
firmware has already configured the device, so configure everything we
find.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Linux manages MPS and MRRS settings to keep them consistent across the PCIe
fabric. BIOS doesn't participate in this Linux management, so ignore that
part of any _HPX settings it supplies.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
We currently apply _HPP settings only to:
- non-bridge devices, and
- PCI-to-PCI bridges
i.e., we do not apply them to PCI-to-ISA bridges and the like. It has been
that way since _HPP support was added by 40abb96c51 ("pciehp: Fix
programming hotplug parameters"), but I don't think there's any reason to
exclude these other bridges.
Apply _HPP settings to hot-added PCI devices of any type.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Do not clear PCI_COMMAND_SERR or PCI_COMMAND_PARITY based on _HPP. The
spec (ACPI rev 5.0, sec 6.2.7) says that when "Enable SERR" is set to 1,
we should enable SERR in the command register. It says nothing about
*disabling* SERR or PERR; in fact, the example in 6.2.7.1 says we should
leave PERR alone unless "Enable PERR" is 1.
For hot-added devices, this probably doesn't matter because they power up
with these bits cleared. But in addition to hot-plugged devices, the spec
allows the platform to use _HPP for "configuration of PCI devices not
configured by the BIOS at system boot," and it may make a difference for
devices present at boot.
This change means that if BIOS enables SERR or PERR on a device, and it
supplies _HPP or _HPX with the SERR or PERR bits *cleared*, we will now
leave SERR or PERR reporting enabled on that device instead of disabling it
as we previously did.
See also 40abb96c51 ("pciehp: Fix programming hotplug parameters"), where
this code was first added.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
The ACPI _HPP method was defined before PCIe existed, so its documentation
only mentions PCI. The _HPX Type 0 setting record is essentially identical
to _HPP, but the spec (ACPI rev 5.0, sec 6.2.8.1) says it should be applied
to PCI, PCI-X, and PCIe devices, with settings being ignored if they are
not applicable.
Some platforms with both conventional PCI and PCIe devices provide only
_HPP (not _HPX), so treat _HPP the same way as an _HPX Type 0 record and
apply it to PCIe devices as well as PCI and PCI-X.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
All pci_configure_slot() uses have been removed, so remove the definition
as well.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Some platforms can tell the OS how to configure PCI devices, e.g., how to
set cache line size, error reporting enables, etc. ACPI defines _HPP and
_HPX methods for this purpose.
This configuration was previously done by some of the hotplug drivers using
pci_configure_slot(). But not all hotplug drivers did this, and per the
spec (ACPI rev 5.0, sec 6.2.7), we can also do it for "devices not
configured by the BIOS at system boot."
Move this configuration into the PCI core by adding pci_configure_device()
and calling it from pci_device_add(), so we do this for all devices as we
enumerate them.
This is based on pci_configure_slot(), which is used by hotplug drivers.
I omitted:
- pcie_bus_configure_settings() because it configures MPS and MRRS, which
requires global knowledge of the fabric and must be done later, and
- configuration of subordinate devices; that will happen when we call
pci_device_add() for those devices.
Because pci_configure_slot() was only done by hotplug drivers, this initial
version of pci_configure_device() only configures hot-added devices,
ignoring anything added during boot.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Move pci_configure_slot() and related functions from
drivers/pci/hotplug/pcihp_slot to drivers/pci/probe.c.
This is to prepare for doing device configuration during the normal
enumeration process instead of just after hot-add.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r3.0, sec 2.3.2, an endpoint may respond to a Configuration
Request with a Completion with Configuration Request Retry Status (CRS).
This terminates the Configuration Request.
When the CRS Software Visibility feature is disabled (as it is by default),
a Root Complex must handle a CRS Completion by re-issuing the Configuration
Request. This is invisible to software. From the CPU's point of view, an
endpoint that always responds with CRS causes a hang because the Root
Complex never supplies data to complete the CPU read.
When CRS Software Visibility is enabled, a Root Complex that receives a CRS
Completion for a read of the Vendor ID must return data of 0x0001. The
Vendor ID of 0x0001 indicates to software that the endpoint is not ready.
We now have more devices that require CRS Software Visibility. For
example, a PLX 8713 NT bridge may respond with CRS until it has been
configured via I2C, and the I2C configuration is completely independent of
PCI enumeration.
Enable CRS Software Visibility if it is supported. This allows a system
with such a device to work (though the PCI core times out waiting for it to
become ready, and we have to rescan the bus after it is ready).
This essentially reverts ad7edfe049 ("[PCI] Do not enable CRS Software
Visibility by default"). The failures that led to ad7edfe049 should be
addressed by 89665a6a71 ("PCI: Check only the Vendor ID to identify
Configuration Request Retry").
[bhelgaas: changelog]
Link: http://lkml.kernel.org/r/20071029061532.5d10dfc6@snowcone
Link: http://lkml.kernel.org/r/alpine.LFD.0.9999.0712271023090.21557@woody.linux-foundation.org
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r3.0, sec 2.3.2, if a Root Complex
- has Configuration Request Retry Status Software Visibility enabled,
- issues a Configuration Read of both bytes of the Vendor ID, and
- receives a Completion with Configuration Request Retry Status (CRS),
it must complete the request to the host by fabricating data of 0x0001 for
the Vendor ID and 0xff for any additional bytes in the request.
Linux issues a single config read for the four bytes containing the Vendor
ID and the Device ID. Previously we checked all four bytes for 0xffff0001
to identify CRS.
However, it is only the Vendor ID that really indicates CRS, because it's
sufficient to read only those two bytes. Checking the Device ID verifies
spec compliance but doesn't add any information.
Some Root Complexes appear to indicate CRS by returning 0x0001 for the
Vendor ID along with the actual the Device ID. Previously we interpreted
that as a valid Vendor/Device ID pair, although 0x0001 is reserved and
cannot be a valid Vendor ID.
[bhelgaas: changelog]
Link: http://lkml.kernel.org/r/4729FC36.3040000@gmail.com
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.
No functional change.
[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix various whitespace errors.
No functional change.
[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move EXPORT_SYMBOL so it immediately follows the function or variable.
No functional change.
[bhelgaas: squash similar changes, fix hotplug, probe, rom, search, too]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/misc:
PCI: Fix return value from pci_user_{read,write}_config_*()
PCI: Turn pcibios_penalize_isa_irq() into a weak function
PCI: Test for std config alias when testing extended config space
* pci/hotplug:
PCI: cpqphp: Fix possible null pointer dereference
NVMe: Implement PCIe reset notification callback
PCI: Notify driver before and after device reset
* pci/pci_is_bridge:
pcmcia: Use pci_is_bridge() to simplify code
PCI: pciehp: Use pci_is_bridge() to simplify code
PCI: acpiphp: Use pci_is_bridge() to simplify code
PCI: cpcihp: Use pci_is_bridge() to simplify code
PCI: shpchp: Use pci_is_bridge() to simplify code
PCI: rpaphp: Use pci_is_bridge() to simplify code
sparc/PCI: Use pci_is_bridge() to simplify code
powerpc/PCI: Use pci_is_bridge() to simplify code
ia64/PCI: Use pci_is_bridge() to simplify code
x86/PCI: Use pci_is_bridge() to simplify code
PCI: Use pci_is_bridge() to simplify code
PCI: Add new pci_is_bridge() interface
PCI: Rename pci_is_bridge() to pci_has_subordinate()
* pci/virtualization:
PCI: Introduce new device binding path using pci_dev.driver_override
Conflicts:
drivers/pci/pci-sysfs.c
The driver_override field allows us to specify the driver for a device
rather than relying on the driver to provide a positive match of the
device. This shortcuts the existing process of looking up the vendor and
device ID, adding them to the driver new_id, binding the device, then
removing the ID, but it also provides a couple advantages.
First, the above existing process allows the driver to bind to any device
matching the new_id for the window where it's enabled. This is often not
desired, such as the case of trying to bind a single device to a meta
driver like pci-stub or vfio-pci. Using driver_override we can do this
deterministically using:
echo pci-stub > /sys/bus/pci/devices/0000:03:00.0/driver_override
echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
Previously we could not invoke drivers_probe after adding a device to
new_id for a driver as we get non-deterministic behavior whether the driver
we intend or the standard driver will claim the device. Now it becomes a
deterministic process, only the driver matching driver_override will probe
the device.
To return the device to the standard driver, we simply clear the
driver_override and reprobe the device:
echo > /sys/bus/pci/devices/0000:03:00.0/driver_override
echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
Another advantage to this approach is that we can specify a driver override
to force a specific binding or prevent any binding. For instance when an
IOMMU group is exposed to userspace through VFIO we require that all
devices within that group are owned by VFIO. However, devices can be
hot-added into an IOMMU group, in which case we want to prevent the device
from binding to any driver (override driver = "none") or perhaps have it
automatically bind to vfio-pci. With driver_override it's a simple matter
for this field to be set internally when the device is first discovered to
prevent driver matches.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a PCI-to-PCIe bridge is stacked on a PCIe-to-PCI bridge, we can have
PCIe endpoints masked by a conventional PCI bus. This makes the extended
config space of the PCIe endpoint inaccessible. The PCIe-to-PCI bridge is
supposed to handle any type 1 configuration transactions where the extended
config offset bits are non-zero as an Unsupported Request rather than
forward it to the secondary interface. As noted here, there are a couple
known offenders to this rule. These bridges drop the extended offset bits,
resulting in the conventional config space being aliased many times across
the extended config space. For Intel NICs, this alias often seems to
expose a bogus SR-IOV cap.
Stacking bridges may seem like an uncommon scenario, but note that any
conventional PCI slot in a modern PC is already the secondary interface of
an onboard PCIe-to-PCI bridge. The user need only add a PCI-to-PCIe
adapter and PCIe device to encounter this problem.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use pci_is_bridge() to simplify code. No functional change.
Requires: 326c1cdae7 PCI: Rename pci_is_bridge() to pci_has_subordinate()
Requires: 1c86438c94 PCI: Add new pci_is_bridge() interface
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* dma-api:
iommu/exynos: Remove unnecessary "&" from function pointers
DMA-API: Update dma_pool_create ()and dma_pool_alloc() descriptions
DMA-API: Fix duplicated word in DMA-API-HOWTO.txt
DMA-API: Capitalize "CPU" consistently
sh/PCI: Pass GAPSPCI_DMA_BASE CPU & bus address to dma_declare_coherent_memory()
DMA-API: Change dma_declare_coherent_memory() CPU address to phys_addr_t
DMA-API: Clarify physical/bus address distinction
* pci/virtualization:
PCI: Mark RTL8110SC INTx masking as broken
* pci/msi:
PCI/MSI: Remove pci_enable_msi_block()
* pci/misc:
PCI: Remove pcibios_add_platform_entries()
s390/pci: use pdev->dev.groups for attribute creation
PCI: Move Open Firmware devspec attribute to PCI common code
* pci/resource:
PCI: Add resource allocation comments
PCI: Simplify __pci_assign_resource() coding style
PCI: Change pbus_size_mem() return values to be more conventional
PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources
PCI: Support BAR sizes up to 8GB
resources: Clarify sanity check message
PCI: Don't add disabled subtractive decode bus resources
PCI: Don't print anything while decoding is disabled
PCI: Don't set BAR to zero if dma_addr_t is too small
PCI: Don't convert BAR address to resource if dma_addr_t is too small
PCI: Reject BAR above 4GB if dma_addr_t is too small
PCI: Fail safely if we can't handle BARs larger than 4GB
x86/gart: Tidy messages and add bridge device info
x86/gart: Replace printk() with pr_info()
x86/PCI: Move pcibios_assign_resources() annotation to definition
x86/PCI: Mark ATI SBx00 HPET BAR as IORESOURCE_PCI_FIXED
x86/PCI: Don't try to move IORESOURCE_PCI_FIXED resources
x86/PCI: Fix Broadcom CNB20LE unintended sign extension
For a subtractive decode bridge, we previously added and printed all
resources of the primary bus, even if they were not valid. In the example
below, the bridge 00:1c.3 has no windows enabled, so there are no valid
resources on bus 02. But since 02:00.0 is subtractive decode bridge, we
add and print all those invalid resources, which don't really make sense:
pci 0000:00:1c.3: PCI bridge to [bus 02-03]
pci 0000:02:00.0: PCI bridge to [bus 03] (subtractive decode)
pci 0000:02:00.0: bridge window [??? 0x00000000 flags 0x0] (subtractive decode)
Add and print the subtractively-decoded resources only if they are valid.
There's an example in the dmesg log attached to the bugzilla below (but
this patch doesn't fix the bug reported there).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=73141
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the console is a PCI device, and we try to print to it while its
decoding is disabled, the system will hang. This particular printk hasn't
caused a problem yet, but it could, so this fixes it.
See also 0ff9514b57 ("PCI: Don't print anything while decoding is
disabled").
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If a BAR is above 4GB and our dma_addr_t is too small, don't clear the BAR
to zero: that doesn't disable the BAR, and it makes it more likely that the
BAR will conflict with things if we turn on the memory enable bit (as we
will at "out:" if the device was already enabled at the handoff).
We should also print the BAR info and its original size so we can follow
the process when we try to assign space to it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If dma_addr_t is too small to represent the BAR value,
pcibios_bus_to_resource() will fail, so just remember the BAR size directly
in the resource. The resource is already marked UNSET, so we know the
address isn't valid anyway.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We can only handle BARs above 4GB if dma_addr_t (not resource_size_t) is 64
bits wide. If we have a 64-bit resource_size_t and a 32-bit dma_addr_t,
we can't deal with BARs above 4GB.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We can only handle BARs larger than 4GB if both dma_addr_t and
resource_size_t are 64 bits wide. If dma_addr_t is 32 bits, we can't
represent all the bus addresses, and if resource_size_t is 32 bits, we
can't represent all the CPU addresses.
Previously we cleared res->flags (at "fail:") for resources that were too
large. That means we think the BAR doesn't exist at all, which in turn
means that we could enable the device even though we can't keep track of
where the BAR is and we can't make sure it doesn't overlap something else.
This preserves the type flags (MEM/IO) so we can keep from enabling the
device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If "pcie_bus_config == PCIE_BUS_PERFORMANCE", we don't initialize "smpss",
so we pass a pointer to garbage into pcie_bus_configure_set(), where we
compute "mps" based on the garbage. We then pass the garbage "mps" to
pcie_write_mps(), which ignores it in the PCIE_BUS_PERFORMANCE case.
Coverity isn't smart enough to deduce that we ignore the garbage (it's a
lot to expect from a human, too), so initialize "smpss" to a safe value in
all cases.
Found by Coverity (CID 146454).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some PCI functions used to be marked __devinit. When CONFIG_HOTPLUG was
not set, these functions were discarded after boot. A few callers of these
__devinit functions were marked __ref to indicate that they could safely
call the __devinit functions even though the callers were not __devinit.
But CONFIG_HOTPLUG and __devinit are now gone, and the need for the __ref
annotations is also gone, so remove them. Relevant historical commits:
54b956b903 Remove __dev* markings from init.h
a8e4b9c101 PCI: add generic pci_hp_add_bridge()
0ab2b57f8d PCI: fix section mismatch warning in pci_scan_child_bus
451124a7cc PCI: fix 4x section mismatch warnings
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/resource: (26 commits)
Revert "[PATCH] Insert GART region into resource map"
PCI: Log IDE resource quirk in dmesg
PCI: Change pci_bus_alloc_resource() type_mask to unsigned long
PCI: Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region()
resources: Set type in __request_region()
PCI: Don't check resource_size() in pci_bus_alloc_resource()
s390/PCI: Use generic pci_enable_resources()
tile PCI RC: Use default pcibios_enable_device()
sparc/PCI: Use default pcibios_enable_device() (Leon only)
sh/PCI: Use default pcibios_enable_device()
microblaze/PCI: Use default pcibios_enable_device()
alpha/PCI: Use default pcibios_enable_device()
PCI: Add "weak" generic pcibios_enable_device() implementation
PCI: Don't enable decoding if BAR hasn't been assigned an address
PCI: Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit
PCI: Don't try to claim IORESOURCE_UNSET resources
PCI: Check IORESOURCE_UNSET before updating BAR
PCI: Don't clear IORESOURCE_UNSET when updating BAR
PCI: Mark resources as IORESOURCE_UNSET if we can't assign them
PCI: Remove pci_find_parent_resource() use for allocation
...
Make a note in dmesg when we overwrite legacy IDE BAR info. We previously
logged something like this:
pci 0000:00:1f.1: reg 0x10: [io 0x0000-0x0007]
and then silently overwrote the resource. There's an example in the
bugzilla below. This doesn't fix the bugzilla; it just makes what's going
on more obvious.
No functional change; merely adds some dev_info() calls.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=48451
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If we don't support 64-bit addresses, i.e., CONFIG_PHYS_ADDR_T_64BIT is not
set, we can't deal with BARs above 4GB. In this case we already pretend
the BAR contained zero; this patch also sets IORESOURCE_UNSET so we can try
to reallocate it later.
I don't think this is exactly correct: what we care about here are *bus*
addresses, not CPU addresses, so the tests of sizeof(resource_size_t)
probably should be on sizeof(dma_addr_t) instead. But this is what's been
in -next, so we'll fix that later.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When assigning a new bus number in pci_scan_bridge we check whether
max+1 is free by calling pci_find_bus. If it does already exist then we
assume that we are rescanning and that this is the right bus to scan.
This is fragile. If max+1 lies outside of bus->busn_res.end then we will
rescan some random bus from somewhere else in the hierachy. This patch
checks for this case and prints a warning.
[bhelgaas: add parent/child bus number info to dev_warn()]
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_scan_child_bus can (potentially) return a bus number higher than the
subordinate value of the child bus. Possible reasons are that bus numbers
are reserved for SR-IOV or for CardBus (SR-IOV is done without checks and
the CardBus checks are sketchy at best).
We clamp the returned value to the actual subordinate value and print a
warning if too many bus numbers are reserved.
[bhelgaas: whitespace]
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The function has no effect.
If pcibios_assign_all_busses() is not set then the function does nothing.
If it is set then in pci_scan_bridge we are always in the branch where
we assign the bus numbers ourselves and the subordinate values of all
parent busses will be set to 0xff since that is what they inherited from
their parent bus and ultimately from the root bus.
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Right now we use 0xff for busn_res.end when probing and later reduce it to
the value that is actually used. This does not work if a parent bridge has
already a lower subordinate value. For example during hotplug of a new
bridge below an already-configured bridge the following message is printed
from pci_bus_insert_busn_res():
pci_bus 0000:06: busn_res: can not insert [bus 06-ff] under [bus 05-9b] (conflicts with (null) [bus 05-9b])
This patch clamps the bus range to that of the parent and also ensures that
we do not exceed the parents range when assigning the final subordinate
value.
We also check that busses configured by the firmware fit into their parents
bounds.
[bhelgaas: reword dev_warn() and fix printk format warning]
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If a conflict happens during insert_resource_conflict() and all conflicts
fit within the newly inserted resource then they will become children of
the new resource. This is almost certainly not what we want for bus
numbers.
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Right now the CardBus code in pci_scan_bridge() is executed during both
passes. Since we always allocate the bus number ourselves it makes sense
to put it into the second pass.
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Initially when we encountered a bus that was already present we skipped
it. Since 74710ded8e 'PCI: always scan child buses' we continue
scanning in order to allow user triggered rescans of already existing
busses.
The old comment suggested that the reason for continuing the scan is a
bug in the i450NX chipset. This is not the case.
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch fixes two small issues:
- If pci_add_new_bus() fails, max must not be incremented. Otherwise
an incorrect value is returned from pci_scan_bridge().
- If the bus is already present, max must be incremented. I think
that this case should only be hit if we trigger a manual rescan of a
CardBus bridge.
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Revert commit ef83b0781a "PCI: Remove from bus_list and release
resources in pci_release_dev()" that made some nasty race conditions
become possible. For example, if a Thunderbolt link is unplugged
and then replugged immediately, the pci_release_dev() resulting from
the hot-remove code path may be racing with the hot-add code path
which after that commit causes various kinds of breakage to happen
(up to and including a hard crash of the whole system).
Moreover, the problem that commit ef83b0781a attempted to address
cannot happen any more after commit 8a4c5c329d "PCI: Check parent
kobject in pci_destroy_dev()", because pci_destroy_dev() will now
return immediately if it has already been executed for the given
device.
Note, however, that the invocation of msi_remove_pci_irq_vectors()
removed by commit ef83b0781a from pci_free_resources() along with
the other changes made by it is not added back because of subsequent
code changes depending on that modification.
Fixes: ef83b0781a (PCI: Remove from bus_list and release resources in pci_release_dev())
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are multiple PCI device addition and removal code paths that may be
run concurrently with the generic PCI bus rescan and device removal that
can be triggered via sysfs. If that happens, it may lead to multiple
different, potentially dangerous race conditions.
The most straightforward way to address those problems is to run
the code in question under the same lock that is used by the
generic rescan/remove code in pci-sysfs.c. To prepare for those
changes, move the definition of the global PCI remove/rescan lock
to probe.c and provide global wrappers, pci_lock_rescan_remove()
and pci_unlock_rescan_remove(), allowing drivers to manipulate
that lock. Also provide pci_stop_and_remove_bus_device_locked()
for the callers of pci_stop_and_remove_bus_device() who only need
to hold the rescan/remove lock around it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Using 'make namespacecheck' identify code which should be declared static.
Checked for users in other driver/archs as well. Compile tested only.
This stops exporting the following interfaces to modules:
pci_target_state()
pci_load_saved_state()
[bhelgaas: retained pci_find_next_ext_capability() and pci_cfg_space_size()]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
My philosophy is unused code is dead code. And dead code is subject to bit
rot and is a likely source of bugs. Use it or lose it.
This removes this unused and deprecated interface:
alloc_pci_dev()
[bhelgaas: split to separate patch]
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/resource:
PCI: Allocate 64-bit BARs above 4G when possible
PCI: Enforce bus address limits in resource allocation
PCI: Split out bridge window override of minimum allocation address
agp/ati: Use PCI_COMMAND instead of hard-coded 4
agp/intel: Use CPU physical address, not bus address, for ioremap()
agp/intel: Use pci_bus_address() to get GTTADR bus address
agp/intel: Use pci_bus_address() to get MMADR bus address
agp/intel: Support 64-bit GMADR
agp/intel: Rename gtt_bus_addr to gtt_phys_addr
drm/i915: Rename gtt_bus_addr to gtt_phys_addr
agp: Use pci_resource_start() to get CPU physical address for BAR
agp: Support 64-bit APBASE
PCI: Add pci_bus_address() to get bus address of a BAR
PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev
PCI: Change pci_bus_region addresses to dma_addr_t
These interfaces:
pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource)
pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region)
took a pci_dev, but they really depend only on the pci_bus. And we want to
use them in resource allocation paths where we have the bus but not a
device, so this patch converts them to take the pci_bus instead of the
pci_dev:
pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource)
pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region)
In fact, with standard PCI-PCI bridges, they only depend on the host
bridge, because that's the only place address translation occurs, but
we aren't going that far yet.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously we removed the pci_dev from the bus_list and released its
resources in pci_destroy_dev(). But that's too early: it's possible to
call pci_destroy_dev() twice for the same device (e.g., via sysfs), and
that will cause an oops when we try to remove it from bus_list the second
time.
We should remove it from the bus_list only when the last reference to the
pci_dev has been released, i.e., in pci_release_dev().
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4f535093cf ("PCI: Put pci_dev in device tree as early as possible")
moved pci_proc_attach_device() from pci_bus_add_device() to
pci_device_add().
This moves it back to pci_bus_add_device(), essentially reverting that
part of 4f535093cf. This makes it symmetric with pci_stop_dev(),
where we call pci_proc_detach_device() and pci_remove_sysfs_dev_files()
and set dev->is_added = 0.
[bhelgaas: changelog, create sysfs then attach proc for symmetry]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix whitespace, capitalization, and spelling errors. No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.
Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/misc:
PCI: Remove unused PCI_MSIX_FLAGS_BIRMASK definition
PCI: acpiphp_ibm: Convert to dynamic debug
PCI: acpiphp: Convert to dynamic debug
PCI: Remove Intel Haswell D3 delays
PCI: Pass type, width, and prefetchability for window alignment
PCI: Document reason for using pci_is_root_bus()
PCI: Use pci_is_root_bus() to check for root bus
PCI: Remove unused "is_pcie" from pci_dev structure
PCI: Update pci_find_slot() description in pci.txt
[SCSI] qla2xxx: Use standard PCIe Capability Link register field names
PCI: Fix comment typo, remove unnecessary !! in pci_is_pcie()
PCI: Drop "setting latency timer" messages
No one uses "is_pcie" now; remove this obsolete member.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use pci_is_pcie() instead of pci_find_capability() to simplify code.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This branch contains mostly additions and changes to platform enablement
and SoC-level drivers. Since there's sometimes a dependency on device-tree
changes, there's also a fair amount of those in this branch.
Pieces worth mentioning are:
- Mbus driver for Marvell platforms, allowing kernel configuration
and resource allocation of on-chip peripherals.
- Enablement of the mbus infrastructure from Marvell PCI-e drivers.
- Preparation of MSI support for Marvell platforms.
- Addition of new PCI-e host controller driver for Tegra platforms
- Some churn caused by sharing of macro names between i.MX 6Q and 6DL
platforms in the device tree sources and header files.
- Various suspend/PM updates for Tegra, including LP1 support.
- Versatile Express support for MCPM, part of big little support.
- Allwinner platform support for A20 and A31 SoCs (dual and quad Cortex-A7)
- OMAP2+ support for DRA7, a new Cortex-A15-based SoC.
The code that touches other architectures are patches moving
MSI arch-specific functions over to weak symbols and removal of
ARCH_SUPPORTS_MSI, acked by PCI maintainers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSKhYmAAoJEIwa5zzehBx322AP/1ONYs8o8f7/Gzq6lZvTN6T3
0pBTApg6Jfioi3lwKvUAEIcsW82YKQ+UZkbW66GQH6+Ri4aZJKZHuz0+JPU67OJ4
LtSLuzVWrymy2VOOUvAnS/SXkOZw/pHhU4cLNHn1dMndhUL1Uqp9/XwuiHEQyFsP
uOkpcBtIu0EWElov0PKKZ5SWBg8JJs2vy5ydiViGelWHCrZvDDZkWzIsDcBQxJLQ
juzT4+JE+KOu7vKmfw78o6iHoCS2TBRAN9YUCajRb8Wl+out1hrTahHnDWaZ5Mce
EskcQNkJROqFbjD4k3ABN4XGTv2VDmrztIwFe0SEQ7Dz/9ypCrBGT69uI9xIqTXr
GwVRIwAUFTpMupK0gy93z1ajV3N0CXV79out9+jQNUQybYE+czp8QOyhmuc1tZx0
8fn9jlBQe9Vy6yrs39gEcE7nUwrayeyQ+6UvqqwsE2pWZabNAnCMSPX5+QIu+T/3
tQ7+jYmfFeserp1sIDOHOnxfhtW9EI6U9d1h/DUCwrsuFdkL9ha4M/vh9Pwgye98
tBdz0T4yE39AJQwwFWRkv1jcQKcGu6WqJanmvS4KRBksGwuLWxy+ewOnkz2ifS25
ZYSyxAryZRBvQRqlOK11rXPfRcbGcY0MG9lkKX96rGcyWEizgE1DdjxXD8HoIleN
R8heV6GX5OzlFLGX2tKK
=fJ5x
-----END PGP SIGNATURE-----
Merge tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform changes from Olof Johansson:
"This branch contains mostly additions and changes to platform
enablement and SoC-level drivers. Since there's sometimes a
dependency on device-tree changes, there's also a fair amount of
those in this branch.
Pieces worth mentioning are:
- Mbus driver for Marvell platforms, allowing kernel configuration
and resource allocation of on-chip peripherals.
- Enablement of the mbus infrastructure from Marvell PCI-e drivers.
- Preparation of MSI support for Marvell platforms.
- Addition of new PCI-e host controller driver for Tegra platforms
- Some churn caused by sharing of macro names between i.MX 6Q and 6DL
platforms in the device tree sources and header files.
- Various suspend/PM updates for Tegra, including LP1 support.
- Versatile Express support for MCPM, part of big little support.
- Allwinner platform support for A20 and A31 SoCs (dual and quad
Cortex-A7)
- OMAP2+ support for DRA7, a new Cortex-A15-based SoC.
The code that touches other architectures are patches moving MSI
arch-specific functions over to weak symbols and removal of
ARCH_SUPPORTS_MSI, acked by PCI maintainers"
* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (266 commits)
tegra-cpuidle: provide stub when !CONFIG_CPU_IDLE
PCI: tegra: replace devm_request_and_ioremap by devm_ioremap_resource
ARM: tegra: Drop ARCH_SUPPORTS_MSI and sort list
ARM: dts: vf610-twr: enable i2c0 device
ARM: dts: i.MX51: Add one more I2C2 pinmux entry
ARM: dts: i.MX51: Move pins configuration under "iomuxc" label
ARM: dtsi: imx6qdl-sabresd: Add USB OTG vbus pin to pinctrl_hog
ARM: dtsi: imx6qdl-sabresd: Add USB host 1 VBUS regulator
ARM: dts: imx27-phytec-phycore-som: Enable AUDMUX
ARM: dts: i.MX27: Disable AUDMUX in the template
ARM: dts: wandboard: Add support for SDIO bcm4329
ARM: i.MX5 clocks: Remove optional clock setup (CKIH1) from i.MX51 template
ARM: dts: imx53-qsb: Make USBH1 functional
ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module
ARM i.MX6Q: dts: Enable SPI NOR flash on Phytec phyFLEX-i.MX6 Ouad module
ARM: dts: imx6qdl-sabresd: Add touchscreen support
ARM: imx: add ocram clock for imx53
ARM: dts: imx: ocram size is different between imx6q and imx6dl
ARM: dts: imx27-phytec-phycore-som: Fix regulator settings
ARM: dts: i.MX27: Remove clock name from CPU node
...
Pull networking changes from David Miller:
"Noteworthy changes this time around:
1) Multicast rejoin support for team driver, from Jiri Pirko.
2) Centralize and simplify TCP RTT measurement handling in order to
reduce the impact of bad RTO seeding from SYN/ACKs. Also, when
both timestamps and local RTT measurements are available prefer
the later because there are broken middleware devices which
scramble the timestamp.
From Yuchung Cheng.
3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
memory consumed to queue up unsend user data. From Eric Dumazet.
4) Add a "physical port ID" abstraction for network devices, from
Jiri Pirko.
5) Add a "suppress" operation to influence fib_rules lookups, from
Stefan Tomanek.
6) Add a networking development FAQ, from Paul Gortmaker.
7) Extend the information provided by tcp_probe and add ipv6 support,
from Daniel Borkmann.
8) Use RCU locking more extensively in openvswitch data paths, from
Pravin B Shelar.
9) Add SCTP support to openvswitch, from Joe Stringer.
10) Add EF10 chip support to SFC driver, from Ben Hutchings.
11) Add new SYNPROXY netfilter target, from Patrick McHardy.
12) Compute a rate approximation for sending in TCP sockets, and use
this to more intelligently coalesce TSO frames. Furthermore, add
a new packet scheduler which takes advantage of this estimate when
available. From Eric Dumazet.
13) Allow AF_PACKET fanouts with random selection, from Daniel
Borkmann.
14) Add ipv6 support to vxlan driver, from Cong Wang"
Resolved conflicts as per discussion.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
openvswitch: Fix alignment of struct sw_flow_key.
netfilter: Fix build errors with xt_socket.c
tcp: Add missing braces to do_tcp_setsockopt
caif: Add missing braces to multiline if in cfctrl_linkup_request
bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
vxlan: Fix kernel panic on device delete.
net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
icplus: Use netif_running to determine device state
ethernet/arc/arc_emac: Fix huge delays in large file copies
tuntap: orphan frags before trying to set tx timestamp
tuntap: purge socket error queue on detach
qlcnic: use standard NAPI weights
ipv6:introduce function to find route for redirect
bnx2x: VF RSS support - VF side
bnx2x: VF RSS support - PF side
vxlan: Notify drivers for listening UDP port changes
net: usbnet: update addr_assign_type if appropriate
driver/net: enic: update enic maintainers and driver
driver/net: enic: Exposing symbols for Cisco's low latency driver
...
* pci/misc:
PCI: Remove pcie_cap_has_devctl()
PCI: Support PCIe Capability Slot registers only for ports with slots
PCI: Remove PCIe Capability version checks
PCI: Allow PCIe Capability link-related register access for switches
PCI: Add offsets of PCIe capability registers
PCI: Tidy bitmasks and spacing of PCIe capability definitions
PCI: Remove obsolete comment reference to pci_pcie_cap2()
PCI: Clarify PCI_EXP_TYPE_PCI_BRIDGE comment
PCI: Rename PCIe capability definitions to follow convention
PCI: Disable decoding for BAR sizing only when it was actually enabled
PCI: Add comment about needing pci_msi_off() even when CONFIG_PCI_MSI=n
PCI: Add pcibios_pm_ops for optional arch-specific hibernate functionality