Commit Graph

11835 Commits

Author SHA1 Message Date
Bjorn Helgaas
a80b04dffe Merge branch 'pci/controller/vmd'
- Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so
  pci_ops.read() will never sleep, even on PREEMPT_RT where spinlock_t
  becomes a sleepable lock (Ryo Takakura)

* pci/controller/vmd:
  PCI: vmd: Make vmd_dev::cfg_lock a raw_spinlock_t type
2025-03-27 13:14:51 -05:00
Bjorn Helgaas
6547faa1bc Merge branch 'pci/controller/qcom'
- Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as
  RESERVED (Manivannan Sadhasivam)

- Add optional dma-coherent DT property for Qualcomm SA8775P (Dmitry
  Baryshkov)

- Make DT iommu property required for SA8775P and prohibited for SDX55
  (Dmitry Baryshkov)

- Add DT iommu and DMA-related properties for Qualcomm SM8450 (Dmitry
  Baryshkov)

- Consolidate DMA vs non-DMA cases in DT (Dmitry Baryshkov)

- Add endpoint DT properties for SAR2130P and enable endpoint mode in
  driver (Dmitry Baryshkov)

* pci/controller/qcom:
  PCI: qcom-ep: Enable EP mode support for SAR2130P
  dt-bindings: PCI: qcom-ep: Add SAR2130P compatible
  dt-bindings: PCI: qcom-ep: Consolidate DMA vs non-DMA cases
  dt-bindings: PCI: qcom-ep: Enable DMA for SM8450
  dt-bindings: PCI: qcom-ep: Describe optional IOMMU
  dt-bindings: PCI: qcom-ep: Describe optional dma-coherent property
  PCI: qcom-ep: Mark BAR0/BAR2 as 64bit BARs and BAR1/BAR3 as RESERVED
2025-03-27 13:14:51 -05:00
Bjorn Helgaas
d7f6f07ece Merge branch 'pci/controller/mediatek'
- Remove leftover mac_reset assert for Airoha EN7581 SoC (Lorenzo Bianconi)

- Add EN7581 PBUS controller 'mediatek,pbus-csr' DT property and program
  host bridge memory aperture to this syscon node (Lorenzo Bianconi)

* pci/controller/mediatek:
  PCI: mediatek-gen3: Fix inconsistent indentation
  PCI: mediatek-gen3: Configure PBUS_CSR registers for EN7581 SoC
  dt-bindings: PCI: mediatek-gen3: Add mediatek,pbus-csr phandle array property
  PCI: mediatek-gen3: Remove leftover mac_reset assert for Airoha EN7581 SoC
2025-03-27 13:14:50 -05:00
Bjorn Helgaas
5edeea2d7b Merge branch 'pci/controller/layerscape'
- Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg")
  arg_count to fix probe failure on LS1043A (Ioana Ciornei)

* pci/controller/layerscape:
  PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args()
2025-03-27 13:14:50 -05:00
Bjorn Helgaas
f2d4def0e9 Merge branch 'pci/controller/j721e'
- Correct the 'link down' interrupt bit for J784S4 (Siddharth Vadapalli)

* pci/controller/j721e:
  PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4
2025-03-27 13:14:50 -05:00
Bjorn Helgaas
ad49cd490e Merge branch 'pci/controller/imx6'
- Identify the second controller on i.MX8MQ based on devicetree
  'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu)

- Use devm_clk_bulk_get_all() to fetch clocks to simplify the code (Richard
  Zhu)

* pci/controller/imx6:
  PCI: imx6: Use devm_clk_bulk_get_all() to fetch clocks
  PCI: imx6: Identify controller via 'linux,pci-domain', not address
2025-03-27 13:14:49 -05:00
Bjorn Helgaas
8c6dadf8af Merge branch 'pci/controller/hyperv'
- Correct comment to say that invalidations from a PF driver are delivered
  to the VF endpoint driver, not by the VF driver (Easwar Hariharan)

* pci/controller/hyperv:
  PCI: hv: Correct a comment
2025-03-27 13:14:49 -05:00
Bjorn Helgaas
58746a573a Merge branch 'pci/controller/histb'
- Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe
  JAILLET)

* pci/controller/histb:
  PCI: histb: Fix an error handling path in histb_pcie_probe()
2025-03-27 13:14:49 -05:00
Bjorn Helgaas
ba4751ae1a Merge branch 'pci/controller/dwc'
- Move struct dwc_pcie_vsec_id to include/linux/pcie-dwc.h, where it can be
  shared by debugfs, perf, sysfs, etc (Manivannan Sadhasivam)

- Add dw_pcie_find_vsec_capability() to locate Vendor Specific Extended
  Capabilities (Shradha Todi)

- Add debugfs-based Silicon Debug, Error Injection, Statistical Counter
  support for DWC (Shradha Todi)

- Add debugfs property to expose LTSSM status of DWC PCIe link (Hans Zhang)

- Add Rockchip Vendor ID and Vendor Specific ID of RAS DES Capability so
  the DWC debugfs features work for Rockchip as well (Niklas Cassel)

* pci/controller/dwc:
  PCI: dw-rockchip: Hide broken ATS capability for RK3588 running in EP mode
  PCI: dwc: ep: Add dw_pcie_ep_hide_ext_capability()
  PCI: dwc: ep: Return -ENOMEM for allocation failures
  PCI: dwc: Add Rockchip to the RAS DES allowed vendor list
  PCI: Add Rockchip Vendor ID
  PCI: dwc: Add debugfs property to provide LTSSM status of the PCIe link
  PCI: dwc: Add debugfs based Statistical Counter support for DWC
  PCI: dwc: Add debugfs based Error Injection support for DWC
  PCI: dwc: Add debugfs based Silicon Debug support for DWC
  PCI: dwc: Add helper to find the Vendor Specific Extended Capability (VSEC)
  perf/dwc_pcie: Move common DWC struct definitions to 'pcie-dwc.h'
2025-03-27 13:14:49 -05:00
Bjorn Helgaas
479e4a014b Merge branch 'pci/controller/cadence'
- Correct MSG TLP generation so endpoint can generate INTx messages (Hans
  Zhang)

* pci/controller/cadence:
  PCI: cadence-ep: Fix the driver to send MSG TLP for INTx without data payload
2025-03-27 13:14:48 -05:00
Bjorn Helgaas
b79789646e Merge branch 'pci/controller/brcmstb'
- Add missing of_node refcount release after of_parse_phandle() (Stanimir
  Varbanov)

- Add BCM2712 MSI-X DT binding and interrupt controller drivers (Stanimir
  Varbanov)

- Add brcmstb softdep on irq_bcm2712_mip MIP MSI-X interrupt controller
  driver to ensure that it is loaded first (Stanimir Varbanov)

- Add struct brcm_pcie pointer to pcie_cfg_data so we can reference the
  pcie_cfg_data directly instead of copying it to brcm_pcie (Stanimir
  Varbanov)

- Expand inbound window map to 64GB so it can accommodate BCM2712 (Stanimir
  Varbanov)

- Add BCM2712 support and DT updates (Stanimir Varbanov)

- Apply link speed restriction before bringing link up, not after (Jim
  Quinlan)

- Update Max Link Speed in Link Capabilities via the internal writable
  register, not the read-only config register (Jim Quinlan)

- Handle regulator_bulk_get() error to avoid panic when we call
  regulator_bulk_free() later (Jim Quinlan)

- Disable regulators only when removing the bus immediately below a Root
  Port because we don't support regulators deeper in the hierarchy (Jim
  Quinlan)

- Consistently use config access index/data register offsets from the
  SoC-specific pcie_offsets[] table (Jim Quinlan)

- Update MDIO register fields that reduced CMD from 12 bits to 1 and
  widened PORT from 4 bits to 5 and split it into two parts (Jim Quinlan)

- Make const read-only arrays static (Colin Ian King)

* pci/controller/brcmstb:
  PCI: brcmstb: Make const read-only arrays static
  PCI: brcmstb: Make irq_domain_set_info() parameter cast explicit
  PCI: brcmstb: Make two changes in MDIO register fields
  PCI: brcmstb: Use same constant table for config space access
  PCI: brcmstb: Fix potential premature regulator disabling
  PCI: brcmstb: Fix error path after a call to regulator_bulk_get()
  PCI: brcmstb: Do not assume that register field starts at LSB
  PCI: brcmstb: Use internal register to change link capability
  PCI: brcmstb: Set generation limit before PCIe link up
  PCI: brcmstb: Add BCM2712 support
  PCI: brcmstb: Expand inbound window size up to 64GB
  PCI: brcmstb: Reuse pcie_cfg_data structure
  PCI: brcmstb: Add a softdep to MIP MSI-X driver
  irqchip: Add Broadcom BCM2712 MSI-X interrupt controller
  dt-bindings: PCI: brcmstb: Update bindings for PCIe on BCM2712
  dt-bindings: interrupt-controller: Add BCM2712 MSI-X bindings
  PCI: brcmstb: Fix missing of_node_put() in brcm_pcie_probe()
2025-03-27 13:14:48 -05:00
Bjorn Helgaas
c51638f15e Merge branch 'pci/controller/amd-mdb'
- Add DT binding and driver for AMD MDB (Multimedia DMA Bridge)
  (Thippeswamy Havalige)

* pci/controller/amd-mdb:
  PCI: amd-mdb: Add AMD MDB Root Port driver
  dt-bindings: PCI: amd-mdb: Add AMD Versal2 MDB PCIe Root Port Bridge
  dt-bindings: PCI: dwc: Add AMD Versal2 MDB SLCR support
2025-03-27 13:14:48 -05:00
Bjorn Helgaas
17dbd3f621 Merge branch 'pci/controller/altera'
- Add DT binding for Agilex family (P-Tile, F-Tile, R-Tile) (Matthew
  Gerlach)

- Add driver support for Agilex family (P-Tile, F-Tile, R-Tile) (D M,
  Sharath Kumar)

* pci/controller/altera:
  PCI: altera: Add Agilex support
  dt-bindings: PCI: altera: Add binding for Agilex
2025-03-27 13:14:47 -05:00
Bjorn Helgaas
8085db1d07 Merge branch 'pci/scoped-cleanup'
- Use for_each_available_child_of_node_scoped() to simplify apple, kirin,
  mediatek, mt7621, tegra drivers (Zhang Zekun)

* pci/scoped-cleanup:
  PCI: tegra: Use helper function for_each_child_of_node_scoped()
  PCI: apple: Use helper function for_each_child_of_node_scoped()
  PCI: mt7621: Use helper function for_each_available_child_of_node_scoped()
  PCI: mediatek: Use helper function for_each_available_child_of_node_scoped()
  PCI: kirin: Tidy up _probe() related function with dev_err_probe()
  PCI: kirin: Use helper function for_each_available_child_of_node_scoped()
2025-03-27 13:14:47 -05:00
Bjorn Helgaas
f775c8a4bb Merge branch 'pci/epf-mhi'
- Update SA8775P device ID (Mrinmay Sarkar)

* pci/epf-mhi:
  PCI: epf-mhi: Update device ID for SA8775P
2025-03-27 13:14:47 -05:00
Bjorn Helgaas
cc28c0e5e7 Merge branch 'pci/endpoint-test'
- Fix endpoint BAR testing so the test can skip disabled BARs instead of
  reporting them as failures (Niklas Cassel)

- Verify that pci_endpoint interrupt tests set the correct IRQ type
  (Kunihiko Hayashi)

- Fix interpretation of pci_endpoint_test_bars_read_bar() error returns
  (Niklas Cassel)

- Fix potential string truncation in pci_endpoint_test_probe() (Niklas
  Cassel)

- Increase endpoint test BAR size variable to accommodate BARs larger than
  INT_MAX (Niklas Cassel)

- Release IRQs to avoid leak in pci_endpoint interrupt tests (Kunihiko
  Hayashi)

- Log the correct IRQ type when pci_endpoint IRQ request test fails
  (Kunihiko Hayashi)

- Remove pci_endpoint_test irq_type and no_msi globals; instead use
  test->irq_type (Kunihiko Hayashi)

- Remove unnecessary use of managed IRQ functions in pci_endpoint_test
  (Kunihiko Hayashi)

- Add and use IRQ_TYPE_* defines in pci_endpoint_test (Niklas Cassel)

- Add struct pci_epc_features.intx_capable and note that RK3568 and RK3588
  can't raise INTx interrupts (Niklas Cassel)

- Expose supported IRQ types in CAPS so pci_endpoint_test can set
  appropriate type (Niklas Cassel)

- Add PCITEST_IRQ_TYPE_AUTO to pci_endpoint_test for cases where the IRQ
  type doesn't matter (Niklas Cassel)

* pci/endpoint-test:
  misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO
  PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register
  PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts
  PCI: endpoint: Add intx_capable to epc_features struct
  selftests: pci_endpoint: Use IRQ_TYPE_* defines from UAPI header
  misc: pci_endpoint_test: Use IRQ_TYPE_* defines from UAPI header
  PCI: endpoint: pcitest: Add IRQ_TYPE_* defines to UAPI header
  misc: pci_endpoint_test: Do not use managed IRQ functions
  misc: pci_endpoint_test: Remove global 'irq_type' and 'no_msi'
  misc: pci_endpoint_test: Fix 'irq_type' to convey the correct type
  misc: pci_endpoint_test: Fix displaying 'irq_type' after 'request_irq' error
  misc: pci_endpoint_test: Avoid issue of interrupts remaining after request_irq error
  misc: pci_endpoint_test: Handle BAR sizes larger than INT_MAX
  misc: pci_endpoint_test: Give disabled BARs a distinct error code
  misc: pci_endpoint_test: Fix potential truncation in pci_endpoint_test_probe()
  misc: pci_endpoint_test: Fix pci_endpoint_test_bars_read_bar() error handling
  selftests: pci_endpoint: Add GET_IRQTYPE checks to each interrupt test
  selftests: pci_endpoint: Skip disabled BARs
2025-03-27 13:14:46 -05:00
Bjorn Helgaas
a113afb84a Merge branch 'pci/endpoint'
- Convert PCI device data so pci-epf-test works correctly on big-endian
  endpoint systems (Niklas Cassel)

- Add BAR_RESIZABLE type to endpoint framework (Niklas Cassel)

- Add pci_epc_bar_size_to_rebar_cap() to convert a size to the Resizable
  BAR Capability so endpoint drivers can configure what the Capability
  register advertises (Niklas Cassel)

- Add DWC core support for EPF drivers to set BAR_RESIZABLE type and size
  via dw_pcie_ep_set_bar() (Niklas Cassel)

- Describe TI AM65x (keystone) BARs 2 and 5 as Resizable, not Fixed
  (Niklas Cassel)

- Reduce TI AM65x (keystone) BAR alignment requirement from 1MB to 64KB
  (Niklas Cassel)

- Describe Rockchip rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas
  Cassel)

- Drop unused devm_pci_epc_destroy() (Zijun Hu)

- Fix pci-epf-test double free that causes an oops if the host reboots and
  PERST# deassertion restarts endpoint BAR allocation (Christian Bruel)

- Drop dw_pcie_ep_find_ext_capability() and use
  dw_pcie_find_ext_capability() instead (Niklas Cassel)

* pci/endpoint:
  PCI: dwc: ep: Remove superfluous function dw_pcie_ep_find_ext_capability()
  PCI: endpoint: pci-epf-test: Fix double free that causes kernel to oops
  PCI: endpoint: Remove unused devm_pci_epc_destroy()
  PCI: dw-rockchip: Describe Resizable BARs as Resizable BARs
  PCI: keystone: Specify correct alignment requirement
  PCI: keystone: Describe Resizable BARs as Resizable BARs
  PCI: dwc: ep: Allow EPF drivers to configure the size of Resizable BARs
  PCI: dwc: ep: Move dw_pcie_ep_find_ext_capability()
  PCI: endpoint: Add pci_epc_bar_size_to_rebar_cap()
  PCI: endpoint: Allow EPF drivers to configure the size of Resizable BARs
  PCI: endpoint: pci-epf-test: Handle endianness properly
2025-03-27 13:14:46 -05:00
Bjorn Helgaas
a1aed6b34f Merge branch 'pci/devtree-create'
- Add device_add_of_node() to set dev->of_node and dev->fwnode only if they
  haven't been set already (Herve Codina)

- Allow of_pci_set_address() to set the DT address property for root bus
  nodes, where there is no PCI bridge to supply the PCI bus/device/function
  part of the property (Herve Codina)

- Create DT nodes for PCI host bridges to enable loading device tree
  overlays to create platform devices for PCI devices that have several
  features that require multiple drivers (Herve Codina)

* pci/devtree-create:
  PCI: of: Create device tree PCI host bridge node
  PCI: of_property: Constify parameter in of_pci_get_addr_flags()
  PCI: of_property: Add support for NULL pdev in of_pci_set_address()
  PCI: of: Use device_{add,remove}_of_node() to attach of_node to existing device
  driver core: Introduce device_{add,remove}_of_node()
2025-03-27 13:14:45 -05:00
Bjorn Helgaas
38d42a6612 Merge branch 'pci/resource'
- Use pci_resource_n() to simplify BAR/window resource lookup (Ilpo
  Järvinen)

- Fix typo that repeatedly distributed resources to a bridge instead of
  iterating over subordinate bridges, which resulted in too little space to
  assign some BARs (Kai-Heng Feng)

- Relax bridge window tail sizing for optional resources, e.g., IOV BARs,
  to avoid failures when removing and re-adding devices (Ilpo Järvinen)

- Fix a double counting error for I/O resources, as we previously did for
  memory resources (Ilpo Järvinen)

- Use resource_set_{range,size}() helpers in more places (Ilpo Järvinen)

- Add pci_resource_is_iov() to identify IOV resources (Ilpo Järvinen)

- Add pci_resource_num() to look up the BAR number from the resource
  pointer (Ilpo Järvinen)

- Add restore_dev_resource() to simplify code that resources saved device
  resources (Ilpo Järvinen)

- Allow drivers to enable devices even if we haven't assigned optional IOV
  resources to them (Ilpo Järvinen)

- Improve debug output during resource reallocation (Ilpo Järvinen)

- Rework handling of optional resources (IOV BARs, ROMs) to reduce failures
  if we can't allocate them (Ilpo Järvinen)

- Move declarations of pci_rescan_bus_bridge_resize(),
  pci_reassign_bridge_resources(), and CardBus-related sizes from
  include/linux/pci.h to drivers/pci/pci.h since they're not used outside
  the PCI core (Ilpo Järvinen)

- Make pci_setup_bridge() static (Ilpo Järvinen)

- Fix a NULL dereference in the SR-IOV VF creation error path (Shay Drory)

- Fix s390 mmio_read/write syscalls, which didn't cause page faults in some
  cases, which broke vfio-pci lazy mapping on first access (Niklas
  Schnelle)

- Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which was
  disabled only for s390 (Niklas Schnelle)

- Support mmap of PCI resources on s390 except for ISM devices (Niklas
  Schnelle)

* pci/resource:
  s390/pci: Support mmap() of PCI resources except for ISM devices
  s390/pci: Introduce pdev->non_mappable_bars and replace VFIO_PCI_MMAP
  s390/pci: Fix s390_mmio_read/write syscall page fault handling
  PCI: Fix NULL dereference in SR-IOV VF creation error path
  PCI: Move cardbus IO size declarations into pci/pci.h
  PCI: Make pci_setup_bridge() static
  PCI: Move resource reassignment func declarations into pci/pci.h
  PCI: Move pci_rescan_bus_bridge_resize() declaration to pci/pci.h
  PCI: Fix BAR resizing when VF BARs are assigned
  PCI: Do not claim to release resource falsely
  PCI: Increase Resizable BAR support from 512 GB to 128 TB
  PCI: Rework optional resource handling
  PCI: Perform reset_resource() and build fail list in sync
  PCI: Use res->parent to check if resource is assigned
  PCI: Add debug print when releasing resources before retry
  PCI: Indicate optional resource assignment failures
  PCI: Always have realloc_head in __assign_resources_sorted()
  PCI: Extend enable to check for any optional resource
  PCI: Add restore_dev_resource()
  PCI: Remove incorrect comment from pci_reassign_resource()
  PCI: Consolidate assignment loop next round preparation
  PCI: Rename retval to ret
  PCI: Use while loop and break instead of gotos
  PCI: Refactor pdev_sort_resources() & __dev_sort_resources()
  PCI: Converge return paths in __assign_resources_sorted()
  PCI: Add dev & res local variables to resource assignment funcs
  PCI: Add pci_resource_num() helper
  PCI: Check resource_size() separately
  PCI: Add pci_resource_is_iov() to identify IOV resources
  PCI: Use resource_set_{range,size}() helpers
  PCI: Use SZ_* instead of literals in setup-bus.c
  PCI: Fix old_size lower bound in calculate_iosize() too
  PCI: Allow relaxed bridge window tail sizing for optional resources
  PCI: Simplify size1 assignment logic
  PCI: Use min_align, not unrelated add_align, for size0
  PCI: Remove add_align overwrite unrelated to size0
  PCI: Use downstream bridges for distributing resources
  PCI: Cleanup dev->resource + resno to use pci_resource_n()
2025-03-27 13:14:45 -05:00
Bjorn Helgaas
a7a8e7996c Merge branch 'pci/reset'
- Log debug messages about reset methods being used (Bjorn Helgaas)

- Avoid reset when it has been disabled via sysfs (Nishanth Aravamudan)

* pci/reset:
  PCI: Avoid reset when disabled via sysfs
  PCI: Log debug messages about reset method
2025-03-27 13:14:45 -05:00
Bjorn Helgaas
55d25a101d Merge branch 'pci/pwrctrl'
- Create pwrctrl devices in pci_scan_device() to make it more symmetric
  with pci_pwrctrl_unregister() and make pwrctrl devices for PCI bridges
  possible (Manivannan Sadhasivam)

- Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc. can
  still access devices after pci_stop_dev() (Manivannan Sadhasivam)

- If there's a pwrctrl device for a PCI device, skip scanning it because
  the pwrctrl core will rescan the bus after the device is powered on
  (Manivannan Sadhasivam)

- Add a pwrctrl driver for PCI slots based on voltage regulators described
  via devicetree (Manivannan Sadhasivam)

* pci/pwrctrl:
  PCI/pwrctrl: Add pwrctrl driver for PCI slots
  dt-bindings: vendor-prefixes: Document the 'pciclass' prefix
  PCI/pwrctrl: Skip scanning for the device further if pwrctrl device is created
  PCI/pwrctrl: Move pci_pwrctrl_unregister() to pci_destroy_dev()
  PCI/pwrctrl: Move creation of pwrctrl devices to pci_scan_device()
2025-03-27 13:14:44 -05:00
Bjorn Helgaas
e91c25c6fc Merge branch 'pci/pm'
- Allow PCI bridges to go to D3Hot on all non-x86 systems (Manivannan
  Sadhasivam)

* pci/pm:
  PCI: Allow PCI bridges to go to D3Hot on all non-x86
2025-03-27 13:14:44 -05:00
Bjorn Helgaas
655ea930fe Merge branch 'pci/hotplug'
- Drop shpchp module init/exit logging (Ilpo Järvinen)

- Replace shpchp dbg() with ctrl_dbg() and remove unused dbg(), err(),
  info(), warn() wrappers (Ilpo Järvinen)

- Drop 'shpchp_debug' module parameter in favor of standard dynamic
  debugging (Ilpo Järvinen)

- Drop unused .get_power(), .set_power() function pointers (Guilherme
  Giacomo Simoes)

- Drop superfluous pci_hotplug_slot_list (Lukas Wunner)

- Drop superfluous try_module_get() calls (Lukas Wunner)

- Drop superfluous NULL pointer checks (Lukas Wunner)

- Pass struct hotplug_slot pointers directly to avoid backpointer
  dereferencing in has_*_file() (Lukas Wunner)

- Inline pci_hp_{create,remove}_module_link() to reduce exported symbols
  (Lukas Wunner)

- Disable hotplug interrupts in portdrv only when pciehp is not enabled to
  prevent issuing two hotplug commands too close together (Feng Tang)

- Skip pciehp 'device replaced' check if the device has been removed to
  address a common deadlock when resuming after a device was removed during
  system sleep (Lukas Wunner)

- Don't enable pciehp hotplug interupt when resuming in poll mode (Ilpo
  Järvinen)

* pci/hotplug:
  PCI: pciehp: Don't enable HPIE when resuming in poll mode
  PCI: pciehp: Avoid unnecessary device replacement check
  PCI/portdrv: Only disable pciehp interrupts early when needed
  PCI: hotplug: Inline pci_hp_{create,remove}_module_link()
  PCI: hotplug: Avoid backpointer dereferencing in has_*_file()
  PCI: hotplug: Drop superfluous NULL pointer checks in has_*_file()
  PCI: hotplug: Drop superfluous try_module_get() calls
  PCI: hotplug: Drop superfluous pci_hotplug_slot_list
  PCI: cpcihp: Remove unused .get_power() and .set_power()
  PCI: shpchp: Remove 'shpchp_debug' module parameter
  PCI: shpchp: Remove unused logging wrappers
  PCI: shpchp: Change dbg() -> ctrl_dbg()
  PCI: shpchp: Remove logging from module init/exit functions
2025-03-27 13:14:44 -05:00
Bjorn Helgaas
e9e224dadd Merge branch 'pci/enumeration'
- Enable Configuration RRS SV early instead of during child bus scanning
  (Bjorn Helgaas)

- Cache offset of Resizable BAR capability to avoid redundant searches for
  it (Bjorn Helgaas)

- Fix reference leaks in pci_register_host_bridge() and
  pci_alloc_child_bus() (Ma Ke)

- Drop put_device() in pci_register_host_bridge() left over from converting
  device_register() to device_add() (Dan Carpenter)

* pci/enumeration:
  PCI: Remove stray put_device() in pci_register_host_bridge()
  PCI: Fix reference leak in pci_alloc_child_bus()
  PCI: Fix reference leak in pci_register_host_bridge()
  PCI: Cache offset of Resizable BAR capability
  PCI: Enable Configuration RRS SV early
2025-03-27 13:14:43 -05:00
Bjorn Helgaas
651aa9052c Merge branch 'pci/doe'
- Rename DOE 'protocol' to 'feature' to follow spec terminology (Alistair
  Francis)

- Expose supported DOE features via sysfs (Alistair Francis)

- Allow DOE support to be enabled even if CXL isn't enabled (Alistair
  Francis)

* pci/doe:
  PCI/DOE: Allow enabling DOE without CXL
  PCI/DOE: Expose DOE features via sysfs
  PCI/DOE: Rename Discovery Response Data Object Contents to type
  PCI/DOE: Rename DOE protocol to feature
2025-03-27 13:14:43 -05:00
Bjorn Helgaas
67b9f18202 Merge branch 'pci/devres'
- Enlarge the devres table[] to accommodate bridge windows, ROM, IOV BARs,
  etc (Philipp Stanner)

- Validate BAR index in devres interfaces (Philipp Stanner)

* pci/devres:
  PCI: Check BAR index for validity
  PCI: Fix wrong length of devres array
2025-03-27 13:14:43 -05:00
Bjorn Helgaas
4d1a2a9244 Merge branch 'pci/bwctrl'
- Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the
  set_pcie_cooling_state.sh test case (Yi Lai)

- Fix the pcie_bwctrl_select_speed() return value in cases where a
  non-compliant device doesn't advertise valid supported speeds (Ilpo
  Järvinen)

- Avoid a NULL pointer dereference when we run out of bus numbers to assign
  for a bridge secondary bus (Lukas Wunner)

* pci/bwctrl:
  PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion
  PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
  selftests/pcie_bwctrl: Add 'set_pcie_speed.sh' to TEST_PROGS
2025-03-27 13:14:42 -05:00
Bjorn Helgaas
2cde6eb252 Merge branch 'pci/aspm'
- Delay pcie_link_state deallocation to avoid dangling pointers that cause
  invalid references during hot-unplug (Daniel Stodden)

* pci/aspm:
  PCI/ASPM: Fix link state exit during switch upstream function removal
2025-03-27 13:14:42 -05:00
Bjorn Helgaas
5a04a18b1a Merge branch 'pci/aer'
- Implement local aer_printk() since AER is the only place that prints a
  message with level depending on the error severity (Ilpo Järvinen)

* pci/aer:
  PCI/ERR: Handle TLP Log in Flit mode
  PCI: Track Flit Mode Status & print it with link status
  PCI/AER: Descope pci_printk() to aer_printk()
2025-03-27 13:14:42 -05:00
Ioana Ciornei
4c8c0ffd41 PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args()
The arg_count parameter to syscon_regmap_lookup_by_phandle_args()
represents the number of argument cells following the phandle. In this
case, the number of arguments should be 1 instead of 2 since the dt
property looks like this:

  fsl,pcie-scfg = <&scfg 0>;

Without this fix, layerscape-pcie fails with the following message on
LS1043A:

  OF: /soc/pcie@3500000: phandle scfg@1570000 needs 2, found 1
  layerscape-pcie 3500000.pcie: No syscfg phandle specified
  layerscape-pcie 3500000.pcie: probe with driver layerscape-pcie failed with error -22

Link: https://lore.kernel.org/r/20250327151949.2765193-1-ioana.ciornei@nxp.com
Fixes: 149fc35734 ("PCI: layerscape: Use syscon_regmap_lookup_by_phandle_args")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Roy Zang <Roy.Zang@nxp.com>
Cc: stable@vger.kernel.org
2025-03-27 13:11:14 -05:00
Linus Torvalds
336b4dae6d IOMMU Updates for Linux v6.15
Including:
 
 	- Core: IOMMUFD dependencies from Jason:
 	  - Change the iommufd fault handle into an always present hwpt handle in
 	    the domain
 	  - Give iommufd its own SW_MSI implementation along with some IRQ layer
 	    rework
 	  - Improvements to the handle attach API
 
 	- Core: Fixes for probe-issues from Robin
 
 	- Intel VT-d changes:
 	  - Checking for SVA support in domain allocation and attach paths
 	  - Move PCI ATS and PRI configuration into probe paths
 	  - Fix a pentential hang on reboot -f
 	  - Miscellaneous cleanups
 
 	- AMD-Vi changes:
 	  - Support for up to 2k IRQs per PCI device function
 	  - Set of smaller fixes
 
 	- ARM-SMMU changes:
 	  - SMMUv2 devicetree binding updates for Qualcomm implementations
 	    (QCS8300 GPU and MSM8937)
 	  - Clean up SMMUv2 runtime PM implementation to help with wider rework of
 	    pm_runtime_put_autosuspend()
 
 	- Rockchip driver changes:
 	  - Driver adjustments for recent DT probing changes
 
 	- S390 IOMMU changes:
 	  - Support for IOMMU passthrough
 
 	- Apple Dart changes:
 	  - Driver adjustments to meet ISP device requirements
 	  - Null-ptr deref fix
 	  - Disable subpage protection for DART 1
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmfkFSkACgkQK/BELZcB
 GuOVUg/9En4QlF0eeg4E1stszGcOOXn9IkKiGP9iqKpL/WqVVitoagn7nwq0rBFF
 PdqqcGbSPNHctiUursB18DyENOAmRUE8mGG29po9rAfq5wxZ2yA6eLmpFAf9QDCW
 vY3nz2oen1lUrzbPwVI1IR67L6BrtqEUz/1syf295RMr2yXkdmBXb12lbL1rdjv0
 7YfWwlDtMOVsewlZmCuDAAfWHtd7cO/ulDcYwq+GOiSXojVmKw8624EFlFbM9E9f
 tkceE4mjGQnN/CpZzUc48f1DgIW82PVVTek95Cznbk9ACDJQ4VYYwDMx2WszfMzu
 D1CXLiQu0vrxQgFdHn02bw3T4yvXQ4HeawIaTF+1J1gEHbrj6MoVQ88zaO3GnKFk
 yxYI5sfOs5iL6zSMkig4OytXuRmRqDVJhgrF1i4XIM3GXrRKf9EiRZ/1eXb+1gab
 3Q5k7+u4+7vWbaKLsxOdIBNzz02hb7LVndPIFlWmSnr1YY0LFBfwmgbfTAsSINZo
 22Ni8aE2C42lYOZxJ67m3khLpqTZaTERtQIab/gUd5FcuEjxKuHBIzj6SZeC/0Q9
 Y6rzklHxJxxtQLrUJp16IzjpiMJbWH0H2jKstRberOpxk3L5aLrJNd8M1B+CokmP
 LZeKgqgxV5F+VlR2Bap5AXL9ZEuqjCVDIEy79j8XzAzxobDVnVM=
 =a8Pd
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu updates from Joerg Roedel:
 "Core iommufd dependencies from Jason:
   - Change the iommufd fault handle into an always present hwpt handle
     in the domain
   - Give iommufd its own SW_MSI implementation along with some IRQ
     layer rework
   - Improvements to the handle attach API

  Core fixes for probe-issues from Robin

  Intel VT-d changes:
   - Checking for SVA support in domain allocation and attach paths
   - Move PCI ATS and PRI configuration into probe paths
   - Fix a pentential hang on reboot -f
   - Miscellaneous cleanups

  AMD-Vi changes:
   - Support for up to 2k IRQs per PCI device function
   - Set of smaller fixes

  ARM-SMMU changes:
   - SMMUv2 devicetree binding updates for Qualcomm implementations
     (QCS8300 GPU and MSM8937)
   - Clean up SMMUv2 runtime PM implementation to help with wider rework
     of pm_runtime_put_autosuspend()

  Rockchip driver changes:
   - Driver adjustments for recent DT probing changes

  S390 IOMMU changes:
   - Support for IOMMU passthrough

  Apple Dart changes:
   - Driver adjustments to meet ISP device requirements
   - Null-ptr deref fix
   - Disable subpage protection for DART 1"

* tag 'iommu-updates-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (54 commits)
  iommu/vt-d: Fix possible circular locking dependency
  iommu/vt-d: Don't clobber posted vCPU IRTE when host IRQ affinity changes
  iommu/vt-d: Put IRTE back into posted MSI mode if vCPU posting is disabled
  iommu: apple-dart: fix potential null pointer deref
  iommu/rockchip: Retire global dma_dev workaround
  iommu/rockchip: Register in a sensible order
  iommu/rockchip: Allocate per-device data sensibly
  iommu/mediatek-v1: Support COMPILE_TEST
  iommu/amd: Enable support for up to 2K interrupts per function
  iommu/amd: Rename DTE_INTTABLEN* and MAX_IRQS_PER_TABLE macro
  iommu/amd: Replace slab cache allocator with page allocator
  iommu/amd: Introduce generic function to set multibit feature value
  iommu: Don't warn prematurely about dodgy probes
  iommu/arm-smmu: Set rpm auto_suspend once during probe
  dt-bindings: arm-smmu: Document QCS8300 GPU SMMU
  iommu: Get DT/ACPI parsing into the proper probe path
  iommu: Keep dev->iommu state consistent
  iommu: Resolve ops in iommu_init_device()
  iommu: Handle race with default domain setup
  iommu: Unexport iommu_fwspec_free()
  ...
2025-03-26 20:10:09 -07:00
Linus Torvalds
87dc996d56 A urgent fix for the XEN related PCI/MSI changes:
XEN used a global variable to disable the masking of MSI interrupts as
   XEN handles that on the hypervisor side. This turned out to be a problem
   with VMD as the PCI devices behind a VMD bridge are not always handled
   by the hypervisor and then require masking by guest.
 
   To solve this the global variable was replaced by a interrupt domain
   specific flag, which is set by the generic XEN PCI/MSI domain, but not by
   VMD or any other domain in the system.
 
   So far, so good. But the implementation (and the reviewer) missed the
   fact, that accessing the domain flag cannot be done directly because
   there are at least two situations, where this fails. Legacy architectures
   are not providing interrupt domains at all. The new MSI parent domains do
   not require to have a domain info pointer. Both cases result in a
   unconditional NULL pointer derefence.
 
   The PCI/MSI code already has a function to query the MSI domain specific
   flag in a safe way, which handles all possible cases of PCI/MSI backends.
 
   So the fix it simply to replace the open coded checks by invoking the
   safe helper to query the flag.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfkUA4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVymD/4xvcTnjjBQgesBWtkZ/wyvOx+zAoZD
 s6ikEnYVzG6jjiob9SXrttJS/fZDuFujmntvq1lp+Lw1MbK/k9qyeKpMUkjVw4V5
 gHgPUZUkKapk2Jf3vYy23cb8CuRp9nANi7r//kzefN/pyEmfuqme3AYNJjusyEyn
 QfhLGE046GI2PLLS+4FKUjk+/aPEk/Pywy3moaSpDVd8Kmsus5GTmxt/AWcwGfcw
 00TSRKstpew+RVLBb7NvSar0D43vtvP/8If6qkIpEttVhB6/4zTREB1z6iPjUSf0
 2+90iubCdSm8wO+y0JDmA13cUct2XqW19M75imSLBIsJAlVtgWgOi2IJF787vDPJ
 3bN2HeScCbl8+0e4Uv+NOi2c906k43MtC4XTrIq0nZjAuRyF9FEtdK+IrcywxuKH
 nuaHtlseGWg97kU49yDTi1k+tRvIpmsibKM4pKZoUZKNMfE2soCFjzj40O7hTcKJ
 LaN/fcDdPGnn+GMPywwIWc7e6kOIYnCqakom/x7e+KWpqfujnM3KLiyOy9iys/oL
 xXfKnMbbmL0CSdUrC7EJn20KFRA2ovAJ+ouzJRkFhCdERCVidY29O3VrstacbevS
 zQKb302PbFraGHa/MiuzyF0BuWuM0T0d12+2wEoVJpN/GTAbgO2vVbk2gvHEReXJ
 eMDn02f5lF659w==
 =ZPMY
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI irq fix from Thomas Gleixner:
 "An urgent fix for the XEN related PCI/MSI changes:

  XEN used a global variable to disable the masking of MSI interrupts as
  XEN handles that on the hypervisor side. This turned out to be a
  problem with VMD as the PCI devices behind a VMD bridge are not always
  handled by the hypervisor and then require masking by guest.

  To solve this the global variable was replaced by a interrupt domain
  specific flag, which is set by the generic XEN PCI/MSI domain, but not
  by VMD or any other domain in the system.

  So far, so good. But the implementation (and the reviewer) missed the
  fact, that accessing the domain flag cannot be done directly because
  there are at least two situations, where this fails.

  Legacy architectures are not providing interrupt domains at all. The
  new MSI parent domains do not require to have a domain info pointer.
  Both cases result in a unconditional NULL pointer derefence.

  The PCI/MSI code already has a function to query the MSI domain
  specific flag in a safe way, which handles all possible cases of
  PCI/MSI backends.

  So the fix it simply to replace the open coded checks by invoking the
  safe helper to query the flag"

* tag 'irq-urgent-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
2025-03-26 13:20:22 -07:00
Thomas Gleixner
3ece3e8e59 PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
The conversion of the XEN specific global variable pci_msi_ignore_mask to a
MSI domain flag, missed the facts that:

    1) Legacy architectures do not provide a interrupt domain
    2) Parent MSI domains do not necessarily have a domain info attached
   
Both cases result in an unconditional NULL pointer dereference. This was
unfortunatly missed in review and testing revealed it late.

Cure this by using the existing pci_msi_domain_supports() helper, which
handles all possible cases correctly.

Fixes: c3164d2e0d ("PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag")
Reported-by: Daniel Gomez <da.gomez@kernel.org>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Tested-by: Daniel Gomez <da.gomez@kernel.org>
Link: https://lore.kernel.org/all/87iknwyp2o.ffs@tglx
Closes: https://lore.kernel.org/all/qn7fzggcj6qe6r6gdbwcz23pzdz2jx64aldccmsuheabhmjgrt@tawf5nfwuvw7
2025-03-26 15:28:43 +01:00
Siddharth Vadapalli
d66b5b3362
PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4
Commit e49ad66781 ("PCI: j721e: Add TI J784S4 PCIe configuration")
assigned the value of .linkdown_irq_regfield for the J784S4 SoC as the
"LINK_DOWN" macro corresponding to BIT(1), and as a result, the Link
Down interrupts on J784S4 SoC are missed.

According to the Technical Reference Manual and Register Documentation
for the J784S4 SoC[1], BIT(1) corresponds to "ENABLE_SYS_EN_PCIE_DPA_1",
which is not the correct field for the link-state interrupt. Instead, it
is BIT(10) of the "PCIE_INTD_ENABLE_REG_SYS_2" register that corresponds
to the link-state field named as "ENABLE_SYS_EN_PCIE_LINK_STATE".

Thus, set .linkdown_irq_regfield to the macro "J7200_LINK_DOWN", which
expands to BIT(10) and was first defined for the J7200 SoC. Other SoCs
already reuse this macro since it accurately represents the "link-state"
field in their respective "PCIE_INTD_ENABLE_REG_SYS_2" register.

1: https://www.ti.com/lit/zip/spruj52

Fixes: e49ad66781 ("PCI: j721e: Add TI J784S4 PCIe configuration")
Cc: stable@vger.kernel.org
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
[kwilczynski: commit log, add a missing .linkdown_irq_regfield member
set to the J7200_LINK_DOWN macro to struct j7200_pcie_ep_data]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250305132018.2260771-1-s-vadapalli@ti.com
2025-03-26 07:06:12 +00:00
Niklas Cassel
7c3b54cf64
PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register
Expose the supported IRQ types in the CAPS register.

This way, the host side driver (drivers/misc/pci_endpoint_test.c) can
know which IRQ types that the endpoint supports.

The host side driver will make use of this information in a follow-up
commit.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250310111016.859445-15-cassel@kernel.org
2025-03-26 06:11:52 +00:00
Niklas Cassel
e55c67837a
PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts
Neither RK3568 or RK3588 supports INTx interrupts.

Since epc_features is zero initialized, this is strictly not needed.
However, setting intx_capable explicitly to false makes it more clear
that neither RK3568 or RK3588 supports INTx interrupts.

No functional change.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250310111016.859445-14-cassel@kernel.org
2025-03-26 06:11:49 +00:00
Linus Torvalds
7d20aa5c32 Power management updates for 6.15-rc1
- Manage sysfs attributes and boost frequencies efficiently from
    cpufreq core to reduce boilerplate code in drivers (Viresh Kumar).
 
  - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
    Dhananjay Ugwekar, Imran Shaik, zuoqian).
 
  - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
    Bai).
 
  - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
 
  - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng).
 
  - Optimize the amd-pstate driver to avoid cases where call paths end
    up calling the same writes multiple times and needlessly caching
    variables through code reorganization, locking overhaul and tracing
    adjustments (Mario Limonciello, Dhananjay Ugwekar).
 
  - Make it possible to avoid enabling capacity-aware scheduling (CAS) in
    the intel_pstate driver and relocate a check for out-of-band (OOB)
    platform handling in it to make it detect OOB before checking HWP
    availability (Rafael Wysocki).
 
  - Fix dbs_update() to avoid inadvertent conversions of negative integer
    values to unsigned int which causes CPU frequency selection to be
    inaccurate in some cases when the "conservative" cpufreq governor is
    in use (Jie Zhan).
 
  - Update the handling of the most recent idle intervals in the menu
    cpuidle governor to prevent useful information from being discarded
    by it in some cases and improve the prediction accuracy (Rafael
    Wysocki).
 
  - Make it possible to tell the intel_idle driver to ignore its built-in
    table of idle states for the given processor, clean up the handling
    of auto-demotion disabling on Baytrail and Cherrytrail chips in it,
    and update its MAINTAINERS entry (David Arcari, Artem Bityutskiy,
    Rafael Wysocki).
 
  - Make some cpuidle drivers use for_each_present_cpu() instead of
    for_each_possible_cpu() during initialization to avoid issues
    occurring when nosmp or maxcpus=0 are used (Jacky Bai).
 
  - Clean up the Energy Model handling code somewhat (Rafael Wysocki).
 
  - Use kfree_rcu() to simplify the handling of runtime Energy Model
    updates (Li RongQing).
 
  - Add an entry for the Energy Model framework to MAINTAINERS as
    properly maintained (Lukasz Luba).
 
  - Address RCU-related sparse warnings in the Energy Model code (Rafael
    Wysocki).
 
  - Remove ENERGY_MODEL dependency on SMP and allow it to be selected
    when DEVFREQ is set without CPUFREQ so it can be used on a wider
    range of systems (Jeson Gao).
 
  - Unify error handling during runtime suspend and runtime resume in the
    core to help drivers to implement more consistent runtime PM error
    handling (Rafael Wysocki).
 
  - Drop a redundant check from pm_runtime_force_resume() and rearrange
    documentation related to __pm_runtime_disable() (Rafael Wysocki).
 
  - Rework the handling of the "smart suspend" driver flag in the PM core
    to avoid issues hat may occur when drivers using it depend on some
    other drivers and clean up the related PM core code (Rafael Wysocki,
    Colin Ian King).
 
  - Fix the handling of devices with the power.direct_complete flag set
    if device_suspend() returns an error for at least one device to avoid
    situations in which some of them may not be resumed (Rafael Wysocki).
 
  - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
    possible deadlock that may occur if the "compressor" hibernation
    module parameter is accessed during the registration of a new
    ieee80211 device (Lizhi Xu).
 
  - Suppress sleeping parent warning in device_pm_add() in the case when
    new children are added under a device with the power.direct_complete
    set after it has been processed by device_resume() (Xu Yang).
 
  - Remove needless return in three void functions related to system
    wakeup (Zijun Hu).
 
  - Replace deprecated kmap_atomic() with kmap_local_page() in the
    hibernation core code (David Reaver).
 
  - Remove unused helper functions related to system sleep (David Alan
    Gilbert).
 
  - Clean up s2idle_enter() so it does not lock and unlock CPU offline
    in vain and update comments in it (Ulf Hansson).
 
  - Clean up broken white space in dpm_wait_for_children() (Geert
    Uytterhoeven).
 
  - Update the cpupower utility to fix lib version-ing in it and memory
    leaks in error legs, remove hard-coded values, and implement CPU
    physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
    Khan, Yiwei Lin, Zhongqiu Han).
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmfhhTYSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO16/gIAKuRiG1fFgUcUSXC1iFu42vrB/1i4wpA
 02GICACqM3K6/5jd3ct/WOU28GUgDs+xcmqH7CnMaM6y9nXEWjWarmSfFekAO+0q
 TPtQ7xTy0hBCB3he1P2uLKBJBin4Wn47U9/rvs4J7mQd5zDxTINKIiVoHg2lEE+s
 HAeSoNRb2sp5IZDm9+/LfhHNYRP1mJ97cbZlymqctGB3xgDL7qMLid/1+gFPHAQS
 4/LXj3IgyU8DpA/j5nhtpaAqjN5g2QxIUfQgADRIcESK99Y/7aAMs1/G0WhJKaay
 9yx+4/xmkGvVCZQx1DphksFLISEzltY0SFWLsoppPzBTGVEW2GQQsNI=
 =LqVy
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These are dominated by cpufreq updates which in turn are dominated by
  updates related to boost support in the core and drivers and
  amd-pstate driver optimizations.

  Apart from the above, there are some cpuidle updates including a
  rework of the most recent idle intervals handling in the venerable
  menu governor that leads to significant improvements in some
  performance benchmarks, as the governor is now more likely to predict
  a shorter idle duration in some cases, and there are updates of the
  core device power management code, mostly related to system suspend
  and resume, that should help to avoid potential issues arising when
  the drivers of devices depending on one another want to use different
  optimizations.

  There is also a usual collection of assorted fixes and cleanups,
  including removal of some unused code.

  Specifics:

   - Manage sysfs attributes and boost frequencies efficiently from
     cpufreq core to reduce boilerplate code in drivers (Viresh Kumar)

   - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
     Dhananjay Ugwekar, Imran Shaik, zuoqian)

   - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
     Bai)

   - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski)

   - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng)

   - Optimize the amd-pstate driver to avoid cases where call paths end
     up calling the same writes multiple times and needlessly caching
     variables through code reorganization, locking overhaul and tracing
     adjustments (Mario Limonciello, Dhananjay Ugwekar)

   - Make it possible to avoid enabling capacity-aware scheduling (CAS)
     in the intel_pstate driver and relocate a check for out-of-band
     (OOB) platform handling in it to make it detect OOB before checking
     HWP availability (Rafael Wysocki)

   - Fix dbs_update() to avoid inadvertent conversions of negative
     integer values to unsigned int which causes CPU frequency selection
     to be inaccurate in some cases when the "conservative" cpufreq
     governor is in use (Jie Zhan)

   - Update the handling of the most recent idle intervals in the menu
     cpuidle governor to prevent useful information from being discarded
     by it in some cases and improve the prediction accuracy (Rafael
     Wysocki)

   - Make it possible to tell the intel_idle driver to ignore its
     built-in table of idle states for the given processor, clean up the
     handling of auto-demotion disabling on Baytrail and Cherrytrail
     chips in it, and update its MAINTAINERS entry (David Arcari, Artem
     Bityutskiy, Rafael Wysocki)

   - Make some cpuidle drivers use for_each_present_cpu() instead of
     for_each_possible_cpu() during initialization to avoid issues
     occurring when nosmp or maxcpus=0 are used (Jacky Bai)

   - Clean up the Energy Model handling code somewhat (Rafael Wysocki)

   - Use kfree_rcu() to simplify the handling of runtime Energy Model
     updates (Li RongQing)

   - Add an entry for the Energy Model framework to MAINTAINERS as
     properly maintained (Lukasz Luba)

   - Address RCU-related sparse warnings in the Energy Model code
     (Rafael Wysocki)

   - Remove ENERGY_MODEL dependency on SMP and allow it to be selected
     when DEVFREQ is set without CPUFREQ so it can be used on a wider
     range of systems (Jeson Gao)

   - Unify error handling during runtime suspend and runtime resume in
     the core to help drivers to implement more consistent runtime PM
     error handling (Rafael Wysocki)

   - Drop a redundant check from pm_runtime_force_resume() and rearrange
     documentation related to __pm_runtime_disable() (Rafael Wysocki)

   - Rework the handling of the "smart suspend" driver flag in the PM
     core to avoid issues hat may occur when drivers using it depend on
     some other drivers and clean up the related PM core code (Rafael
     Wysocki, Colin Ian King)

   - Fix the handling of devices with the power.direct_complete flag set
     if device_suspend() returns an error for at least one device to
     avoid situations in which some of them may not be resumed (Rafael
     Wysocki)

   - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
     possible deadlock that may occur if the "compressor" hibernation
     module parameter is accessed during the registration of a new
     ieee80211 device (Lizhi Xu)

   - Suppress sleeping parent warning in device_pm_add() in the case
     when new children are added under a device with the
     power.direct_complete set after it has been processed by
     device_resume() (Xu Yang)

   - Remove needless return in three void functions related to system
     wakeup (Zijun Hu)

   - Replace deprecated kmap_atomic() with kmap_local_page() in the
     hibernation core code (David Reaver)

   - Remove unused helper functions related to system sleep (David Alan
     Gilbert)

   - Clean up s2idle_enter() so it does not lock and unlock CPU offline
     in vain and update comments in it (Ulf Hansson)

   - Clean up broken white space in dpm_wait_for_children() (Geert
     Uytterhoeven)

   - Update the cpupower utility to fix lib version-ing in it and memory
     leaks in error legs, remove hard-coded values, and implement CPU
     physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
     Khan, Yiwei Lin, Zhongqiu Han)"

* tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits)
  PM: sleep: Fix bit masking operation
  dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
  dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
  cpufreq: Init cpufreq only for present CPUs
  PM: sleep: Fix handling devices with direct_complete set on errors
  cpuidle: Init cpuidle only for present CPUs
  PM: clk: Remove unused pm_clk_remove()
  PM: sleep: core: Fix indentation in dpm_wait_for_children()
  PM: s2idle: Extend comment in s2idle_enter()
  PM: s2idle: Drop redundant locks when entering s2idle
  PM: sleep: Remove unused pm_generic_ wrappers
  cpufreq: tegra186: Share policy per cluster
  cpupower: Make lib versioning scheme more obvious and fix version link
  PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
  PM: EM: Address RCU-related sparse warnings
  cpupower: Implement CPU physical core querying
  pm: cpupower: remove hard-coded topology depth values
  pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology
  ...
2025-03-25 15:00:18 -07:00
Linus Torvalds
dce3ab4c57 xen: branch for v6.15-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZ9/gEwAKCRCAXGG7T9hj
 vlxhAQCRzSCNI8wwvENnuc2OnRyWKy8gq7C5WAOIOJdJ3U+scQEAwKGhPJLwE4IS
 /JDh5PRJgZ4rdMYatuDfldEcSAfRRgw=
 =dF6Z
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - cleanup: remove an used function

 - add support for a XenServer specific virtual PCI device

 - fix the handling of a sparse Xen hypervisor symbol table

 - avoid warnings when building the kernel with gcc 15

 - fix use of devices behind a VMD bridge when running as a Xen PV dom0

* tag 'for-linus-6.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag
  PCI: vmd: Disable MSI remapping bypass under Xen
  xen/pci: Do not register devices with segments >= 0x10000
  xen/pciback: Remove unused pcistub_get_pci_dev
  xenfs/xensyms: respect hypervisor's "next" indication
  xen/mcelog: Add __nonstring annotations for unterminated strings
  xen: Add support for XenServer 6.1 platform device
2025-03-25 14:33:32 -07:00
Linus Torvalds
36f5f026df Updates for MSI interrupts
- Switch the MSI descriptor locking to guards
 
   - Replace the broken PCI/TPH implementation, which lacks any form of
     serialization against concurrent modifications with a properly
     serialized mechanism in the PCI/MSI core code.
 
   - Replace the MSI descriptor abuse in the SCSI/UFS Qualcom driver with
     dedicated driver internal storage.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmff5JcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYocp6D/9IUXX/8HsBVZwCBY1/SYAC8fnTUNGG
 fMxPTX0BSRNOaChvGkyIllEEkNbh+WDMh1qkV94fusJCRvQVaDnwJZ9zoTMJekiM
 ZEBXdqTiDyqUG5LzFeoYQFu/XBth59HMkaDqCxJEy+YbjnTNamX8QS8A/FZlPNH0
 uRs4m7oQxABL6E/JkjOkC0UOqddZ5BDwbLgXoGq+uDSLTicO9yR6EsigVacoreap
 zmkaUPRmL9NM8onrP2/9LduWwgcxn9u2wvd0uOCoIHBUKrlIpHkuNjieQ4R8QCP0
 b7INNTEnwv815O11EoU+AVoL+sTYjUVOAUzfeNejlUspwvjBASm5lUkyB2TS5nq7
 TtCxFyIS7/eBXwrKeV1w9ZcGjcso0DRJPvBQvhhL+q00Mtlrbpy7NUIF5S/8I60q
 QWLN1h1iKISl9aT5p7K1v1lnGyPTb942PNmRc9Cd8Si3QMt6be3RkAJekPWU78Wd
 Q9fXiMS4vyZhtt1S0vKU1L80Ezg3JVL6tWWpsSdZhMBM8wYiw1PI9NYtYEO2lzP/
 hDHOu4EaItPg5A6Z+IDOj13pddtQsnssiWJmcDvarnJKBOfmAbwOcNGpwcwCbm62
 bephtPc4h7En6/d4lsi8yzK6dSpf7yHLxTRrp9lomMaydEjfsNWcAs/nN2Y+wbr4
 UMuZDSrlTGHP0Q==
 =Vl8Q
 -----END PGP SIGNATURE-----

Merge tag 'irq-msi-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI irq updates from Thomas Gleixner:

 - Switch the MSI descriptor locking to guards

 - Replace the broken PCI/TPH implementation, which lacks any form of
   serialization against concurrent modifications with a properly
   serialized mechanism in the PCI/MSI core code

 - Replace the MSI descriptor abuse in the SCSI/UFS Qualcom driver with
   dedicated driver internal storage

* tag 'irq-msi-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/msi: Rename msi_[un]lock_descs()
  scsi: ufs: qcom: Remove the MSI descriptor abuse
  PCI/TPH: Replace the broken MSI-X control word update
  PCI/MSI: Provide a sane mechanism for TPH
  PCI: hv: Switch MSI descriptor locking to guard()
  PCI/MSI: Switch to MSI descriptor locking to guard()
  NTB/msi: Switch MSI descriptor locking to lock guard()
  soc: ti: ti_sci_inta_msi: Switch MSI descriptor locking to guard()
  genirq/msi: Use lock guards for MSI descriptor locking
  cleanup: Provide retain_ptr()
  genirq/msi: Make a few functions static
2025-03-25 09:15:17 -07:00
Linus Torvalds
e34c38057a [ Merge note: this pull request depends on you having merged
two locking commits in the locking tree,
 	      part of the locking-core-2025-03-22 pull request. ]
 
 x86 CPU features support:
   - Generate the <asm/cpufeaturemasks.h> header based on build config
     (H. Peter Anvin, Xin Li)
   - x86 CPUID parsing updates and fixes (Ahmed S. Darwish)
   - Introduce the 'setcpuid=' boot parameter (Brendan Jackman)
   - Enable modifying CPU bug flags with '{clear,set}puid='
     (Brendan Jackman)
   - Utilize CPU-type for CPU matching (Pawan Gupta)
   - Warn about unmet CPU feature dependencies (Sohil Mehta)
   - Prepare for new Intel Family numbers (Sohil Mehta)
 
 Percpu code:
   - Standardize & reorganize the x86 percpu layout and
     related cleanups (Brian Gerst)
   - Convert the stackprotector canary to a regular percpu
     variable (Brian Gerst)
   - Add a percpu subsection for cache hot data (Brian Gerst)
   - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak)
   - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak)
 
 MM:
   - Add support for broadcast TLB invalidation using AMD's INVLPGB instruction
     (Rik van Riel)
   - Rework ROX cache to avoid writable copy (Mike Rapoport)
   - PAT: restore large ROX pages after fragmentation
     (Kirill A. Shutemov, Mike Rapoport)
   - Make memremap(MEMREMAP_WB) map memory as encrypted by default
     (Kirill A. Shutemov)
   - Robustify page table initialization (Kirill A. Shutemov)
   - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn)
   - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
     (Matthew Wilcox)
 
 KASLR:
   - x86/kaslr: Reduce KASLR entropy on most x86 systems,
     to support PCI BAR space beyond the 10TiB region
     (CONFIG_PCI_P2PDMA=y) (Balbir Singh)
 
 CPU bugs:
   - Implement FineIBT-BHI mitigation (Peter Zijlstra)
   - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta)
   - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta)
   - RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta)
 
 System calls:
   - Break up entry/common.c (Brian Gerst)
   - Move sysctls into arch/x86 (Joel Granados)
 
 Intel LAM support updates: (Maciej Wieczor-Retman)
   - selftests/lam: Move cpu_has_la57() to use cpuinfo flag
   - selftests/lam: Skip test if LAM is disabled
   - selftests/lam: Test get_user() LAM pointer handling
 
 AMD SMN access updates:
   - Add SMN offsets to exclusive region access (Mario Limonciello)
   - Add support for debugfs access to SMN registers (Mario Limonciello)
   - Have HSMP use SMN through AMD_NODE (Yazen Ghannam)
 
 Power management updates: (Patryk Wlazlyn)
   - Allow calling mwait_play_dead with an arbitrary hint
   - ACPI/processor_idle: Add FFH state handling
   - intel_idle: Provide the default enter_dead() handler
   - Eliminate mwait_play_dead_cpuid_hint()
 
 Bootup:
 
 Build system:
   - Raise the minimum GCC version to 8.1 (Brian Gerst)
   - Raise the minimum LLVM version to 15.0.0
     (Nathan Chancellor)
 
 Kconfig: (Arnd Bergmann)
   - Add cmpxchg8b support back to Geode CPUs
   - Drop 32-bit "bigsmp" machine support
   - Rework CONFIG_GENERIC_CPU compiler flags
   - Drop configuration options for early 64-bit CPUs
   - Remove CONFIG_HIGHMEM64G support
   - Drop CONFIG_SWIOTLB for PAE
   - Drop support for CONFIG_HIGHPTE
   - Document CONFIG_X86_INTEL_MID as 64-bit-only
   - Remove old STA2x11 support
   - Only allow CONFIG_EISA for 32-bit
 
 Headers:
   - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers
     (Thomas Huth)
 
 Assembly code & machine code patching:
   - x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf)
   - x86/alternatives: Simplify callthunk patching (Peter Zijlstra)
   - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf)
   - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf)
   - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra)
   - x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>
     (Uros Bizjak)
   - Use named operands in inline asm (Uros Bizjak)
   - Improve performance by using asm_inline() for atomic locking instructions
     (Uros Bizjak)
 
 Earlyprintk:
   - Harden early_serial (Peter Zijlstra)
 
 NMI handler:
   - Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()
     (Waiman Long)
 
 Miscellaneous fixes and cleanups:
 
   - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel,
     Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst,
     Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin,
     Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport,
     Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker,
     Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra,
     Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner,
     Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak,
     Vitaly Kuznetsov, Xin Li, liuye.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfenkQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g1FRAAi6OFTSn/5aeLMI0IMNBxJ6ddQiFc3imd
 7+C/vU5nul4CyDs8mKyj/+f/DDrbkG9lKz3VG631Yl237lXHjD8XWcVMeC/1z/q0
 3zInDIloE9/nBHRPkF6F7fARBLBZ0LFgaBsGrCo7mwpGybiQdqGcqcxllvTbtXaw
 OHta4q6ok+lBDNlfc0v6H4cRnzhmmlKu6Ng0j6UI3V7uFhi3vtxas32ltDQtzorq
 2+jbV6/+kbrrv+xPC+jlzOFhTEKRupNPQXmvyQteoQg6G3kqAKMDvBthGXd1rHuX
 Qa+BoDIifE/2NiVeRwNrhoqYH/pHCzUzDREW5IW8+ca+4XNKuzAC6EuC8CeCzyK1
 q8ZjZjooQW4zEeVFeJYllHONzJYfxfSH5CLsnbcuhq99yfGlrQhF1qL72/Omn1w/
 DfPJM8Zt5zyKvLqUg3Md+fkVCO2wyDNhB61QPzRgHF+yD+rvuDpoqvUWir+w7cSn
 fwEDVZGXlFx6dumtSrqRaTd1nvFt80s8yP2ll09DMvGQ8D/yruS7hndGAmmJVCSW
 NAfd8pSjq5v2+ux2UR92/Cc3VF3SjaUqHBOp/Nq9rESya18ZVa3cJpHhVYYtPIVf
 THW0h07RIkGVKs1uq+5ekLCr/8uAZg58UPIqmhTuW0ttymRHCNfohR45FQZzy+0M
 tJj1oc2TIZw=
 =Dcb3
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Ingo Molnar:
 "x86 CPU features support:
   - Generate the <asm/cpufeaturemasks.h> header based on build config
     (H. Peter Anvin, Xin Li)
   - x86 CPUID parsing updates and fixes (Ahmed S. Darwish)
   - Introduce the 'setcpuid=' boot parameter (Brendan Jackman)
   - Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan
     Jackman)
   - Utilize CPU-type for CPU matching (Pawan Gupta)
   - Warn about unmet CPU feature dependencies (Sohil Mehta)
   - Prepare for new Intel Family numbers (Sohil Mehta)

  Percpu code:
   - Standardize & reorganize the x86 percpu layout and related cleanups
     (Brian Gerst)
   - Convert the stackprotector canary to a regular percpu variable
     (Brian Gerst)
   - Add a percpu subsection for cache hot data (Brian Gerst)
   - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak)
   - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak)

  MM:
   - Add support for broadcast TLB invalidation using AMD's INVLPGB
     instruction (Rik van Riel)
   - Rework ROX cache to avoid writable copy (Mike Rapoport)
   - PAT: restore large ROX pages after fragmentation (Kirill A.
     Shutemov, Mike Rapoport)
   - Make memremap(MEMREMAP_WB) map memory as encrypted by default
     (Kirill A. Shutemov)
   - Robustify page table initialization (Kirill A. Shutemov)
   - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn)
   - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
     (Matthew Wilcox)

  KASLR:
   - x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI
     BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir
     Singh)

  CPU bugs:
   - Implement FineIBT-BHI mitigation (Peter Zijlstra)
   - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta)
   - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan
     Gupta)
   - RFDS: Exclude P-only parts from the RFDS affected list (Pawan
     Gupta)

  System calls:
   - Break up entry/common.c (Brian Gerst)
   - Move sysctls into arch/x86 (Joel Granados)

  Intel LAM support updates: (Maciej Wieczor-Retman)
   - selftests/lam: Move cpu_has_la57() to use cpuinfo flag
   - selftests/lam: Skip test if LAM is disabled
   - selftests/lam: Test get_user() LAM pointer handling

  AMD SMN access updates:
   - Add SMN offsets to exclusive region access (Mario Limonciello)
   - Add support for debugfs access to SMN registers (Mario Limonciello)
   - Have HSMP use SMN through AMD_NODE (Yazen Ghannam)

  Power management updates: (Patryk Wlazlyn)
   - Allow calling mwait_play_dead with an arbitrary hint
   - ACPI/processor_idle: Add FFH state handling
   - intel_idle: Provide the default enter_dead() handler
   - Eliminate mwait_play_dead_cpuid_hint()

  Build system:
   - Raise the minimum GCC version to 8.1 (Brian Gerst)
   - Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor)

  Kconfig: (Arnd Bergmann)
   - Add cmpxchg8b support back to Geode CPUs
   - Drop 32-bit "bigsmp" machine support
   - Rework CONFIG_GENERIC_CPU compiler flags
   - Drop configuration options for early 64-bit CPUs
   - Remove CONFIG_HIGHMEM64G support
   - Drop CONFIG_SWIOTLB for PAE
   - Drop support for CONFIG_HIGHPTE
   - Document CONFIG_X86_INTEL_MID as 64-bit-only
   - Remove old STA2x11 support
   - Only allow CONFIG_EISA for 32-bit

  Headers:
   - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI
     headers (Thomas Huth)

  Assembly code & machine code patching:
   - x86/alternatives: Simplify alternative_call() interface (Josh
     Poimboeuf)
   - x86/alternatives: Simplify callthunk patching (Peter Zijlstra)
   - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf)
   - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf)
   - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra)
   - x86/kexec: Merge x86_32 and x86_64 code using macros from
     <asm/asm.h> (Uros Bizjak)
   - Use named operands in inline asm (Uros Bizjak)
   - Improve performance by using asm_inline() for atomic locking
     instructions (Uros Bizjak)

  Earlyprintk:
   - Harden early_serial (Peter Zijlstra)

  NMI handler:
   - Add an emergency handler in nmi_desc & use it in
     nmi_shootdown_cpus() (Waiman Long)

  Miscellaneous fixes and cleanups:
   - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem
     Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan
     Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar,
     Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej
     Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter
     Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner,
     Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly
     Kuznetsov, Xin Li, liuye"

* tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (211 commits)
  zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work around compiler segfault
  x86/asm: Make asm export of __ref_stack_chk_guard unconditional
  x86/mm: Only do broadcast flush from reclaim if pages were unmapped
  perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones
  perf/x86/intel, x86/cpu: Simplify Intel PMU initialization
  x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers
  x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers
  x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions
  x86/asm: Use asm_inline() instead of asm() in clwb()
  x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h>
  x86/hweight: Use asm_inline() instead of asm()
  x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()
  x86/hweight: Use named operands in inline asm()
  x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP
  x86/head/64: Avoid Clang < 17 stack protector in startup code
  x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>
  x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro
  x86/cpu/intel: Limit the non-architectural constant_tsc model checks
  x86/mm/pat: Replace Intel x86_model checks with VFM ones
  x86/cpu/intel: Fix fast string initialization for extended Families
  ...
2025-03-24 22:06:11 -07:00
Frank Li
07ae413e16 PCI: intel-gw: Remove intel_pcie_cpu_addr()
Remove intel_pcie_cpu_addr(), the .cpu_addr_fixup() method, because the dwc
core driver already handles address translation based on the devicetree
description.

[bhelgaas: this does require a minor dts change, but maintainer
Lei Chuan Hua <lchuanhua@maxlinear.com> confirms that the driver is only
used internally to Maxlinear and internal users will update dts:
https://lore.kernel.org/r/BY3PR19MB507667CE7531D863E1E5F8AEBDD82@BY3PR19MB5076.namprd19.prod.outlook.com]

Link: https://lore.kernel.org/r/20250305-intel-v1-1-40db3a685490@nxp.com
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 14:58:35 -05:00
Frank Li
b9812179f6 PCI: imx6: Remove imx_pcie_cpu_addr_fixup()
Remove imx_pcie_cpu_addr_fixup, the .cpu_addr_fixup() method, because the
dwc core driver already handles address translation based on the devicetree
description.

Link: https://lore.kernel.org/r/20250315201548.858189-14-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
2025-03-24 14:58:34 -05:00
Frank Li
befc86a0b3 PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup()
We know the parent_bus_offset, either computed from a DT reg property (the
offset is the CPU physical addr - the 'config'/'addr_space' address on the
parent bus) or from a .cpu_addr_fixup() (which may have used a host bridge
window offset).

Apply that parent_bus_offset instead of calling .cpu_addr_fixup() when
programming the ATU.

This assumes all intermediate addresses are at the same offset from the CPU
physical addresses.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20250315201548.858189-13-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 14:58:34 -05:00
Frank Li
f3e1dccba0 PCI: dwc: ep: Ensure proper iteration over outbound map windows
Most systems' PCIe outbound map windows have non-zero physical addresses,
but the possibility of encountering zero increased after following commit
("PCI: dwc: Use parent_bus_offset").

'ep->outbound_addr[n]', representing 'parent_bus_address', might be 0 on
some hardware, which trims high address bits through bus fabric before
sending to the PCIe controller.

Replace the iteration logic with 'for_each_set_bit()' to ensure only
allocated map windows are iterated when determining the ATU index from a
given address.

Link: https://lore.kernel.org/r/20250315201548.858189-12-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 14:58:34 -05:00
Frank Li
f28b3c9c42 PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset
Endpoint
  ┌───────────────────────────────────────────────┐
  │                             pcie-ep@5f010000  │
  │                             ┌────────────────┐│
  │                             │   Endpoint     ││
  │                             │   PCIe         ││
  │                             │   Controller   ││
  │           bus@5f000000      │             ┌────────►
  │           ┌──────────┐      │             │  ││dynamically
  │           │          │ Outbound Transfer  │  ││allocated
  │┌─────┐    │  Bus     ┼─────►│ ATU  ───────┘  ││PCI Addr
  ││     │    │  Fabric  │Bus   │                ││
  ││ CPU ├───►│          │Addr  │                ││
  ││     │CPU │          │0x8000_0000            ││
  │└─────┘Addr└──────────┘      │                ││
  │       0x7000_0000           └────────────────┘│
  └───────────────────────────────────────────────┘

  bus@5f000000 {
          compatible = "simple-bus";
          ranges = <0x80000000 0x0 0x70000000 0x10000000>;

          pcie-ep@5f010000 {
                  reg = <0x80000000 0x10000000>;
                  reg-names ="addr_space";
                  ...
          };
          ...
  };

In the diagram above, CPU writes data to outbound window address
0x7000_0000, and the bus fabric maps it to 0x8000_0000. The ATU uses
bus address 0x8000_0000 as input address and maps to some PCI address
dynamically allocated by a PCI device driver on the host side.

The pcie-ep@5f010000 'reg[addr_space]' is the parent bus address, which is
the input of PCIe controller, including the ATU.

Set parent_bus_offset, the offset from the CPU address to the PCIe
controller input address using dw_pcie_init_parent_bus_offset().  The
parent_bus_offset is not used yet, so no functional change intended.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20250315201548.858189-11-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 14:58:34 -05:00
Bjorn Helgaas
d7ae671eba PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources()
Consolidate devicetree resource handling in dw_pcie_ep_get_resources().
No functional change intended.

Link: https://lore.kernel.org/r/20250315201548.858189-10-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-03-24 14:58:34 -05:00
Bjorn Helgaas
92eb132ad1 PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init()
Move devm_pci_epc_create() to the beginning of dw_pcie_ep_init().

devm_pci_epc_create() is generic code that doesn't depend on any DWC
resource, so moving it earlier keeps all the subsequent devicetree-related
code together.

Link: https://lore.kernel.org/r/20250315201548.858189-9-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-03-24 14:58:34 -05:00
Frank Li
7db02f725d PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset
The 'ranges' property of a PCI controller's parent can indicate address
translation information. Most system use 1:1 map between CPU physical and
PCI controller input addresses.

But some hardware, like i.MX8QXP, doesn't use 1:1 map.  See below diagram:

              ┌─────────┐                    ┌────────────┐
   ┌─────┐    │         │ IA: 0x8ff8_0000    │            │
   │ CPU ├───►│   ┌────►├─────────────────┐  │ PCI        │
   └─────┘    │   │     │ IA: 0x8ff0_0000 │  │            │
    CPU Addr  │   │  ┌─►├─────────────┐   │  │ Controller │
  0x7ff8_0000─┼───┘  │  │             │   │  │            │
              │      │  │             │   │  │            │   PCI Addr
  0x7ff0_0000─┼──────┘  │             │   └──► IOSpace   ─┼────────────►
              │         │             │      │            │    0
  0x7000_0000─┼────────►├─────────┐   │      │            │
              └─────────┘         │   └──────► CfgSpace  ─┼────────────►
               Bus Fabric         │          │            │    0
                                  │          │            │
                                  └──────────► MemSpace  ─┼────────────►
                          IA: 0x8000_0000    │            │  0x8000_0000
                                             └────────────┘

  bus@5f000000 {
          compatible = "simple-bus";
          #address-cells = <1>;
          #size-cells = <1>;
          ranges = <0x80000000 0x0 0x70000000 0x10000000>;

          pcie@5f010000 {
                  compatible = "fsl,imx8q-pcie";
                  reg = <0x5f010000 0x10000>, <0x8ff00000 0x80000>;
                  reg-names = "dbi", "config";
                  ...
          };
  };

Intermediate address (IA) here means the PCIe controller input address.
The pcie@5f010000 'reg[config]' address is the parent bus (PCIe controller
input) address of CfgSpace.

The ATU in MemSpace is not explicitly described via devicetree, so we
assume the offset from CPU address to intermediate MemSpace address is the
same as that for CfgSpace.

We could use bus@5f000000 'ranges' for the same purpose.

Set parent_bus_offset using dw_pcie_init_parent_bus_offset().  The
parent_bus_offset is not used yet, so no functional change intended.

Link: https://lore.kernel.org/r/20250315201548.858189-8-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 14:58:34 -05:00
Frank Li
3b69e1d381 PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug
dw_pcie_parent_bus_offset() looks up the parent bus address of a PCI
controller 'reg' property in devicetree.  If implemented, .cpu_addr_fixup()
is a hard-coded way to get the parent bus address corresponding to a CPU
physical address.

Add debug code to compare the address from .cpu_addr_fixup() with the
address from devicetree.  If they match, warn that .cpu_addr_fixup() is
redundant and should be removed; if they differ, warn that something is
wrong with the devicetree.

If .cpu_addr_fixup() is not implemented, the parent bus address should be
identical to the CPU physical address because we previously ignored the
parent bus address from devicetree.  If the devicetree has a different
parent bus address, warn about it being broken.

[bhelgaas: split debug to separate patch for easier future revert, commit
log]

Link: https://lore.kernel.org/r/20250315201548.858189-7-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[bhelgaas: squash Ioana Ciornei <ioana.ciornei@nxp.com> fix for NULL
pointer deref when driver doesn't supply dw_pcie_ops, e.g., layerscape-pcie
https://lore.kernel.org/r/20250319134339.3114817-1-ioana.ciornei@nxp.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 14:58:01 -05:00
Frank Li
9de3f3cd47 PCI: dwc: Add dw_pcie_parent_bus_offset()
Return the offset from CPU physical address to the parent bus address of
the specified element of the devicetree 'reg' property.

[bhelgaas: cpu_phy_addr -> cpu_phys_addr, return offset, split
.cpu_addr_fixup() checking and debug to separate patch]

Link: https://lore.kernel.org/r/20250315201548.858189-6-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-24 13:13:02 -05:00
Lukas Wunner
667f053b05
PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion
When BIOS neglects to assign bus numbers to PCI bridges, the kernel
attempts to correct that during PCI device enumeration.  If it runs out
of bus numbers, no pci_bus is allocated and the "subordinate" pointer in
the bridge's pci_dev remains NULL.

The PCIe bandwidth controller erroneously does not check for a NULL
subordinate pointer and dereferences it on probe.

Bandwidth control of unusable devices below the bridge is of questionable
utility, so simply error out instead.  This mirrors what PCIe hotplug does
since commit 62e4492c30 ("PCI: Prevent NULL dereference during pciehp
probe").

The PCI core emits a message with KERN_INFO severity if it has run out of
bus numbers.  PCIe hotplug emits an additional message with KERN_ERR
severity to inform the user that hotplug functionality is disabled at the
bridge.  A similar message for bandwidth control does not seem merited,
given that its only purpose so far is to expose an up-to-date link speed
in sysfs and throttle the link speed on certain laptops with limited
Thermal Design Power.  So error out silently.

User-visible messages:

  pci 0000:16:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
  [...]
  pci_bus 0000:45: busn_res: [bus 45-74] end is updated to 74
  pci 0000:16:02.0: devices behind bridge are unusable because [bus 45-74] cannot be assigned for them
  [...]
  pcieport 0000:16:02.0: pciehp: Hotplug bridge without secondary bus, ignoring
  [...]
  BUG: kernel NULL pointer dereference
  RIP: pcie_update_link_speed
  pcie_bwnotif_enable
  pcie_bwnotif_probe
  pcie_port_probe_service
  really_probe

Fixes: 665745f274 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Reported-by: Wouter Bijlsma <wouter@wouterbijlsma.nl>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219906
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Tested-by: Wouter Bijlsma <wouter@wouterbijlsma.nl>
Cc: stable@vger.kernel.org # v6.13+
Link: https://lore.kernel.org/r/3b6c8d973aedc48860640a9d75d20528336f1f3c.1742669372.git.lukas@wunner.de
2025-03-23 13:37:24 +00:00
Thippeswamy Havalige
9e141923cf
PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant
Update the CPM5 check to include CPM5_HOST1 variant. Previously, only
CPM5 was considered when mapping the "cpm_csr" register.

With this change, CPM5_HOST1 is also supported, ensuring proper
resource mapping for this variant.

Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250317124136.1317723-1-thippeswamy.havalige@amd.com
2025-03-23 12:40:22 +00:00
Colin Ian King
2d72d81cac
PCI: brcmstb: Make const read-only arrays static
Don't populate the const read-only arrays "data" and "regs" on the
stack at run time, instead make them static.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
[kwilczynski: commit log, wrap overly long line to 80 columns]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250317143456.477901-1-colin.i.king@gmail.com
2025-03-23 12:34:38 +00:00
Thippeswamy Havalige
5f3de23d85
PCI: amd-mdb: Add AMD MDB Root Port driver
Add support for AMD MDB (Multimedia DMA Bridge) IP core as Root Port.

The Versal2 devices include MDB Module. The integrated block for MDB
along with the integrated bridge can function as PCIe Root Port
controller at Gen5 32-GT/s operation per lane.

Bridge supports error and INTx interrupts and are handled using platform
specific interrupt line in Versal2.

Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250228093351.923615-4-thippeswamy.havalige@amd.com
[bhelgaas: only present on ARM64-based SoCs; squash Kconfig dependency on
ARM64 from Geert Uytterhoeven <geert+renesas@glider.be>:
https://lore.kernel.org/r/eaef1dea7edcf146aa377d5e5c5c85a76ff56bae.1742306383.git.geert+renesas@glider.be]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log, code comments and error messages clean-up,
drop redundant "depends on PCI" from Kconfig, expose the error code
as part of error messages where appropriatie, change "depends on"
expression to match existing style from other drivers]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-23 05:50:59 +00:00
Alistair Francis
6fc6ded50f PCI/DOE: Allow enabling DOE without CXL
PCIe devices (not CXL) can support DOE as well, so allow DOE to be enabled
even if CXL isn't.

Link: https://lore.kernel.org/r/20250306075211.1855177-4-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-03-21 16:36:34 -05:00
Alistair Francis
2311ab1820 PCI/DOE: Expose DOE features via sysfs
PCIe r6.0 added support for Data Object Exchange (DOE).  When DOE is
supported, the DOE Discovery Feature must be implemented per PCIe r6.1, sec
6.30.1.1. DOE allows a requester to obtain information about the other DOE
features supported by the device.

The kernel already queries the DOE features supported and caches the
values.  Expose the values in sysfs to allow user space to determine which
DOE features are supported by the PCIe device.

By exposing the information to userspace, tools like lspci can relay the
information to users. By listing all of the supported features we can allow
userspace to parse the list, which might include vendor specific features
as well as yet to be supported features.

As the DOE Discovery feature must always be supported we treat it as a
special named attribute case. This allows the usual PCI attribute_group
handling to correctly create the doe_features directory when registering
pci_doe_sysfs_group (otherwise it doesn't and sysfs_add_file_to_group()
will seg fault).

After this patch is supported you can see something like this when
attaching a DOE device:

  $ ls /sys/devices/pci0000:00/0000:00:02.0//doe*
  0001:01        0001:02        doe_discovery

Link: https://lore.kernel.org/r/20250306075211.1855177-3-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
[bhelgaas: drop pci_doe_sysfs_init() stub return, make
DEVICE_ATTR_RO(doe_discovery) static]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-03-21 16:36:01 -05:00
Niklas Schnelle
888bd8322d s390/pci: Introduce pdev->non_mappable_bars and replace VFIO_PCI_MMAP
The ability to map PCI resources to user-space is controlled by global
defines. For vfio there is VFIO_PCI_MMAP which is only disabled on s390 and
controls mapping of PCI resources using vfio-pci with a fallback option via
the pread()/pwrite() interface.

For the PCI core there is ARCH_GENERIC_PCI_MMAP_RESOURCE which enables a
generic implementation for mapping PCI resources plus the newer sysfs
interface. Then there is HAVE_PCI_MMAP which can be used with custom
definitions of pci_mmap_resource_range() and the historical /proc/bus/pci
interface. Both mechanisms are all or nothing.

For s390 mapping PCI resources is possible and useful for testing and
certain applications such as QEMU's vfio-pci based user-space NVMe driver.
For certain devices, however access to PCI resources via mappings to
user-space is not possible and these must be excluded from the general PCI
resource mapping mechanisms.

Introduce pdev->non_mappable_bars to indicate that a PCI device's BARs can
not be accessed via mappings to user-space. In the future this enables
per-device restrictions of PCI resource mapping.

For now, set this flag for all PCI devices on s390 in line with the
existing, general disable of PCI resource mapping. As s390 is the only user
of the VFI_PCI_MMAP Kconfig options this can already be replaced with a
check of this new flag. Also add similar checks in the other code protected
by HAVE_PCI_MMAP respectively ARCH_GENERIC_PCI_MMAP in preparation for
enabling these for supported devices.

Link: https://lore.kernel.org/lkml/20250212132808.08dcf03c.alex.williamson@redhat.com/
Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-2-c5c0f1d26efd@linux.ibm.com
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-21 14:54:16 -05:00
Shay Drory
04d50d953a PCI: Fix NULL dereference in SR-IOV VF creation error path
Clean up when virtfn setup fails to prevent NULL pointer dereference
during device removal. The kernel oops below occurred due to incorrect
error handling flow when pci_setup_device() fails.

Add pci_iov_scan_device(), which handles virtfn allocation and setup and
cleans up if pci_setup_device() fails, so pci_iov_add_virtfn() doesn't need
to call pci_stop_and_remove_bus_device().  This prevents accessing
partially initialized virtfn devices during removal.

  BUG: kernel NULL pointer dereference, address: 00000000000000d0
  RIP: 0010:device_del+0x3d/0x3d0
  Call Trace:
   pci_remove_bus_device+0x7c/0x100
   pci_iov_add_virtfn+0xfa/0x200
   sriov_enable+0x208/0x420
   mlx5_core_sriov_configure+0x6a/0x160 [mlx5_core]
   sriov_numvfs_store+0xae/0x1a0

Link: https://lore.kernel.org/r/20250310084524.599225-1-shayd@nvidia.com
Fixes: e3f30d563a ("PCI: Make pci_destroy_dev() concurrent safe")
Signed-off-by: Shay Drory <shayd@nvidia.com>
[bhelgaas: commit log, return ERR_PTR(-ENOMEM) directly]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Keith Busch <kbusch@kernel.org>
2025-03-21 14:54:16 -05:00
Ilpo Järvinen
026e4bffb0 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
pcie_bwctrl_select_speed() should take __fls() of the speed bit, not return
it as a raw value. Instead of directly returning 2.5GT/s speed bit, simply
assign the fallback speed (2.5GT/s) into supported_speeds variable to share
the normal return path that calls pcie_supported_speeds2target_speed() to
calculate __fls().

This code path is not very likely to execute because
pcie_get_supported_speeds() should provide valid ->supported_speeds but a
spec violating device could fail to synthesize any speed in
pcie_get_supported_speeds(). It could also happen in case the
supported_speeds intersection is empty (also a violation of the current
PCIe specs).

Link: https://lore.kernel.org/r/20250321163103.5145-1-ilpo.jarvinen@linux.intel.com
Fixes: de9a6c8d5d ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-21 12:11:38 -05:00
Ilpo Järvinen
527664f738 PCI: pciehp: Don't enable HPIE when resuming in poll mode
PCIe hotplug can operate in poll mode without interrupt handlers using a
polling kthread only.  eb34da60ed ("PCI: pciehp: Disable hotplug
interrupt during suspend") failed to consider that and enables HPIE
(Hot-Plug Interrupt Enable) unconditionally when resuming the Port.

Only set HPIE if non-poll mode is in use. This makes
pcie_enable_interrupt() match how pcie_enable_notification() already
handles HPIE.

Link: https://lore.kernel.org/r/20250321162114.3939-1-ilpo.jarvinen@linux.intel.com
Fixes: eb34da60ed ("PCI: pciehp: Disable hotplug interrupt during suspend")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2025-03-21 11:33:49 -05:00
Roger Pau Monne
c3164d2e0d PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag
Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both
MSI and MSI-X entries globally, regardless of the IRQ chip they are using.
Only Xen sets the pci_msi_ignore_mask when routing physical interrupts over
event channels, to prevent PCI code from attempting to toggle the maskbit,
as it's Xen that controls the bit.

However, the pci_msi_ignore_mask being global will affect devices that use
MSI interrupts but are not routing those interrupts over event channels
(not using the Xen pIRQ chip).  One example is devices behind a VMD PCI
bridge.  In that scenario the VMD bridge configures MSI(-X) using the
normal IRQ chip (the pIRQ one in the Xen case), and devices behind the
bridge configure the MSI entries using indexes into the VMD bridge MSI
table.  The VMD bridge then demultiplexes such interrupts and delivers to
the destination device(s).  Having pci_msi_ignore_mask set in that scenario
prevents (un)masking of MSI entries for devices behind the VMD bridge.

Move the signaling of no entry masking into the MSI domain flags, as that
allows setting it on a per-domain basis.  Set it for the Xen MSI domain
that uses the pIRQ chip, while leaving it unset for the rest of the
cases.

Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and
with Xen dropping usage the variable is unneeded.

This fixes using devices behind a VMD bridge on Xen PV hardware domains.

Albeit Devices behind a VMD bridge are not known to Xen, that doesn't mean
Linux cannot use them.  By inhibiting the usage of
VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal of the pci_msi_ignore_mask
bodge devices behind a VMD bridge do work fine when use from a Linux Xen
hardware domain.  That's the whole point of the series.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Message-ID: <20250219092059.90850-4-roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2025-03-21 08:57:57 +01:00
Roger Pau Monne
6c4d5aadf5 PCI: vmd: Disable MSI remapping bypass under Xen
MSI remapping bypass (directly configuring MSI entries for devices on the
VMD bus) won't work under Xen, as Xen is not aware of devices in such bus,
and hence cannot configure the entries using the pIRQ interface in the PV
case, and in the PVH case traps won't be setup for MSI entries for such
devices.

Until Xen is aware of devices in the VMD bus prevent the
VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as
any kind of Xen guest.

The MSI remapping bypass is an optional feature of VMD bridges, and hence
when running under Xen it will be masked and devices will be forced to
redirect its interrupts from the VMD bridge.  That mode of operation must
always be supported by VMD bridges and works when Xen is not aware of
devices behind the VMD bridge.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Message-ID: <20250219092059.90850-3-roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2025-03-21 08:57:52 +01:00
Ilpo Järvinen
cc7a371b0b PCI: Move cardbus IO size declarations into pci/pci.h
For some reason, cardbus related io/mem size declarations are in
linux/pci.h, whereas non-cardbus sizes are already in pci/pci.h.
Move all them into one place in pci/pci.h.

Link: https://lore.kernel.org/r/20250311174701.3586-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-20 16:44:26 -05:00
Ilpo Järvinen
2f255e299c PCI: Make pci_setup_bridge() static
pci_setup_bridge() is only used within setup-bus.c. Therefore, make it a
static function.

Link: https://lore.kernel.org/r/20250311174701.3586-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-20 16:44:26 -05:00
Ilpo Järvinen
7d4bcc0f26 PCI: Move resource reassignment func declarations into pci/pci.h
Neither pci_reassign_bridge_resources() nor pci_reassign_resource() is used
outside of the PCI subsystem. They seem to be naturally static functions
but since resource fitting/assignment is split between setup-bus.c and
setup-res.c, they fall into different sides of the divide and need to be
declared.

Move the declarations of pci_reassign_bridge_resources() and
pci_reassign_resource() into pci/pci.h to keep them internal to PCI
subsystem.

Link: https://lore.kernel.org/r/20250311174701.3586-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-20 16:44:25 -05:00
Ilpo Järvinen
95c4e6d42c PCI: Move pci_rescan_bus_bridge_resize() declaration to pci/pci.h
pci_rescan_bus_bridge_resize() is only used by code inside PCI subsystem.
The comment also falsely advertises it to be for hotplug drivers, yet the
only caller is from sysfs store function. Move the function declaration
into pci/pci.h.

Link: https://lore.kernel.org/r/20250311174701.3586-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-20 16:44:25 -05:00
Ilpo Järvinen
9ec19bfa78 PCI: Fix BAR resizing when VF BARs are assigned
__resource_resize_store() attempts to release all resources of the device
before attempting the resize. The loop, however, only covers standard BARs
(< PCI_STD_NUM_BARS). If a device has VF BARs that are assigned,
pci_reassign_bridge_resources() finds the bridge window still has some
assigned child resources and returns -NOENT which makes
pci_resize_resource() to detect an error and abort the resize.

Change the release loop to cover all resources up to VF BARs which allows
the resize operation to release the bridge windows and attempt to assigned
them again with the different size.

If SR-IOV is enabled, disallow resize as it requires releasing also IOV
resources.

Link: https://lore.kernel.org/r/20250320142837.8027-1-ilpo.jarvinen@linux.intel.com
Fixes: 91fa127794 ("PCI: Expose PCIe Resizable BAR support via sysfs")
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2025-03-20 16:44:11 -05:00
Manivannan Sadhasivam
a5fb3ff632 PCI: Allow PCI bridges to go to D3Hot on all non-x86
Currently, pci_bridge_d3_possible() encodes a variety of decision factors
when deciding whether a given bridge can be put into D3. A particular one
of note is for "recent enough PCIe ports." Per Rafael [0]:

  "There were hardware issues related to PM on x86 platforms predating
   the introduction of Connected Standby in Windows.  For instance,
   programming a port into D3hot by writing to its PMCSR might cause the
   PCIe link behind it to go down and the only way to revive it was to
   power cycle the Root Complex.  And similar."

Thus, this function contains a DMI-based check for post-2015 BIOS.

The above factors (Windows, x86) don't really apply to non-x86 systems, and
also, many such systems don't have BIOS or DMI. However, we'd like to be
able to suspend bridges on non-x86 systems too.

Restrict the "recent enough" check to x86. If we find further
incompatibilities, it probably makes sense to expand on the deny-list
approach (i.e., bridge_d3_blacklist or similar).

Link: https://lore.kernel.org/r/20250320110604.v6.1.Id0a0e78ab0421b6bce51c4b0b87e6aebdfc69ec7@changeid
Link: https://lore.kernel.org/linux-pci/CAJZ5v0j_6jeMAQ7eFkZBe5Yi+USGzysxAgfemYh=-zq4h5W+Qg@mail.gmail.com/ [0]
Link: https://lore.kernel.org/linux-pci/20240227225442.GA249898@bhelgaas/ [1]
Link: https://lore.kernel.org/linux-pci/20240828210705.GA37859@bhelgaas/ [2]
[Brian: rewrite to !X86 based on Rafael's suggestions]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-20 16:33:33 -05:00
Ryo Takakura
18056a4866 PCI: vmd: Make vmd_dev::cfg_lock a raw_spinlock_t type
The access to the PCI config space via pci_ops::read and pci_ops::write is
a low-level hardware access. The functions can be accessed with disabled
interrupts even on PREEMPT_RT. The pci_lock is a raw_spinlock_t for this
purpose.

A spinlock_t becomes a sleeping lock on PREEMPT_RT, so it cannot be
acquired with disabled interrupts. The vmd_dev::cfg_lock is accessed in
the same context as the pci_lock.

Make vmd_dev::cfg_lock a raw_spinlock_t type so it can be used with
interrupts disabled.

This was reported as:

  BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
  Call Trace:
   rt_spin_lock+0x4e/0x130
   vmd_pci_read+0x8d/0x100 [vmd]
   pci_user_read_config_byte+0x6f/0xe0
   pci_read_config+0xfe/0x290
   sysfs_kf_bin_read+0x68/0x90

Signed-off-by: Ryo Takakura <ryotkkr98@gmail.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Acked-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
[bigeasy: reword commit message]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/r/20250218080830.ufw3IgyX@linutronix.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: add back report info from
https://lore.kernel.org/lkml/20241218115951.83062-1-ryotkkr98@gmail.com/]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-18 20:05:06 -05:00
Alistair Popple
82ba975e4c mm: allow compound zone device pages
Zone device pages are used to represent various type of device memory
managed by device drivers.  Currently compound zone device pages are not
supported.  This is because MEMORY_DEVICE_FS_DAX pages are the only user
of higher order zone device pages and have their own page reference
counting.

A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference counting.
Supporting that requires compound zone device pages.

Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.

A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head page. 
For zone device pages page->compound_head is shared with page->pgmap.

The page->pgmap field must be common to all pages within a folio, even if
the folio spans memory sections.  Therefore pgmap is the same for both
head and tail pages and can be moved into the folio and we can use the
standard scheme to find compound_head from a tail page.

Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:39 -07:00
Alistair Popple
b7e2823787 mm/mm_init: move p2pdma page refcount initialisation to p2pdma
Currently ZONE_DEVICE page reference counts are initialised by core memory
management code in __init_zone_device_page() as part of the memremap()
call which driver modules make to obtain ZONE_DEVICE pages.  This
initialises page refcounts to 1 before returning them to the driver.

This was presumably done because it drivers had a reference of sorts on
the page.  It also ensured the page could always be mapped with
vm_insert_page() for example and would never get freed (ie.  have a zero
refcount), freeing drivers of manipulating page reference counts.

However it complicates figuring out whether or not a page is free from the
mm perspective because it is no longer possible to just look at the
refcount.  Instead the page type must be known and if GUP is used a
secondary pgmap reference is also sometimes needed.

To simplify this it is desirable to remove the page reference count for
the driver, so core mm can just use the refcount without having to account
for page type or do other types of tracking.  This is possible because
drivers can always assume the page is valid as core kernel will never
offline or remove the struct page.

This means it is now up to drivers to initialise the page refcount as
required.  P2PDMA uses vm_insert_page() to map the page, and that requires
a non-zero reference count when initialising the page so set that when the
page is first mapped.

Link: https://lkml.kernel.org/r/6aedb0ac2886dcc4503cb705273db5b3863a0b66.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:38 -07:00
Bjorn Helgaas
2ce107e064 PCI: dwc: Consolidate devicetree handling in dw_pcie_host_get_resources()
Consolidate devicetree resource handling in dw_pcie_host_get_resources().
No functional change intended.

Link: https://lore.kernel.org/r/20250315201548.858189-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-03-17 14:18:53 -05:00
Frank Li
84f37c43d5 PCI: dwc: Call devm_pci_alloc_host_bridge() early in dw_pcie_host_init()
Move devm_pci_alloc_host_bridge() to the beginning of dw_pcie_host_init().

devm_pci_alloc_host_bridge() is generic code that doesn't depend on any DWC
resource, so moving it earlier keeps all the subsequent devicetree-related
code together.

[bhelgaas: reorder earlier in series]

Link: https://lore.kernel.org/r/20250315201548.858189-4-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-17 14:18:53 -05:00
Frank Li
513ef9c496 PCI: dwc: Rename cpu_addr to parent_bus_addr for ATU configuration
Rename 'cpu_addr' to 'parent_bus_addr' in the DesignWare ATU configuration.

The ATU translates parent bus addresses to PCI addresses, which are often
the same as CPU addresses but can differ in systems where the bus fabric
translates addresses before passing them to the PCIe controller. This
renaming clarifies the purpose and avoids confusion.

Link: https://lore.kernel.org/r/20250315201548.858189-3-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-17 14:18:53 -05:00
Frank Li
8f4a489b37 PCI: dwc: Use resource start as ioremap() input in dw_pcie_pme_turn_off()
The msg_res region translates writes into PCIe Message TLPs. Previously we
mapped this region using atu.cpu_addr, the input address programmed into
the ATU.

"cpu_addr" is a misnomer because when a bus fabric translates addresses
between the CPU and the ATU, the ATU input address is different from the
CPU address.  A future patch will rename "cpu_addr" and correct the value
to be the ATU input address instead of the CPU physical address.

Map the msg_res region before writing to it using the msg_res resource
start, a CPU physical address.

Link: https://lore.kernel.org/r/20250315201548.858189-2-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-17 14:18:14 -05:00
Christophe JAILLET
b36fb50701
PCI: histb: Fix an error handling path in histb_pcie_probe()
If an error occurs after a successful phy_init() call, then phy_exit()
should be called.

Add the missing call, as already done in the remove function.

Fixes: bbd11bddb3 ("PCI: hisi: Add HiSilicon STB SoC PCIe controller driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[kwilczynski: remove unnecessary hipcie->phy NULL check from
histb_pcie_probe() and squash a patch that removes similar NULL
check for hipcie-phy from histb_pcie_remove() from
https://lore.kernel.org/linux-pci/c369b5d25e17a44984ae5a889ccc28a59a0737f7.1742058005.git.christophe.jaillet@wanadoo.fr]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/8301fc15cdea5d2dac21f57613e8e6922fb1ad95.1740854531.git.christophe.jaillet@wanadoo.fr
2025-03-16 11:49:19 +00:00
Richard Zhu
f6a1fdfc78 PCI: imx6: Use devm_clk_bulk_get_all() to fetch clocks
Use devm_clk_bulk_get_all() helper to simplify clock handle code.

No functional changes intended.

Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[kwilczynski: commit log, refactor to use dev_err_probe()]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20250226025628.1681206-1-hongxing.zhu@nxp.com
2025-03-15 13:13:09 -05:00
Richard Zhu
81d1d214e1 PCI: imx6: Identify controller via 'linux,pci-domain', not address
Instead of testing the controller register address to distinguish
controller 1 from controller 0 on i.MX8MQ platforms, use the PCI domain
number, which comes from the devicetree 'linux,pci-domain' property.

All relevant devicetrees should already supply 'linux,pci-domain', which
was added by c0b70f05c8 ("arm64: dts: imx8mq: use_dt_domains for pci
node").

Instead of being set directly in imx_pcie_probe(), pci->dbi_base will be
set by the DWC core in dw_pcie_get_resources().

No functional changes intended.

Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250226024256.1678103-3-hongxing.zhu@nxp.com
2025-03-15 13:12:00 -05:00
Niklas Cassel
1f5a69f1b3
PCI: dw-rockchip: Hide broken ATS capability for RK3588 running in EP mode
When running the RK3588 in Endpoint mode, with an Intel host with IOMMU
enabled, the host side prints:

  DMAR: VT-d detected Invalidation Time-out Error: SID 0

When running the RK3588 in Endpoint mode, with an AMD host with IOMMU
enabled, the host side prints:

  iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=63:00.0 address=0x42b5b01a0]

Rockchip has confirmed that the ATS support for RK3588 only works when
running the PCIe controller in Root Complex (RC) mode, see:

  https://lore.kernel.org/linux-pci/93cdce39-1ae6-4939-a3fc-db10be7564e5@rock-chips.com

Usually, to handle these issues, we add a quirk for the PCI vendor and
device ID in drivers/pci/quirks.c with quirk_no_ats(). That is because
we cannot usually modify the capabilities on the EP side. In this case,
we can modify the capabilities on the EP side.

Thus, hide the broken ATS capability on RK3588 when running in EP mode.

That way, we don't need any quirk on the host side, and we see no errors
on the host side, and we can run pci_endpoint_test successfully, with
the IOMMU enabled on the host side.

Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
[kwilczynski: commit log, tidy up code comments and error message]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250310094826.842681-6-cassel@kernel.org
2025-03-14 16:13:19 +00:00
Niklas Cassel
e3d6957f17
PCI: dwc: ep: Add dw_pcie_ep_hide_ext_capability()
Add dw_pcie_ep_hide_ext_capability() which can be used by an endpoint
controller driver to hide a capability.

This can be useful to hide a capability that is buggy, such that the
host side does not try to enable the buggy capability.

Suggested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250310094826.842681-5-cassel@kernel.org
2025-03-14 16:13:19 +00:00
Dan Carpenter
8189aa56db
PCI: dwc: ep: Return -ENOMEM for allocation failures
If the bitmap or memory allocations fail, then dw_pcie_ep_init_registers()
will incorrectly return a success.

Return -ENOMEM instead.

Fixes: 869bc52534 ("PCI: dwc: ep: Fix DBI access failure for drivers requiring refclk from host")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/36dcb6fc-f292-4dd5-bd45-a8c6f9dc3df7@stanley.mountain
2025-03-14 16:13:03 +00:00
Philipp Stanner
b1a7f99967 PCI: Check BAR index for validity
Many functions in PCI use accessor macros such as pci_resource_len(),
which take a BAR index. That index, however, is never checked for
validity, potentially resulting in undefined behavior by overflowing the
array pci_dev.resource in the macro pci_resource_n().

Since many users of those macros directly assign the accessed value to
an unsigned integer, the macros cannot be changed easily anymore to
return -EINVAL for invalid indexes. Consequently, the problem has to be
mitigated in higher layers.

Add pci_bar_index_valid(). Use it where appropriate.

Link: https://lore.kernel.org/r/20250312080634.13731-4-phasta@kernel.org
Closes: https://lore.kernel.org/all/adb53b1f-29e1-3d14-0e61-351fd2d3ff0d@linux.intel.com/
Reported-by: Bingbu Cao <bingbu.cao@linux.intel.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
[kwilczynski: correct if-statement condition the pci_bar_index_is_valid()
helper function uses, tidy up code comments]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: fix typo]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-14 10:35:12 -05:00
Lukas Wunner
e3260237aa PCI: pciehp: Avoid unnecessary device replacement check
Hot-removal of nested PCI hotplug ports suffers from a long-standing race
condition which can lead to a deadlock:  A parent hotplug port acquires
pci_lock_rescan_remove(), then waits for pciehp to unbind from a child
hotplug port.  Meanwhile that child hotplug port tries to acquire
pci_lock_rescan_remove() as well in order to remove its own children.

The deadlock only occurs if the parent acquires pci_lock_rescan_remove()
first, not if the child happens to acquire it first.

Several workarounds to avoid the issue have been proposed and discarded
over the years, e.g.:

https://lore.kernel.org/r/4c882e25194ba8282b78fe963fec8faae7cf23eb.1529173804.git.lukas@wunner.de/

A proper fix is being worked on, but needs more time as it is nontrivial
and necessarily intrusive.

Recent commit 9d573d1954 ("PCI: pciehp: Detect device replacement during
system sleep") provokes more frequent occurrence of the deadlock when
removing more than one Thunderbolt device during system sleep.  The commit
sought to detect device replacement, but also triggered on device removal.
Differentiating reliably between replacement and removal is impossible
because pci_get_dsn() returns 0 both if the device was removed, as well as
if it was replaced with one lacking a Device Serial Number.

Avoid the more frequent occurrence of the deadlock by checking whether the
hotplug port itself was hot-removed.  If so, there's no sense in checking
whether its child device was replaced.

This works because the ->resume_noirq() callback is invoked in top-down
order for the entire hierarchy:  A parent hotplug port detecting device
replacement (or removal) marks all children as removed using
pci_dev_set_disconnected() and a child hotplug port can then reliably
detect being removed.

Link: https://lore.kernel.org/r/02f166e24c87d6cde4085865cce9adfdfd969688.1741674172.git.lukas@wunner.de
Fixes: 9d573d1954 ("PCI: pciehp: Detect device replacement during system sleep")
Reported-by: Kenneth Crudup <kenny@panix.com>
Closes: https://lore.kernel.org/r/83d9302a-f743-43e4-9de2-2dd66d91ab5b@panix.com/
Reported-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Closes: https://lore.kernel.org/r/20240926125909.2362244-1-acelan.kao@canonical.com/
Tested-by: Kenneth Crudup <kenny@panix.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org # v6.11+
2025-03-14 10:17:17 -05:00
Philipp Stanner
f09d3937d4
PCI: Fix wrong length of devres array
The array for the iomapping cookie addresses has a length of
PCI_STD_NUM_BARS. This constant, however, only describes standard BARs;
while PCI can allow for additional, special BARs.

The total number of PCI resources is described by constant
PCI_NUM_RESOURCES, which is also used in, e.g., pci_select_bars().

Thus, the devres array has so far been too small.

Change the length of the devres array to PCI_NUM_RESOURCES.

Link: https://lore.kernel.org/r/20250312080634.13731-3-phasta@kernel.org
Fixes: bbaff68bf4 ("PCI: Add managed partial-BAR request and map infrastructure")
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org	# v6.11+
2025-03-14 09:22:24 +00:00
Thomas Gleixner
79273d0a40 PCI/TPH: Replace the broken MSI-X control word update
The driver walks the MSI descriptors to test whether a descriptor exists
for a given index. That's just abuse of the MSI internals.

The same test can be done with a single function call by looking up whether
there is a Linux interrupt number assigned at the index.

What's worse is that the function is completely unserialized against
modifications of the MSI-X control by operations issued from the interrupt
core. It also brings the PCI/MSI-X internal cached control word out of
sync.

Remove the trainwreck and invoke the function provided by the PCI/MSI core
to update it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/all/20250313130321.898592817@linutronix.de
2025-03-13 18:58:00 +01:00
Thomas Gleixner
b9db8df433 PCI/MSI: Provide a sane mechanism for TPH
The PCI/TPH driver fiddles with the MSI-X control word of an active
interrupt completely unserialized against concurrent operations issued
from the interrupt core. It also brings the PCI/MSI-X internal cached
control word out of sync.

Provide a function, which has the required serialization and keeps the
control word cache in sync.

Unfortunately this requires to look up and lock the interrupt descriptor,
which should be only done in the interrupt core code. But confining this
particular oddity in the PCI/MSI core is the lesser of all evil. A
interrupt core implementation would require a larger pile of infrastructure
and indirections for dubious value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/all/20250313130321.822790423@linutronix.de
2025-03-13 18:58:00 +01:00
Thomas Gleixner
50410bad27 PCI: hv: Switch MSI descriptor locking to guard()
Convert the code to use the new guard(msi_descs_lock).

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/all/20250313130321.758905320@linutronix.de
2025-03-13 18:58:00 +01:00
Thomas Gleixner
1bc7e262a2 PCI/MSI: Switch to MSI descriptor locking to guard()
Convert the code to use the new guard(msi_descs_lock).

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/all/20250313130321.695027112@linutronix.de
2025-03-13 18:58:00 +01:00
Thippeswamy Havalige
ad3b7174d4
PCI: xilinx-cpm: Add support for Versal Net CPM5NC Root Port controller
The Versal Net ACAP (Adaptive Compute Acceleration Platform) devices
incorporate the Coherency and PCIe Gen5 Module, specifically the
Next-Generation Compact Module (CPM5NC).

The integrated CPM5NC block, along with the built-in bridge, can function
as a PCIe Root Port and supports the PCIe Gen5 protocol with data transfer
rates of up to 32 GT/s, and is capable of supporting up to a x16 lane-width
configuration.

Bridge errors are managed using a specific interrupt line designed for
CPM5N. The INTx interrupt support is not available.

Currently in this commit platform specific bridge errors support is not
added.

Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
[kwilczynski: commit log, squashed patch to fix an if-statement condition
to ensure that xilinx_cpm_pcie_init_port() does not run on the CPM5NC_HOST
variant from https://lore.kernel.org/linux-pci/20250311072402.1049990-1-thippeswamy.havalige@amd.com]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250224155025.782179-4-thippeswamy.havalige@amd.com
2025-03-11 14:34:48 +00:00
Thippeswamy Havalige
57b0302240
PCI: xilinx-cpm: Fix IRQ domain leak in error path of probe
The IRQ domain allocated for the PCIe controller is not freed if
resource_list_first_type() returns NULL, leading to a resource leak.

This fix ensures properly cleaning up the allocated IRQ domain in
the error path.

Fixes: 49e427e6bd ("Merge branch 'pci/host-probe-refactor'")
Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com>
[kwilczynski: added missing Fixes: tag, refactored to use one of the goto labels]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250224155025.782179-2-thippeswamy.havalige@amd.com
2025-03-11 14:34:36 +00:00
Robin Murphy
bcb81ac6ae iommu: Get DT/ACPI parsing into the proper probe path
In hindsight, there were some crucial subtleties overlooked when moving
{of,acpi}_dma_configure() to driver probe time to allow waiting for
IOMMU drivers with -EPROBE_DEFER, and these have become an
ever-increasing source of problems. The IOMMU API has some fundamental
assumptions that iommu_probe_device() is called for every device added
to the system, in the order in which they are added. Calling it in a
random order or not at all dependent on driver binding leads to
malformed groups, a potential lack of isolation for devices with no
driver, and all manner of unexpected concurrency and race conditions.
We've attempted to mitigate the latter with point-fix bodges like
iommu_probe_device_lock, but it's a losing battle and the time has come
to bite the bullet and address the true source of the problem instead.

The crux of the matter is that the firmware parsing actually serves two
distinct purposes; one is identifying the IOMMU instance associated with
a device so we can check its availability, the second is actually
telling that instance about the relevant firmware-provided data for the
device. However the latter also depends on the former, and at the time
there was no good place to defer and retry that separately from the
availability check we also wanted for client driver probe.

Nowadays, though, we have a proper notion of multiple IOMMU instances in
the core API itself, and each one gets a chance to probe its own devices
upon registration, so we can finally make that work as intended for
DT/IORT/VIOT platforms too. All we need is for iommu_probe_device() to
be able to run the iommu_fwspec machinery currently buried deep in the
wrong end of {of,acpi}_dma_configure(). Luckily it turns out to be
surprisingly straightforward to bootstrap this transformation by pretty
much just calling the same path twice. At client driver probe time,
dev->driver is obviously set; conversely at device_add(), or a
subsequent bus_iommu_probe(), any device waiting for an IOMMU really
should *not* have a driver already, so we can use that as a condition to
disambiguate the two cases, and avoid recursing back into the IOMMU core
at the wrong times.

Obviously this isn't the nicest thing, but for now it gives us a
functional baseline to then unpick the layers in between without many
more awkward cross-subsystem patches. There are some minor side-effects
like dma_range_map potentially being created earlier, and some debug
prints being repeated, but these aren't significantly detrimental. Let's
make things work first, then deal with making them nice.

With the basic flow finally in the right order again, the next step is
probably turning the bus->dma_configure paths inside-out, since all we
really need from bus code is its notion of which device and input ID(s)
to parse the common firmware properties with...

Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci-driver.c
Acked-by: Rob Herring (Arm) <robh@kernel.org> # of/device.c
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/e3b191e6fd6ca9a1e84c5e5e40044faf97abb874.1740753261.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-03-11 14:05:43 +01:00
Dan Carpenter
6e8d06e509 PCI: Remove stray put_device() in pci_register_host_bridge()
This put_device() was accidentally left over from when we changed the code
from using device_register() to calling device_add().  Delete it.

Link: https://lore.kernel.org/r/55b24870-89fb-4c91-b85d-744e35db53c2@stanley.mountain
Fixes: 9885440b16 ("PCI: Fix pci_host_bridge struct device release/free handling")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-10 13:41:48 -05:00
Ma Ke
1f2768b6a3 PCI: Fix reference leak in pci_alloc_child_bus()
If device_register(&child->dev) fails, call put_device() to explicitly
release child->dev, per the comment at device_register().

Found by code review.

Link: https://lore.kernel.org/r/20250202062357.872971-1-make24@iscas.ac.cn
Fixes: 4f535093cf ("PCI: Put pci_dev in device tree as early as possible")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: stable@vger.kernel.org
2025-03-10 13:41:48 -05:00
Ma Ke
804443c1f2 PCI: Fix reference leak in pci_register_host_bridge()
If device_register() fails, call put_device() to give up the reference to
avoid a memory leak, per the comment at device_register().

Found by code review.

Link: https://lore.kernel.org/r/20250225021440.3130264-1-make24@iscas.ac.cn
Fixes: 37d6a0a6f4 ("PCI: Add pci_register_host_bridge() interface")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
[bhelgaas: squash Dan Carpenter's double free fix from
https://lore.kernel.org/r/db806a6c-a91b-4e5a-a84b-6b7e01bdac85@stanley.mountain]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2025-03-10 13:41:48 -05:00
Bjorn Helgaas
a7eb9124d9 PCI: Cache offset of Resizable BAR capability
Previously most resizable BAR interfaces (pci_rebar_get_possible_sizes(),
pci_rebar_set_size(), etc) as well as pci_restore_state() searched config
space for a Resizable BAR capability.  Most devices don't have such a
capability, so this is wasted effort, especially for pci_restore_state().

Search for a Resizable BAR capability once at enumeration-time and cache
the offset so we don't have to search every time we need it.  No functional
change intended.

Link: https://lore.kernel.org/r/20250215000301.175097-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-10 13:41:47 -05:00
Bjorn Helgaas
3f8c4959fc PCI: Enable Configuration RRS SV early
Following a reset, a Function may respond to Config Requests with Request
Retry Status (RRS) Completion Status to indicate that it is temporarily
unable to process the Request, but will be able to process the Request in
the future (PCIe r6.0, sec 2.3.1).

If the Configuration RRS Software Visibility feature is enabled and a Root
Complex receives RRS for a config read of the Vendor ID, the Root Complex
completes the Request to the host by returning PCI_VENDOR_ID_PCI_SIG,
0x0001 (sec 2.3.2).

The Config RRS SV feature applies only to Root Ports and is not directly
related to pci_scan_bridge_extend().  Move the RRS SV enable to
set_pcie_port_type() where we handle other PCIe-specific configuration.

Link: https://lore.kernel.org/r/20250303210217.199504-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-10 13:41:39 -05:00
Bjorn Helgaas
f4e026f454 PCI: Fix typos
Fix typos and whitespace errors.

Link: https://lore.kernel.org/r/20250307231715.438518-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-08 15:08:45 -06:00
Niklas Cassel
a60a708420
PCI: dwc: ep: Remove superfluous function dw_pcie_ep_find_ext_capability()
Remove the superfluous function dw_pcie_ep_find_ext_capability(),
as it is virtually identical to dw_pcie_find_ext_capability().

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250221202646.395252-3-cassel@kernel.org
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:47:35 +00:00
Christian Bruel
934e9d137d
PCI: endpoint: pci-epf-test: Fix double free that causes kernel to oops
Fix a kernel oops found while testing the stm32_pcie Endpoint driver
with handling of PERST# deassertion:

During EP initialization, pci_epf_test_alloc_space() allocates all BARs,
which are further freed if epc_set_bar() fails (for instance, due to no
free inbound window).

However, when pci_epc_set_bar() fails, the error path:

  pci_epc_set_bar() ->
    pci_epf_free_space()

does not clear the previous assignment to epf_test->reg[bar].

Then, if the host reboots, the PERST# deassertion restarts the BAR
allocation sequence with the same allocation failure (no free inbound
window), creating a double free situation since epf_test->reg[bar] was
deallocated and is still non-NULL.

Thus, make sure that pci_epf_alloc_space() and pci_epf_free_space()
invocations are symmetric, and as such, set epf_test->reg[bar] to NULL
when memory is freed.

Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Link: https://lore.kernel.org/r/20250124123043.96112-1-christian.bruel@foss.st.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:47:35 +00:00
Zijun Hu
22a01177c3
PCI: endpoint: Remove unused devm_pci_epc_destroy()
The static function devm_pci_epc_match() is only invoked within the
devm_pci_epc_destroy(). However, since it was initially introduced,
this new API has had no callers.

Thus, remove both the unused API and the static function.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250217-remove_api-v2-1-b169c9117045@quicinc.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:47:31 +00:00
Niklas Cassel
aba2b17810
PCI: dw-rockchip: Describe Resizable BARs as Resizable BARs
Looking at section "11.4.4.29 USP_PCIE_RESBAR Registers Summary" in the
Technical Reference Manual (TRM) for RK3588, we can see that none of the
BARs are Fixed BARs, but actually Resizable BARs.

I couldn't find any reference in the TRM for RK3568, but looking at the
downstream PCIe endpoint driver, both RK3568 and RK3588 are treated as
the same, so the BARs on RK3568 must also be Resizable BARs.

Now when we actually have support for Resizable BARs, let's configure
these BARs as such.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20250131182949.465530-16-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:47:31 +00:00
Niklas Cassel
a2fa5f9614
PCI: keystone: Specify correct alignment requirement
The support for a specific iATU alignment was added in
commit 2a9a801620 ("PCI: endpoint: Add support to specify alignment for
buffers allocated to BARs").

This commit specifically mentions both that the alignment by each DWC
based EP driver should match CX_ATU_MIN_REGION_SIZE, and that AM65x
specifically has a 64 KB alignment.

This also matches the CX_ATU_MIN_REGION_SIZE value specified in the
section "12.2.2.4.7 PCIe Subsystem Address Translation" of the Technical
Reference Manual (TRM) for AM65x:

  https://www.ti.com/lit/ug/spruid7e/spruid7e.pdf

This higher value, 1 MB, was obviously an ugly hack used to be able to
handle Resizable BARs which have a minimum size of 1 MB.

Now when we actually have support for Resizable BARs, let's configure the
iATU alignment requirement to the actual requirement.
(BARs described as Resizable will still get aligned to 1 MB.)

Cc: stable+noautosel@kernel.org # Depends on PCI endpoint Resizable BARs series
Fixes: 23284ad677 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250131182949.465530-15-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:47:25 +00:00
Niklas Cassel
6a6b66f7e6
PCI: keystone: Describe Resizable BARs as Resizable BARs
Looking at section "12.2.2.4.15 PCIe Subsystem BAR Configuration" in the
following Technical Reference Manual (TRM) for AM65x:

  https://www.ti.com/lit/ug/spruid7e/spruid7e.pdf

We can see that BAR2 and BAR5 are not Fixed BARs, but actually Resizable
BARs.

Now when we actually have support for Resizable BARs, let's configure
these BARs as such.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20250131182949.465530-14-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:43:11 +00:00
Niklas Cassel
3a3d4cabe6
PCI: dwc: ep: Allow EPF drivers to configure the size of Resizable BARs
The DWC databook specifies three different BARn_SIZING_SCHEME_N as:

  - Fixed Mask (0)
  - Programmable Mask (1)
  - Resizable BAR (2)

Each of these sizing schemes have different instructions for how to
initialize the BAR.

The DWC driver currently does not support resizable BARs.

Instead, in order to somewhat support resizable BARs, the DWC EP driver
currently has an ugly hack that force sets a resizable BAR to 1 MB, if
such a BAR is detected.

Additionally, this hack only works if the DWC glue driver also has lied
in their EPC features, and claimed that the resizable BAR is a 1 MB fixed
size BAR.

This is unintuitive (as you somehow need to know that you need to lie in
your EPC features), but other than that it is overly restrictive, since a
resizable BAR is capable of supporting sizes different than 1 MB.

Add proper support for resizable BARs in the DWC EP driver.

Note that the pci_epc_set_bar() API takes a struct pci_epf_bar which tells
the EPC driver how it wants to configure the BAR.

struct pci_epf_bar only has a single size struct member.

This means that an EPC driver will only be able to set a single supported
size. This is perfectly fine, as we do not need the complexity of allowing
a host to change the size of the BAR. If someone ever wants to support
resizing a resizable BAR, the pci_epc_set_bar() API can be extended in the
future.

With these changes, we allow an EPF driver to configure the size of
Resizable BARs, rather than forcing them to a 1 MB size.

This means that an EPC driver does not need to lie in EPC features, and an
EPF driver will be able to set an arbitrary size (not be forced to a 1 MB
size), just like BAR_PROGRAMMABLE.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250131182949.465530-13-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:43:08 +00:00
Niklas Cassel
30a172db9f
PCI: dwc: ep: Move dw_pcie_ep_find_ext_capability()
Move dw_pcie_ep_find_ext_capability() so that it is located next to
dw_pcie_ep_find_capability().

Additionally, a follow-up commit requires this to be defined earlier
in order to avoid a forward declaration.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20250131182949.465530-12-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:43:07 +00:00
Niklas Cassel
4eb208424c
PCI: endpoint: Add pci_epc_bar_size_to_rebar_cap()
Add a helper function to convert a size to the representation used by the
Resizable BAR Capability Register.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250131182949.465530-11-cassel@kernel.org
[mani: squashed the change that added PCIe spec reference to comments
from https://lore.kernel.org/linux-pci/20250219171454.2903059-2-cassel@kernel.org]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:43:05 +00:00
Niklas Cassel
52132f3a63
PCI: endpoint: Allow EPF drivers to configure the size of Resizable BARs
A resizable BAR is different from a normal BAR in a few ways:

  - The minimum size of a resizable BAR is 1 MB.
  - Each BAR that is resizable has a Capability and Control register in
    the Resizable BAR Capability structure.

These registers contain the supported sizes and the currently selected
size of a resizable BAR.

The supported sizes is a bitmap of the supported sizes. The selected size
is a single value that is equal to one of the supported sizes.

A resizable BAR thus has to be configured differently than a
BAR_PROGRAMMABLE BAR, which usually sets the BAR size/mask in a vendor
specific way.

The PCI endpoint framework currently does not support resizable BARs.

Add a BAR type BAR_RESIZABLE, so that an EPC driver can support resizable
BARs properly.

Note that the pci_epc_set_bar() API takes a struct pci_epf_bar which tells
the EPC driver how it wants to configure the BAR.

struct pci_epf_bar only has a single size struct member.

This means that an EPC driver will only be able to set a single supported
size. This is perfectly fine, as we do not need the complexity of allowing
a host to change the size of the BAR. If someone ever wants to support
resizing a resizable BAR, the pci_epc_set_bar() API can be extended in the
future.

With these changes, we allow an EPF driver to configure the size of
Resizable BARs, rather than forcing them to a 1 MB size.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250131182949.465530-10-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:43:02 +00:00
Niklas Cassel
3c936e0ec0
PCI: endpoint: pci-epf-test: Handle endianness properly
The struct pci_epf_test_reg is the actual data in pci-epf-test's test_reg
BAR (usually BAR0), which the host uses to send commands (etc.), and which
pci-epf-test uses to send back status codes.

pci-epf-test currently reads and writes this data without any endianness
conversion functions, which means that pci-epf-test is completely broken
on big-endian endpoint systems.

PCI devices are inherently little-endian, and the data stored in the PCI
BARs should be in little-endian.

Use endianness conversion functions when reading and writing data to
struct pci_epf_test_reg so that pci-epf-test will behave correctly on
big-endian endpoint systems.

Fixes: 349e7a85b2 ("PCI: endpoint: functions: Add an EP function to test PCI")
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20250127161242.104651-2-cassel@kernel.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08 14:42:38 +00:00
Ilpo Järvinen
e4cb29386f PCI: Do not claim to release resource falsely
pci_release_resource() will print "... releasing" regardless of the
resource being assigned or not. Move the print after the res->parent check
to avoid claiming the kernel would be releasing an unassigned resource.

Likely, none of the current callers pass a resource that is unassigned so
this change is mostly to correct the non-sensical order than to remove
errorneous printouts.

Link: https://lore.kernel.org/r/20250307140922.5776-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-07 11:26:19 -06:00
Zhiyuan Dai
5af473941b PCI: Increase Resizable BAR support from 512 GB to 128 TB
Per PCIe r6.0, sec 7.8.6.2, devices can advertise Resizable BAR sizes up to
128 TB in the Resizable BAR Capability register.  Larger sizes can be
advertised via the Capability register, but that requires an API change.

Update pci_rebar_get_possible_sizes() and pbus_size_mem() to increase the
sizes we currently support from 512 GB to 128 TB.

Link: https://lore.kernel.org/r/20250307053535.44918-1-daizhiyuan@phytium.com.cn
Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-07 11:23:34 -06:00
Alistair Francis
f810d17762 PCI/DOE: Rename Discovery Response Data Object Contents to type
PCIe r6.1, sec 6.30.1.1, describes a "Vendor ID", a "Data Object Type" and
"Next Index" as the fields in the DOE Discovery Response Data Object. The
DOE driver currently uses both the terms 'type' and 'prot' for the second
element.

Rename all uses of the DOE Discovery Response Data Object to use 'type' as
the second element of the object header, instead of type/prot as it
currently is.

Link: https://lore.kernel.org/r/20250306075211.1855177-2-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-03-06 12:48:16 -06:00
Alistair Francis
b4db6be0ce PCI/DOE: Rename DOE protocol to feature
DOE r1.1 replaced all occurrences of "protocol" with the term "feature" or
"Data Object Type".  PCIe r6.1 incorporated that change.

Rename the existing terms protocol with feature.

Link: https://lore.kernel.org/r/20250306075211.1855177-1-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2025-03-06 12:47:57 -06:00
D M, Sharath Kumar
60f2ee5f14
PCI: altera: Add Agilex support
Add PCIe Root Port controller support for the Agilex family of chips.

The Agilex PCIe Hard IP has three variants that are mostly software
compatible, except for a couple register offsets. The P-Tile variant
supports Gen3/Gen4 1x16. The F-Tile variant supports Gen3/Gen4 4x4,
4x8, and 4x16. The R-Tile variant improves on the F-Tile variant by
adding Gen5 support.

To simplify the implementation of pci_ops read/write functions,
ep_{read/write}_cfg() callbacks were added to struct altera_pci_ops
to easily distinguish between hardware variants.

Signed-off-by: D M, Sharath Kumar <sharath.kumar.d.m@intel.com>
Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250221170452.875419-3-matthew.gerlach@linux.intel.com
[kwilczynski: tidy code comments]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 09:52:48 +00:00
Zhang Zekun
bffc72387a
PCI: tegra: Use helper function for_each_child_of_node_scoped()
The for_each_available_child_of_node_scoped() helper provides
a scope-based clean-up functionality to put the device_node
automatically, and as such, there is no need to call of_node_put()
directly.

Thus, use this helper to simplify the code.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240831040413.126417-7-zhangzekun11@huawei.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 09:31:45 +00:00
Zhang Zekun
f60b4e06a9
PCI: apple: Use helper function for_each_child_of_node_scoped()
The for_each_available_child_of_node_scoped() helper provides
a scope-based clean-up functionality to put the device_node
automatically, and as such, there is no need to call of_node_put()
directly.

Thus, use this helper to simplify the code.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240831040413.126417-6-zhangzekun11@huawei.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 09:31:45 +00:00
Zhang Zekun
8905f8b6f5
PCI: mt7621: Use helper function for_each_available_child_of_node_scoped()
The for_each_available_child_of_node_scoped() helper provides
a scope-based clean-up functionality to put the device_node
automatically, and as such, there is no need to call of_node_put()
directly.

Thus, use this helper to simplify the code.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240831040413.126417-5-zhangzekun11@huawei.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 09:31:45 +00:00
Zhang Zekun
a51adf82f8
PCI: mediatek: Use helper function for_each_available_child_of_node_scoped()
The for_each_available_child_of_node_scoped() helper provides
a scope-based clean-up functionality to put the device_node
automatically, and as such, there is no need to call of_node_put()
directly.

Thus, use this helper to simplify the code.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240831040413.126417-4-zhangzekun11@huawei.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 09:31:45 +00:00
Zhang Zekun
d523347854
PCI: kirin: Tidy up _probe() related function with dev_err_probe()
The combination of dev_err() and the returned error code could be
replaced by dev_err_probe() in driver's probe function.

Thus, convert the code to use dev_err_probe() to make code simpler.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240831040413.126417-3-zhangzekun11@huawei.com
[kwilczynski: commit log, return -ETIMEDOUT from hi3660_pcie_phy_start()
rather than -EINVAL for when the PIPE clock fails to become stable,
drop redundant dev->of_node NULL check]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 09:30:15 +00:00
Shawn Lin
20bbb083bb
PCI: Add Rockchip Vendor ID
Move PCI_VENDOR_ID_ROCKCHIP from pci_endpoint_test.c to pci_ids.h and
reuse it in pcie-rockchip-host.c.

Link: https://lore.kernel.org/r/20250218092120.2322784-2-cassel@kernel.org
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 08:55:54 +00:00
Hans Zhang
f0f3044d22
PCI: dwc: Add debugfs property to provide LTSSM status of the PCIe link
Add the debugfs property to provide a view of the current link's LTSSM
status from the Root Port device.

Signed-off-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Niklas Cassel <cassel@kernel.org>
Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com>
Link: https://lore.kernel.org/r/20250223141848.231232-1-18255117159@163.com
[kwilczynski: commit log, refactor dw_ltssm_sts_string() to avoid
compilation errors on platforms that do not set CONFIG_PCIE_DW_HOST]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 08:55:54 +00:00
Shradha Todi
27491ac2cc
PCI: dwc: Add debugfs based Statistical Counter support for DWC
Add support to provide Statistical Counter interface to userspace.

This set of debug registers are part of the RAS DES feature present in
DesignWare PCIe controllers.

Co-developed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Shradha Todi <shradha.t@samsung.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com>
Link: https://lore.kernel.org/r/20250221131548.59616-6-shradha.t@samsung.com
[kwilczynski: commit log, tidy up code comments, update documentation,
squashed patch that checks if the event counter is supported from
https://lore.kernel.org/linux-pci/20250225171239.19574-3-manivannan.sadhasivam@linaro.org]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 08:55:53 +00:00
Shradha Todi
d20ee8e2db
PCI: dwc: Add debugfs based Error Injection support for DWC
Add support to provide Error Injection interface to userspace.

This set of debug registers are part of the RAS DES feature present in
DesignWare PCIe controllers.

Signed-off-by: Shradha Todi <shradha.t@samsung.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com>
Link: https://lore.kernel.org/r/20250221131548.59616-5-shradha.t@samsung.com
[kwilczynski: commit log, tidy up code comments, update documentation,
change debugfs property name from "duplicate_dllp" to "duplicate_tlp"]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 08:55:53 +00:00
Shradha Todi
4fbfa17f9a
PCI: dwc: Add debugfs based Silicon Debug support for DWC
Add support to provide Silicon Debug interface to userspace.

This set of debug registers are part of the RAS DES feature present in
DesignWare PCIe controllers.

Co-developed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Shradha Todi <shradha.t@samsung.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com>
Link: https://lore.kernel.org/r/20250221131548.59616-4-shradha.t@samsung.com
[kwilczynski: commit log, tidy up Kconfig and drop "default y", tidy up
code comments, squashed patch that fixes a NULL pointer dereference when
debugfs is already unavailable during clean-up from
https://lore.kernel.org/linux-pci/20250225171239.19574-2-manivannan.sadhasivam@linaro.org,
refactor dwc_pcie_debugfs_init() to not return errors, squashed patch that
changes how lack of the RAS DES capability is handled from
https://lore.kernel.org/linux-pci/20250304151814.6xu7cbpwpqrvcad5@thinkpad]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-06 08:55:47 +00:00
Charles Han
98e87cc501
PCI: mediatek-gen3: Fix inconsistent indentation
Fix the following inconsistent indentation warning:

  drivers/pci/controller/pcie-mediatek-gen3.c:922 mtk_pcie_parse_port() warn: inconsistent indenting

Found using Smatch.  No functional changes intended.

Signed-off-by: Charles Han <hanchunchao@inspur.com>
Link: https://lore.kernel.org/r/20250305070022.4668-1-hanchunchao@inspur.com
[kwilczynski: commit log, refactor if-statement around num_lanes to
make it more readable, wrap overly long lines to fit 80 colums]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-05 20:30:51 +00:00
Zhang Zekun
9a0f3c50bd
PCI: kirin: Use helper function for_each_available_child_of_node_scoped()
The for_each_available_child_of_node_scoped() helper provides
a scope-based clean-up functionality to put the device_node
automatically, and as such, there is no need to call of_node_put()
directly.

Thus, use this helper to simplify the code.

Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20240831040413.126417-2-zhangzekun11@huawei.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-05 10:44:44 +00:00
Nishanth Aravamudan
479380efe1 PCI: Avoid reset when disabled via sysfs
After d88f521da3 ("PCI: Allow userspace to query and set device reset
mechanism"), userspace can disable reset of specific PCI devices by writing
an empty string to the sysfs reset_method file.

However, pci_slot_resettable() does not check pci_reset_supported(), which
means that pci_reset_function() will still reset the device even if
userspace has disabled all the reset methods.

I was able to reproduce this issue with a vfio device passed to a qemu
guest, where I had disabled PCI reset via sysfs.

Add an explicit check of pci_reset_supported() in both
pci_slot_resettable() and pci_bus_resettable() to ensure both the reset
status and reset execution are bypassed if an administrator disables it for
a device.

Link: https://lore.kernel.org/r/20250207205600.1846178-1-naravamudan@nvidia.com
Fixes: d88f521da3 ("PCI: Allow userspace to query and set device reset mechanism")
Signed-off-by: Nishanth Aravamudan <naravamudan@nvidia.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: Amey Narkhede <ameynarkhede03@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Yishai Hadas <yishaih@nvidia.com>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: Kevin Tian <kevin.tian@intel.com>
2025-03-04 17:23:21 -06:00
Feng Tang
9d7db4db19 PCI/portdrv: Only disable pciehp interrupts early when needed
Firmware developers reported that Linux issues two PCIe hotplug commands in
very short intervals on an ARM server, which doesn't comply with the PCIe
spec.  According to PCIe r6.1, sec 6.7.3.2, if the Command Completed event
is supported, software must wait for a command to complete or wait at
least 1 second before sending a new command.

In the failure case, the first PCIe hotplug command is from
get_port_device_capability(), which sends a command to disable PCIe hotplug
interrupts without waiting for its completion, and the second command comes
from pcie_enable_notification() of pciehp driver, which enables hotplug
interrupts again.

Fix this by only disabling the hotplug interrupts when the pciehp driver is
not enabled.

Link: https://lore.kernel.org/r/20250303023630.78397-1-feng.tang@linux.alibaba.com
Fixes: 2bd50dd800 ("PCI: PCIe: Disable PCIe port services during port initialization")
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2025-03-04 17:17:01 -06:00
Lukas Wunner
cc973ef13f PCI: hotplug: Inline pci_hp_{create,remove}_module_link()
For no apparent reason, the pci_hp_{create,remove}_module_link() helpers
live in slot.c, even though they're only called from two functions in
pci_hotplug_core.c.

Inline the helpers to reduce code size and number of exported symbols.

Link: https://lore.kernel.org/r/c207f03cfe32ae9002d9b453001a1dd63d9ab3fb.1740501868.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04 17:00:12 -06:00
Lukas Wunner
34bd6141a6 PCI: hotplug: Avoid backpointer dereferencing in has_*_file()
The PCI hotplug core contains five has_*_file() functions to determine
whether a certain sysfs file shall be added (or removed) for a given
hotplug slot.

The functions receive a struct pci_slot pointer which they have to
dereference back to a struct hotplug_slot.

Avoid by passing them a struct hotplug_slot pointer directly.

Link: https://lore.kernel.org/r/5b2f5b4ac45285953d00fd7637732a93fd40d26e.1740501868.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04 17:00:12 -06:00
Lukas Wunner
62460bcb5a PCI: hotplug: Drop superfluous NULL pointer checks in has_*_file()
The PCI hotplug core contains five has_*_file() functions to determine
whether a certain sysfs file shall be added (or removed) for a given
hotplug slot.

The functions perform NULL pointer checks for the hotplug_slot and its
hotplug_slot_ops.  However the callers already perform these checks:

  pci_hp_register()
    __pci_hp_register()
      __pci_hp_initialize()

  pci_hp_deregister()
    pci_hp_del()

The only way to actually trigger these checks is to call pci_hp_add()
without having called pci_hp_initialize().

Amend pci_hp_add() to catch that and drop the now superfluous NULL
pointer checks in has_*_file().

Drop the same superfluous checks from pci_hp_create_module_link(),
which is (only) called from pci_hp_add().

Link: https://lore.kernel.org/r/37d1928edf8c3201a8b10794f1db3142e16e02b9.1740501868.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04 17:00:12 -06:00
Lukas Wunner
666550a806 PCI: hotplug: Drop superfluous try_module_get() calls
In December 2002, historic commit

  https://git.kernel.org/tglx/history/c/bec7aa00ffe5
  ("[PATCH] more module warning fixes")

amended the PCI hotplug core to acquire a reference on the hotplug
driver module when a sysfs attribute is accessed.  That was necessary
because back in the day, sysfs code did not take any precautions to
prevent module unloading when an attribute was accessed.

Soon after in July 2003, historic commit

  https://git.kernel.org/tglx/history/c/1cf6d20f6078
  ("[PATCH] SYSFS: add module referencing to sysfs attribute files.")

addressed that deficiency.  But the commit neglected to remove the now
unnecessary reference acquisition from the PCI hotplug core.

The commit acquired a module reference for the entire duration between
open() and close() of a sysfs attribute.  This made it impossible to
unload a module while attributes were kept open by user space.
That's possible today:

When a hotplug driver module is unloaded, it removes sysfs attributes of
all its hotplug slots by calling pci_hp_del().  This will wait for any
concurrent user space operation to finish:

  pci_hp_del()
    fs_remove_slot()
      sysfs_remove_file()
        sysfs_remove_file_ns()
          kernfs_remove_by_name_ns()
            __kernfs_remove()
              kernfs_drain()

A user space operation such as read() briefly acquires a reference on
the attribute with kernfs_get_active().  kernfs_drain() waits until all
such references are released before allowing attribute removal.  Once
the attribute is removed, any subsequent user space operation on a still
open attribute file will return -ENODEV.

Thus, reference acquisition by the PCI hotplug core is still unnecessary
today.  So drop it at long last.

Link: https://lore.kernel.org/r/ed950fa2722967be4491146c7b867c1e7be11d37.1740501868.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04 17:00:12 -06:00
Lukas Wunner
5c8265fa63 PCI: hotplug: Drop superfluous pci_hotplug_slot_list
The PCI hotplug core keeps a list of all registered slots.  Its sole
purpose is to WARN() on slot removal if another slot is using the same
name.

But this can never happen because already on slot creation, an error is
returned and multiple messages are emitted if a slot's name is
duplicated:

  pci_hp_register()
    __pci_hp_register()
      __pci_hp_initialize()
        pci_create_slot()
          kobject_init_and_add()
            kobject_add_varg()
              kobject_add_internal()
                create_dir()
                  sysfs_create_dir_ns()
                    kernfs_create_dir_ns()
                      sysfs_warn_dup()
                        pr_warn("cannot create duplicate filename ...")
                pr_err("%s failed for %s with -EEXIST, ...");

Drop the superfluous list.

Link: https://lore.kernel.org/r/603735bc50eb370bc7f1c358441ac671360bab25.1740501868.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04 17:00:12 -06:00
Bjorn Helgaas
800ce277f4 PCI: Log debug messages about reset method
Log pci_dbg() messages about the reset methods we attempt and any errors
(-ENOTTY means "try the next method").

Set CONFIG_DYNAMIC_DEBUG=y and enable by booting with
dyndbg="file drivers/pci/* +p" or enable at runtime:

  # echo "file drivers/pci/* +p" > /sys/kernel/debug/dynamic_debug/control

Link: https://lore.kernel.org/r/20250303204220.197172-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
2025-03-04 16:44:43 -06:00
Jim Quinlan
174cfcf13d
PCI: brcmstb: Make irq_domain_set_info() parameter cast explicit
Make the cast to the irq_hw_number_t type for the parameter passed to
irq_domain_set_info() function explicit.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-9-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 16:03:10 +00:00
Jim Quinlan
a9ec9fb738
PCI: brcmstb: Make two changes in MDIO register fields
The hardware has been updated with two changes to the MDIO packet
format.

The CMD field used to be 12 bits and now is only 1 bit. This change
is backwards compatible because the field's starting bit position is
unchanged, and the only commands we've used have values 0 and 1.

The PORT field's width has been changed from 4 bits to 5 bits. When
written, the new bit is not contiguous with the other four. However,
this change is backwards compatible because the driver never used
anything other than 0 for the port field's value.

Thus, update the existing code to handle new changes to the hardware
in a backwards-compatible manner.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-8-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 16:03:04 +00:00
Jim Quinlan
42fd45be82
PCI: brcmstb: Use same constant table for config space access
The constants EXT_CFG_DATA and EXT_CFG_INDEX vary by SoC, where one of
the map_bus methods used these constants, and the other used a different
set of constants.

Thankfully, there was no problem because the SoCs that used the latter
map_bus method all had the same register constants.

Thus, remove redundant constants and adjust the code to use the correct
constants accordingly.

While at it, update the value of EXT_CFG_DATA to use the 4k-page based
configuration space access system, which is what the second map_bus
method was already using.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250214173944.47506-7-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 16:01:43 +00:00
Jim Quinlan
b7de1b60ec
PCI: brcmstb: Fix potential premature regulator disabling
The platform supports enabling and disabling regulators only on
ports below the Root Complex.

Thus, we need to verify this both when adding and removing the bus,
otherwise regulators may be disabled prematurely when a bus further
down the topology is removed.

Fixes: 9e6be018b2 ("PCI: brcmstb: Enable child bus device regulators from DT")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-6-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 16:01:30 +00:00
Jim Quinlan
3651ad5249
PCI: brcmstb: Fix error path after a call to regulator_bulk_get()
If the regulator_bulk_get() returns an error and no regulators
are created, we need to set their number to zero.

If we don't do this and the PCIe link up fails, a call to the
regulator_bulk_free() will result in a kernel panic.

While at it, print the error value, as we cannot return an error
upwards as the kernel will WARN() on an error from add_bus().

Fixes: 9e6be018b2 ("PCI: brcmstb: Enable child bus device regulators from DT")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250214173944.47506-5-james.quinlan@broadcom.com
[kwilczynski: commit log, use comma in the message to match style with
other similar messages]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 16:00:20 +00:00
Jim Quinlan
b5e441793e
PCI: brcmstb: Do not assume that register field starts at LSB
When setting the LNKCAP and LNKCTL2 register fields, it was assumed
that the field started at the LSB of the register.

Although the masks do indeed start at the LSB, and this will probably
not change, it is prudent to use a method that makes no assumption
about the mask's placement in the register.

Thus, use the u{16,32}p_replace_bits() helpers since they are already
wildly used in this driver.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-4-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 15:58:17 +00:00
Jim Quinlan
0c97321e11
PCI: brcmstb: Use internal register to change link capability
The driver has been mistakenly writing to a read-only (RO)
configuration space register (PCI_EXP_LNKCAP) to change the
PCIe link capability.

Although harmless in this case, the proper write destination
is an internal register that is reflected by PCI_EXP_LNKCAP.

Thus, fix the brcm_pcie_set_gen() function to correctly update
the link capability.

Fixes: c045213703 ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-3-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 15:54:46 +00:00
Jim Quinlan
72d36589c6
PCI: brcmstb: Set generation limit before PCIe link up
When the user elects to limit the PCIe generation via the appropriate
devicetree property, apply the settings before the PCIe link up, not
after.

Fixes: c045213703 ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214173944.47506-2-james.quinlan@broadcom.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 15:54:15 +00:00
Stanimir Varbanov
377bced88c
PCI: brcmstb: Add BCM2712 support
Add a bare minimum amount of changes in order to support PCIe Root
Complex hardware IP found on RPi5. The PCIe controller on BCM2712
is based on BCM7712 and as such it inherits register offsets, PERST#
assertion, bridge_reset ops, and inbound windows count.

Although, the implementation for BCM2712 needs a workaround related to
the control of the bridge_reset where turning off of the Root Port must
not shutdown the bridge_reset and this must be avoided. To implement
this workaround a quirks field is introduced in pcie_cfg_data struct.

The controller also needs adjustment of PHY PLL setup to use a 54MHz
input refclk. The default input reference clock for the PHY PLL is
100Mhz, except for some devices where it is 54Mhz like BCM2712C1 and
BCM2712D0.

To implement those adjustments introduce a new .post_setup op in
pcie_cfg_data and call it at the end of brcm_pcie_setup function.

The BCM2712 .post_setup callback implements the required MDIO writes
that switch the PLL refclk and also change PHY PM clock period.

Without this RPi5 PCIex1 is unable to enumerate endpoint devices on
the expansion connector.

Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-8-svarbanov@suse.de
[commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 15:54:08 +00:00
Hans Zhang
3ac47fbf4f
PCI: cadence-ep: Fix the driver to send MSG TLP for INTx without data payload
Per the Cadence's "PCIe Controller IP for AX14" user guide, Version
1.04, Section 9.1.7.1, "AXI Subordinate to PCIe Address Translation
Registers", Table 9.4, the bit 16 of the AXI Subordinate Address
(axi_s_awaddr) when set corresponds to MSG with data, and when not set,
to MSG without data.

However, the driver is currently doing the opposite and due to this,
the INTx is never received on the host.

So, fix the driver to reflect the documentation and also make INTx work.

Fixes: 37dddf14f1 ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller")
Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Hans Zhang <hans.zhang@cixtech.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250214165724.184599-1-18255117159@163.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-04 11:22:25 +00:00
Shradha Todi
efaf16de43
PCI: dwc: Add helper to find the Vendor Specific Extended Capability (VSEC)
The new dw_pcie_find_vsec_capability() helper will be used within
different DWC APIs to find the VSEC capabilities like PTM, RAS, etc.

Co-developed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Shradha Todi <shradha.t@samsung.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com>
Link: https://lore.kernel.org/r/20250221131548.59616-3-shradha.t@samsung.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-03 20:02:58 +00:00
Lorenzo Bianconi
249b782980
PCI: mediatek-gen3: Configure PBUS_CSR registers for EN7581 SoC
Configure PBus base address and address mask to allow the hw
to detect if a given address is accessible on PCIe controller.

Fixes: f6ab898356 ("PCI: mediatek-gen3: Add Airoha EN7581 support")
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/20250225-en7581-pcie-pbus-csr-v4-2-24324382424a@kernel.org
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-03 19:11:44 +00:00
Herve Codina
1f34072441 PCI: of: Create device tree PCI host bridge node
PCI devices device tree nodes can be already created. This was introduced
by commit 407d1a5192 ("PCI: Create device tree node for bridge").

In order to have device tree nodes related to PCI devices attached on their
PCI root bus (the PCI bus handled by the PCI host bridge), a PCI root bus
device tree node is needed. This root bus node will be used as the parent
node of the first level devices scanned on the bus. On device tree based
systems, this PCI root bus device tree node is set to the node of the
related PCI host bridge. The PCI host bridge node is available in the
device tree used to describe the hardware passed at boot.

On non device tree based system (such as ACPI), a device tree node for the
PCI host bridge or for the root bus does not exist. Indeed, the PCI host
bridge is not described in a device tree used at boot simply because no
device tree is passed at boot.

The device tree PCI host bridge node creation needs to be done at runtime.
This is done in the same way as for the creation of the PCI device nodes.
I.e. node and properties are created based on computed information done by
the PCI core. Also, as is done on device tree based systems, this PCI host
bridge node is used for the PCI root bus.

With this done, hardware available in a PCI device that doesn't follow the
PCI model consisting in one PCI function handled by one driver can be
described by a device tree overlay loaded by the PCI device driver on non
device tree based systems. Those PCI devices provide a single PCI function
that includes several functionalities that require different drivers. The
device tree overlay describes the internal devices and their relationships.
It allows to load drivers needed by those different devices in order to
have functionalities handled.

Link: https://lore.kernel.org/r/20250224141356.36325-6-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
2025-02-28 15:13:51 -06:00
Herve Codina
3dc8adeeef PCI: of_property: Constify parameter in of_pci_get_addr_flags()
The res parameter has no reason to be a pointer to an un-const struct
resource. Indeed, struct resource is not supposed to be modified by the
function.

Constify the res parameter.

Link: https://lore.kernel.org/r/20250224141356.36325-5-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
2025-02-28 15:13:07 -06:00
Herve Codina
c5785a165f PCI: of_property: Add support for NULL pdev in of_pci_set_address()
The pdev (pointer to a struct pci_dev) parameter of of_pci_set_address()
cannot be NULL.

In order to use of_pci_set_address() when creating the PCI root bus node,
it needs to support a NULL pdev parameter. Indeed, in the case of the PCI
root bus node creation, no pdev is available and of_pci_set_address() will
be used with the bridge windows.

Allow to call of_pci_set_address() with a NULL pdev.

Link: https://lore.kernel.org/r/20250224141356.36325-4-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
2025-02-28 15:13:07 -06:00
Herve Codina
e2267841fe PCI: of: Use device_{add,remove}_of_node() to attach of_node to existing device
The commit 407d1a5192 ("PCI: Create device tree node for bridge")
creates of_node for PCI devices. The newly created of_node is attached
to an existing device. This is done setting directly pdev->dev.of_node
in the code.

Even if pdev->dev.of_node cannot be previously set, this doesn't handle
the fwnode field of the struct device. Indeed, this field needs to be
set if it hasn't already been set.

device_{add,remove}_of_node() have been introduced to handle this case.

Use them instead of the direct setting.

Link: https://lore.kernel.org/r/20250224141356.36325-3-herve.codina@bootlin.com
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
2025-02-28 15:13:07 -06:00
Stanimir Varbanov
25a98c7270
PCI: brcmstb: Expand inbound window size up to 64GB
The BCM2712 memory map can support up to 64GB of system memory, thus
expand the inbound window size in calculation helper function.

The change is safe for the currently supported SoCs that have smaller
inbound window sizes.

Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-7-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-28 18:40:08 +00:00
Stanimir Varbanov
10dbedad3c
PCI: brcmstb: Reuse pcie_cfg_data structure
Instead of copying fields from the pcie_cfg_data structure to
brcm_pcie, reference it directly.

Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelil <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-6-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-28 18:40:01 +00:00
Stanimir Varbanov
2294059118
PCI: brcmstb: Add a softdep to MIP MSI-X driver
Then the brcmstb PCIe driver and MIP MSI-X interrupt controller
drivers are built as modules there could be a race in probing.

To avoid this, add a softdep to MIP driver to guarantee that
MIP driver will be load first.

Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-5-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-28 18:38:01 +00:00
Yury Norov
7a610694fa PCI: hv: Switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
Calling cpumask_next_wrap_old() with starting CPU == nr_cpu_ids
is effectively the same as request to find first CPU, starting
from a given one and wrapping around if needed.

cpumask_next_wrap() is a proper replacement for that.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-02-24 16:37:23 -05:00
Yury Norov
dc5bb9b769 cpumask: deprecate cpumask_next_wrap()
The next patch aligns implementation of cpumask_next_wrap() with the
find_next_bit_wrap(), and it changes function signature.

To make the transition smooth, this patch deprecates current
implementation by adding an _old suffix. The following patches switch
current users to the new implementation one by one.

No functional changes were intended.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-02-24 16:37:22 -05:00
Dmitry Baryshkov
42c812d070
PCI: qcom-ep: Enable EP mode support for SAR2130P
Enable PCIe Endpoint mode support for the Qualcomm SAR2130P platform.

This is needed, as it is not possible to use a compatible fallback on
any other platform since SAR2130P uses slightly different set of clocks.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20250221-sar2130p-pci-v3-6-61a0fdfb75b4@linaro.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-24 19:05:00 +00:00
Stanimir Varbanov
2df181e1ae
PCI: brcmstb: Fix missing of_node_put() in brcm_pcie_probe()
A call to of_parse_phandle() is incrementing the refcount, and as such,
the of_node_put() must be called when the reference is no longer needed.

Thus, refactor the existing code and add a missing of_node_put() call
following the check to ensure that "msi_np" matches "pcie->np" and after
MSI initialization, but only if the MSI support is enabled system-wide.

Cc: stable@vger.kernel.org # v5.10+
Fixes: 40ca1bf580 ("PCI: brcmstb: Add MSI support")
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250122222955.1752778-1-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-24 18:50:54 +00:00
Manivannan Sadhasivam
9d52691f89
PCI: qcom-ep: Mark BAR0/BAR2 as 64bit BARs and BAR1/BAR3 as RESERVED
On all Qcom endpoint SoCs, BAR0/BAR2 are 64bit BARs by default and
software cannot change the type.

So, mark the those BARs as 64bit BARs and also mark the successive
BAR1/BAR3 as RESERVED BARs so that the EPF drivers cannot use them.

Cc: stable+noautosel@kernel.org # depends on patch introducing only_64bit flag
Fixes: f55fee56a6 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241231130224.38206-3-manivannan.sadhasivam@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-24 18:29:05 +00:00
Balbir Singh
7ffb791423 x86/kaslr: Reduce KASLR entropy on most x86 systems
When CONFIG_PCI_P2PDMA=y (which is basically enabled on all
large x86 distros), it maps the PFN's via a ZONE_DEVICE
mapping using devm_memremap_pages(). The mapped virtual
address range corresponds to the pci_resource_start()
of the BAR address and size corresponding to the BAR length.

When KASLR is enabled, the direct map range of the kernel is
reduced to the size of physical memory plus additional padding.
If the BAR address is beyond this limit, PCI peer to peer DMA
mappings fail.

Fix this by not shrinking the size of the direct map when
CONFIG_PCI_P2PDMA=y.

This reduces the total available entropy, but it's better than
the current work around of having to disable KASLR completely.

[ mingo: Clarified the changelog to point out the broad impact ... ]

Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/Kconfig
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/
Link: https://lore.kernel.org/r/20250206234234.1912585-1-balbirs@nvidia.com
--
 arch/x86/mm/kaslr.c | 10 ++++++++--
 drivers/pci/Kconfig |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)
2025-02-22 12:25:57 +01:00
Guilherme Giacomo Simoes
8ff4574cf7
PCI: cpcihp: Remove unused .get_power() and .set_power()
The .get_power() and .set_power() function pointers in struct
cpci_hp_controller_ops were declared but never implemented by any
driver.

Thus, to improve code readability and reduce resource usage,
remove these pointers and the code that has never been used.

Link: https://lore.kernel.org/r/20250217185638.398925-1-trintaeoitogc@gmail.com
Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-22 08:03:02 +00:00
Ilpo Järvinen
7e077e6707 PCI/ERR: Handle TLP Log in Flit mode
Flit mode introduced in PCIe r6.0 alters how the TLP Header Log is
presented through AER and DPC Capability registers. The TLP Prefix Log
Register is not present with Flit mode, and the register becomes an
extension of the TLP Header Log (PCIe r6.1 secs 7.8.4.12 & 7.9.14.13).

Adapt pcie_read_tlp_log() and struct pcie_tlp_log to read and store the
extended TLP Header Log when the Link is in Flit mode. As the Prefix Log
and Extended TLP Header are not present at the same time, a C union can be
used.

Determining whether the error occurred while the Link was in Flit mode is a
bit complicated. In case of AER, the Advanced Error Capabilities and
Control Register directly tells whether the error was logged in Flit mode
or not (PCIe r6.1 sec 7.8.4.7). The DPC Capability (PCIe r6.1 sec 7.9.14),
unfortunately, does not contain the same information.

Unlike AER, the DPC Capability does not provide a way to discern whether
the error was logged in Flit mode (this is confirmed by PCI WG to be an
oversight in the spec). DPC will bring the Link down immediately following
an error, which makes it impossible to acquire the Flit Mode Status
directly from the Link Status 2 register because Flit Mode Status is only
set in certain Link states (PCIe r6.1 sec 7.5.3.20). As a workaround, use
the flit_mode value stored into the struct pci_bus.

Link: https://lore.kernel.org/r/20250207161836.2755-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21 17:31:45 -06:00
Ilpo Järvinen
79c731e20d PCI: Track Flit Mode Status & print it with link status
PCIe r6.0 added Flit mode, which mainly alters HW behavior, but there are
some OS visible changes. The OS visible changes include differences in the
layout of some capabilities and interpretation of the TLP headers (in
diagnostics situations).

To be able to determine which mode the PCIe Link is using, store the Flit
Mode Status (PCIe r6.1 sec 7.5.3.20) information in addition to the Link
speed into struct pci_bus in pcie_update_link_speed().

Link: https://lore.kernel.org/r/20250207161836.2755-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: use unsigned int:1 instead of bool, update flit_mode setting]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21 17:31:35 -06:00
Ilpo Järvinen
fab874e125 PCI/AER: Descope pci_printk() to aer_printk()
include/linux/pci.h provides low-level pci_printk() interface that is
only used by AER because it needs to print the same message with
different levels depending on the error severity. No other PCI code
uses that functionality and calls pci_<level>() logging functions
directly with the appropriate level.

Descope pci_printk() into AER as aer_printk().

Link: https://lore.kernel.org/r/20241216161012.1774-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: retain pci_printk() for now since shpchp still uses it and I
moved those patches to a different branch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21 17:17:08 -06:00
Tushar Dave
9cf8a952d5 PCI/ACS: Fix 'pci=config_acs=' parameter
Commit 47c8846a49 ("PCI: Extend ACS configurability") introduced bugs
that fail to configure ACS ctrl to the value specified by the kernel
parameter. Essentially there are two bugs:

1) When ACS is configured for multiple PCI devices using 'config_acs'
   kernel parameter, it results into error "PCI: Can't parse ACS command
   line parameter". This is due to a bug that doesn't preserve the ACS
   mask, but instead overwrites the mask with value 0.

   For example, using 'config_acs' to configure ACS ctrl for multiple BDFs
   fails:

      Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
      PCI: Can't parse ACS command line parameter
      pci 0020:02:00.0: ACS mask  = 0x007f
      pci 0020:02:00.0: ACS flags = 0x007b
      pci 0020:02:00.0: Configured ACS to 0x007b

   After this fix:

      Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
      pci 0020:02:00.0: ACS mask  = 0x007f
      pci 0020:02:00.0: ACS flags = 0x007b
      pci 0020:02:00.0: ACS control = 0x005f
      pci 0020:02:00.0: ACS fw_ctrl = 0x0053
      pci 0020:02:00.0: Configured ACS to 0x007b
      pci 0039:00:00.0: ACS mask  = 0x0070
      pci 0039:00:00.0: ACS flags = 0x0050
      pci 0039:00:00.0: ACS control = 0x001d
      pci 0039:00:00.0: ACS fw_ctrl = 0x0000
      pci 0039:00:00.0: Configured ACS to 0x0050

2) In the bit manipulation logic, we copy the bit from the firmware
   settings when mask bit 0.

   For example, 'disable_acs_redir' fails to clear all three ACS P2P redir
   bits due to the wrong bit fiddling:

      Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
      pci 0020:02:00.0: ACS mask  = 0x002c
      pci 0020:02:00.0: ACS flags = 0xffd3
      pci 0020:02:00.0: Configured ACS to 0xfffb
      pci 0030:02:00.0: ACS mask  = 0x002c
      pci 0030:02:00.0: ACS flags = 0xffd3
      pci 0030:02:00.0: Configured ACS to 0xffdf
      pci 0039:00:00.0: ACS mask  = 0x002c
      pci 0039:00:00.0: ACS flags = 0xffd3
      pci 0039:00:00.0: Configured ACS to 0xffd3

   After this fix:

      Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p"
      pci 0020:02:00.0: ACS mask  = 0x002c
      pci 0020:02:00.0: ACS flags = 0xffd3
      pci 0020:02:00.0: ACS control = 0x007f
      pci 0020:02:00.0: ACS fw_ctrl = 0x007b
      pci 0020:02:00.0: Configured ACS to 0x0053
      pci 0030:02:00.0: ACS mask  = 0x002c
      pci 0030:02:00.0: ACS flags = 0xffd3
      pci 0030:02:00.0: ACS control = 0x005f
      pci 0030:02:00.0: ACS fw_ctrl = 0x005f
      pci 0030:02:00.0: Configured ACS to 0x0053
      pci 0039:00:00.0: ACS mask  = 0x002c
      pci 0039:00:00.0: ACS flags = 0xffd3
      pci 0039:00:00.0: ACS control = 0x001d
      pci 0039:00:00.0: ACS fw_ctrl = 0x0000
      pci 0039:00:00.0: Configured ACS to 0x0000

Link: https://lore.kernel.org/r/20250207030338.456887-1-tdave@nvidia.com
Fixes: 47c8846a49 ("PCI: Extend ACS configurability")
Signed-off-by: Tushar Dave <tdave@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2025-02-21 16:16:55 -06:00
Mrinmay Sarkar
4f13dd9e2b
PCI: epf-mhi: Update device ID for SA8775P
Update device ID for the Qcom SA8775P SoC.

Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com>
Link: https://lore.kernel.org/r/20241205065422.2515086-3-quic_msarkar@quicinc.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-21 05:55:09 +00:00
Lorenzo Bianconi
b6d7bb0d3b
PCI: mediatek-gen3: Remove leftover mac_reset assert for Airoha EN7581 SoC
Remove a leftover assert for mac_reset line in mtk_pcie_en7581_power_up().

This is not harmful since EN7581 does not requires mac_reset and
mac_reset is not defined in EN7581 device tree.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://lore.kernel.org/r/20250201-pcie-en7581-remove-mac_reset-v2-1-a06786cdc683@kernel.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-21 05:46:00 +00:00
Manivannan Sadhasivam
75996c92f4
PCI/pwrctrl: Add pwrctrl driver for PCI slots
This driver is used to control the power state of the devices attached to
the PCI slots. Currently, it controls the voltage rails of the PCI slots
defined in the devicetree node of the root port.

The voltage rails for PCI slots are documented in the DT-schema:

  https://github.com/devicetree-org/dt-schema/blob/v2024.11/dtschema/schemas/pci/pci-bus-common.yaml#L153

Since this driver has to work with different kind of slots (PCIe
x1/x4/x8/x16, Mini PCIe, PCI, etc.), the driver is thus using the
of_regulator_bulk_get_all() API to obtain the voltage regulators defined
in the DT node, instead of hardcoding them.

As such, the DT node of the root port should define the relevant supply
properties corresponding to the voltage rails of the PCI slot.

Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-5-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-21 01:03:39 +00:00
Easwar Hariharan
25a3c220a2
PCI: hv: Correct a comment
The VF driver controls an endpoint attached to the PCI HyperV
controller. An invalidation sent by the PF driver in the host
would be delivered *to* the endpoint driver by the controller
driver.

Thus, correct the wording within the comment.

Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20250207190716.89995-1-eahariha@linux.microsoft.com
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20 14:57:28 +00:00
Manivannan Sadhasivam
2489eeb777
PCI/pwrctrl: Skip scanning for the device further if pwrctrl device is created
The pwrctrl core will rescan the bus once the device is powered on. So
there is no need to continue scanning for the device further.

Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-3-827473c8fbf4@linaro.org
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20 10:59:10 +00:00
Manivannan Sadhasivam
2d923930f2
PCI/pwrctrl: Move pci_pwrctrl_unregister() to pci_destroy_dev()
The PCI core will try to access the devices even after pci_stop_dev()
for things like Data Object Exchange (DOE), ASPM, etc.

So, move pci_pwrctrl_unregister() to the near end of pci_destroy_dev()
to make sure that the devices are powered down only after the PCI core
is done with them.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-2-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20 10:59:10 +00:00
Manivannan Sadhasivam
957f40d039
PCI/pwrctrl: Move creation of pwrctrl devices to pci_scan_device()
Current way of creating pwrctrl devices requires iterating through the
child devicetree nodes of the PCI bridge in pci_pwrctrl_create_devices().
Even though it works, it creates confusion as there is no symmetry between
this and pci_pwrctrl_unregister() function that removes the pwrctrl
devices.

So to make these two functions symmetric, move the creation of pwrctrl
devices to pci_scan_device(). During the scan of each device in a slot,
the devicetree node (if exists) for the PCI device will be checked. If it
has the supplies populated, then the pwrctrl device will be created.

Since the PCI device scan happens so early, there would be no "struct
pci_dev" available for the device. So the host bridge is used as the
parent of all pwrctrl devices.

One nice side effect of this move is that, it is now possible to have
pwrctrl devices for PCI bridges as well (to control the supplies of PCI
slots).

Suggested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250116-pci-pwrctrl-slot-v3-1-827473c8fbf4@linaro.org
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20 10:59:02 +00:00
Daniel Stodden
cbf937dcad
PCI/ASPM: Fix link state exit during switch upstream function removal
Before 456d8aa37d ("PCI/ASPM: Disable ASPM on MFD function removal to
avoid use-after-free"), we would free the ASPM link only after the last
function on the bus pertaining to the given link was removed.

That was too late. If function 0 is removed before sibling function,
link->downstream would point to free'd memory after.

After above change, we freed the ASPM parent link state upon any function
removal on the bus pertaining to a given link.

That is too early. If the link is to a PCIe switch with MFD on the upstream
port, then removing functions other than 0 first would free a link which
still remains parent_link to the remaining downstream ports.

The resulting GPFs are especially frequent during hot-unplug, because
pciehp removes devices on the link bus in reverse order.

On that switch, function 0 is the virtual P2P bridge to the internal bus.
Free exactly when function 0 is removed -- before the parent link is
obsolete, but after all subordinate links are gone.

Link: https://lore.kernel.org/r/e12898835f25234561c9d7de4435590d957b85d9.1734924854.git.dns@arista.com
Fixes: 456d8aa37d ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free")
Signed-off-by: Daniel Stodden <dns@arista.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-02-20 10:05:56 +00:00
Ilpo Järvinen
588021b286 PCI: shpchp: Remove 'shpchp_debug' module parameter
The "shpchp_debug" module parameter is used to enable debug logging.  The
generic ability to turn on/off debug prints dynamically covers this use
case already so there is no need for module specific debug handling.  The
ctrl_dbg() wrapper also uses a low-level pci_printk() despite always using
KERN_DEBUG level.

Remove "shpchp_debug" parameter and convert ctrl_dbg() to use pci_dbg().

From now on, shpchp can be debugged using the normal dynamic debugger by
setting CONFIG_DYNAMIC_DEBUG=y and then either adding to kernel cmdline:

  dyndbg="file drivers/pci/hotplug/shpchp* +p"

or using this command on a running kernel:

  echo 'file drivers/pci/hotplug/shpchp* +p' > /sys/kernel/debug/dynamic_debug/control

Link: https://lore.kernel.org/r/20250217095550.2789-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19 16:54:42 -06:00
Ilpo Järvinen
b52ce0b6d5 PCI: shpchp: Remove unused logging wrappers
The shpchp hotplug driver defines logging wrapper with generic names which
are just duplicates of existing generic printk() wrappers. They are also
unused so remove them.

Link: https://lore.kernel.org/r/20250217095550.2789-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19 16:54:42 -06:00
Ilpo Järvinen
4999822008 PCI: shpchp: Change dbg() -> ctrl_dbg()
Convert the last user of dbg() to use ctrl_dbg().

Link: https://lore.kernel.org/r/20241216161012.1774-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19 16:54:37 -06:00
Ilpo Järvinen
7d5f1e615e PCI: shpchp: Remove logging from module init/exit functions
The logging in shpchp module init/exit functions is not very useful.
Remove it.

Link: https://lore.kernel.org/r/20241216161012.1774-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-19 16:51:44 -06:00
Rafael J. Wysocki
bca84a7b93 PM: sleep: Use DPM_FLAG_SMART_SUSPEND conditionally
A recent discussion has revealed that using DPM_FLAG_SMART_SUSPEND
unconditionally is generally problematic because it may lead to
situations in which the device's runtime PM information is internally
inconsistent or does not reflect its real state [1].

For this reason, change the handling of DPM_FLAG_SMART_SUSPEND so that
it is only taken into account if it is consistently set by the drivers
of all devices having any PM callbacks throughout dependency graphs in
accordance with the following rules:

 - The "smart suspend" feature is only enabled for devices whose drivers
   ask for it (that is, set DPM_FLAG_SMART_SUSPEND) and for devices
   without PM callbacks unless they have never had runtime PM enabled.

 - The "smart suspend" feature is not enabled for a device if it has not
   been enabled for the device's parent unless the parent does not take
   children into account or it has never had runtime PM enabled.

 - The "smart suspend" feature is not enabled for a device if it has not
   been enabled for one of the device's suppliers taking runtime PM into
   account unless that supplier has never had runtime PM enabled.

Namely, introduce a new device PM flag called smart_suspend that is only
set if the above conditions are met and update all DPM_FLAG_SMART_SUSPEND
users to check power.smart_suspend instead of directly checking the
latter.

At the same time, drop the power.set_active flage introduced recently
in commit 3775fc538f ("PM: sleep: core: Synchronize runtime PM status
of parents and children") because it is now sufficient to check
power.smart_suspend along with the dev_pm_skip_resume() return value
to decide whether or not pm_runtime_set_active() needs to be called
for the device.

Link: https://lore.kernel.org/linux-pm/CAPDyKFroyU3YDSfw_Y6k3giVfajg3NQGwNWeteJWqpW29BojhQ@mail.gmail.com/ [1]
Fixes: 7585946243 ("PM: sleep: core: Restrict power.set_active propagation")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci
Link: https://patch.msgid.link/1914558.tdWV9SEqCh@rjwysocki.net
2025-02-19 13:22:12 +01:00
Ilpo Järvinen
2499f53484 PCI: Rework optional resource handling
Remove and rescan cycle can result in failure to assign a bridge window if
it becomes larger than before the remove. The bridge window size will
include space for disabled Expansion ROM, which can causes the bridge
window to not fit anymore into the same address space slot on rescan if the
Expansion ROM resource was not assigned before the remove. In addition, the
optional resource handling is not internally consistent.

The resource fitting logic supports three main types of optional resources:

  - IOV BARs
  - Expansion ROMs
  - Bridge window size variation due to optional resources

In addition to the above, resizable BARs beyond their current size will
require handling optional variation in resource sizes within the resource
fitting algorithm (not yet done by the resource fitting code).

There are multiple inconsistencies related to optional resource handling:

a) The allocation failure of disabled expansion ROM requires special case
   inside assign_requested_resources_sorted().

b) The optionality of disabled expansion ROM is not considered during
   bridge window sizing in pbus_size_mem().

c) Setting resource size to zero for optional resource in pbus_size_mem()
   is problematic because it makes also the alignment invalid, which is
   checked by pdev_sort_resources().

   Optional IOV resources have their size set to zero by pbus_size_mem()
   but the information about size is stored externally in struct pci_sriov
   and complex call-chain trickery in pci_resource_alignment() ensures IOV
   resources return a valid alignment despite having zero resource size. A
   solution that is specific to IOV resources makes it hard to use the same
   solution for other types of resources such as expansion ROM.

Simply changing pbus_size_mem() is not sufficient to fully address the main
issue because it would introduce disparity between bridge window sizing and
resource allocation. Due to size-based ordering of the resource list during
assignment loop, an Expansion ROM resource could steal space from some
other resource and make the other resource not fit if the Expansion ROM is
larger than the other resource. Thus, the resource assignment functions
need to be changed as well.

Make optional resource handling more straightforward. Use
pci_resource_is_optional() to determine if a resource is optional in both
bridge window sizing and assignment failure classification to ensure they
always align. Indicate with a parameter to
assign_requested_resources_sorted() whether it should attempt to allocate
optional resources or not.

Always try first to assign all resources (also when realloc_head is not
provided). This is required for calls from
pci_assign_unassigned_root_bus_resources() that provide realloc_head only
with some of its iterations.

Non-bridge-window optional resources in realloc_head now have add_size 0.
This condition has to be detected in reassign_resources_sorted() before
reassigning them (which would fail as there is no size change).  Removing
add_size=0 optional resources entirely from realloc_head might eventually
be doable but further rework in __assign_resources_sorted() is needed first
to support such a change.

Link: https://lore.kernel.org/r/20241216175632.4175-26-ilpo.jarvinen@linux.intel.com
Reported-by: Jia Yao <jia.yao@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219547
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jia Yao <jia.yao@intel.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
96336ec702 PCI: Perform reset_resource() and build fail list in sync
Resetting a resource is problematic as it prevents attempting to allocate
the resource later, unless something in between restores the resource.
Similarly, if fail_head does not contain all resources that were reset,
those resources cannot be restored later.

The entire reset/restore cycle adds complexity and leaving resources in the
reset state causes issues to other code such as for checks done in
pci_enable_resources(). Take a small step towards not resetting resources
by delaying reset until the end of resource assignment and build failure
list (fail_head) in sync with the reset to avoid leaving behind resources
that cannot be restored (for the case where the caller provides fail_head
in the first place to allow restore somewhere in the callchain, as is not
all callers pass non-NULL fail_head).

Leave the Expansion ROM check temporarily in place while building the
failure list until an upcoming change that reworks optional resource
handling.

Ideally, whole resource reset could be removed but doing that in one step
would be non-tractable due to complexity of all related code.

Link: https://lore.kernel.org/r/20241216175632.4175-25-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
e89df6d2be PCI: Use res->parent to check if resource is assigned
reassign_resources_sorted() uses resource_size() to select between
pci_assign_resource() and pci_reassign_resource(). Due to twisted way
bridge window sizing in pbus_size_mem() sets resource sizes to 0, it works
to match into IOV resources but that is going to be changed by an upcoming
change.

Replace resource_size() check with res->parent check that is the true
dividing line in between whether assign or reassign function should be used
for the resource.

Link: https://lore.kernel.org/r/20241216175632.4175-24-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
8884b5637b PCI: Add debug print when releasing resources before retry
PCI resource fitting is somewhat hard to track because it performs many
actions without logging them. In the case inside
__assign_resources_sorted(), the resources are released before resource
assignment is going to be retried in a different order. That is just one
level of retries the resource fitting performs overall so tracking it
through repeated assignments or failures of a resource gets messy rather
quickly.

Simply announce the release explicitly using pci_dbg() so it is clear what
is going on with each resource.

Link: https://lore.kernel.org/r/20241216175632.4175-23-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
07854e08cd PCI: Indicate optional resource assignment failures
Add pci_dbg() to note that an assignment failure was for an optional
resource and reword existing message about resource resize to say the
change was optional.

Link: https://lore.kernel.org/r/20241216175632.4175-22-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
b3281eb5de PCI: Always have realloc_head in __assign_resources_sorted()
Add a dummy list to always have a non-NULL realloc head in
__assign_resources_sorted() as it allows only checking list_empty().

In future, it would be good to ensure all callers provide a valid
realloc_head but that is relatively complex to do in practice and not
necessary for the subsequent optional resource handling fix.

Link: https://lore.kernel.org/r/20241216175632.4175-21-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
9caf4ea2fd PCI: Extend enable to check for any optional resource
pci_enable_resources() checks if device's io and mem resources are all
assigned and disallows enable if any resource failed to assign (*) but
makes an exception for the case of disabled extension ROM. There are other
optional resources, however.

Add pci_resource_is_optional() and use it instead of
pci_resource_is_disabled_rom() to cover also IOV resources that are also
optional as per pbus_size_mem().

As there will be more users of pci_resource_is_optional() inside
setup-bus.c in changes coming up after this one, the function is placed
there.

(*) In practice, resource fitting code calls reset_resource() for any
resource it fails to assign which clears resource's ->flags causing
pci_enable_resources() to never detect failed resource assignments.
This seems undesirable internal logic inconsistency, effectively
reset_resource() prevents pci_enable_resources() from functioning as
intended. This is one step of many that will be needed towards removing
reset_resource().

Link: https://lore.kernel.org/r/20241216175632.4175-20-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
4e362abe48 PCI: Add restore_dev_resource()
Resource fitting needs to restore the saved dev resources in a few places.
Add a restore_dev_resource() helper for that.

Link: https://lore.kernel.org/r/20241216175632.4175-19-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
c8098ad8fb PCI: Remove incorrect comment from pci_reassign_resource()
Commit a4ac9fea01 ("PCI : Calculate right add_size") removed including
min_align into new_size in pci_reassign_resource() which is the correct
thing to do. However, it also added a snakeoil comment that the resource
would already be aligned with min_align which is incorrect.

A resource that is assigned earlier is aligned with the old alignment, NOT
with the new requested alignment (min_align) until later deep within the
reassignment callchain. Thus, remove the incorrect comment.

Link: https://lore.kernel.org/r/20241216175632.4175-18-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
Cc: Yinghai Lu <yinghai@kernel.org>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
ca9097f9ce PCI: Consolidate assignment loop next round preparation
pci_assign_unassigned_root_bus_resources() and
pci_assign_unassigned_bridge_resources() have a loop that may perform
several rounds to assign resources. The code to prepare for the next round
is identical.

Consolidate the code that prepares for the next assignment round into
pci_prepare_next_assign_round().

Link: https://lore.kernel.org/r/20241216175632.4175-17-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:54 -06:00
Ilpo Järvinen
54181c1364 PCI: Rename retval to ret
Rename 'retval' to 'ret' in pci_assign_unassigned_bridge_resources().

Link: https://lore.kernel.org/r/20241216175632.4175-16-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Ilpo Järvinen
acba174d2e PCI: Use while loop and break instead of gotos
pci_assign_unassigned_root_bus_resources() and
pci_assign_unassigned_bridge_resources() contain ad hoc loops using
backwards goto and gotos out of the loop. Replace them with while loops
and break statements.

While reindenting the loop bodies, add braces & remove parenthesis.

Link: https://lore.kernel.org/r/20241216175632.4175-15-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Ilpo Järvinen
0aa089cdde PCI: Refactor pdev_sort_resources() & __dev_sort_resources()
Reduce level of call nesting by calling pdev_sort_resources() directly
and by moving the tests done inside __dev_sort_resources() into
pdev_resources_assignable() helper.

Link: https://lore.kernel.org/r/20241216175632.4175-14-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Ilpo Järvinen
22fb2eda54 PCI: Converge return paths in __assign_resources_sorted()
All return paths want to free head list in __assign_resources_sorted(), so
add a label and use goto.

Link: https://lore.kernel.org/r/20241216175632.4175-13-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Ilpo Järvinen
9b54578bc0 PCI: Add dev & res local variables to resource assignment funcs
Many PCI resource allocation related functions process struct
pci_dev_resource items which hold the struct pci_dev and resource pointers.
Reduce the number of lines that need indirection by adding 'dev' and 'res'
local variable to hold the pointers.

Link: https://lore.kernel.org/r/20241216175632.4175-12-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Ilpo Järvinen
e4728eed24 PCI: Add pci_resource_num() helper
A few places in PCI code, mainly in setup-bus.c, need to reverse lookup the
index of a resource in pci_dev's resource array. Create pci_resource_num()
helper to avoid repeating the pointer arithmetic trick used to calculate
the index.

Link: https://lore.kernel.org/r/20241216175632.4175-11-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Ilpo Järvinen
2bd0c72117 PCI: Check resource_size() separately
Instead of chaining logic inside if () condition so that multiple lines are
required, make !resource_size() a separate check and use continue.

Link: https://lore.kernel.org/r/20241216175632.4175-10-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:53 -06:00
Michał Winiarski
cbd384389e PCI: Add pci_resource_is_iov() to identify IOV resources
There are multiple places where special handling is required for IOV
resources.

Extract the identification of IOV resources to pci_resource_is_iov() and
drop a few ifdefs.

Link: https://lore.kernel.org/r/20241216175632.4175-9-ilpo.jarvinen@linux.intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
ee4621b7e4 PCI: Use resource_set_{range,size}() helpers
A few sites that could use resource_set_range/size() in setup-bus.c were
not picked up earlier due to them no matching the usual pattern.  Convert
them now.

These are more cases similar to 783602c920 ("PCI: Use
resource_set_{range,size}() helpers").

Link: https://lore.kernel.org/r/20241216175632.4175-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: add 783602c920 history]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
8986e7e668 PCI: Use SZ_* instead of literals in setup-bus.c
Convert literals in setup-bus.c to SZ_* defines that make the size more
human readable.

As the code is now self-explanatory, eliminate comments about the size.

Link: https://lore.kernel.org/r/20241216175632.4175-7-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
ff61f380de PCI: Fix old_size lower bound in calculate_iosize() too
Commit 903534fa7d ("PCI: Fix resource double counting on remove &
rescan") fixed double counting of mem resources because of old_size being
applied too early.

Fix a similar counting bug on the io resource side.

Link: https://lore.kernel.org/r/20241216175632.4175-6-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
67f9085596 PCI: Allow relaxed bridge window tail sizing for optional resources
Commit 566f1dd528 ("PCI: Relax bridge window tail sizing rules")
relaxed the bridge window requirements for non-optional size (size0)
but pbus_size_mem() also handles optional sizes (IOV resources) using
size1. This can manifest, e.g., as a failure to resize a BAR back to
its original size after it was first shrunk when device has a VF BAR
resource because the bridge window (size1) is enlarged beyond what is
strictly required to fit the downstream resources.

Allow using relaxed bridge window tail sizing rules also with the optional
resources (size1) so that the remove/realloc cycle during BAR resize
(smaller and back to the original size) does not fail unexpectedly due to
increase in bridge window size demand.

Also move add_align calculation to more logical place next to size1
assignment as they are strongly related to each other.

Link: https://lore.kernel.org/r/20241216175632.4175-5-ilpo.jarvinen@linux.intel.com
Fixes: 566f1dd528 ("PCI: Relax bridge window tail sizing rules")
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
a55bf64b30 PCI: Simplify size1 assignment logic
In pbus_size_io() and pbus_size_mem(), a complex ?: operation is performed
to set size1.  Decompose this so it's easier to read.

In the case of pbus_size_mem(), simply initializing size1 to zero ensures
the size1 checks work as expected.

Link: https://lore.kernel.org/r/20241216175632.4175-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
1f82b7e84a PCI: Use min_align, not unrelated add_align, for size0
Commit 566f1dd528 ("PCI: Relax bridge window tail sizing rules")
relaxed bridge window tail alignment rule for the non-optional part
(size0, no add_size/add_align). The required alignment given for
pbus_upstream_space_available(), however, was add_align which relates
only to size1 alignment.

As pbus_upstream_space_available() only selects between normal and relaxed
tail alignment of the bridge window, the different alignment only makes
relaxed tail alignment to be used more often than what was intended, which
should be harmless because relaxed tail alignment itself should work in all
cases.

For consistency, change pbus_upstream_space_available() call to use
min_align which is the alignment that is going to be used for the bridge
window in the case where size0 sized allocation is attempted.

Link: https://lore.kernel.org/r/20241216175632.4175-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Ilpo Järvinen
d06cc1e380 PCI: Remove add_align overwrite unrelated to size0
Commit 566f1dd528 ("PCI: Relax bridge window tail sizing rules")
relaxed bridge window tail alignment rule for the non-optional part
(size0, no add_size/add_align). The change, however, also overwrote
add_align, which is only related to case where optional size1 related
entry is added into realloc head.

Correct this by removing the add_align overwrite.

Link: https://lore.kernel.org/r/20241216175632.4175-2-ilpo.jarvinen@linux.intel.com
Fixes: 566f1dd528 ("PCI: Relax bridge window tail sizing rules")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
2025-02-18 15:40:52 -06:00
Kai-Heng Feng
1a596ad00f PCI: Use downstream bridges for distributing resources
7180c1d086 ("PCI: Distribute available resources for root buses, too")
breaks BAR assignment on some devices:

  pci 0006:03:00.0: BAR 0 [mem 0x6300c0000000-0x6300c1ffffff 64bit pref]: assigned
  pci 0006:03:00.1: BAR 0 [mem 0x6300c2000000-0x6300c3ffffff 64bit pref]: assigned
  pci 0006:03:00.2: BAR 0 [mem size 0x00800000 64bit pref]: can't assign; no space
  pci 0006:03:00.0: VF BAR 0 [mem size 0x02000000 64bit pref]: can't assign; no space
  pci 0006:03:00.1: VF BAR 0 [mem size 0x02000000 64bit pref]: can't assign; no space

The apertures of domain 0006 before 7180c1d086:

  6300c0000000-63ffffffffff : PCI Bus 0006:00
    6300c0000000-6300c9ffffff : PCI Bus 0006:01
      6300c0000000-6300c9ffffff : PCI Bus 0006:02        # 160MB
        6300c0000000-6300c8ffffff : PCI Bus 0006:03      #   144MB
          6300c0000000-6300c1ffffff : 0006:03:00.0       #     32MB
          6300c2000000-6300c3ffffff : 0006:03:00.1       #     32MB
          6300c4000000-6300c47fffff : 0006:03:00.2       #      8MB
          6300c4800000-6300c67fffff : 0006:03:00.0       #     32MB
          6300c6800000-6300c87fffff : 0006:03:00.1       #     32MB
        6300c9000000-6300c9bfffff : PCI Bus 0006:04      #    12MB
          6300c9000000-6300c9bfffff : PCI Bus 0006:05    #    12MB
            6300c9000000-6300c91fffff : PCI Bus 0006:06  #      2MB
            6300c9200000-6300c93fffff : PCI Bus 0006:07  #      2MB
            6300c9400000-6300c95fffff : PCI Bus 0006:08  #      2MB
            6300c9600000-6300c97fffff : PCI Bus 0006:09  #      2MB

After 7180c1d086:

  6300c0000000-63ffffffffff : PCI Bus 0006:00
    6300c0000000-6300c9ffffff : PCI Bus 0006:01
      6300c0000000-6300c9ffffff : PCI Bus 0006:02        # 160MB
        6300c0000000-6300c43fffff : PCI Bus 0006:03      #    68MB
          6300c0000000-6300c1ffffff : 0006:03:00.0       #      32MB
          6300c2000000-6300c3ffffff : 0006:03:00.1       #      32MB
              --- no space ---      : 0006:03:00.2       #       8MB
              --- no space ---      : 0006:03:00.0       #      32MB
              --- no space ---      : 0006:03:00.1       #      32MB
        6300c4400000-6300c4dfffff : PCI Bus 0006:04      #    10MB
          6300c4400000-6300c4dfffff : PCI Bus 0006:05    #      10MB
            6300c4400000-6300c45fffff : PCI Bus 0006:06  #        2MB
            6300c4600000-6300c47fffff : PCI Bus 0006:07  #        2MB
            6300c4800000-6300c49fffff : PCI Bus 0006:08  #        2MB
            6300c4a00000-6300c4bfffff : PCI Bus 0006:09  #        2MB

We can see that the window to 0006:03 gets shrunken too much and 0006:04
eats away the window for 0006:03:00.2.

The offending commit distributes the upstream bridge's resources multiple
times to every downstream bridge, hence makes the aperture smaller than
desired because calculation of io_per_b, mmio_per_b and mmio_pref_per_b
becomes incorrect.

Instead, distribute downstream bridges' own resources to resolve the issue.

Link: https://lore.kernel.org/r/20241204022457.51322-1-kaihengf@nvidia.com
Fixes: 7180c1d086 ("PCI: Distribute available resources for root buses, too")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219540
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Carol Soto <csoto@nvidia.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Chris Chiu <chris.chiu@canonical.com>
2025-02-18 14:46:55 -06:00
Linus Torvalds
78a632a208 pci-v6.14-fixes-3
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmev0oAUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxeNBAAunByLPVxbteqIXJ6IZHz1956SDfo
 batHJodeNCxeVYpjLca+1asqZ75hply3pOK+DUKyNUyEqkXfQDvSZCBfo70uA9ur
 eSeaZxBoNS7VnOvmw5w0kEWd2Sx0gEkIPuxegympOfTiWaN8bGoryPHAuGnO0Exz
 YTIv+TtLAag96OFswQGvSZyh8AfCg5QXl6vZ0W7Ex+O24o/LJ9sO9gALkhFXoc2+
 tdKBJSOQ3dfEVa0S6btNxsY5nsy7xp3CxLYqQTvT2kES0xUwK7QrafOdl0kEaVDB
 6JRkFhKaK5R6WDz5d4LnU0mzSrBc17cQD+C6StUd4dzQdzyuiP4X70R+kMiu71ff
 XMSBPSbLcP1iQ50O3IdkbV5OVnn7ap3LPP2v/PYcmoYNc3i81hvZap5RDaEnJ/ej
 xI+bKullZG787jl+qnweNsu9jar9HWCJOGwMCZIKiXoU8fJdsI+b+53FNxFqqWAR
 4DAsu9nGwlkoEFHaDFKeoG/qBrW65UAeIetUoh+dpQ19OaD7vUuv6Uk4aiwbDFx3
 jGNBOS1Cwx3aIhLNrg3wxSmajFHaPlI+yElS+PcP/J9cV1YDXV7U/ZKaYTVwU0TZ
 l3TY72q3/ckP7+9NaxExO6ow59cN1XNncBkAcOEoSCfFIFnSmvxC8v6Vpwc2ZNQN
 o3N9euTROFouZeA=
 =HYty
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

 - Update a BUILD_BUG_ON() usage that works on current compilers, but
   breaks compilation on gcc 5.3.1 (Alex Williamson)

 - Avoid use of FLR for Mediatek MT7922 WiFi; the device previously
   worked after a long timeout and fallback to SBR, but after a recent
   RRS change it doesn't work at all after FLR (Bjorn Helgaas)

* tag 'pci-v6.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: Avoid FLR for Mediatek MT7922 WiFi
  PCI: Fix BUILD_BUG_ON usage for old gcc
2025-02-14 16:49:07 -08:00
Ilpo Järvinen
addb30c5bd PCI: Cleanup dev->resource + resno to use pci_resource_n()
Replace pointer arithmetic in finding the correct resource entry with the
pci_resource_n() helper.

Link: https://lore.kernel.org/r/20250207162301.2842-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-14 14:39:31 -06:00
Bjorn Helgaas
81f64e925c PCI: Avoid FLR for Mediatek MT7922 WiFi
The Mediatek MT7922 WiFi device advertises FLR support, but it apparently
does not work, and all subsequent config reads return ~0:

  pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint
  pciback 0000:01:00.0: not ready 65535ms after FLR; giving up

After an FLR, pci_dev_wait() waits for the device to become ready.  Prior
to d591f6804e ("PCI: Wait for device readiness with Configuration RRS"),
it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR
(~0).  If it times out, pci_dev_wait() returns -ENOTTY and
__pci_reset_function_locked() tries the next available reset method.
Typically this is Secondary Bus Reset, which does work, so the MT7922 is
eventually usable.

After d591f6804e, if Configuration Request Retry Status Software
Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it
is something other than the special 0x0001 Vendor ID that indicates a
completion with RRS status.

When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001,
i.e., the config read was completed with RRS, or a valid Vendor ID.  On the
MT7922, it seems that all config reads after FLR return ~0 indefinitely.
When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's
a valid Vendor ID and the device is now ready, so it returns with success.

After pci_dev_wait() returns success, we restore config space and continue.
Since the MT7922 is not actually ready after the FLR, the restore fails and
the device is unusable.

We considered changing pci_dev_wait() to continue polling if a
PCI_VENDOR_ID read returns either 0x0001 or 0xffff.  This "works" as it did
before d591f6804e, although we have to wait for the timeout and then fall
back to SBR.  But it doesn't work for SR-IOV VFs, which *always* return
0xffff as the Vendor ID.

Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely.  This
will cause fallback to another reset method, such as SBR.

Link: https://lore.kernel.org/r/20250212193516.88741-1-helgaas@kernel.org
Fixes: d591f6804e ("PCI: Wait for device readiness with Configuration RRS")
Link: https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149
Link: https://lore.kernel.org/r/Z4pHll_6GX7OUBzQ@mail-itl
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Cc: stable@vger.kernel.org
2025-02-13 08:36:54 -06:00
Alex Williamson
472ff48e2c PCI: Fix BUILD_BUG_ON usage for old gcc
As reported in the below link, it seems older versions of gcc cannot
determine that the howmany variable is known for all callers.  Include
a test so that newer compilers can enforce this sanity check and older
compilers can still work.  Add __always_inline attribute to give the
compiler an even better chance to know the inputs.

Link: https://lore.kernel.org/r/20250212185337.293023-1-alex.williamson@redhat.com
Fixes: 4453f36086 ("PCI: Batch BAR sizing operations")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/all/20250209154512.GA18688@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Oleg Nesterov <oleg@redhat.com>
Tested-by: Mitchell Augustin <mitchell.augustin@canonical.com>
2025-02-12 16:16:21 -06:00
Robin Murphy
6f64b83d9f PCI/TPH: Restore TPH Requester Enable correctly
When we reenable TPH after changing a Steering Tag value, we need the
actual TPH Requester Enable value, not the ST Mode (which only happens to
work out by chance for non-extended TPH in interrupt vector mode).

Link: https://lore.kernel.org/r/13118098116d7bce07aa20b8c52e28c7d1847246.1738759933.git.robin.murphy@arm.com
Fixes: d2e8a34876 ("PCI/TPH: Add Steering Tag support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Huang <wei.huang2@amd.com>
2025-02-06 10:30:11 -06:00
Ilpo Järvinen
7507eb3e7b PCI/ASPM: Fix L1SS saving
Commit 1db806ec06 ("PCI/ASPM: Save parent L1SS config in
pci_save_aspm_l1ss_state()") aimed to perform L1SS config save for both the
Upstream Port and its upstream bridge when handling an Upstream Port, which
matches what the L1SS restore side does. However, parent->state_saved can
be set true at an earlier time when the upstream bridge saved other parts
of its state. Then later when attempting to save the L1SS config while
handling the Upstream Port, parent->state_saved is true in
pci_save_aspm_l1ss_state() resulting in early return and skipping saving
bridge's L1SS config because it is assumed to be already saved. Later on
restore, junk is written into L1SS config which causes issues with some
devices.

Remove parent->state_saved check and unconditionally save L1SS config also
for the upstream bridge from an Upstream Port which ought to be harmless
from correctness point of view. With the Upstream Port check now present,
saving the L1SS config more than once for the bridge is no longer a problem
(unlike when the parent->state_saved check got introduced into the fix
during its development).

Link: https://lore.kernel.org/r/20250131152913.2507-1-ilpo.jarvinen@linux.intel.com
Fixes: 1db806ec06 ("PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219731
Reported-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com>
Reported by: Rafael J. Wysocki <rafael@kernel.org>
Closes: https://lore.kernel.org/r/CAJZ5v0iKmynOQ5vKSQbg1J_FmavwZE-nRONovOZ0mpMVauheWg@mail.gmail.com
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://lore.kernel.org/r/d7246feb-4f3f-4d0c-bb64-89566b170671@molgen.mpg.de
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> # Dell XPS 13 9360
2025-02-06 09:54:38 -06:00
Takashi Iwai
d555ed45a5 PCI: Restore original INTX_DISABLE bit by pcim_intx()
pcim_intx() tries to restore the INTx bit at removal via devres, but there
is a chance that it restores a wrong value.

Because the value to be restored is blindly assumed to be the negative of
the enable argument, when a driver calls pcim_intx() unnecessarily for the
already enabled state, it'll restore to the disabled state in turn.  That
is, the function assumes the case like:

  // INTx == 1
  pcim_intx(pdev, 0); // old INTx value assumed to be 1 -> correct

but it might be like the following, too:

  // INTx == 0
  pcim_intx(pdev, 0); // old INTx value assumed to be 1 -> wrong

Also, when a driver calls pcim_intx() multiple times with different enable
argument values, the last one will win no matter what value it is.  This
can lead to inconsistency, e.g.

  // INTx == 1
  pcim_intx(pdev, 0); // OK
  ...
  pcim_intx(pdev, 1); // now old INTx wrongly assumed to be 0

This patch addresses those inconsistencies by saving the original INTx
state at the first pcim_intx() call.  For that, get_or_create_intx_devres()
is folded into pcim_intx() caller side; it allows us to simply check the
already allocated devres and record the original INTx along with the
devres_alloc() call.

Link: https://lore.kernel.org/r/20241031134300.10296-1-tiwai@suse.de
Fixes: 25216afc9d ("PCI: Add managed pcim_intx()")
Link: https://lore.kernel.org/87v7xk2ps5.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Cc: stable@vger.kernel.org	# v6.11+
2025-01-27 12:55:12 -06:00
Linus Torvalds
647d69605c pci-v6.14-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmeTr8wUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxrMw//TJXH+U6x5LhYvBPD/KZ20ecGHqaA
 eGXrbHAasYbU1CfW7HM0onR8NffOIGoYvQrtefjQAln0w6rTvyFO0xJKLP15vMfN
 hnj+y1WWtKwAkSpu10Cl9nTj8uYRNNSQeoy5kS+1diwuXdby/DlgQONO2APSe9zd
 KMPXJcqSfDJlM5zHrcqqtlxauO9KHInLCc/iutd85AKjvcjOoNHNeZE0pTC0C3gE
 sXYHDqJiS3zdEG6X6mWFo3OzI/Q/7NGlHJ2j0CQaObsgQ9yA7eWkez25ifwZcugc
 TPtjm8DhaDo9/zx0NV9c2dPauHRC6NYUjAflMPK7Aye/41BE1Ag5Ka+tMDgC2i/N
 TbfBxSeArhjnjY+eZwRhrJNNC58TtHTUs69TO7Dbmuwr7cp99MIEDAYI5V6LFAdk
 plKqn1h8FztW5QKRPCgmzy6KTE+WPytiGAGAQFxzIGYkV/QqyvFaVs8FIyOJUIFM
 aDSa6Xy5WLGxmPZ9hPapzEm4ws/HTRpFjNgi/d4rRG5RWMwAxZZa44s9eldhN1D/
 ZwEmF2rJ+U8S7Q+mXPHDlwcsHe5APCbiaTEp4X+e3LNe0i9oxhhaUWG6LrDDmTlQ
 tU5j5daHiBa0nTDL1lfaayJlYX/oJ+IYQrIYzGbnivZv4ZVdPnuWSsOsMOEiLhEt
 4QqCoanqf0mCn2A=
 =O62O
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Batch sizing of multiple BARs while memory decoding is disabled
     instead of disabling/enabling decoding for each BAR individually;
     this optimizes virtualized environments where toggling decoding
     enable is expensive (Alex Williamson)

   - Add host bridge .enable_device() and .disable_device() hooks for
     bridges that need to configure things like Requester ID to StreamID
     mapping when enabling devices (Frank Li)

   - Extend struct pci_ecam_ops with .enable_device() and
     .disable_device() hooks so drivers that use pci_host_common_probe()
     instead of their own .probe() have a way to set the
     .enable_device() callbacks (Marc Zyngier)

   - Drop 'No bus range found' message so we don't complain when DTs
     don't specify the default 'bus-range = <0x00 0xff>' (Bjorn Helgaas)

   - Rename the drivers/pci/of_property.c struct of_pci_range to
     of_pci_range_entry to avoid confusion with the global of_pci_range
     in include/linux/of_address.h (Bjorn Helgaas)

  Driver binding:

   - Update resource request API documentation to encourage callers to
     supply a driver name when requesting resources (Philipp Stanner)

   - Export pci_intx_unmanaged() and pcim_intx() (always managed) so
     callers of pci_intx() (which is sometimes managed) can explicitly
     choose the one they need (Philipp Stanner)

   - Convert drivers from pci_intx() to always-managed pcim_intx() or
     never-managed pci_intx_unmanaged(): amd_sfh, ata (ahci, ata_piix,
     pata_rdc, sata_sil24, sata_sis, sata_uli, sata_vsc), bnx2x, bna,
     ntb, qtnfmac, rtsx, tifm_7xx1, vfio, xen-pciback (Philipp Stanner)

   - Remove pci_intx_unmanaged() since pci_intx() is now always
     unmanaged and pcim_intx() is always managed (Philipp Stanner)

  Error handling:

   - Unexport pcie_read_tlp_log() to encourage drivers to use PCI core
     logging rather than building their own (Ilpo Järvinen)

   - Move TLP Log handling to its own file (Ilpo Järvinen)

   - Store number of supported End-End TLP Prefixes always so we can
     read the correct number of DWORDs from the TLP Prefix Log (Ilpo
     Järvinen)

   - Read TLP Prefixes in addition to the Header Log in
     pcie_read_tlp_log() (Ilpo Järvinen)

   - Add pcie_print_tlp_log() to consolidate printing of TLP Header and
     Prefix Log (Ilpo Järvinen)

   - Quirk the Intel Raptor Lake-P PIO log size to accommodate vendor
     BIOSes that don't configure it correctly (Takashi Iwai)

  ASPM:

   - Save parent L1 PM Substates config so when we restore it along with
     an endpoint's config, the parent info isn't junk (Jian-Hong Pan)

  Power management:

   - Avoid D3 for Root Ports on TUXEDO Sirius Gen1 with old BIOS because
     the system can't wake up from suspend (Werner Sembach)

  Endpoint framework:

   - Destroy the EPC device in devm_pci_epc_destroy(), which previously
     didn't call devres_release() (Zijun Hu)

   - Finish virtual EP removal in pci_epf_remove_vepf(), which
     previously caused a subsequent pci_epf_add_vepf() to fail with
     -EBUSY (Zijun Hu)

   - Write BAR_MASK before iATU registers in pci_epc_set_bar() so we
     don't depend on the BAR_MASK reset value being larger than the
     requested BAR size (Niklas Cassel)

   - Prevent changing BAR size/flags in pci_epc_set_bar() to prevent
     reads from bypassing the iATU if we reduced the BAR size (Niklas
     Cassel)

   - Verify address alignment when programming iATU so we don't attempt
     to write bits that are read-only because of the BAR size, which
     could lead to directing accesses to the wrong address (Niklas
     Cassel)

   - Implement artpec6 pci_epc_features so we can rely on all drivers
     supporting it so we can use it in EPC core code (Niklas Cassel)

   - Check for BARs of fixed size to prevent endpoint drivers from
     trying to change their size (Niklas Cassel)

   - Verify that requested BAR size is a power of two when endpoint
     driver sets the BAR (Niklas Cassel)

  Endpoint framework tests:

   - Clear pci-epf-test dma_chan_rx, not dma_chan_tx, after freeing
     dma_chan_rx (Mohamed Khalfella)

   - Correct the DMA MEMCPY test so it doesn't fail if the Endpoint
     supports both DMA_PRIVATE and DMA_MEMCPY (Manivannan Sadhasivam)

   - Add pci-epf-test and pci_endpoint_test support for capabilities
     (Niklas Cassel)

   - Add Endpoint test for consecutive BARs (Niklas Cassel)

   - Remove redundant comparison from Endpoint BAR test because a > 1MB
     BAR can always be exactly covered by iterating with a 1MB buffer
     (Hans Zhang)

   - Move and convert PCI Endpoint tests from tools/pci to Kselftests
     (Manivannan Sadhasivam)

  Apple PCIe controller driver:

   - Convert StreamID mapping configuration from a bus notifier to the
     .enable_device() and .disable_device() callbacks (Marc Zyngier)

  Freescale i.MX6 PCIe controller driver:

   - Add Requester ID to StreamID mapping configuration when enabling
     devices (Frank Li)

   - Use DWC core suspend/resume functions for imx6 (Frank Li)

   - Add suspend/resume support for i.MX8MQ, i.MX8Q, and i.MX95 (Richard
     Zhu)

   - Add DT compatible string 'fsl,imx8q-pcie-ep' and driver support for
     i.MX8Q series (i.MX8QM, i.MX8QXP, and i.MX8DXL) Endpoints (Frank
     Li)

   - Add DT binding for optional i.MX95 Refclk and driver support to
     enable it if the platform hasn't enabled it (Richard Zhu)

   - Configure PHY based on controller being in Root Complex or Endpoint
     mode (Frank Li)

   - Rely on dbi2 and iATU base addresses from DT via
     dw_pcie_get_resources() instead of hardcoding them (Richard Zhu)

   - Deassert apps_reset in imx_pcie_deassert_core_reset() since it is
     asserted in imx_pcie_assert_core_reset() (Richard Zhu)

   - Add missing reference clock enable or disable logic for IMX6SX,
     IMX7D, IMX8MM (Richard Zhu)

   - Remove redundant imx7d_pcie_init_phy() since
     imx7d_pcie_enable_ref_clk() does the same thing (Richard Zhu)

  Freescale Layerscape PCIe controller driver:

   - Simplify by using syscon_regmap_lookup_by_phandle_args() instead
     of syscon_regmap_lookup_by_phandle() followed by
     of_property_read_u32_array() (Krzysztof Kozlowski)

  Marvell MVEBU PCIe controller driver:

   - Add MODULE_DEVICE_TABLE() to enable module autoloading (Liao Chen)

  MediaTek PCIe Gen3 controller driver:

   - Use clk_bulk_prepare_enable() instead of separate
     clk_bulk_prepare() and clk_bulk_enable() (Lorenzo Bianconi)

   - Rearrange reset assert/deassert so they're both done in the
     *_power_up() callbacks (Lorenzo Bianconi)

   - Document that Airoha EN7581 requires PHY init and power-on before
     PHY reset deassert, unlike other MediaTek Gen3 controllers (Lorenzo
     Bianconi)

   - Move Airoha EN7581 post-reset delay from the en7581 clock .enable()
     method to mtk_pcie_en7581_power_up() (Lorenzo Bianconi)

   - Sleep instead of delay during Airoha EN7581 power-up, since this is
     a non-atomic context (Lorenzo Bianconi)

   - Skip PERST# assertion on Airoha EN7581 during probe and
     suspend/resume to avoid a hardware defect (Lorenzo Bianconi)

   - Enable async probe to reduce system startup time (Douglas Anderson)

  Microchip PolarFlare PCIe controller driver:

   - Set up the inbound address translation based on whether the
     platform allows coherent or non-coherent DMA (Daire McNamara)

   - Update DT binding such that platforms are DMA-coherent by default
     and must specify 'dma-noncoherent' if needed (Conor Dooley)

  Mobiveil PCIe controller driver:

   - Convert mobiveil-pcie.txt to YAML and update 'interrupt-names'
     and 'reg-names' (Frank Li)

  Qualcomm PCIe controller driver:

   - Add DT SM8550 and SM8650 optional 'global' interrupt for link
     events (Neil Armstrong)

   - Add DT 'compatible' strings for IPQ5424 PCIe controller (Manikanta
     Mylavarapu)

   - If 'global' IRQ is supported for detection of Link Up events, tell
     DWC core not to wait for link up (Krishna chaitanya chundru)

  Renesas R-Car PCIe controller driver:

   - Avoid passing stack buffer as resource name (King Dix)

  Rockchip PCIe controller driver:

   - Simplify clock and reset handling by using bulk interfaces (Anand
     Moon)

   - Pass typed rockchip_pcie (not void) pointer to
     rockchip_pcie_disable_clocks() (Anand Moon)

   - Return -ENOMEM, not success, when pci_epc_mem_alloc_addr() fails
     (Dan Carpenter)

  Rockchip DesignWare PCIe controller driver:

   - Use dll_link_up IRQ to detect Link Up and enumerate devices so
     users don't have to manually rescan (Niklas Cassel)

   - Tell DWC core not to wait for link up since the 'sys' interrupt is
     required and detects Link Up events (Niklas Cassel)

  Synopsys DesignWare PCIe controller driver:

   - Don't wait for link up in DWC core if driver can detect Link Up
     event (Krishna chaitanya chundru)

   - Update ICC and OPP votes after Link Up events (Krishna chaitanya
     chundru)

   - Always stop link in dw_pcie_suspend_noirq(), which is required at
     least for i.MX8QM to re-establish link on resume (Richard Zhu)

   - Drop racy and unnecessary LTSSM state check before sending
     PME_TURN_OFF message in dw_pcie_suspend_noirq() (Richard Zhu)

   - Add struct of_pci_range.parent_bus_addr for devices that need their
     immediate parent bus address, not the CPU address, e.g., to program
     an internal Address Translation Unit (iATU) (Frank Li)

  TI DRA7xx PCIe controller driver:

   - Simplify by using syscon_regmap_lookup_by_phandle_args() instead of
     syscon_regmap_lookup_by_phandle() followed by
     of_parse_phandle_with_fixed_args() or of_property_read_u32_index()
     (Krzysztof Kozlowski)

  Xilinx Versal CPM PCIe controller driver:

   - Add DT binding and driver support for Xilinx Versal CPM5
     (Thippeswamy Havalige)

  MicroSemi Switchtec management driver:

   - Add Microchip PCI100X device IDs (Rakesh Babu Saladi)

  Miscellaneous:

   - Move reset related sysfs code from pci.c to pci-sysfs.c where other
     similar code lives (Ilpo Järvinen)

   - Simplify reset_method_store() memory management by using __free()
     instead of explicit kfree() cleanup (Ilpo Järvinen)

   - Constify struct bin_attribute for sysfs, VPD, P2PDMA, and the IBM
     ACPI hotplug driver (Thomas Weißschuh)

   - Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT (Dongdong
     Zhang)

   - Correct documentation of the 'config_acs=' kernel parameter
     (Akihiko Odaki)"

* tag 'pci-v6.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (111 commits)
  PCI: Batch BAR sizing operations
  dt-bindings: PCI: microchip,pcie-host: Allow dma-noncoherent
  PCI: microchip: Set inbound address translation for coherent or non-coherent mode
  Documentation: Fix pci=config_acs= example
  PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT
  PCI: Don't include 'pm_wakeup.h' directly
  selftests: pci_endpoint: Migrate to Kselftest framework
  selftests: Move PCI Endpoint tests from tools/pci to Kselftests
  misc: pci_endpoint_test: Fix IOCTL return value
  dt-bindings: PCI: qcom: Document the IPQ5424 PCIe controller
  dt-bindings: PCI: qcom,pcie-sm8550: Document 'global' interrupt
  dt-bindings: PCI: mobiveil: Convert mobiveil-pcie.txt to YAML
  PCI: switchtec: Add Microchip PCI100X device IDs
  misc: pci_endpoint_test: Remove redundant 'remainder' test
  misc: pci_endpoint_test: Add consecutive BAR test
  misc: pci_endpoint_test: Add support for capabilities
  PCI: endpoint: pci-epf-test: Add support for capabilities
  PCI: endpoint: pci-epf-test: Fix check for DMA MEMCPY test
  PCI: endpoint: pci-epf-test: Set dma_chan_rx pointer to NULL on error
  PCI: dwc: Simplify config resource lookup
  ...
2025-01-25 16:03:40 -08:00
Bjorn Helgaas
10ff5bbfd4 Merge branch 'pci/misc'
- Constify struct bin_attribute for sysfs, VPD, P2PDMA, and the IBM ACPI
  hotplug driver (Thomas Weißschuh)

- Update PCI_EXP_LNKCAP_SLS comment (Lukas Wunner)

- Drop superfluous pm_wakeup.h include (Wolfram Sang)

- Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT (Dongdong Zhang)

- Correct documentation of the 'config_acs=' kernel parameter (Akihiko
  Odaki)

* pci/misc:
  Documentation: Fix pci=config_acs= example
  PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT
  PCI: Don't include 'pm_wakeup.h' directly
  PCI: Update code comment on PCI_EXP_LNKCAP_SLS for PCIe r3.0
  PCI/ACPI: Constify 'struct bin_attribute'
  PCI/P2PDMA: Constify 'struct bin_attribute'
  PCI/VPD: Constify 'struct bin_attribute'
  PCI/sysfs: Constify 'struct bin_attribute'
2025-01-23 13:05:06 -06:00
Bjorn Helgaas
4d3bf4e845 Merge branch 'pci/controller/xilinx-cpm'
- Add DT binding and driver support for Xilinx Versal CPM5 (Thippeswamy
  Havalige)

* pci/controller/xilinx-cpm:
  PCI: xilinx-cpm: Add support for Versal CPM5 Root Port Controller 1
  dt-bindings: PCI: xilinx-cpm: Add compatible string for CPM5 host1
2025-01-23 13:05:06 -06:00
Bjorn Helgaas
d3f0bec2a4 Merge branch 'pci/controller/rockchip'
- Add struct rockchip_pcie_ep kernel-doc to fix warnings (Damien Le Moal)

- Simplify clock and reset handling by using bulk interfaces (Anand Moon)

- Pass typed rockchip_pcie (not void) pointer to
  rockchip_pcie_disable_clocks() (Anand Moon)

- Return -ENOMEM, not success, when pci_epc_mem_alloc_addr() fails (Dan
  Carpenter)

* pci/controller/rockchip:
  PCI: rockchip-ep: Fix error code in rockchip_pcie_ep_init_ob_mem()
  PCI: rockchip: Refactor rockchip_pcie_disable_clocks() signature
  PCI: rockchip: Simplify reset control handling by using reset_control_bulk*() function
  PCI: rockchip: Simplify clock handling by using clk_bulk*() functions
  PCI: rockchip: Add missing fields descriptions for struct rockchip_pcie_ep
2025-01-23 13:05:05 -06:00
Bjorn Helgaas
a306f01eb5 Merge branch 'pci/controller/rcar-ep'
- Avoid passing stack buffer as resource name (King Dix)

* pci/controller/rcar-ep:
  PCI: rcar-ep: Fix incorrect variable used when calling devm_request_mem_region()
2025-01-23 13:05:05 -06:00
Bjorn Helgaas
1854b2e019 Merge branch 'pci/controller/mvebu'
- Add MODULE_DEVICE_TABLE() to enable module autoloading (Liao Chen)

* pci/controller/mvebu:
  PCI: mvebu: Enable module autoloading
2025-01-23 13:05:05 -06:00
Bjorn Helgaas
8d35c2b0eb Merge branch 'pci/controller/microchip'
- Set up the inbound address translation based on whether the platform
  allows coherent or non-coherent DMA (Daire McNamara)

- Update DT binding such that platforms are DMA-coherent by default and
  must specify 'dma-noncoherent' if needed (Conor Dooley)

* pci/controller/microchip:
  dt-bindings: PCI: microchip,pcie-host: Allow dma-noncoherent
  PCI: microchip: Set inbound address translation for coherent or non-coherent mode
2025-01-23 13:05:04 -06:00
Bjorn Helgaas
1276ad01a2 Merge branch 'pci/controller/mediatek'
- Use clk_bulk_prepare_enable() instead of separate clk_bulk_prepare() and
  clk_bulk_enable() (Lorenzo Bianconi)

- Rearrange reset assert/deassert so they're both done in the *_power_up()
  callbacks (Lorenzo Bianconi)

- Document that Airoha EN7581 requires PHY init and power-on before PHY
  reset deassert, unlike other MediaTek Gen3 controllers (Lorenzo Bianconi)

- Move Airoha EN7581 post-reset delay from the en7581 clock .enable()
  method to mtk_pcie_en7581_power_up() (Lorenzo Bianconi)

- Sleep instead of delay during Airoha EN7581 power-up, since this is a
  non-atomic context (Lorenzo Bianconi)

- Skip PERST# assertion on Airoha EN7581 during probe and suspend/resume to
  avoid a hardware defect (Lorenzo Bianconi)

- Enable async probe to reduce system startup time (Douglas Anderson)

* pci/controller/mediatek:
  PCI: mediatek-gen3: Enable async probe by default
  PCI: mediatek-gen3: Avoid PCIe resetting via PERST# for Airoha EN7581 SoC
  PCI: mediatek-gen3: Rely on msleep() in mtk_pcie_en7581_power_up()
  PCI: mediatek-gen3: Move reset delay in mtk_pcie_en7581_power_up()
  PCI: mediatek-gen3: Add comment about initialization order in mtk_pcie_en7581_power_up()
  PCI: mediatek-gen3: Move reset/assert callbacks in .power_up()
  PCI: mediatek-gen3: Rely on clk_bulk_prepare_enable() in mtk_pcie_en7581_power_up()
2025-01-23 13:05:04 -06:00
Bjorn Helgaas
09b7c1627b Merge branch 'pci/controller/layerscape'
- Simplify by using syscon_regmap_lookup_by_phandle_args() instead of
  syscon_regmap_lookup_by_phandle() followed by
  of_property_read_u32_array() (Krzysztof Kozlowski)

* pci/controller/layerscape:
  PCI: layerscape: Use syscon_regmap_lookup_by_phandle_args
2025-01-23 13:05:03 -06:00
Bjorn Helgaas
5b9c74b635 Merge branch 'pci/controller/imx6'
- Add DT compatible string 'fsl,imx8q-pcie-ep' and driver support for
  i.MX8Q series (i.MX8QM, i.MX8QXP, and i.MX8DXL) Endpoints (Frank Li)

- Add DT binding for optional i.MX95 Refclk and driver support to enable it
  if the platform hasn't enabled it (Richard Zhu)

- Configure PHY based on controller being in Root Complex or Endpoint mode
  (Frank Li)

- Rely on dbi2 and iATU base addresses from DT via dw_pcie_get_resources()
  instead of hardcoding them in imx6 (Richard Zhu)

- Skip controller_id computation for i.MX7D since it only has one
  controller (Richard Zhu)

- Deassert apps_reset in imx_pcie_deassert_core_reset() since it is
  asserted in imx_pcie_assert_core_reset() (Richard Zhu)

- Add missing reference clock enable or disable logic for IMX6SX, IMX7D,
  IMX8MM (Richard Zhu)

- Remove redundant imx7d_pcie_init_phy() since imx7d_pcie_enable_ref_clk()
  does the same thing (Richard Zhu)

* pci/controller/imx6:
  PCI: imx6: Clean up comments and whitespace
  PCI: imx6: Remove surplus imx7d_pcie_init_phy() function
  PCI: imx6: Add missing reference clock disable logic
  PCI: imx6: Deassert apps_reset in imx_pcie_deassert_core_reset()
  PCI: imx6: Skip controller_id generation logic for i.MX7D
  PCI: imx6: Fetch dbi2 and iATU base addesses from DT
  PCI: imx6: Configure PHY based on Root Complex or Endpoint mode
  PCI: imx6: Add Refclk for i.MX95 PCIe
  dt-bindings: PCI: fsl,imx6q-pcie: Add Refclk for i.MX95 RC
  PCI: imx6: Add i.MX8Q PCIe Endpoint (EP) support
  dt-bindings: PCI: fsl,imx6q-pcie-ep: Add compatible string fsl,imx8q-pcie-ep

# Conflicts:
#	drivers/pci/controller/dwc/pci-imx6.c
2025-01-23 13:05:03 -06:00
Bjorn Helgaas
349b434b7a Merge branch 'pci/controller/dwc'
- Fix potential string truncation in dw_pcie_edma_irq_verify() (Niklas
  Cassel)

- Don't wait for link up in DWC core if driver can detect Link Up event
  (Krishna chaitanya chundru)

- If qcom 'global' IRQ is supported for detection of Link Up events, tell
  DWC core not to wait for link up (Krishna chaitanya chundru)

- Update ICC and OPP votes after Link Up events (Krishna chaitanya chundru)

- Use dw-rockchip dll_link_up IRQ to detect Link Up and enumerate devices
  so users don't have to manually rescan (Niklas Cassel)

- In dw-rockchip, the 'sys' interrupt is required and detects Link Up
  events, so tell DWC core not to wait for link up (Niklas Cassel)

- Always stop link in dw_pcie_suspend_noirq(), which is required at least
  for i.MX8QM to re-establish link on resume (Richard Zhu)

- Drop racy and unnecessary LTSSM state check before sending PME_TURN_OFF
  message in dw_pcie_suspend_noirq() (Richard Zhu)

- Add stubs for dw_pcie_suspend_noirq() dw_pcie_resume_noirq() when
  CONFIG_PCIE_DW_HOST is not defined so drivers don't need #ifdefs (Bjorn
  Helgaas)

- Use DWC core suspend/resume functions for imx6 (Frank Li)

- Add imx6 suspend/resume support for i.MX8MQ, i.MX8Q, and i.MX95 (Richard
  Zhu)

- Add struct of_pci_range.parent_bus_addr for devices that need their
  immediate parent bus address, not the CPU address, e.g., to program an
  internal Address Translation Unit (iATU) (Frank Li)

* pci/controller/dwc:
  PCI: dwc: Simplify config resource lookup
  of: address: Add parent_bus_addr to struct of_pci_range
  PCI: imx6: Add i.MX8MQ, i.MX8Q and i.MX95 PM support
  PCI: imx6: Use DWC common suspend resume method
  PCI: dwc: Add dw_pcie_suspend_noirq(), dw_pcie_resume_noirq() stubs for !CONFIG_PCIE_DW_HOST
  PCI: dwc: Remove LTSSM state test in dw_pcie_suspend_noirq()
  PCI: dwc: Always stop link in the dw_pcie_suspend_noirq
  PCI: dw-rockchip: Don't wait for link since we can detect Link Up
  PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ
  PCI: qcom: Update ICC and OPP values after Link Up event
  PCI: qcom: Don't wait for link if we can detect Link Up
  PCI: dwc: Don't wait for link up if driver can detect Link Up event
  PCI: dwc: Fix potential truncation in dw_pcie_edma_irq_verify()

# Conflicts:
#	drivers/pci/controller/dwc/pci-imx6.c
2025-01-23 13:04:59 -06:00
Bjorn Helgaas
8ee6c61633 Merge branch 'pci/controller/dra7xx'
- Simplify by using syscon_regmap_lookup_by_phandle_args() instead of
  syscon_regmap_lookup_by_phandle() followed by
  of_parse_phandle_with_fixed_args() or of_property_read_u32_index()
  (Krzysztof Kozlowski)

* pci/controller/dra7xx:
  PCI: dra7xx: Use syscon_regmap_lookup_by_phandle_args
2025-01-23 13:04:54 -06:00
Bjorn Helgaas
ec3619abbf Merge branch 'pci/controller/iommu-map'
- Add host bridge .enable_device() and .disable_device() hooks for bridges
  that need to configure things like Requester ID to StreamID mapping when
  enabling devices (Frank Li)

- Add imx6 Requester ID to StreamID mapping configuration when enabling
  devices (Frank Li)

- Extend struct pci_ecam_ops with .enable_device() and .disable_device()
  hooks so drivers that use pci_host_common_probe() instead of their own
  .probe() have a way to set the .enable_device() callbacks (Marc Zyngier)

- Convert pcie-apple StreamID mapping configuration from a bus notifier to
  the .enable_device() and .disable_device() callbacks (Marc Zyngier)

* pci/controller/iommu-map:
  PCI: apple: Convert to {en,dis}able_device() callbacks
  PCI: host-generic: Allow {en,dis}able_device() to be provided via pci_ecam_ops
  PCI: imx6: Add IOMMU and ITS MSI support for i.MX95
  PCI: Add enable_device() and disable_device() callbacks for bridges
2025-01-23 13:04:53 -06:00
Bjorn Helgaas
e321b10b83 Merge branch 'pci/endpoint-test'
- Clear pci-epf-test dma_chan_rx, not dma_chan_tx, after freeing
  dma_chan_rx (Mohamed Khalfella)

- Correct the DMA MEMCPY test so it doesn't fail if the Endpoint supports
  both DMA_PRIVATE and DMA_MEMCPY (Manivannan Sadhasivam)

- Add pci-epf-test and pci_endpoint_test support for capabilities (Niklas
  Cassel)

- Add Endpoint test for consecutive BARs (Niklas Cassel)

- Remove redundant comparison from Endpoint BAR test because a > 1MB BAR
  can always be exactly covered by iterating with a 1MB buffer (Hans Zhang)

- Correct the PCI Endpoint test IOCTL return value (Manivannan Sadhasivam)

- Move PCI Endpoint tests from tools/pci to Kselftests (Manivannan
  Sadhasivam)

- Convert PCI Endpoint tests to the Kselftest framework (Manivannan
  Sadhasivam)

* pci/endpoint-test:
  selftests: pci_endpoint: Migrate to Kselftest framework
  selftests: Move PCI Endpoint tests from tools/pci to Kselftests
  misc: pci_endpoint_test: Fix IOCTL return value
  misc: pci_endpoint_test: Remove redundant 'remainder' test
  misc: pci_endpoint_test: Add consecutive BAR test
  misc: pci_endpoint_test: Add support for capabilities
  PCI: endpoint: pci-epf-test: Add support for capabilities
  PCI: endpoint: pci-epf-test: Fix check for DMA MEMCPY test
  PCI: endpoint: pci-epf-test: Set dma_chan_rx pointer to NULL on error
2025-01-23 13:04:53 -06:00
Bjorn Helgaas
74855f6697 Merge branch 'pci/endpoint'
- Destroy the EPC device in devm_pci_epc_destroy(), which previously didn't
  call devres_release() (Zijun Hu)

- Simplify pci_epc_get() with class_find_device_by_name() (Zijun Hu)

- Finish virtual EP removal in pci_epf_remove_vepf(), which previously
  caused a subsequent pci_epf_add_vepf() to fail with -EBUSY (Zijun Hu)

- Write BAR_MASK before iATU registers in pci_epc_set_bar() so we don't
  depend on the BAR_MASK reset value being larger than the requested BAR
  size (Niklas Cassel)

- Prevent changing BAR size/flags in pci_epc_set_bar() to prevent reads
  from bypassing the iATU if we reduced the BAR size (Niklas Cassel)

- Verify address alignment when programming iATU so we don't attempt to
  write bits that are read-only because of the BAR size, which could lead
  to directing accesses to the wrong address (Niklas Cassel)

- Implement artpec6 pci_epc_features so we can rely on all drivers
  supporting it so we can use it in EPC core code (Niklas Cassel)

- Check for BARs of fixed size to prevent endpoint drivers from trying to
  change their size (Niklas Cassel)

- Verify that requested BAR size is a power of two when endpoint driver
  sets the BAR (Niklas Cassel)

* pci/endpoint:
  PCI: endpoint: Verify that requested BAR size is a power of two
  PCI: endpoint: Add size check for fixed size BARs in pci_epc_set_bar()
  PCI: artpec6: Implement dw_pcie_ep operation get_features
  PCI: dwc: ep: Add 'address' alignment to 'size' check in dw_pcie_prog_ep_inbound_atu()
  PCI: dwc: ep: Prevent changing BAR size/flags in pci_epc_set_bar()
  PCI: dwc: ep: Write BAR_MASK before iATU registers in pci_epc_set_bar()
  PCI: endpoint: Finish virtual EP removal in pci_epf_remove_vepf()
  PCI: endpoint: Simplify pci_epc_get()
  PCI: endpoint: Destroy the EPC device in devm_pci_epc_destroy()
  PCI: endpoint: Replace magic number '6' by PCI_STD_NUM_BARS
2025-01-23 13:04:52 -06:00
Bjorn Helgaas
770b18a541 Merge branch 'pci/switchtec'
- Add Microchip PCI100X device IDs (Rakesh Babu Saladi)

* pci/switchtec:
  PCI: switchtec: Add Microchip PCI100X device IDs
2025-01-23 13:04:52 -06:00
Bjorn Helgaas
60b5cd4f82 Merge branch 'pci/pci-sysfs'
- Move reset related sysfs code from pci.c to pci-sysfs.c where other
  similar code lives (Ilpo Järvinen)

- Simplify reset_method_store() memory management by using __free() instead
  of explicit kfree() cleanup (Ilpo Järvinen)

- Drop unnecessary zero initializer (Ilpo Järvinen)

* pci/pci-sysfs:
  PCI/sysfs: Remove unnecessary zero in initializer
  PCI/sysfs: Use __free() in reset_method_store()
  PCI/sysfs: Move reset related sysfs code to correct file
2025-01-23 13:04:51 -06:00
Bjorn Helgaas
ccbd884f9e Merge branch 'pci/of'
- Unexport of_pci_parse_bus_range() since it's only used in of.c (Bjorn
  Helgaas)

- Drop 'No bus range found' message so we don't complain when DTs don't
  specify the default 'bus-range = <0x00 0xff>' (Bjorn Helgaas)

- Simplify devm_of_pci_get_host_bridge_resources() interface by dropping
  parameters that are always the same default values (Bjorn Helgaas)

- Update comment reference to of_pci_get_host_bridge_resources(), which no
  longer exists (Bjorn Helgaas)

- Rename the drivers/pci/of_property.c struct of_pci_range to
  of_pci_range_entry to avoid confusion with the global of_pci_range in
  include/linux/of_address.h (Bjorn Helgaas)

* pci/of:
  PCI: of_property: Rename struct of_pci_range to of_pci_range_entry
  sparc/PCI: Update reference to devm_of_pci_get_host_bridge_resources()
  PCI: of: Simplify devm_of_pci_get_host_bridge_resources() interface
  PCI: of: Drop 'No bus range found' message
  PCI: Unexport of_pci_parse_bus_range()
2025-01-23 13:04:51 -06:00
Bjorn Helgaas
f4a09274c5 Merge branch 'pci/err'
- Unexport pcie_read_tlp_log() to encourage drivers to use PCI core logging
  rather than building their own (Ilpo Järvinen)

- Move TLP Log handling to its own file (Ilpo Järvinen)

- Add #defines for TLP Header/Prefix log sizes (Ilpo Järvinen)

- Store number of supported End-End TLP Prefixes always so we can read the
  correct number of DWORDs from the TLP Prefix Log (Ilpo Järvinen)

- Read TLP Prefixes in addition to the Header Log in pcie_read_tlp_log()
  (Ilpo Järvinen)

- Add pcie_print_tlp_log() to consolidate printing of TLP Header and Prefix
  Log (Ilpo Järvinen)

* pci/err:
  PCI: Add pcie_print_tlp_log() to print TLP Header and Prefix Log
  PCI: Add TLP Prefix reading to pcie_read_tlp_log()
  PCI: Store number of supported End-End TLP Prefixes
  PCI: Use unsigned int i in pcie_read_tlp_log()
  PCI: Use same names in pcie_read_tlp_log() prototype and definition
  PCI: Add defines for TLP Header/Prefix log sizes
  PCI: Move TLP Log handling to its own file
  PCI: Don't expose pcie_read_tlp_log() outside PCI subsystem
2025-01-23 13:04:50 -06:00
Bjorn Helgaas
5a1d568ad1 Merge branch 'pci/enumeration'
- Batch sizing of multiple BARs while memory decoding is disabled instead
  of disabling/enabling decoding for each BAR individually; this optimizes
  virtualized environments where toggling decoding enable is expensive
  (Alex Williamson)

* pci/enumeration:
  PCI: Batch BAR sizing operations
2025-01-23 13:04:50 -06:00
Bjorn Helgaas
06fd49ef47 Merge branch 'pci/dpc'
- Quirk the Intel Raptor Lake-P PIO log size to accommodate vendor BIOSes
  that don't configure it correctly (Takashi Iwai)

* pci/dpc:
  PCI/DPC: Quirk PIO log size for Intel Raptor Lake-P
2025-01-23 13:04:50 -06:00
Bjorn Helgaas
a3eda0e333 Merge branch 'pci/devres'
- Update resource request API documentation to encourage callers to supply
  a driver name when requesting resources (Philipp Stanner)

- Export pci_intx_unmanaged() and pcim_intx() (always managed) so callers
  of pci_intx() (which is sometimes managed) can explicitly choose the one
  they need (Philipp Stanner)

- Convert drivers from pci_intx() to always-managed pcim_intx() or
  never-managed pci_intx_unmanaged(): amd_sfh, ata (ahci, ata_piix,
  pata_rdc, sata_sil24, sata_sis, sata_uli, sata_vsc), bnx2x, bna, ntb,
  qtnfmac, rtsx, tifm_7xx1, vfio, xen-pciback (Philipp Stanner)

- Remove pci_intx_unmanaged() since pci_intx() is now always unmanaged and
  pcim_intx() is always managed (Philipp Stanner)

* pci/devres:
  PCI: Remove devres from pci_intx()
  net/ethernet: Use never-managed version of pci_intx()
  HID: amd_sfh: Use always-managed version of pcim_intx()
  wifi: qtnfmac: use always-managed version of pcim_intx()
  ata: Use always-managed version of pci_intx()
  PCI/MSI: Use never-managed version of pci_intx()
  vfio/pci: Use never-managed version of pci_intx()
  misc: Use never-managed version of pci_intx()
  ntb: Use never-managed version of pci_intx()
  drivers/xen: Use never-managed version of pci_intx()
  PCI: Export pci_intx_unmanaged() and pcim_intx()
  PCI: Encourage resource request API users to supply driver name
2025-01-23 13:04:49 -06:00
Alex Williamson
4453f36086 PCI: Batch BAR sizing operations
Toggling memory enable is free on bare metal, but potentially expensive
in virtualized environments as the device MMIO spaces are added and
removed from the VM address space, including DMA mapping of those spaces
through the IOMMU where peer-to-peer is supported.  Currently memory
decode is disabled around sizing each individual BAR, even for SR-IOV
BARs while VF Enable is cleared.

This can be better optimized for virtual environments by sizing a set
of BARs at once, stashing the resulting mask into an array, while only
toggling memory enable once.  This also naturally improves the SR-IOV
path as the caller becomes responsible for any necessary decode disables
while sizing BARs, therefore SR-IOV BARs are sized relying only on the
VF Enable rather than toggling the PF memory enable in the command
register.

Link: https://lore.kernel.org/r/20250120182202.1878581-1-alex.williamson@redhat.com
Reported-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Link: https://lore.kernel.org/r/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Reviewed-by: Mitchell Augustin <mitchell.augustin@canonical.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-23 11:05:20 -06:00
Linus Torvalds
641b0c64b8 A pretty quiet cycle this time around. We have a bunch of new Qualcomm clk
drivers, per usual, and then a handful of drivers for other SoCs. Then the
 usual pile of cleanups is fairly small data fixes or converting DT bindings to
 YAML so they can be validated. No changes to the core framework besides an OF
 node refcount bump that never got decremented.
 
 New Drivers:
  - 5L35023 variant of Versa 3 clock generator
  - Various Qualcomm clk controllers: IPQ CMN PLL, SM6115 LPASS, SM750 global,
    tcsr, rpmh, and display. X Plus GPU and global. QCS615 rpmh and MSM8937 and
    MSM8940 RPM.
  - Qualcomm Pongo and Taycan Alpha PLLs
  - Qualcomm IPQ5424 NoC-related interconnect clks
  - Renesas RZ/G3E (R9A09G047) SoC clk driver
  - SAMA7D65 SoC clk driver
  - Samsung Exynos990 SoC clk driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmeQNfYRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSUuwRAAkKea3uRcSkTgHK3Ts0gmf8L2QS+dL47N
 OFmqhhdF0gYU60kzsaU0A6UGvaagq/rkB8nvZJ6G8/wV6T0jXHmxuCmZ7uRaErpt
 4KDjpS9qQ8sl5LXpuxh9LgfxcOOfAueWRpmF/5alHEtAQLXKHKV5CdcyYa71pj40
 +LfjoaW6xaqx+G3lqJhakY77zKiRzxWH86XQS5CHD3DITkv3B5/dV/nQlAb3P083
 7SzHXKbBpWpXH0y0pLTXZDTVCsHl90t1DO7JKt9Y1fOxtpLB/ROfLPOJ4cZyCQGH
 Y28ZWDA9jEEX/cz/R2qPY3mRUPrFp2ArsXsx1rKlPTabp4NZLs3d9tZiMI/irK/W
 GTkRKMUZlDD5w6jSYgmSTbTj2CsTsPXc8EzsNIFudl6WyzyxWHvnpUb+hdrR2B+0
 untNOkwcb8GzgucYrbK5s/Aw03CiyGTYZHGJxsnIr7uSYRxe8mlV/cIbDcn5+WWj
 rrOcPatLEnCeE1Eldm6cOzFsLMbBVP9HeNkms91y2AJDx4mWn8qyY0psX+HaNyBm
 1YZBVmo2PiZ84ZEhiK7WhPPMaDyR2ZSQS0/U5FaB56G9+rtuVYs8Z7KFS3nK27Rh
 oKWcdKDn1wUmtUhVggC+m4PueOH3dlM0ELaRNKzePx9rEimjWhzfy5GlOvPoaBAl
 MKOVgeLYa4c=
 =wK9g
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
 "A pretty quiet cycle this time around. We have a bunch of new Qualcomm
  clk drivers, per usual, and then a handful of drivers for other SoCs.
  Then the usual pile of cleanups is fairly small data fixes or
  converting DT bindings to YAML so they can be validated.

  No changes to the core framework besides an OF node refcount bump that
  never got decremented.

  New Drivers:

   - 5L35023 variant of Versa 3 clock generator

   - Various Qualcomm clk controllers: IPQ CMN PLL, SM6115 LPASS, SM750
     global, tcsr, rpmh, and display. X Plus GPU and global. QCS615 rpmh
     and MSM8937 and MSM8940 RPM.

   - Qualcomm Pongo and Taycan Alpha PLLs

   - Qualcomm IPQ5424 NoC-related interconnect clks

   - Renesas RZ/G3E (R9A09G047) SoC clk driver

   - SAMA7D65 SoC clk driver

   - Samsung Exynos990 SoC clk driver"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (159 commits)
  clk: analogbits: Fix incorrect calculation of vco rate delta
  clk: bcm: rpi: Add disp clock
  clk: bcm: rpi: Create helper to retrieve private data
  clk: bcm: rpi: Enable minimize for all firmware clocks
  clk: bcm: rpi: Allow cpufreq driver to also adjust gpu clocks
  clk: bcm: rpi: Add ISP to exported clocks
  clk: stm32f4: support spread spectrum clock generation
  clk: stm32f4: use FIELD helpers to access the PLLCFGR fields
  dt-bindings: clock: st,stm32-rcc: support spread spectrum clocking
  dt-bindings: clock: convert stm32 rcc bindings to json-schema
  clk: Use str_enable_disable-like helpers
  clk: clk-loongson2: Fix the number count of clk provider
  clk: clk-loongson2: Switch to use devm_clk_hw_register_fixed_rate_parent_data()
  clk: starfive: Make _clk_get become a common helper function
  clk: en7523: Add clock for eMMC for EN7581
  dt-bindings: clock: add ID for eMMC for EN7581
  dt-bindings: clock: drop NUM_CLOCKS define for EN7581
  clk: en7523: Rework clock handling for different clock numbers
  clk: thead: Fix cpu2vp_clk for TH1520 AP_SUBSYS clocks
  clk: thead: Add CLK_IGNORE_UNUSED to fix TH1520 boot
  ...
2025-01-22 10:54:18 -08:00
Daire McNamara
1390a33b3d PCI: microchip: Set inbound address translation for coherent or non-coherent mode
On Microchip PolarFire SoC the PCIe Root Port can be behind one of three
general purpose Fabric Interface Controller (FIC) buses that encapsulates
an AXI-S bus. Depending on which FIC(s) the Root Port is connected through
to CPU space, and what address translation is done by that FIC, the Root
Port driver's inbound address translation may vary.

For all current supported designs and all future expected designs, inbound
address translation done by a FIC on PolarFire SoC varies depending on
whether PolarFire SoC is operating in coherent DMA mode or noncoherent DMA
mode.

The setup of the outbound address translation tables in the Root Port
driver only needs to handle these two cases.

Setup the inbound address translation tables to one of two address
translations, depending on whether the Root Port is being used with
coherent DMA or noncoherent DMA.

Link: https://lore.kernel.org/r/20241011140043.1250030-3-daire.mcnamara@microchip.com
Fixes: 6f15a9c9f9 ("PCI: microchip: Add Microchip PolarFire PCIe controller driver")
Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com>
[bhelgaas: adapt for ac7f53b7e7 ("PCI: microchip: Add support for using
either Root Port 1 or 2")]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
2025-01-21 17:34:56 -06:00
Wolfram Sang
816875a468 PCI: Don't include 'pm_wakeup.h' directly
The header clearly states that it does not want to be included directly,
only via 'device.h'. The 'platform_device.h' works equally well.

Thus, remove the direct inclusion.

Link: https://lore.kernel.org/r/20241118072917.3853-12-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-01-21 17:29:07 -06:00
Linus Torvalds
4c551165e7 Updates for the interrupt subsystem:
- Consolidation of the machine_kexec_mask_interrupts() by providing a
     generic implementation and replacing the copy & pasta orgy in the
     relevant architectures.
 
   - Prevent unconditional operations on interrupt chips during kexec
     shutdown, which can trigger warnings in certain cases when the
     underlying interrupt has been shut down before.
 
   - Make the enforcement of interrupt handling in interrupt context
     unconditionally available, so that it actually works for non x86
     related interrupt chips. The earlier enablement for ARM GIC chips set
     the required chip flag, but did not notice that the check was hidden
     behind a config switch which is not selected by ARM[64].
 
   - Decrapify the handling of deferred interrupt affinity setting. Some
     interrupt chips require that affinity changes are made from the context
     of handling an interrupt to avoid certain race conditions. For x86 this
     was the default, but with interrupt remapping this requirement was
     lifted and a flag was introduced which tells the core code that
     affinity changes can be done in any context. Unrestricted affinity
     changes are the default for the majority of interrupt chips. RISCV has
     the requirement to add the deferred mode to one of it's interrupt
     controllers, but with the original implementation this would require to
     add the any context flag to all other RISC-V interrupt chips. That's
     backwards, so reverse the logic and require that chips, which need the
     deferred mode have to be marked accordingly. That avoids chasing the
     'sane' chips and marking them.
 
   - Add multi-node support to the Loongarch AVEC interrupt controller
     driver.
 
   - The usual tiny cleanups, fixes and improvements all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmePkVITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRbQD/9bHVph/V9Ekl7JAX3aY4gG4JbRhOc7
 dp1VAcHRhktRfoTztYRbjsbMu2nvZ58GKA8bkOS2jHSF/m3PbkIJfOhwk0YdIAoa
 +kdy5yDgqCGfkqW43DN4Cr+CnzGjWMitw67tFp3fhwehMDpDjdt2L28IjtanSS0f
 hO6FV7o65MWeJwxk4Isb2/nvkO+X23Lrp6RrWS8SXBnF9FFXxiPIg/fiOPTizhCh
 1W/bSGxLLb9WwsVzmlGAKVFlXDij0QGaIUug2fdVZ63OsELXD7tJrLSPG133yk92
 ppIa0s6BT4IBsfM00us4hG15PkLuJmP3yWWcoquG0rP8Wq58VOXiN6+rcJIyvB+5
 mWceTH6IKfZGoRQKwXC7BxeBAIb147reiJtb06meq1/8ADIvzafiNy0c8x9i/UaV
 QiyhPVENjaGCGDomZmJQqN7Yb02Wge1k8InQnodDrHxZNl/bX/B1Z8Bxd0n6hPHg
 NSJXYif2AxgaddpohsdygqRDbT6SNyQdj7YjJFY5qAGJ3yFyJ4JB6WTqkWW4o1vH
 3FVqdAnJmejAmmYSkah0Hkem2T5QASQmTWb93PLxiV6q+d0NM8stWAujjyVdIV/B
 W4Uj9mQ20cz54TjLtxqX+A1k6KcqOWRgh1l2QbUlFsgsOP3V8yz47yqYdR9qMWlO
 9kNEjI3sw+G/IQ==
 =q4rj
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:

 - Consolidate the machine_kexec_mask_interrupts() by providing a
   generic implementation and replacing the copy & pasta orgy in the
   relevant architectures.

 - Prevent unconditional operations on interrupt chips during kexec
   shutdown, which can trigger warnings in certain cases when the
   underlying interrupt has been shut down before.

 - Make the enforcement of interrupt handling in interrupt context
   unconditionally available, so that it actually works for non x86
   related interrupt chips. The earlier enablement for ARM GIC chips set
   the required chip flag, but did not notice that the check was hidden
   behind a config switch which is not selected by ARM[64].

 - Decrapify the handling of deferred interrupt affinity setting.

   Some interrupt chips require that affinity changes are made from the
   context of handling an interrupt to avoid certain race conditions.
   For x86 this was the default, but with interrupt remapping this
   requirement was lifted and a flag was introduced which tells the core
   code that affinity changes can be done in any context. Unrestricted
   affinity changes are the default for the majority of interrupt chips.

   RISCV has the requirement to add the deferred mode to one of it's
   interrupt controllers, but with the original implementation this
   would require to add the any context flag to all other RISC-V
   interrupt chips. That's backwards, so reverse the logic and require
   that chips, which need the deferred mode have to be marked
   accordingly. That avoids chasing the 'sane' chips and marking them.

 - Add multi-node support to the Loongarch AVEC interrupt controller
   driver.

 - The usual tiny cleanups, fixes and improvements all over the place.

* tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/generic_chip: Export irq_gc_mask_disable_and_ack_set()
  genirq/timings: Add kernel-doc for a function parameter
  genirq: Remove IRQ_MOVE_PCNTXT and related code
  x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  genirq: Provide IRQCHIP_MOVE_DEFERRED
  hexagon: Remove GENERIC_PENDING_IRQ leftover
  ARC: Remove GENERIC_PENDING_IRQ
  genirq: Remove handle_enforce_irqctx() wrapper
  genirq: Make handle_enforce_irqctx() unconditionally available
  irqchip/loongarch-avec: Add multi-nodes topology support
  irqchip/ts4800: Replace seq_printf() by seq_puts()
  irqchip/ti-sci-inta : Add module build support
  irqchip/ti-sci-intr: Add module build support
  irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function
  irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args
  genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown
  kexec: Consolidate machine_kexec_mask_interrupts() implementation
  genirq: Reuse irq_thread_fn() for forced thread case
  genirq: Move irq_thread_fn() further up in the code
2025-01-21 13:51:07 -08:00
Rakesh Babu Saladi
a3282f84b2 PCI: switchtec: Add Microchip PCI100X device IDs
Add Microchip parts to the Device ID table so the driver supports PCI100x
devices.

Add a new macro to quirk the Microchip Switchtec PCI100x parts to allow DMA
access via NTB to work when the IOMMU is turned on.

PCI100x family has 6 variants; each variant is designed for different
application usages, different port counts and lane counts:

  PCI1001 has 1 x4 upstream port and 3 x4 downstream ports
  PCI1002 has 1 x4 upstream port and 4 x2 downstream ports
  PCI1003 has 2 x4 upstream ports, 2 x2 upstream ports, and 2 x2
    downstream ports
  PCI1004 has 4 x4 upstream ports
  PCI1005 has 1 x4 upstream port and 6 x2 downstream ports
  PCI1006 has 6 x2 upstream ports and 2 x2 downstream ports

[Historical note: these parts use PCI_VENDOR_ID_EFAR (0x1055), from EFAR
Microsystems, which was acquired in 1996 by Standard Microsystems Corp,
which was acquired by Microchip Technology in 2012.  The PCI-SIG confirms
that Vendor ID 0x1055 is assigned to Microchip even though it's not
visible via https://pcisig.com/membership/member-companies]

Link: https://lore.kernel.org/r/20250120095524.243103-1-Saladi.Rakeshbabu@microchip.com
Signed-off-by: Rakesh Babu Saladi <Saladi.Rakeshbabu@microchip.com>
[bhelgaas: Vendor ID history]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-By: Logan Gunthorpe <logang@deltatee.com>
2025-01-21 10:47:28 -06:00
Niklas Cassel
8a02612f85 PCI: endpoint: pci-epf-test: Add support for capabilities
The test BAR is on the EP side is allocated using pci_epf_alloc_space(),
which allocates the backing memory using dma_alloc_coherent(), which will
return zeroed memory regardless of __GFP_ZERO was set or not.

This means that running a new version of pci-endpoint-test.c (host side)
with an old version of pci-epf-test.c (EP side) will not see any
capabilities being set (as intended), so this is backwards compatible.

Additionally, the EP side always allocates at least 128 bytes for the test
BAR (excluding the MSI-X table), this means that adding another register at
offset 0x30 is still within the 128 available bytes.

For now, we only add the CAP_UNALIGNED_ACCESS capability.

Set CAP_UNALIGNED_ACCESS if the EPC driver can handle any address (because
it implements the .align_addr callback).

Link: https://lore.kernel.org/r/20241203063851.695733-5-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-21 09:44:14 -06:00
Manivannan Sadhasivam
235c2b197a PCI: endpoint: pci-epf-test: Fix check for DMA MEMCPY test
Currently, if DMA MEMCPY test is requested by the host, and if the endpoint
DMA controller supports DMA_PRIVATE, the test will fail. This is not
correct since there is no check for DMA_MEMCPY capability and the DMA
controller can support both DMA_PRIVATE and DMA_MEMCPY.

Fix the check and also reword the error message.

Link: https://lore.kernel.org/r/20250116171650.33585-2-manivannan.sadhasivam@linaro.org
Fixes: 8353813c88 ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Reported-by: Niklas Cassel <cassel@kernel.org>
Closes: https://lore.kernel.org/linux-pci/Z3QtEihbiKIGogWA@ryzen
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2025-01-21 09:44:13 -06:00
Mohamed Khalfella
b1b1f4b129 PCI: endpoint: pci-epf-test: Set dma_chan_rx pointer to NULL on error
If dma_chan_tx allocation fails, set dma_chan_rx to NULL after it is
freed.

Link: https://lore.kernel.org/r/20241227160841.92382-1-khalfella@gmail.com
Fixes: 8353813c88 ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Signed-off-by: Mohamed Khalfella <khalfella@gmail.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-21 09:44:13 -06:00
Linus Torvalds
1cbfb828e0 for-6.14/block-20250118
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmeL6hoQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgppw2EADQV8nDgLRggZR+il4U03yKHXcQEdAX1GrB
 Erowx+dasIJuh6kp3n6qRe9QD/pRqt1DKyLvXoWF8Qfuwq85j7oDnDDYxutNYT27
 hDgrLJriJ3VeKYtTu+andHWt8P29b5h57UayInDOUJurEPA6rXyFZ5YVIti8n21K
 uDOrQXiACG3qRWS2+p2f3UNhX0MkFNFdN/lxi13WMIJtRWF5bXAP+JOgIWCID4Ze
 QuSY6rQD4dp4Q6M2erpX6tn0YZb7Hvw3rPjsd91n6jvYfTUVLH375zg8jCBpi6Wi
 Syufbb8xcTtriVPTDRNu0ekjebkc8wD8ax/h86g0z9v3Ua4DlNmsx9eXrtv6r5nu
 YXqDODOad6stI0+owFquW2vas0gHmfNSfyfGdlk2g24PMtP5Yx0V6FIEvwIeqnje
 ghgxQvBuKUsdhqakByfNnc+XvXi3+RUJek8kvMeUSUQWT1IyMQqPOOk0yp9WdyWD
 bY1f2ECP5BR1b37zYOyawewsI5xTupHUswn5a4r4qtGn3O15rGDkX98Nab5aLCnR
 rW/DvX7+wT6gW9EwrRHiwjwfNDZbsJ9Ggu3lMhtUl5GUWdk58yTiVgKaHJLnlX9/
 CKFKfyyIR1Vl8+gYIpemyFhhcoN+dCSf06ISkrg0jeS0/tYwydaAaCBPL5J4kxZA
 h3Rtbh+Pgg==
 =EXYs
 -----END PGP SIGNATURE-----

Merge tag 'for-6.14/block-20250118' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe pull requests via Keith:
      - Target support for PCI-Endpoint transport (Damien)
      - TCP IO queue spreading fixes (Sagi, Chaitanya)
      - Target handling for "limited retry" flags (Guixen)
      - Poll type fix (Yongsoo)
      - Xarray storage error handling (Keisuke)
      - Host memory buffer free size fix on error (Francis)

 - MD pull requests via Song:
      - Reintroduce md-linear (Yu Kuai)
      - md-bitmap refactor and fix (Yu Kuai)
      - Replace kmap_atomic with kmap_local_page (David Reaver)

 - Quite a few queue freeze and debugfs deadlock fixes

   Ming introduced lockdep support for this in the 6.13 kernel, and it
   has (unsurprisingly) uncovered quite a few issues

 - Use const attributes for IO schedulers

 - Remove bio ioprio wrappers

 - Fixes for stacked device atomic write support

 - Refactor queue affinity helpers, in preparation for better supporting
   isolated CPUs

 - Cleanups of loop O_DIRECT handling

 - Cleanup of BLK_MQ_F_* flags

 - Add rotational support for null_blk

 - Various fixes and cleanups

* tag 'for-6.14/block-20250118' of git://git.kernel.dk/linux: (106 commits)
  block: Don't trim an atomic write
  block: Add common atomic writes enable flag
  md/md-linear: Fix a NULL vs IS_ERR() bug in linear_add()
  block: limit disk max sectors to (LLONG_MAX >> 9)
  block: Change blk_stack_atomic_writes_limits() unit_min check
  block: Ensure start sector is aligned for stacking atomic writes
  blk-mq: Move more error handling into blk_mq_submit_bio()
  block: Reorder the request allocation code in blk_mq_submit_bio()
  nvme: fix bogus kzalloc() return check in nvme_init_effects_log()
  md/md-bitmap: move bitmap_{start, end}write to md upper layer
  md/raid5: implement pers->bitmap_sector()
  md: add a new callback pers->bitmap_sector()
  md/md-bitmap: remove the last parameter for bimtap_ops->endwrite()
  md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()
  md: Replace deprecated kmap_atomic() with kmap_local_page()
  md: reintroduce md-linear
  partitions: ldm: remove the initial kernel-doc notation
  blk-cgroup: rwstat: fix kernel-doc warnings in header file
  blk-cgroup: fix kernel-doc warnings in header file
  nbd: fix partial sending
  ...
2025-01-20 19:38:46 -08:00
Bjorn Helgaas
1108d677da PCI: dwc: Simplify config resource lookup
If platform_get_resource_byname("config") fails, return error immediately
and unindent the normal path.  No functional change intended.

Link: https://lore.kernel.org/r/20250117235119.712043-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-20 12:39:38 -06:00
Bjorn Helgaas
b881532991 PCI: imx6: Clean up comments and whitespace
For readability, fix typos and comments that needlessly exceed 80 columns.

Link: https://lore.kernel.org/r/20250118210727.795559-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-20 12:38:14 -06:00
Bjorn Helgaas
42d9972732 PCI: of_property: Rename struct of_pci_range to of_pci_range_entry
Previously there were two definitions of struct of_pci_range: one in
include/linux/of_address.h and another local to drivers/pci/of_property.c.

Rename the local struct of_pci_range to of_pci_range_entry to avoid
confusion.

Link: https://lore.kernel.org/r/20250117161037.643953-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
2025-01-18 15:18:38 -06:00
Richard Zhu
9d6b1bd6b3 PCI: imx6: Add i.MX8MQ, i.MX8Q and i.MX95 PM support
Add i.MX8MQ, i.MX8Q and i.MX95 PCIe suspend/resume support.

Link: https://lore.kernel.org/r/20241126075702.4099164-10-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-18 15:04:23 -06:00
Frank Li
a528d1a725 PCI: imx6: Use DWC common suspend resume method
Call common DWC suspend/resume function. Use DWC common iATU method to
send out PME_TURN_OFF message.

In old DWC implementations, PCIE_ATU_INHIBIT_PAYLOAD in iATU Ctrl2 register
is reserved, so the generic DWC implementation of sending the PME_Turn_Off
message using a dummy MMIO write cannot be used. Use the previous method to
kick off PME_TURN_OFF message for these platforms.

The System Reset Control (SRC) interface is used to toggle 'turnoff_reset'
to send PME_TURN_OFF and since the DWC implementation is used, it is not
needed now.

Replace the imx_pcie_stop_link() and imx_pcie_host_exit() by
dw_pcie_suspend_noirq() in imx_pcie_suspend_noirq().

Since dw_pcie_suspend_noirq() already does these, see below call stack:

  dw_pcie_suspend_noirq()
    dw_pcie_stop_link()
      imx_pcie_stop_link()
    pci->pp.ops->deinit()
      imx_pcie_host_exit()

Replace the imx_pcie_host_init(), dw_pcie_setup_rc() and
imx_pcie_start_link() by dw_pcie_resume_noirq() in imx_pcie_resume_noirq().

Since dw_pcie_resume_noirq() already does these, see below call stack:

  dw_pcie_resume_noirq()
    pci->pp.ops->init()
      imx_pcie_host_init()
    dw_pcie_setup_rc()
    dw_pcie_start_link()
      imx_pcie_start_link(;

Link: https://lore.kernel.org/r/20241126075702.4099164-9-hongxing.zhu@nxp.com
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-01-18 15:04:23 -06:00
Bjorn Helgaas
ec57335b81 PCI: dwc: Add dw_pcie_suspend_noirq(), dw_pcie_resume_noirq() stubs for !CONFIG_PCIE_DW_HOST
Previously pcie-designware.h declared dw_pcie_suspend_noirq() and
dw_pcie_resume_noirq() unconditionally, even though they were only
implemented when CONFIG_PCIE_DW_HOST was defined.

Add no-op stubs for them when CONFIG_PCIE_DW_HOST is not defined so
drivers that support both Root Complex and Endpoint modes don't need

Link: https://lore.kernel.org/r/20250117213810.GA656803@bhelgaas
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-01-18 15:03:40 -06:00
Philipp Stanner
dfa2f4d5f9 PCI: Remove devres from pci_intx()
pci_intx() is a hybrid function which can sometimes be managed through
devres. This hybrid nature is undesirable.

Since all users of pci_intx() have by now been ported either to
always-managed pcim_intx() or never-managed pci_intx_unmanaged(), the
devres functionality can be removed from pci_intx().

Consequently, pci_intx_unmanaged() is now redundant, because pci_intx()
itself is now unmanaged.

Remove the devres functionality from pci_intx(). Have all users of
pci_intx_unmanaged() call pci_intx(). Remove pci_intx_unmanaged().

Link: https://lore.kernel.org/r/20241209130632.132074-13-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
2025-01-18 14:38:49 -06:00
Philipp Stanner
b182cbaaa9 PCI/MSI: Use never-managed version of pci_intx()
pci_intx() is a hybrid function which can sometimes be managed through
devres. To remove this hybrid nature from pci_intx(), it is necessary to
port users to either an always-managed or a never-managed version.

MSI sets up its own separate devres callback implicitly in
pcim_setup_msi_release(). This callback ultimately uses pci_intx(), which
is problematic since the callback runs on driver detach.

That problem has last been described here:
https://lore.kernel.org/all/ee44ea7ac760e73edad3f20b30b4d2fff66c1a85.camel@redhat.com/

Replace the call to pci_intx() with one to the never-managed version
pci_intx_unmanaged().

Link: https://lore.kernel.org/r/20241209130632.132074-9-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2025-01-18 14:38:48 -06:00
Philipp Stanner
f546e8033d PCI: Export pci_intx_unmanaged() and pcim_intx()
pci_intx() is a hybrid function which sometimes performs devres operations,
depending on whether pcim_enable_device() has been used to enable the
pci_dev. This sometimes-managed nature of the function is problematic.
Notably, it causes the function to allocate under some circumstances which
makes it unusable from interrupt context.

Export pcim_intx() (which is always managed) and rename __pcim_intx()
(which is never managed) to pci_intx_unmanaged() and export it as well.

Then all callers of pci_intx() can be ported to the version they need,
depending whether they use pci_enable_device() or pcim_enable_device().

Link: https://lore.kernel.org/r/20241209130632.132074-3-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
2025-01-18 14:38:48 -06:00
Richard Zhu
112aba9a79 PCI: dwc: Remove LTSSM state test in dw_pcie_suspend_noirq()
It's safe to send PME_TURN_OFF message regardless of whether the link is up
or down, so don't test the LTSSM state before sending the PME_TURN_OFF
message.

Only print an error message when the LTSSM is not in DETECT or POLL. There
shouldn't be an error when no Endpoint is connected at all.

Link: https://lore.kernel.org/r/20241210081557.163555-3-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-18 13:38:34 -06:00
Richard Zhu
86a016e278 PCI: dwc: Always stop link in the dw_pcie_suspend_noirq
On the i.MX8QM, PCIe link can't be re-established again in
dw_pcie_resume_noirq(), if the LTSSM_EN bit is not cleared
properly in dw_pcie_suspend_noirq().

So, add dw_pcie_stop_link() to dw_pcie_suspend_noirq() to fix
this issue and to align the suspend/resume functions since there
is dw_pcie_start_link() in dw_pcie_resume_noirq() already.

Fixes: 4774faf854 ("PCI: dwc: Implement generic suspend/resume functionality")
Link: https://lore.kernel.org/r/20241210081557.163555-2-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-18 11:35:26 -06:00
Niklas Cassel
ec9fd499b9 PCI: dw-rockchip: Don't wait for link since we can detect Link Up
The Root Complex specific device tree binding for pcie-dw-rockchip has the
'sys' interrupt marked as required.

The driver requests the 'sys' IRQ unconditionally, and errors out if not
provided.

Thus, we can unconditionally set 'use_linkup_irq', so dw_pcie_host_init()
doesn't wait for the link to come up.

This will skip the wait for link up (since the bus will be enumerated once
the link up IRQ is triggered), which reduces the bootup time.

Link: https://lore.kernel.org/r/20250113-rockchip-no-wait-v1-1-25417f37b92f@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-18 11:35:25 -06:00
Niklas Cassel
0e0b45ab5d PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ
Most boards using the pcie-dw-rockchip PCIe controller lack standard
hotplug support.

Thus, when an endpoint is attached to the SoC, users have to rescan the bus
manually to enumerate the device. This can be avoided by using the
'dll_link_up' interrupt in the combined system interrupt 'sys'.

Once the 'dll_link_up' IRQ is received, the bus underneath the host bridge
is scanned to enumerate PCIe endpoint devices.

This implements the same functionality that was implemented in the DWC
based pcie-qcom driver in 4581403f67 ("PCI: qcom: Enumerate endpoints
based on Link up event in 'global_irq' interrupt").

The Root Complex specific device tree binding for pcie-dw-rockchip already
has the 'sys' interrupt marked as required, so there is no need to update
the device tree binding. This also means that we can request the 'sys' IRQ
unconditionally.

Link: https://lore.kernel.org/r/20241127145041.3531400-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log, squash Pei Xiao's redundant dev_err() fix from
https://lore.kernel.org/r/327718207d3cd72847c079ff9d56eb246744c182.1736126067.git.xiaopei01@kylinos.cn,
squash Niklas's #define change from https://lore.kernel.org/r/20250103095812.2408364-2-cassel@kernel.org]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-18 11:35:25 -06:00
Krishna chaitanya chundru
f0639013d3 PCI: qcom: Update ICC and OPP values after Link Up event
4581403f67 ("PCI: qcom: Enumerate endpoints based on Link up event in
'global_irq' interrupt") added the Link Up-based enumeration support, but
did not update the ICC/OPP vote once link is up. Before that, the update
happened during probe and the endpoints may or may not be enumerated at
that time, so the ICC/OPP vote was not guaranteed to be accurate.

With Link Up-based enumeration support, the driver can request the accurate
vote based on the PCIe link.

Call qcom_pcie_icc_opp_update() in qcom_pcie_global_irq_thread() after
enumerating the endpoints.

Fixes: 4581403f67 ("PCI: qcom: Enumerate endpoints based on Link up event in 'global_irq' interrupt")
Link: https://lore.kernel.org/r/20241123-remove_wait2-v5-3-b5f9e6b794c2@quicinc.com
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2025-01-18 11:35:18 -06:00
Krishna chaitanya chundru
36971d6c5a PCI: qcom: Don't wait for link if we can detect Link Up
If we have a 'global' IRQ for Link Up events, we need not wait for the
link to be up during PCI initialization, which reduces startup time.

Check for 'global' IRQ, and if present, set 'use_linkup_irq',
so dw_pcie_host_init() doesn't wait for the link to come up.

Link: https://lore.kernel.org/r/20241123-remove_wait2-v5-2-b5f9e6b794c2@quicinc.com
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2025-01-18 11:35:11 -06:00
Krishna chaitanya chundru
8d3bf19f1b PCI: dwc: Don't wait for link up if driver can detect Link Up event
If the driver can detect the Link Up event and enumerate downstream devices
at that time, we need not wait here.

Skip waiting for link to come up if the driver supports 'use_linkup_irq'.

Link: https://lore.kernel.org/r/20241123-remove_wait2-v5-1-b5f9e6b794c2@quicinc.com
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: wrap comment, update commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2025-01-18 11:35:02 -06:00
Niklas Cassel
574913f9e1 PCI: dwc: Fix potential truncation in dw_pcie_edma_irq_verify()
Increase the size of the string buffer to avoid potential truncation in
dw_pcie_edma_irq_verify().

This fixes the following build warning when compiling with W=1:

  drivers/pci/controller/dwc/pcie-designware.c: In function ‘dw_pcie_edma_detect’:
  drivers/pci/controller/dwc/pcie-designware.c:989:50: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 3 [-Wformat-truncation=]
    989 |                 snprintf(name, sizeof(name), "dma%d", pci->edma.nr_irqs);
        |                                                  ^~

Link: https://lore.kernel.org/r/20250104002119.2681246-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-18 11:34:48 -06:00
Krzysztof Kozlowski
ad9afd7503 PCI: dra7xx: Use syscon_regmap_lookup_by_phandle_args
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument.  Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.

There is also no real benefit in printing errors on missing syscon
argument, because this is done just too late: runtime check on
static/build-time data.  Dtschema and Devicetree bindings offer the
static/build-time check for this already.

Link: https://lore.kernel.org/r/20250112-syscon-phandle-args-pci-v1-1-fcb6ebcc0afc@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-01-16 14:38:53 -06:00
Krzysztof Kozlowski
149fc35734 PCI: layerscape: Use syscon_regmap_lookup_by_phandle_args
Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over
syscon_regmap_lookup_by_phandle() combined with getting the syscon
argument.  Except simpler code this annotates within one line that given
phandle has arguments, so grepping for code would be easier.

Link: https://lore.kernel.org/r/20250112-syscon-phandle-args-pci-v1-2-fcb6ebcc0afc@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Roy Zang <Roy.Zang@nxp.com>
2025-01-16 14:38:36 -06:00
Richard Zhu
6dd24b0a85 PCI: imx6: Remove surplus imx7d_pcie_init_phy() function
This function essentially duplicates imx7d_pcie_enable_ref_clk(). So remove
it.

Link: https://lore.kernel.org/r/20241126075702.4099164-8-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-16 14:29:38 -06:00
Richard Zhu
93d883f890 PCI: imx6: Add missing reference clock disable logic
Ensure the *_enable_ref_clk() function is symmetric by addressing missing
disable parts on some platforms.

Fixes: d0a75c791f ("PCI: imx6: Factor out ref clock disable to match enable")
Link: https://lore.kernel.org/r/20241126075702.4099164-7-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-16 14:29:38 -06:00
Richard Zhu
ef61c7d8d0 PCI: imx6: Deassert apps_reset in imx_pcie_deassert_core_reset()
Since the apps_reset is asserted in imx_pcie_assert_core_reset(), it should
be deasserted in imx_pcie_deassert_core_reset().

Fixes: 9b3fe6796d ("PCI: imx6: Add code to support i.MX7D")
Link: https://lore.kernel.org/r/20241126075702.4099164-6-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-16 14:29:38 -06:00
Richard Zhu
f068ffdd03 PCI: imx6: Skip controller_id generation logic for i.MX7D
The i.MX7D only has one PCIe controller, so controller_id should always be
0. The previous code is incorrect although yielding the correct result.

Fix by removing "IMX7D" from the switch case branch.

Fixes: 2d8ed461db ("PCI: imx6: Add support for i.MX8MQ")
Link: https://lore.kernel.org/r/20241126075702.4099164-5-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-16 14:29:29 -06:00
Richard Zhu
7ab93e6a08 PCI: imx6: Fetch dbi2 and iATU base addesses from DT
Since dw_pcie_get_resources() gets the dbi2 and iATU base addresses from
DT, remove the code from the imx6 driver that does the same.

Upstream DTSes have not enabled Endpoint function. So nothing is broken for
old upstream DTBs.

Link: https://lore.kernel.org/r/20241126075702.4099164-4-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-16 14:22:07 -06:00
Frank Li
de22e20589 PCI: imx6: Configure PHY based on Root Complex or Endpoint mode
Pass PHY_MODE_PCIE_EP if the PCI controller operates in Endpoint (EP) mode,
and fix the Root Complex (RC) mode being hardcoded using a drvdata mode
check.

Fixes: 8026f2d8e8 ("PCI: imx6: Call common PHY API to set mode, speed, and submode")
Link: https://lore.kernel.org/r/20241119-pci_fixup_addr-v8-6-c4bfa5193288@nxp.com
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>
2025-01-16 14:21:53 -06:00
Richard Zhu
137250911f PCI: imx6: Add Refclk for i.MX95 PCIe
Add "ref" clock to enable Refclk. To avoid breaking DT backwards
compatibility, the i.MX95 "ref" clock is optional. Use
devm_clk_get_optional() to fetch i.MX95 PCIe optional clocks in driver.

If using external clock, "ref" clock should point to external reference.

If using internal clock, CREF_EN in LAST_TO_REG controls reference output,
implemented in drivers/clk/imx/clk-imx95-blk-ctl.c.

Link: https://lore.kernel.org/r/20241126075702.4099164-3-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2025-01-16 14:20:37 -06:00
Frank Li
687aedb73a PCI: imx6: Add i.MX8Q PCIe Endpoint (EP) support
Add support for the i.MX8Q series (i.MX8QM, i.MX8QXP, and i.MX8DXL) PCIe
Endpoint (EP). On the i.MX8Q platforms, the PCI bus addresses differ
from the CPU addresses. However, the DesignWare (DWC) driver already
handles this in the common code.

Link: https://lore.kernel.org/r/20241119-pci_fixup_addr-v8-7-c4bfa5193288@nxp.com
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-16 14:18:14 -06:00
Ilpo Järvinen
f68ea779d9 PCI: Add pcie_print_tlp_log() to print TLP Header and Prefix Log
Add pcie_print_tlp_log() to print TLP Header and Prefix Log.  Print End-End
Prefixes only if they are non-zero.

Consolidate the few places which currently print TLP using custom
formatting.

Link: https://lore.kernel.org/r/20250114170840.1633-9-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-01-16 12:05:43 -06:00
Ilpo Järvinen
ad41ddeeac PCI: Add TLP Prefix reading to pcie_read_tlp_log()
pcie_read_tlp_log() handles only 4 Header Log DWORDs but TLP Prefix Log
(PCIe r6.1 secs 7.8.4.12 & 7.9.14.13) may also be present.

Generalize pcie_read_tlp_log() and struct pcie_tlp_log to also handle TLP
Prefix Log. The relevant registers are formatted identically in AER and DPC
Capability, but has these variations:

  a) The offsets of TLP Prefix Log registers vary.

  b) DPC RP PIO TLP Prefix Log register can be < 4 DWORDs.

  c) AER TLP Prefix Log Present (PCIe r6.1 sec 7.8.4.7) can indicate Prefix
     Log is not present.

Therefore callers must pass the offset of the TLP Prefix Log register and
the entire length to pcie_read_tlp_log() to be able to read the correct
number of TLP Prefix DWORDs from the correct offset.

Link: https://lore.kernel.org/r/20250114170840.1633-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash ternary fix from
https://lore.kernel.org/r/20250116172019.88116-1-colin.i.king@gmail.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-01-16 12:04:38 -06:00
Bjorn Helgaas
bbaa022917 PCI: of: Simplify devm_of_pci_get_host_bridge_resources() interface
Previously pci_parse_request_of_pci_ranges() supplied the default bus range
to devm_of_pci_get_host_bridge_resources(), but that function is static and
has no other callers, so there's no reason to complicate its interface by
passing the default bus range.

Drop the busno and bus_max parameters and use 0x0 and 0xff directly in
devm_of_pci_get_host_bridge_resources().

Link: https://lore.kernel.org/r/20250113231557.441289-4-helgaas@kernel.org
[bhelgaas: dev_warn() for invalid end of bus-range]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-01-15 15:26:51 -06:00
Bjorn Helgaas
adda86bf46 PCI: of: Drop 'No bus range found' message
The typical bus range for a host bridge is [bus 00-ff], and
devm_of_pci_get_host_bridge_resources() defaults to that unless DT contains
a "bus-range" property.

devm_of_pci_get_host_bridge_resources() previously emitted a message when
there was no "bus-range" property, but that seems unnecessary for this
common situation.  Remove the message.

Link: https://lore.kernel.org/r/20250113231557.441289-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2025-01-15 15:24:44 -06:00
Bjorn Helgaas
ddedc08e80 PCI: Unexport of_pci_parse_bus_range()
of_pci_parse_bus_range() is only used in drivers/pci/of.c, so make it
static and unexport it.

Link: https://lore.kernel.org/r/20250113231557.441289-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2025-01-15 15:24:39 -06:00
Marc Zyngier
d9f6642ab7 PCI: apple: Convert to {en,dis}able_device() callbacks
Now that the core host-bridge infrastructure is able to give us a callback
on each device being added or removed, convert the bus-notifier hack to it.

Link: https://lore.kernel.org/r/20241204150145.800408-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-15 14:52:50 -06:00
Marc Zyngier
477ac7b0a7 PCI: host-generic: Allow {en,dis}able_device() to be provided via pci_ecam_ops
In order to let host controller drivers using the host-generic
infrastructure use the {en,dis}able_device() callbacks that can be used to
configure sideband RID mapping hardware, provide these two callbacks as
part of the pci_ecam_ops structure.

Link: https://lore.kernel.org/r/20241204150145.800408-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-15 14:52:12 -06:00
Frank Li
ce4c430172 PCI: imx6: Add IOMMU and ITS MSI support for i.MX95
For the i.MX95, the configuration of a LUT is necessary to convert PCIe
Requester IDs (RIDs) to StreamIDs, which are used by both IOMMU and ITS.
This involves checking msi-map and iommu-map device tree properties to
ensure consistent mapping of Requester IDs to the same StreamIDs.

Subsequently, LUT-related registers are configured. If a msi-map isn't
detected, the platform relies on DWC built-in controller for MSIs that
do not need StreamIDs.

Implement PCI bus callback function to handle enable_device() and
disable_device() operations, setting up the LUT whenever a new PCI
device is enabled.

Known limitations:

  - If iommu-map exists in the device tree but the IOMMU controller is
    disabled, StreamIDs are programmed into the LUT. However, if a RID
    is out of range of the iommu-map, enabling the PCI device would
    result in a failure, although the PCI device can work without the
    IOMMU.

  - If msi-map exists in the device tree but the MSI controller is
    disabled, MSIs will not work. The DWC driver skips initializing the
    built-in MSI controller, falling back to legacy PCI INTx only.

Link: https://lore.kernel.org/r/20250114-imx95_lut-v9-2-39f58dbed03a@nxp.com
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: fix uninitialized "sid" in imx_pcie_enable_device()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
2025-01-15 14:47:24 -06:00
Thomas Gleixner
7d04319a05 x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.

That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.

All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
2025-01-15 21:38:53 +01:00
Dan Carpenter
7ca2887600
PCI: rockchip-ep: Fix error code in rockchip_pcie_ep_init_ob_mem()
Return -ENOMEM if pci_epc_mem_alloc_addr() fails.  Don't return success.

Link: https://lore.kernel.org/r/Z014ylYz_xrrgI4W@stanley.mountain
Fixes: 9456480194 ("PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2025-01-15 18:24:12 +00:00
Anand Moon
500969661b
PCI: rockchip: Refactor rockchip_pcie_disable_clocks() signature
Refactor rockchip_pcie_disable_clocks() to accept a struct rockchip_pcie
pointer instead of a void pointer thus improving type safety and code
readability by explicitly specifying the expected data type.

Link: https://lore.kernel.org/r/20241202151150.7393-4-linux.amoon@gmail.com
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-15 18:24:12 +00:00
Anand Moon
18715931a5
PCI: rockchip: Simplify reset control handling by using reset_control_bulk*() function
Currently, the driver acquires and asserts/deasserts the resets
individually thereby making the driver complex to read.

This can be simplified by using the reset_control_bulk() APIs.

Use devm_reset_control_bulk_get_exclusive() API to acquire all the resets
and use reset_control_bulk_{assert/deassert}() APIs to assert/deassert them
in bulk.

Following the recommendations in 'Rockchip RK3399 TRM v1.3 Part2':

  1. Split the reset controls into two groups as per section '17.5.8.1.1
     PCIe as Root Complex'.

  2. Deassert the 'Pipe, MGMT Sticky, MGMT, Core' resets in groups as per
     section '17.5.8.1.1 PCIe as Root Complex'. This is accomplished using
     the reset_control_bulk APIs.

Link: https://lore.kernel.org/r/20241202151150.7393-3-linux.amoon@gmail.com
Co-developed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: squash error handling fix from https://lore.kernel.org/r/7da6ac56-af55-4436-9597-6af24df8122c@stanley.mountain]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-15 18:23:28 +00:00
Frank Li
a3751212a8
PCI: Add enable_device() and disable_device() callbacks for bridges
Some PCI host bridges require special handling when enabling or
disabling PCI devices. For example, the i.MX95 platform has a lookup
table to map Requester IDs to StreamIDs, which the SMMU and MSI
controller use to identify the source of DMA accesses.

Without this mapping, DMA accesses may target unintended memory, which
would corrupt memory or read the wrong data.

Add a host bridge enable_device() hook the imx6 driver can use to
configure the Requester ID to StreamID mapping. The hardware table isn't
big enough to map all possible Requester IDs, so this hook may fail if
no table space is available. In that case, return failure from
pci_enable_device().

It might make more sense to make pci_set_master() decline to enable bus
mastering and return failure, but it currently doesn't have a way to return
failure.

Link: https://lore.kernel.org/r/20250114-imx95_lut-v9-1-39f58dbed03a@nxp.com
Tested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-15 18:06:22 +00:00
Takashi Iwai
b198499c7d
PCI/DPC: Quirk PIO log size for Intel Raptor Lake-P
Apparently the Raptor Lake-P reference firmware configures the PIO log size
correctly, but some vendor BIOSes, including at least ASUSTeK COMPUTER INC.
Zenbook UX3402VA_UX3402VA, do not.

Apply the quirk for Raptor Lake-P.  This prevents kernel complaints like:

  DPC: RP PIO log size 0 is invalid

and also enables the DPC driver to dump the RP PIO Log registers when DPC
is triggered.

Note that the bug report also mentions 8086:a76e, which has been already
added by 627c6db207 ("PCI/DPC: Quirk PIO log size for Intel Raptor Lake
Root Ports").

Link: https://lore.kernel.org/r/20250102164315.7562-1-tiwai@suse.de
Link: https://bugzilla.suse.com/show_bug.cgi?id=1234623
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-15 17:59:55 +00:00
Ilpo Järvinen
2d54d23c60
PCI/sysfs: Remove unnecessary zero in initializer
Providing empty initializer for an array is enough to set its elements
to zero. Thus, remove the redundant 0 from the initializer.

Link: https://lore.kernel.org/r/20241028174046.1736-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-15 03:54:44 +00:00
Ilpo Järvinen
b6e5c46d83
PCI/sysfs: Use __free() in reset_method_store()
Use __free() from  cleanup.h to handle freeing options in
reset_method_store() as it simplifies the code flow.

Link: https://lore.kernel.org/r/20241028174046.1736-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-15 03:54:38 +00:00
Ilpo Järvinen
10269d5781
PCI/sysfs: Move reset related sysfs code to correct file
Most PCI sysfs code and structs are in a dedicated file but a few reset
related things remain in pci.c. Move also them to pci-sysfs.c and drop
pci_dev_reset_method_attr_is_visible() as it is 100% duplicate of
pci_dev_reset_attr_is_visible().

Link: https://lore.kernel.org/r/20241028174046.1736-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-15 03:54:22 +00:00
Ilpo Järvinen
e5321ae10e PCI: Store number of supported End-End TLP Prefixes
eetlp_prefix_path in the struct pci_dev tells if End-End TLP Prefixes
are supported by the path or not, and the value is only calculated if
CONFIG_PCI_PASID is set.

The Max End-End TLP Prefixes field in the Device Capabilities Register 2
also tells how many (1-4) End-End TLP Prefixes are supported (PCIe r6.2 sec
7.5.3.15). The number of supported End-End Prefixes is useful for reading
correct number of DWORDs from TLP Prefix Log register in AER capability
(PCIe r6.2 sec 7.8.4.12).

Replace eetlp_prefix_path with eetlp_prefix_max and determine the number of
supported End-End Prefixes regardless of CONFIG_PCI_PASID so that an
upcoming commit generalizing TLP Prefix Log register reading does not have
to read extra DWORDs for End-End Prefixes that never will be there.

The value stored into eetlp_prefix_max is directly derived from device's
Max End-End TLP Prefixes and does not consider limitations imposed by
bridges or the Root Port beyond supported/not supported flags. This is
intentional for two reasons:

  1) PCIe r6.2 spec sections 2.2.10.4 & 6.2.4.4 indicate that a TLP is
     malformed only if the number of prefixes exceed the number of Max
     End-End TLP Prefixes, which seems to be the case even if the device
     could never receive that many prefixes due to smaller maximum imposed
     by a bridge or the Root Port. If TLP parsing is later added, this
     distinction is significant in interpreting what is logged by the TLP
     Prefix Log registers and the value matching to the Malformed TLP
     threshold is going to be more useful.

  2) TLP Prefix handling happens autonomously on a low layer and the value
     in eetlp_prefix_max is not programmed anywhere by the kernel (i.e.,
     there is no limiter OS can control to prevent sending more than N TLP
     Prefixes).

Link: https://lore.kernel.org/r/20250114170840.1633-7-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-01-14 17:47:39 -06:00
Ilpo Järvinen
27d41818b6 PCI: Use unsigned int i in pcie_read_tlp_log()
Loop variable i counting from 0 upwards does not need to be signed so make
it unsigned int.

Link: https://lore.kernel.org/r/20250114170840.1633-6-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-01-14 17:47:34 -06:00
Ilpo Järvinen
dbfd51ba4c PCI: Use same names in pcie_read_tlp_log() prototype and definition
pcie_read_tlp_log()'s prototype and function signature diverged due to
changes made while applying.

Make the parameters of pcie_read_tlp_log() named identically.

Link: https://lore.kernel.org/r/20250114170840.1633-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-01-14 17:46:36 -06:00
Ilpo Järvinen
ede5d5dbef PCI: Add defines for TLP Header/Prefix log sizes
Add defines for AER and DPC capabilities TLP Header Logging register sizes
(PCIe r6.2, sec 7.8.4 / 7.9.14) and replace literals with them.

Link: https://lore.kernel.org/r/20250114170840.1633-4-ilpo.jarvinen@linux.intel.com
Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-01-14 17:39:04 -06:00
Ilpo Järvinen
a71c59269e PCI: Move TLP Log handling to its own file
TLP Log is a PCIe feature and is processed only by AER and DPC.
Configwise, DPC depends AER being enabled. In lack of better place, the TLP
Log handling code was initially placed into pci.c but it can be easily
placed in a separate file.

Move TLP Log handling code to its own file under pcie/ subdirectory and
include it only when AER is enabled.

Link: https://lore.kernel.org/r/20250114170840.1633-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-01-14 17:38:18 -06:00
Ilpo Järvinen
013525583f PCI: Don't expose pcie_read_tlp_log() outside PCI subsystem
pcie_read_tlp_log() was exposed by the commit 0a5a46a6a6 ("PCI/AER:
Generalize TLP Header Log reading") with the intent that drivers could use
it, but the PCI maintainer later decided that drivers should be encouraged
to use PCI core diagnostic logging of generic AER registers rather than
building their own.

Drivers that currently implement their own diagnostic logging include ixgbe
(ixgbe_io_error_detected()) and iwlwifi (iwl_trans_pcie_dump_regs()).

Remove the unwanted EXPORT of pcie_read_tlp_log() and remove it from
include/linux/aer.h.

Link: https://lore.kernel.org/r/20250114170840.1633-2-ilpo.jarvinen@linux.intel.com
Link: https://lore.kernel.org/all/20240322193011.GA701027@bhelgaas/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-01-14 17:24:27 -06:00
Linus Torvalds
7f5b6a8ec1 pci-v6.13-fixes-3
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmeGsZYUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwu6RAAreLLwBLB0bb2nHoGTL0OjfG+u9PU
 2lDkRD1A6g53YHMRLo67qC6TKD6+9e5N5xCxO+VjLBzE/KbPuD8cIacV+oYCv7r+
 BlIkQUnmmEOIrcO4wc+DuH3cYpRCDGerp8L8TI/+bkuVzGoc0nvSOuvfe++hqbDP
 fkt5jcFUSN5NYmXQmfe5bdh41YC/uHAUw0+PpP+bt1ygdG4jfsMXbUOE7NVUa79Y
 Ovg+SSrUKdnTtKJqjIkpvapZkeXl4Pn4vcxiIo1AX+KfMYpNX0CYm2ByN5UIICa7
 S0KE4Ac4Di7qg9jhFTTp8180sbgIxo6XCbIszRDyDvWJpVWFdZB5L1BNeZHgIDut
 4JsDXCH0u75IdmLLwK2yTWbNaQSYUQcZfuMLybjso7sMUGnubjQT7e4k5qvEsWZj
 pY+uCByowDMh8gkrkdSsGwf8pPX/4ka/bKgNyAstOkkXERHMV5NPWv7Ex0zfwjK6
 9QsJ8Wjfn0d3HqkHd+rD2cQWNYm2a3N5MhGZLK7vUQYulLa6MSwxEEjG4cv+E+aI
 T+MdhtX8mes45xfxzgMMY7JJBMEp3M9TQM6DjgtrWdig7sGDB93oNK2aCeg1zqvB
 Fvsn+izJCs2Ocks6+BWCIEaDYPAeTKnmat/7ZaUfLs/uw8nPS9RGE3/0OeiUJeGK
 EvT1mE6DYhVa1cY=
 =e6ty
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fix from Bjorn Helgaas:

 - Prevent bwctrl NULL pointer dereference that caused hangs on shutdown
   on ASUS ROG Strix SCAR 17 G733PYV (Lukas Wunner)

* tag 'pci-v6.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/bwctrl: Fix NULL pointer deref on unbind and bind
2025-01-14 11:32:14 -08:00
Liao Chen
26cdda5444
PCI: mvebu: Enable module autoloading
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based
on the alias from of_device_id table.

Link: https://lore.kernel.org/r/20240903132311.961646-1-liaochen4@huawei.com
Signed-off-by: Liao Chen <liaochen4@huawei.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
2025-01-14 01:36:39 +00:00
Anand Moon
abdd4c8ea7 PCI: rockchip: Simplify clock handling by using clk_bulk*() functions
Currently, the driver acquires clocks and prepare/enable/disable/unprepare
the clocks individually thereby making the driver complex to read.

The driver can be simplified by using the clk_bulk*() APIs.

Use:

  - devm_clk_bulk_get_all() API to acquire all the clocks
  - clk_bulk_prepare_enable() to prepare/enable clocks
  - clk_bulk_disable_unprepare() APIs to disable/unprepare them in bulk

Link: https://lore.kernel.org/r/20241202151150.7393-2-linux.amoon@gmail.com
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
[bhelgaas: squash error handling fix from https://lore.kernel.org/r/20250106153041.55267-1-linux.amoon@gmail.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 15:10:38 -06:00
Damien Le Moal
fd46bc0e0b PCI: rockchip: Add missing fields descriptions for struct rockchip_pcie_ep
When compiling the Rockchip endpoint driver with -W=1, the following
warnings can be seen in the output:

  drivers/pci/controller/pcie-rockchip-ep.c:59: warning: Function parameter or struct member 'perst_irq' not described in 'rockchip_pcie_ep'
  drivers/pci/controller/pcie-rockchip-ep.c:59: warning: Function parameter or struct member 'perst_asserted' not described in 'rockchip_pcie_ep'
  drivers/pci/controller/pcie-rockchip-ep.c:59: warning: Function parameter or struct member 'link_up' not described in 'rockchip_pcie_ep'
  drivers/pci/controller/pcie-rockchip-ep.c:59: warning: Function parameter or struct member 'link_training' not described in 'rockchip_pcie_ep'

Fix these warnings by adding the missing field descriptions in
struct rockchip_pcie_ep kernel-doc comment.

Fixes: a7137cbf6b ("PCI: rockchip-ep: Handle PERST# signal in EP mode")
Fixes: bd6e61df4b ("PCI: rockchip-ep: Improve link training")
Link: https://lore.kernel.org/r/20241216133404.540736-1-dlemoal@kernel.org
Reported-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 14:45:41 -06:00
King Dix
2d2da5a4c1
PCI: rcar-ep: Fix incorrect variable used when calling devm_request_mem_region()
The rcar_pcie_parse_outbound_ranges() uses the devm_request_mem_region()
macro to request a needed resource. A string variable that lives on the
stack is then used to store a dynamically computed resource name, which
is then passed on as one of the macro arguments. This can lead to
undefined behavior.

Depending on the current contents of the memory, the manifestations of
errors may vary. One possible output may be as follows:

  $ cat /proc/iomem
  30000000-37ffffff :
  38000000-3fffffff :

Sometimes, garbage may appear after the colon.

In very rare cases, if no NULL-terminator is found in memory, the system
might crash because the string iterator will overrun which can lead to
access of unmapped memory above the stack.

Thus, fix this by replacing outbound_name with the name of the previously
requested resource. With the changes applied, the output will be as
follows:

  $ cat /proc/iomem
  30000000-37ffffff : memory2
  38000000-3fffffff : memory3

Fixes: 2a6d0d63d9 ("PCI: rcar: Add endpoint mode support")
Link: https://lore.kernel.org/r/tencent_DBDCC19D60F361119E76919ADAB25EC13C06@qq.com
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: King Dix <kingdix10@qq.com>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 07:46:07 +00:00
Douglas Anderson
17bd5e4dc9
PCI: mediatek-gen3: Enable async probe by default
The mediatek-gen3 driver can run its probe routine fairly slow on some
hardware, which adds to the total time it takes for the system start up.

Thus, turn on async mode for the probe to avoid blocking the rest of the
system.

Link: https://lore.kernel.org/r/20241220145205.1.Ibf2563896c3b1fc133bb46d3fc96ad0041763922@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 07:12:44 +00:00
Lorenzo Bianconi
491cb9c508
PCI: mediatek-gen3: Avoid PCIe resetting via PERST# for Airoha EN7581 SoC
Airoha EN7581 has a hw bug asserting/releasing PERST# signal causing
occasional PCIe link down issues. In order to overcome the problem,
PERST# signal is not asserted/released during device probe or
suspend/resume phase and the PCIe block is reset using
en7523_reset_assert() and en7581_pci_enable().

Introduce flags field in the mtk_gen3_pcie_pdata struct in order to
specify per-SoC capabilities.

Link: https://lore.kernel.org/r/20250109-pcie-en7581-rst-fix-v4-1-4a45c89fb143@kernel.org
Tested-by: Hui Ma <hui.ma@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-01-13 07:09:42 +00:00
Lorenzo Bianconi
c98bee18d0
PCI: mediatek-gen3: Rely on msleep() in mtk_pcie_en7581_power_up()
Since mtk_pcie_en7581_power_up() runs in non-atomic context, rely on
msleep() routine instead of mdelay().

Link: https://lore.kernel.org/r/20250108-pcie-en7581-fixes-v6-5-21ac939a3b9b@kernel.org
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 07:07:40 +00:00
Lorenzo Bianconi
90d4e466c9
PCI: mediatek-gen3: Move reset delay in mtk_pcie_en7581_power_up()
Airoha EN7581 has a hw bug asserting/releasing PCIE_PE_RSTB signal
causing occasional PCIe link down issues. In order to overcome the
problem, PCIe block is reset using REG_PCI_CONTROL (0x88) and
REG_RESET_CONTROL (0x834) registers available in the clock module
running clk_bulk_prepare_enable() in mtk_pcie_en7581_power_up().

In order to make the code more readable, move the wait for the time
needed to complete the PCIe reset from en7581_pci_enable() to
mtk_pcie_en7581_power_up().

Reduce reset timeout from 250ms to the standard PCIE_T_PVPERL_MS value
(100ms) since it has no impact on the driver behavior.

Link: https://lore.kernel.org/r/20250108-pcie-en7581-fixes-v6-4-21ac939a3b9b@kernel.org
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
2025-01-13 07:07:18 +00:00
Lorenzo Bianconi
0c9d2d2ef0
PCI: mediatek-gen3: Add comment about initialization order in mtk_pcie_en7581_power_up()
Add a comment in mtk_pcie_en7581_power_up() to clarify, unlike the other
MediaTek Gen3 controllers, the Airoha EN7581 requires PHY initialization
and power-on before PHY reset deassert.

Link: https://lore.kernel.org/r/20250108-pcie-en7581-fixes-v6-3-21ac939a3b9b@kernel.org
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
2025-01-13 07:07:09 +00:00
Lorenzo Bianconi
e4c7dfd953
PCI: mediatek-gen3: Move reset/assert callbacks in .power_up()
In order to make the code more readable, the reset_control_bulk_assert()
function for PHY reset lines is moved to make it pair with
reset_control_bulk_deassert() in mtk_pcie_power_up() and
mtk_pcie_en7581_power_up(). The same change is done for
reset_control_assert() used to assert MAC reset line.

Introduce PCIE_MTK_RESET_TIME_US macro for the time needed to
complete PCIe reset on MediaTek controller.

Link: https://lore.kernel.org/r/20250108-pcie-en7581-fixes-v6-2-21ac939a3b9b@kernel.org
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 07:05:01 +00:00
Lorenzo Bianconi
0e7a622da1
PCI: mediatek-gen3: Rely on clk_bulk_prepare_enable() in mtk_pcie_en7581_power_up()
Replace clk_bulk_prepare() and clk_bulk_enable() with
clk_bulk_prepare_enable() in mtk_pcie_en7581_power_up() routine.

Link: https://lore.kernel.org/r/20250108-pcie-en7581-fixes-v6-1-21ac939a3b9b@kernel.org
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-13 07:04:45 +00:00
Lukas Wunner
15b8968dcb PCI/bwctrl: Fix NULL pointer deref on unbind and bind
The interrupt handler for bandwidth notifications, pcie_bwnotif_irq(),
dereferences a "data" pointer.

On unbind, that pointer is set to NULL by pcie_bwnotif_remove().  However
the interrupt handler may still be invoked afterwards and will dereference
that NULL pointer.

That's because the interrupt is requested using a devm_*() helper and the
driver core releases devm_*() resources *after* calling ->remove().

pcie_bwnotif_remove() does clear the Link Bandwidth Management Interrupt
Enable and Link Autonomous Bandwidth Interrupt Enable bits in the Link
Control Register, but that won't prevent execution of pcie_bwnotif_irq():
The interrupt for bandwidth notifications may be shared with AER, DPC,
PME, and hotplug.  So pcie_bwnotif_irq() may be executed as long as the
interrupt is requested.

There's a similar race on bind:  pcie_bwnotif_probe() requests the
interrupt when the "data" pointer still points to NULL.  A NULL pointer
deref may thus likewise occur if AER, DPC, PME or hotplug raise an
interrupt in-between the bandwidth controller's call to devm_request_irq()
and assignment of the "data" pointer.

Drop the devm_*() usage and reorder requesting of the interrupt to fix the
issue.

While at it, drop a stray but harmless no_free_ptr() invocation when
assigning the "data" pointer in pcie_bwnotif_probe().

Ilpo points out that the locking on unbind and bind needs to be symmetric,
so move the call to pcie_bwnotif_disable() inside the critical section
protected by pcie_bwctrl_setspeed_rwsem and pcie_bwctrl_lbms_rwsem.

Evert reports a hang on shutdown of an ASUS ROG Strix SCAR 17 G733PYV.
The issue is no longer reproducible with the present commit.

Evert found that attaching a USB-C monitor prevented the hang.  The
machine contains an ASMedia USB 3.2 controller below a hotplug-capable
Root Port.  So one possible explanation is that the controller gets
hot-removed on shutdown unless something is connected.  And the ensuing
hotplug interrupt occurs exactly when the bandwidth controller is
unregistering.  The precise cause could not be determined because the
screen had already turned black when the hang occurred.

Link: https://lore.kernel.org/r/ae2b02c9cfbefff475b6e132b0aa962aaccbd7b2.1736162539.git.lukas@wunner.de
Fixes: 665745f274 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Reported-by: Evert Vorster <evorster@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219629
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Evert Vorster <evorster@gmail.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-07 14:24:06 -06:00
Linus Torvalds
feffd35a03 Fix bogus MSI IRQ setup warning on RISC-V.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmdxCpYRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ivVA//YII6d9TjaEVcADmGnuM6wo4/QGkVmjqW
 q5sQUrzsk9Y7uOkru5OfzMpH1/SlXy/LxGTjPb8V4u8d77EK1lwudWyqfubB3MRo
 F/x34lybF7JE8w0sp3a1DIXQ807oPEs85onGEbNMSMiG8XU/l5xBSSX0IZYevgU/
 y0MEfFwV6fU8Wo9MFji8O7K1gW3hbWKfuKL7VlB7Ej3/QgpCAS1adpndysFLc5EE
 3b/bpjX6pEGW4BHUKGPSyJm0yuOKFbdZDg7c8whsix7PNsKI1kgQAxg2FxW9nLHj
 NDDReTw0A9eZUTh4d9OLy55+eGM1BevusEf5gomE5cR5b/4HZmmmC4SjE+k7pszN
 Qos+OfdaA2teRJWzBKnIxgqOPCc/h9CCS9ta89K9W4z4QrbpHHv5vYReE9ooQ9o/
 hH909mDhKQ/fdwIjJESkz3MVIxnqlXAqTA+VWEH1qXistGqQU4Tb2ywhPLzi1JUW
 vc/dANwef/7CQwfQf0RnPVK20+FATkOHwhw9a8Jwe/aPyrfAQShJi+1N0GUhlAYn
 n868CGMCJ+yk+RSgsqcxq3r7M4vtbnn7YxKPqlwkz9IOsnCpAUebYxm28SMo0RPH
 Xkbp4SwO5A7CIad1JF+U3am0DxTnpRj84ItftleYFfKfGmVhDnqxKfSs25mwdnee
 w46Edap72F0=
 =EuZb
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "Fix bogus MSI IRQ setup warning on RISC-V"

* tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Handle lack of irqdomain gracefully
2024-12-29 10:03:01 -08:00
Cristian Ciocaltea
10106d5c1f PCI: exynos: Switch to devm_clk_bulk_get_all_enabled()
The helper devm_clk_bulk_get_all_enable() missed to return the number of
clocks stored in the clk_bulk_data table referenced by the clks
argument and, therefore, will be dropped.

Use the newly introduced devm_clk_bulk_get_all_enabled() variant
instead, which is consistent with devm_clk_bulk_get_all() in terms of
the returned value:

 > 0 if one or more clocks have been stored
 = 0 if there are no clocks
 < 0 if an error occurred

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://lore.kernel.org/r/20241217-clk_bulk_ena_fix-v5-2-aafbbb245155@collabora.com
Acked-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-12-23 12:53:52 -08:00
Daniel Wagner
22d813bf00 PCI: hookup irq_get_affinity callback
struct bus_type has a new callback for retrieving the IRQ affinity for a
device. Hook this callback up for PCI based devices.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-2-27211e9c2cd5@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23 08:17:23 -07:00
Linus Torvalds
a99b4a369a PCI fixes for Linux 6.13
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEtJ9XYyOm/GqvHx8fGR2jT3jNOcFAmdmsgwACgkQfGR2jT3j
 NOcr7A//RB1SINcsO4gaw+hU5GypC2/P0LOsuufV0+6pGA3iN9XIafbiw8NeSENf
 7kH8IQYkdtkhRs7mPoSTzGhsupMuqbHJ4TjVVztJzGTl6KRo8O1WAF4XQN9jd+Mg
 aomkA4KuT3Ss4yzl549hJ7RNzkhot7O3y1xRA9P3PtdvvFSyTboB7hYmjP9DLQ8F
 EuWXIds2JfZrid6Ru01gSvuI+vFt8ZDw+F6fJCITQQvHzaUzy3GcQtl6MoKQklMc
 PtuNoZYaiLyjROb51cS1M4B631JI4bbQ8j5hELYidNDdcdXBHpeW7RMuDhlkeNSK
 MTt7Zu7kd2UsBuSpu0v7i9u79QTTgMn0te5i0gpfZW9T+YlU7gLp0XF1gajecvGz
 QY8xJnTOy8JEv15bx+P5mHKGmKYQCbo4zFMWaKgIlS6ge4i3ZXjX2C2n2GpS7bg6
 8zJ9lXbRUXbBeKf8XSJ0cehtgBIROZBLSrVF8WRLI0uHyVhLqMF7iSJ5FrbVz1u2
 34BTLYKuewU7T5gvl+BSfhF7CERUrRqVwPYiHiziJyD+R57dA7YCAAcZCAB/ChlB
 4pIDrTHidcHuFCknT9WS7ZJOzcle63ci/vQ0gs09ZLCNEcU5Ttj0klbm3tsYD0WX
 G2wxLHsC7+j0ycE739yWTdneMb9uKycm+/RkBdBKBJdCGXWdlSA=
 =zLos
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Krzysztof Wilczyński:
 "Two small patches that are important for fixing boot time hang on
  Intel JHL7540 'Titan Ridge' platforms equipped with a Thunderbolt
  controller.

  The boot time issue manifests itself when a PCI Express bandwidth
  control is unnecessarily enabled on the Thunderbolt controller
  downstream ports, which only supports a link speed of 2.5 GT/s in
  accordance with USB4 v2 specification (p. 671, sec. 11.2.1, "PCIe
  Physical Layer Logical Sub-block").

  As such, there is no need to enable bandwidth control on such
  downstream port links, which also works around the issue.

  Both patches were tested by the original reporter on the hardware on
  which the failure origin golly manifested itself. Both fixes were
  proven to resolve the reported boot hang issue, and both patches have
  been in linux-next this week with no reported problems"

* tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/bwctrl: Enable only if more than one speed is supported
  PCI: Honor Max Link Speed when determining supported speeds
2024-12-21 10:51:04 -08:00
Lukas Wunner
774c71c52a
PCI/bwctrl: Enable only if more than one speed is supported
If a PCIe port only supports a single speed, enabling bandwidth control
is pointless:  There's no need to monitor autonomous speed changes, nor
can the speed be changed.

Not enabling it saves a small amount of memory and compute resources,
but also fixes a boot hang reported by Niklas:  It occurs when enabling
bandwidth control on Downstream Ports of Intel JHL7540 "Titan Ridge 2018"
Thunderbolt controllers.  The ports only support 2.5 GT/s in accordance
with USB4 v2 sec 11.2.1, so the present commit works around the issue.

PCIe r6.2 sec 8.2.1 prescribes that:

   "A device must support 2.5 GT/s and is not permitted to skip support
    for any data rates between 2.5 GT/s and the highest supported rate."

Consequently, bandwidth control is currently only disabled if a port
doesn't support higher speeds than 2.5 GT/s.  However the Implementation
Note in PCIe r6.2 sec 7.5.3.18 cautions:

   "It is strongly encouraged that software primarily utilize the
    Supported Link Speeds Vector instead of the Max Link Speed field,
    so that software can determine the exact set of supported speeds on
    current and future hardware.  This can avoid software being confused
    if a future specification defines Links that do not require support
    for all slower speeds."

In other words, future revisions of the PCIe Base Spec may allow gaps
in the Supported Link Speeds Vector.  To be future-proof, don't just
check whether speeds above 2.5 GT/s are supported, but rather check
whether *more than one* speed is supported.

Fixes: 665745f274 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Closes: https://lore.kernel.org/r/db8e457fcd155436449b035e8791a8241b0df400.camel@kernel.org
Link: https://lore.kernel.org/r/3564908a9c99fc0d2a292473af7a94ebfc8f5820.1734428762.git.lukas@wunner.de
Reported-by: Niklas Schnelle <niks@kernel.org>
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Jonathan Cameron <Jonthan.Cameron@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-19 16:36:36 +00:00
Lukas Wunner
3202ca2215
PCI: Honor Max Link Speed when determining supported speeds
The Supported Link Speeds Vector in the Link Capabilities 2 Register
indicates the *supported* link speeds.  The Max Link Speed field in the
Link Capabilities Register indicates the *maximum* of those speeds.

pcie_get_supported_speeds() neglects to honor the Max Link Speed field and
will thus incorrectly deem higher speeds as supported.  Fix it.

One user-visible issue addressed here is an incorrect value in the sysfs
attribute "max_link_speed".

But the main motivation is a boot hang reported by Niklas:  Intel JHL7540
"Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds,
but indicate 2.5 GT/s as maximum.  Ilpo recalls seeing this on more
devices.  It can be explained by the controller's Downstream Ports
supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s
if the port interfaces to a PCIe Adapter, in accordance with USB4 v2
sec 11.2.1:

   "This section defines the functionality of an Internal PCIe Port that
    interfaces to a PCIe Adapter. [...]
    The Logical sub-block shall update the PCIe configuration registers
    with the following characteristics: [...]
    Max Link Speed field in the Link Capabilities Register set to 0001b
    (data rate of 2.5 GT/s only).
    Note: These settings do not represent actual throughput. Throughput
    is implementation specific and based on the USB4 Fabric performance."

The present commit is not sufficient on its own to fix Niklas' boot hang,
but it is a prerequisite:  A subsequent commit will fix the boot hang by
enabling bandwidth control only if more than one speed is supported.

The GENMASK() macro used herein specifies 0 as lowest bit, even though
the Supported Link Speeds Vector ends at bit 1.  This is done on purpose
to avoid a GENMASK(0, 1) macro if Max Link Speed is zero.  That macro
would be invalid as the lowest bit is greater than the highest bit.
Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated
Endpoints in particular, so it does occur in practice.  (The Link
Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.)

Fixes: d2bd39c045 ("PCI: Store all PCIe Supported Link Speeds")
Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org
Link: https://lore.kernel.org/r/fe03941e3e1cc42fb9bf4395e302bff53ee2198b.1734428762.git.lukas@wunner.de
Reported-by: Niklas Schnelle <niks@kernel.org>
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-19 16:35:59 +00:00
Thippeswamy Havalige
4eea7596b8
PCI: xilinx-cpm: Add support for Versal CPM5 Root Port Controller 1
Add support for the Xilinx Versal CPM5 Root Port Controller 1. The key
difference between Controller 0 and Controller 1 lies in the
platform-specific error interrupt bits, which are located at different
register offsets.

To handle these differences, updated variant structure to hold the
following platform-specific details:

- Interrupt status register offset (ir_status)
- Interrupt enable register offset (ir_enable)
- Miscellaneous interrupt values (ir_misc_value)

The driver differentiates between Controller 0 and Controller 1 using the
compatible string in the device tree. This ensures that the appropriate
register offsets are used for each controller, allowing for correct
handling of platform-specific interrupts and initialization.

Link: https://lore.kernel.org/r/20240922061318.2653503-3-thippesw@amd.com
Signed-off-by: Thippeswamy Havalige <thippesw@amd.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-12-18 23:22:00 +00:00
Niklas Cassel
0e7faea188
PCI: endpoint: Verify that requested BAR size is a power of two
When allocating a BAR using pci_epf_alloc_space(), there are checks that
round up the size to a power of two.

However, there is no check in pci_epc_set_bar() which verifies that the
requested BAR size is a power of two.

Add a power of two check in pci_epc_set_bar(), so that we don't need to
add such a check in each and every PCI endpoint controller driver.

Link: https://lore.kernel.org/r/20241213143301.4158431-14-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-12-18 21:52:16 +00:00
Niklas Cassel
f015b53d63
PCI: endpoint: Add size check for fixed size BARs in pci_epc_set_bar()
A BAR of type BAR_FIXED has a fixed BAR size (the size cannot be changed).

When using pci_epf_alloc_space() to allocate backing memory for a BAR,
pci_epf_alloc_space() will always set the size to the fixed BAR size if
the BAR type is BAR_FIXED (and will give an error if you the requested size
is larger than the fixed BAR size).

However, some drivers might not call pci_epf_alloc_space() before calling
pci_epc_set_bar(), so add a check in pci_epc_set_bar() to ensure that an
EPF driver cannot set a size different from the fixed BAR size, if the BAR
type is BAR_FIXED.

The pci_epc_function_is_valid() check is removed because this check is now
done by pci_epc_get_features().

Link: https://lore.kernel.org/r/20241213143301.4158431-13-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-12-18 21:52:09 +00:00
Niklas Cassel
b61fef0813
PCI: artpec6: Implement dw_pcie_ep operation get_features
All non-DWC EPC drivers implement (struct pci_epc *)->ops->get_features().
All DWC EPC drivers implement (struct dw_pcie_ep *)->ops->get_features(),
except for pcie-artpec6.c.

epc_features has been required in pci-epf-test.c since commit 6613bc2301
("PCI: endpoint: Fix NULL pointer dereference for ->get_features()").

A follow-up commit will make further use of epc_features in EPC core code.

Implement epc_features in the only EPC driver where it is currently not
implemented.

Link: https://lore.kernel.org/r/20241213143301.4158431-12-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
2024-12-18 21:51:47 +00:00
Niklas Cassel
129f6af747
PCI: dwc: ep: Add 'address' alignment to 'size' check in dw_pcie_prog_ep_inbound_atu()
dw_pcie_prog_ep_inbound_atu() is used to program an inbound iATU in
"BAR Match Mode".

A memory address returned by e.g. kmalloc() is guaranteed to have natural
alignment (aligned to the size of the allocation). It is however not
guaranteed that pci_epc_set_bar() (and thus dw_pcie_prog_ep_inbound_atu())
is supplied an address that has natural alignment. (An EPF driver can send
in an arbitrary physical address to pci_epc_set_bar().)

The DWC Databook description for the LWR_TARGET_RW and LWR_TARGET_HW fields
in the IATU_LWR_TARGET_ADDR_OFF_INBOUND_i registers state that:
"Field size depends on log2(BAR_MASK+1) in BAR match mode."

I.e. only the upper bits are writable, and the number of writable bits is
dependent on the configured BAR_MASK.

Add a check to ensure that the physical address programmed in the iATU is
aligned to the size of the BAR (BAR_MASK+1), as without this, we can get
hard to debug errors, as we could write to bits that are read-only (without
getting a write error), which could cause the iATU to end up redirecting to
a physical address that is different from the address that we intended.

Link: https://lore.kernel.org/r/20241213143301.4158431-11-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-12-18 21:51:30 +00:00
Niklas Cassel
3708acbd5f
PCI: dwc: ep: Prevent changing BAR size/flags in pci_epc_set_bar()
In commit 4284c88fff ("PCI: designware-ep: Allow pci_epc_set_bar() update
inbound map address") set_bar() was modified to support dynamically
changing the backing physical address of a BAR that was already configured.

This means that set_bar() can be called twice, without ever calling
clear_bar() (as calling clear_bar() would clear the BAR's PCI address
assigned by the host).

This can only be done if the new BAR size/flags does not differ from the
existing BAR configuration. Add these missing checks.

If we allow set_bar() to set e.g. a new BAR size that differs from the
existing BAR size, the new address translation range will be smaller than
the BAR size already determined by the host, which would mean that a read
past the new BAR size would pass the iATU untranslated, which could allow
the host to read memory not belonging to the new struct pci_epf_bar.

While at it, add comments which clarifies the support for dynamically
changing the physical address of a BAR. (Which was also missing.)

Fixes: 4284c88fff ("PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address")
Link: https://lore.kernel.org/r/20241213143301.4158431-10-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: stable@vger.kernel.org
2024-12-18 21:51:19 +00:00
Niklas Cassel
33a6938e0c
PCI: dwc: ep: Write BAR_MASK before iATU registers in pci_epc_set_bar()
The "DesignWare Cores PCI Express Controller Register Descriptions,
Version 4.60a", section "1.21.70 IATU_LWR_TARGET_ADDR_OFF_INBOUND_i",
fields LWR_TARGET_RW and LWR_TARGET_HW both state that:
"Field size depends on log2(BAR_MASK+1) in BAR match mode."

I.e. only the upper bits are writable, and the number of writable bits is
dependent on the configured BAR_MASK.

If we do not write the BAR_MASK before writing the iATU registers, we are
relying the reset value of the BAR_MASK being larger than the requested
BAR size (which is supplied in the struct pci_epf_bar which is passed to
pci_epc_set_bar()). The reset value of the BAR_MASK is SoC dependent.

Thus, if the struct pci_epf_bar requests a BAR size that is larger than the
reset value of the BAR_MASK, the iATU will try to write to read-only bits,
which will cause the iATU to end up redirecting to a physical address that
is different from the address that was intended.

Thus, we should always write the iATU registers after writing the BAR_MASK.

Fixes: f8aed6ec62 ("PCI: dwc: designware: Add EP mode support")
Link: https://lore.kernel.org/r/20241213143301.4158431-9-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: stable@vger.kernel.org
2024-12-18 21:50:47 +00:00
Thomas Gleixner
a60b990798 PCI/MSI: Handle lack of irqdomain gracefully
Alexandre observed a warning emitted from pci_msi_setup_msi_irqs() on a
RISCV platform which does not provide PCI/MSI support:

 WARNING: CPU: 1 PID: 1 at drivers/pci/msi/msi.h:121 pci_msi_setup_msi_irqs+0x2c/0x32
 __pci_enable_msix_range+0x30c/0x596
 pci_msi_setup_msi_irqs+0x2c/0x32
 pci_alloc_irq_vectors_affinity+0xb8/0xe2

RISCV uses hierarchical interrupt domains and correctly does not implement
the legacy fallback. The warning triggers from the legacy fallback stub.

That warning is bogus as the PCI/MSI layer knows whether a PCI/MSI parent
domain is associated with the device or not. There is a check for MSI-X,
which has a legacy assumption. But that legacy fallback assumption is only
valid when legacy support is enabled, but otherwise the check should simply
return -ENOTSUPP.

Loongarch tripped over the same problem and blindly enabled legacy support
without implementing the legacy fallbacks. There are weak implementations
which return an error, so the problem was papered over.

Correct pci_msi_domain_supports() to evaluate the legacy mode and add
the missing supported check into the MSI enable path to complete it.

Fixes: d2a463b297 ("PCI/MSI: Reject multi-MSI early")
Reported-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/87ed2a8ow5.ffs@tglx
2024-12-16 10:59:47 +01:00
Jian-Hong Pan
1db806ec06 PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()
After 17423360a2 ("PCI/ASPM: Save L1 PM Substates Capability for
suspend/resume"), pci_save_aspm_l1ss_state(dev) saves the L1SS state for
"dev", and pci_restore_aspm_l1ss_state(dev) restores the state for both
"dev" and its parent.

The problem is that unless pci_save_state() has been used in some other
path and has already saved the parent L1SS state, we will restore junk to
the parent, which means the L1 Substates likely won't work correctly.

Save the L1SS config for both the device and its parent in
pci_save_aspm_l1ss_state().  When restoring, we need both because L1SS must
be enabled at the parent (the Downstream Port) before being enabled at the
child (the Upstream Port).

Link: https://lore.kernel.org/r/20241115072200.37509-3-jhp@endlessos.org
Fixes: 17423360a2 ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218394
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
[bhelgaas: parallel save/restore structure, simplify commit log, patch at
https://lore.kernel.org/r/20241212230340.GA3267194@bhelgaas]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jian-Hong Pan <jhp@endlessos.org> # Asus B1400CEAE
2024-12-13 12:20:58 -06:00
Zijun Hu
3b9f942eb2 PCI: endpoint: Finish virtual EP removal in pci_epf_remove_vepf()
When removing a virtual Endpoint, pci_epf_remove_vepf() failed to clear
epf_vf->epf_pf, which caused a subsequent pci_epf_add_vepf() to incorrectly
return -EBUSY:

  pci_epf_add_vepf(epf_pf, epf_vf)      // add
  pci_epf_remove_vepf(epf_pf, epf_vf)   // remove
  pci_epf_add_vepf(epf_pf, epf_vf)      // add again, -EBUSY error

Fix by clearing epf_vf->epf_pf in pci_epf_remove_vepf().

Link: https://lore.kernel.org/r/20241210-pci-epc-core_fix-v3-3-4d86dd573e4b@quicinc.com
Fixes: 1cf362e907 ("PCI: endpoint: Add support to add virtual function in endpoint core")
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Cc: stable@vger.kernel.org
2024-12-12 13:07:19 -06:00
Zijun Hu
e0be8511ff PCI: endpoint: Simplify pci_epc_get()
Simplify pci_epc_get() implementation by using class_find_device_by_name().

Link: https://lore.kernel.org/r/20241210-pci-epc-core_fix-v3-2-4d86dd573e4b@quicinc.com
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
2024-12-12 13:06:50 -06:00
Zijun Hu
d4929755e4 PCI: endpoint: Destroy the EPC device in devm_pci_epc_destroy()
The devm_pci_epc_destroy() comment says destroys the EPC device, but it
does not actually do that since devres_destroy() does not call
devm_pci_epc_release(), and it also can not fully undo what the API
devm_pci_epc_create() does, so it is faulty.

Fortunately, the faulty API has not been used by current kernel tree.  Use
devres_release() instead of devres_destroy() so the EPC device will be
released.

Link: https://lore.kernel.org/r/20241210-pci-epc-core_fix-v3-1-4d86dd573e4b@quicinc.com
Fixes: 5e8cb40338 ("PCI: endpoint: Add EP core layer to enable EP controller and EP functions")
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-12-12 13:04:08 -06:00
Thomas Weißschuh
d5bdde0b38 PCI/ACPI: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Link: https://lore.kernel.org/r/20241202-sysfs-const-bin_attr-pci-v1-4-c32360f495a7@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-12-10 13:00:50 -06:00
Philipp Stanner
9dfc6850cf PCI: Encourage resource request API users to supply driver name
PCI region request functions have a @name parameter (sometimes called
"res_name"). It is used in a log message to inform drivers about request
collisions, e.g., when another driver has requested that region already.

This message is only useful when it contains the actual owner of the
region, i.e., which driver requested it. So far, a great many drivers
misuse the @name parameter and just pass pci_name(), which doesn't result
in useful debug information.

Rename "res_name" to "name".

Detail @name's purpose in the docstrings.

Link: https://lore.kernel.org/r/20241203100023.31152-2-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
[bhelgaas: tweak comment wording to include "driver"]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-12-05 11:29:52 -06:00
Thomas Weißschuh
35fdb302ca PCI/P2PDMA: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Link: https://lore.kernel.org/r/20241202-sysfs-const-bin_attr-pci-v1-3-c32360f495a7@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2024-12-03 15:29:59 -06:00
Thomas Weißschuh
3c39919a5f PCI/VPD: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Link: https://lore.kernel.org/r/20241202-sysfs-const-bin_attr-pci-v1-2-c32360f495a7@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-12-03 15:25:41 -06:00
Thomas Weißschuh
0530ad489d PCI/sysfs: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Link: https://lore.kernel.org/r/20241202-sysfs-const-bin_attr-pci-v1-1-c32360f495a7@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-12-03 15:25:41 -06:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Linus Torvalds
0cb71708c5 pci-v6.13-fixes-1
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmdLqNcUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vzv8Q/+IMfZpw/zOojgVKa7uL8Xv44t+hhX
 DYMghaRJpg392D49DN/cTRST9PO6iWT3FI/EUC+kYDIyHucUZ7qEO5lL/8B/sfwi
 i9qZHODOwqYRy9VRTmBBgmPsq9sGeKZnCSo7kQ1JW4syiYN+SFaFxNG4eSUxPe4K
 yJ958TJqV0dd3YAT8ItcBxaJS/C2FuIvPfmvwjG1GFQwSwuEjeOTnqoN33Pwey/X
 P6p4xLwd+IcEvJ/o9ygm/4g+nT/FTgb91P+FCCUfcOsu0IsxGxei/dIR6JuDmmK/
 QwPM+RnnpXJG5hwm5tKv6I/sJND1EMktTDIFQZawX9/S8qF2KhjJupEB8/pxqFca
 V+kQopiVKysgZFXWtbCFf6rb07s3UO6HnwF6sUgoP7yD5xO4QcEKsbNvftn6kruV
 dfuFericd7MPNlfYwxLLctPWa8IOyJQkqCrClDZQQa3YxWx6FDB9+AWayBfd2lDj
 VTRcupotQX36gOYBZa7F1zN3A/gh/dJ6qdk68WCcWIzkzdmYDZ6iDM6/GiGHjwOk
 ri6b0FyazVSUu17V4Gw5xnqlH6dfCBznjluwdqAbNUXkEusNIf7OxlQxPaAKmtbm
 MTYn20op9OBv1VYgx8oj9iMZGBNKWkoHK2SJnS94xUSje08P0ubhpw2eTpi00Zx6
 7PzJNqBARBF+n6I=
 =bPnr
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fix from Bjorn Helgaas:

 - When removing a PCI device, only look up and remove a platform device
   if there is an associated device node for which there could be a
   platform device, to fix a merge window regression (Brian Norris)

* tag 'pci-v6.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/pwrctrl: Unregister platform device only if one actually exists
2024-11-30 18:23:05 -08:00
Brian Norris
5c8418cf40 PCI/pwrctrl: Unregister platform device only if one actually exists
If a PCI device has an associated device_node with power supplies,
pci_bus_add_device() creates platform devices for use by pwrctrl.  When the
PCI device is removed, pci_stop_dev() uses of_find_device_by_node() to
locate the related platform device, then unregisters it.

But when we remove a PCI device with no associated device node,
dev_of_node(dev) is NULL, and of_find_device_by_node(NULL) returns the
first device with "dev->of_node == NULL".  The result is that we (a)
mistakenly unregister a completely unrelated platform device, leading to
issues like the first trace below, and (b) dereference the NULL pointer
from dev_of_node() when clearing OF_POPULATED, as in the second trace.

Unregister a platform device only if there is one associated with this PCI
device.  This resolves issues seen when doing:

  # echo 1 > /sys/bus/pci/devices/.../remove

Sample issue from unregistering the wrong platform device:

  WARNING: CPU: 0 PID: 5095 at drivers/regulator/core.c:5885 regulator_unregister+0x140/0x160
  Call trace:
   regulator_unregister+0x140/0x160
   devm_rdev_release+0x1c/0x30
   release_nodes+0x68/0x100
   devres_release_all+0x98/0xf8
   device_unbind_cleanup+0x20/0x70
   device_release_driver_internal+0x1f4/0x240
   device_release_driver+0x20/0x40
   bus_remove_device+0xd8/0x170
   device_del+0x154/0x380
   device_unregister+0x28/0x88
   of_device_unregister+0x1c/0x30
   pci_stop_bus_device+0x154/0x1b0
   pci_stop_and_remove_bus_device_locked+0x28/0x48
   remove_store+0xa0/0xb8
   dev_attr_store+0x20/0x40
   sysfs_kf_write+0x4c/0x68

Later NULL pointer dereference for of_node_clear_flag(NULL, OF_POPULATED):

  Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
  Call trace:
   pci_stop_bus_device+0x190/0x1b0
   pci_stop_and_remove_bus_device_locked+0x28/0x48
   remove_store+0xa0/0xb8
   dev_attr_store+0x20/0x40
   sysfs_kf_write+0x4c/0x68

Link: https://lore.kernel.org/r/20241126210443.4052876-1-briannorris@chromium.org
Fixes: 681725afb6 ("PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent")
Reported-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Closes: https://lore.kernel.org/r/1732890621-19656-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Brian Norris <briannorris@chromium.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-30 11:41:25 -06:00
Linus Torvalds
55cb93fd24 Driver core changes for 6.13-rc1
Here is a small set of driver core changes for 6.13-rc1.
 
 Nothing major for this merge cycle, except for the 2 simple merge
 conflicts are here just to make life interesting.
 
 Included in here are:
   - sysfs core changes and preparations for more sysfs api cleanups that
     can come through all driver trees after -rc1 is out
   - fw_devlink fixes based on many reports and debugging sessions
   - list_for_each_reverse() removal, no one was using it!
   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.
   - minor bugfixes and changes full details in the shortlog
 
 As mentioned above, there is 2 merge conflicts with your tree, one is
 where the file is removed (easy enough to resolve), the second is a
 build time error, that has been found in linux-next and the fix can be
 seen here:
 	https://lore.kernel.org/r/20241107212645.41252436@canb.auug.org.au
 
 Other than that, the changes here have been in linux-next with no other
 reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ0lEog8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym+0ACgw6wN+LkLVIHWhxTq5DYHQ0QCxY8AoJrRIcKe
 78h0+OU3OXhOy8JGz62W
 =oI5S
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is a small set of driver core changes for 6.13-rc1.

  Nothing major for this merge cycle, except for the two simple merge
  conflicts are here just to make life interesting.

  Included in here are:

   - sysfs core changes and preparations for more sysfs api cleanups
     that can come through all driver trees after -rc1 is out

   - fw_devlink fixes based on many reports and debugging sessions

   - list_for_each_reverse() removal, no one was using it!

   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.

   - minor bugfixes and changes full details in the shortlog"

* tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
  Fix a potential abuse of seq_printf() format string in drivers
  cpu: Remove spurious NULL in attribute_group definition
  s390/con3215: Remove spurious NULL in attribute_group definition
  perf: arm-ni: Remove spurious NULL in attribute_group definition
  driver core: Constify bin_attribute definitions
  sysfs: attribute_group: allow registration of const bin_attribute
  firmware_loader: Fix possible resource leak in fw_log_firmware_info()
  drivers: core: fw_devlink: Fix excess parameter description in docstring
  driver core: class: Correct WARN() message in APIs class_(for_each|find)_device()
  cacheinfo: Use of_property_present() for non-boolean properties
  cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap()
  drivers: core: fw_devlink: Make the error message a bit more useful
  phy: tegra: xusb: Set fwnode for xusb port devices
  drm: display: Set fwnode for aux bus devices
  driver core: fw_devlink: Stop trying to optimize cycle detection logic
  driver core: Constify attribute arguments of binary attributes
  sysfs: bin_attribute: add const read/write callback variants
  sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR()
  sysfs: treewide: constify attribute callback of bin_attribute::llseek()
  sysfs: treewide: constify attribute callback of bin_attribute::mmap()
  ...
2024-11-29 11:43:29 -08:00
Linus Torvalds
1746db26f8 pci-v6.13-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmdE14wUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxMPRAAslaEhHZ06cU/I+BA0UrMJBbzOw+/
 XM2XUojxWaNMYSBPVXbtSBrfFMnox4G3hFBPK0T0HiWoc7wGx/TUVJk65ioqM8ug
 gS/U3NjSlqlnH8NHxKrb/2t0tlMvSll9WwumOD9pMFeMGFOS3fAgUk+fBqXFYsI/
 RsVRMavW9BucZ0yMHpgr0KGLPSt3HK/E1h0NLO+TN6dpFcoIq3XimKFyk1QQQgiR
 V3W21JMwjw+lDnUAsijU+RBYi5Fj6Rpqig/biRnzagVE6PJOci3ZJEBE7dGqm4LM
 UlgG6Ql/eK+bb3fPhcXxVmscj5XlEfbesX5PUzTmuj79Wq5l9hpy+0c654G79y8b
 rGiEVGM0NxmRdbuhWQUM2EsffqFlkFu7MN3gH0tP0Z0t3VTXfBcGrQJfqCcSCZG3
 5IwGdEE2kmGb5c3RApZrm+HCXdxhb3Nwc3P8c27eXDT4eqHWDJag4hzLETNBdIrn
 Rsbgry6zzAVA6lLT0uasUlWerq/I6OrueJvnEKRGKDtbw/JL6PLveR1Rvsc//cQD
 Tu4FcG81bldQTUOdHEgFyJgmSu77Gvfs5RZBV0cEtcCBc33uGJne08kOdGD4BwWJ
 dqN3wJFh5yX4jlMGmBDw0KmFIwKstfUCIoDE4Kjtal02CURhz5ZCDVGNPnSUKN0C
 hflVX0//cRkHc5g=
 =2Otz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Make pci_stop_dev() and pci_destroy_dev() safe so concurrent
     callers can't stop a device multiple times, even as we migrate from
     the global pci_rescan_remove_lock to finer-grained locking (Keith
     Busch)

   - Improve pci_walk_bus() implementation by making it recursive and
     moving locking up to avoid need for a 'locked' parameter (Keith
     Busch)

   - Unexport pci_walk_bus_locked(), which is only used internally by
     the PCI core (Keith Busch)

   - Detect some Thunderbolt chips that are built-in and hence
     'trustworthy' by a heuristic since the 'ExternalFacingPort' and
     'usb4-host-interface' ACPI properties are not quite enough (Esther
     Shimanovich)

  Resource management:

   - Use PCI bus addresses (not CPU addresses) in 'ranges' properties
     when building dynamic DT nodes so systems where PCI and CPU
     addresses differ work correctly (Andrea della Porta)

   - Tidy resource sizing and assignment with helpers to reduce
     redundancy (Ilpo Järvinen)

   - Improve pdev_sort_resources() 'bogus alignment' warning to be more
     specific (Ilpo Järvinen)

  Driver binding:

   - Convert driver .remove_new() callbacks to .remove() again to finish
     the conversion from returning 'int' to being 'void' (Sergio
     Paracuellos)

   - Export pcim_request_all_regions(), a managed interface to request
     all BARs (Philipp Stanner)

   - Replace pcim_iomap_regions_request_all() with
     pcim_request_all_regions(), and pcim_iomap_table()[n] with
     pcim_iomap(n), in the following drivers: ahci, crypto qat, crypto
     octeontx2, intel_th, iwlwifi, ntb idt, serial rp2, ALSA korg1212
     (Philipp Stanner)

   - Remove the now unused pcim_iomap_regions_request_all() (Philipp
     Stanner)

   - Export pcim_iounmap_region(), a managed interface to unmap and
     release a PCI BAR (Philipp Stanner)

   - Replace pcim_iomap_regions(mask) with pcim_iomap_region(n), and
     pcim_iounmap_regions(mask) with pcim_iounmap_region(n), in the
     following drivers: fpga dfl-pci, block mtip32xx, gpio-merrifield,
     cavium (Philipp Stanner)

  Error handling:

   - Add sysfs 'reset_subordinate' to reset the entire hierarchy below a
     bridge; previously Secondary Bus Reset could only be used when
     there was a single device below a bridge (Keith Busch)

   - Warn if we reset a running device where the driver didn't register
     pci_error_handlers notification callbacks (Keith Busch)

  ASPM:

   - Disable ASPM L1 before touching L1 PM Substates to follow the spec
     closer and avoid a CPU load timeout on some platforms (Ajay
     Agarwal)

   - Set devices below Intel VMD to D0 before enabling ASPM L1 Substates
     as required per spec for all L1 Substates changes (Jian-Hong Pan)

  Power management:

   - Enable starfive controller runtime PM before probing host bridge
     (Mayank Rana)

   - Enable runtime power management for host bridges (Krishna chaitanya
     chundru)

  Power control:

   - Use of_platform_device_create() instead of of_platform_populate()
     to create pwrctl platform devices so we can control it based on the
     child nodes (Manivannan Sadhasivam)

   - Create pwrctrl platform devices only if there's a relevant power
     supply property (Manivannan Sadhasivam)

   - Add device link from the pwrctl supplier to the PCI dev to ensure
     pwrctl drivers are probed before the PCI dev driver; this avoids a
     race where pwrctl could change device power state while the PCI
     driver was active (Manivannan Sadhasivam)

   - Find pwrctl device for removal with of_find_device_by_node()
     instead of searching all children of the parent (Manivannan
     Sadhasivam)

   - Rename 'pwrctl' to 'pwrctrl' to match new bandwidth controller
     ('bwctrl') and hotplug files (Bjorn Helgaas)

  Bandwidth control:

   - Add read/modify/write locking for Link Control 2, which is used to
     manage Link speed (Ilpo Järvinen)

   - Extract Link Bandwidth Management Status check into
     pcie_lbms_seen(), where it can be shared between the bandwidth
     controller and quirks that use it to help retrain failed links
     (Ilpo Järvinen)

   - Re-add Link Bandwidth notification support with updates to address
     the reasons it was previously reverted (Alexandru Gagniuc, Ilpo
     Järvinen)

   - Add pcie_set_target_speed() and related functionality so drivers
     can manage PCIe Link speed based on thermal or other constraints
     (Ilpo Järvinen)

   - Add a thermal cooling driver to throttle PCIe Links via the
     existing thermal management framework (Ilpo Järvinen)

   - Add a userspace selftest for the PCIe bandwidth controller (Ilpo
     Järvinen)

  PCI device hotplug:

   - Add hotplug controller driver for Marvell OCTEON multi-function
     device where function 0 has a management console interface to
     enable/disable and provision various personalities for the other
     functions (Shijith Thotton)

   - Retain a reference to the pci_bus for the lifetime of a pci_slot to
     avoid a use-after-free when the thunderbolt driver resets USB4 host
     routers on boot, causing hotplug remove/add of downstream docks or
     other devices (Lukas Wunner)

   - Remove unused cpcihp struct cpci_hp_controller_ops.hardware_test
     (Guilherme Giacomo Simoes)

   - Remove unused cpqphp struct ctrl_dbg.ctrl (Christophe JAILLET)

   - Use pci_bus_read_dev_vendor_id() instead of hand-coded presence
     detection in cpqphp (Ilpo Järvinen)

   - Simplify cpqphp enumeration, which is already simple-minded and
     doesn't handle devices below hot-added bridges (Ilpo Järvinen)

  Virtualization:

   - Add ACS quirk for Wangxun FF5xxx NICs, which don't advertise an ACS
     capability but do isolate functions as though PCI_ACS_RR and
     PCI_ACS_CR were set, so the functions can be in independent IOMMU
     groups (Mengyuan Lou)

  TLP Processing Hints (TPH):

   - Add and document TLP Processing Hints (TPH) support so drivers can
     enable and disable TPH and the kernel can save/restore TPH
     configuration (Wei Huang)

   - Add TPH Steering Tag support so drivers can retrieve Steering Tag
     values associated with specific CPUs via an ACPI _DSM to improve
     performance by directing DMA writes closer to their consumers (Wei
     Huang)

  Data Object Exchange (DOE):

   - Wait up to 1 second for DOE Busy bit to clear before writing a
     request to the mailbox to avoid failures if the mailbox is still
     busy from a previous transfer (Gregory Price)

  Endpoint framework:

   - Skip attempts to allocate from endpoint controller memory window if
     the requested size is larger than the window (Damien Le Moal)

   - Add and document pci_epc_mem_map() and pci_epc_mem_unmap() to
     handle controller-specific size and alignment constraints, and add
     test cases to the endpoint test driver (Damien Le Moal)

   - Implement dwc pci_epc_ops.align_addr() so pci_epc_mem_map() can
     observe DWC-specific alignment requirements (Damien Le Moal)

   - Synchronously cancel command handler work in endpoint test before
     cleaning up DMA and BARs (Damien Le Moal)

   - Respect endpoint page size in dw_pcie_ep_align_addr() (Niklas
     Cassel)

   - Use dw_pcie_ep_align_addr() in dw_pcie_ep_raise_msi_irq() and
     dw_pcie_ep_raise_msix_irq() instead of open coding the equivalent
     (Niklas Cassel)

   - Avoid NULL dereference if Modem Host Interface Endpoint lacks
     'mmio' DT property (Zhongqiu Han)

   - Release PCI domain ID of Endpoint controller parent (not controller
     itself) and before unregistering the controller, to avoid
     use-after-free (Zijun Hu)

   - Clear secondary (not primary) EPC in pci_epc_remove_epf() when
     removing the secondary controller associated with an NTB (Zijun Hu)

  Cadence PCIe controller driver:

   - Lower severity of 'phy-names' message (Bartosz Wawrzyniak)

  Freescale i.MX6 PCIe controller driver:

   - Fix suspend/resume support on i.MX6QDL, which has a hardware
     erratum that prevents use of L2 (Stefan Eichenberger)

  Intel VMD host bridge driver:

   - Add 0xb60b and 0xb06f Device IDs for client SKUs (Nirmal Patel)

  MediaTek PCIe Gen3 controller driver:

   - Update mediatek-gen3 DT binding to require the exact number of
     clocks for each SoC (Fei Shao)

   - Add support for DT 'max-link-speed' and 'num-lanes' properties to
     restrict the link speed and width (AngeloGioacchino Del Regno)

  Microchip PolarFlare PCIe controller driver:

   - Add DT and driver support for using either of the two PolarFire
     Root Ports (Conor Dooley)

  NVIDIA Tegra194 PCIe controller driver:

   - Move endpoint controller cleanups that depend on refclk from the
     host to the notifier that tells us the host has deasserted PERST#,
     when refclk should be valid (Manivannan Sadhasivam)

  Qualcomm PCIe controller driver:

   - Add qcom SAR2130P DT binding with an additional clock (Dmitry
     Baryshkov)

   - Enable MSI interrupts if 'global' IRQ is supported, since a
     previous commit unintentionally masked them (Manivannan Sadhasivam)

   - Move endpoint controller cleanups that depend on refclk from the
     host to the notifier that tells us the host has deasserted PERST#,
     when refclk should be valid (Manivannan Sadhasivam)

   - Add DT binding and driver support for IPQ9574, with Synopsys IP
     v5.80a and Qcom IP 1.27.0 (devi priya)

   - Move the OPP "operating-points-v2" table from the
     qcom,pcie-sm8450.yaml DT binding to qcom,pcie-common.yaml, where it
     can be used by other Qcom platforms (Qiang Yu)

   - Add 'global' SPI interrupt for events like link-up, link-down to
     qcom,pcie-x1e80100 DT binding so we can start enumeration when the
     link comes up (Qiang Yu)

   - Disable ASPM L0s for qcom,pcie-x1e80100 since the PHY is not tuned
     to support this (Qiang Yu)

   - Add ops_1_21_0 for SC8280X family SoC, which doesn't use the
     'iommu-map' DT property and doesn't need BDF-to-SID translation
     (Qiang Yu)

  Rockchip PCIe controller driver:

   - Define ROCKCHIP_PCIE_AT_SIZE_ALIGN to replace magic 256 endpoint
     .align value (Damien Le Moal)

   - When unmapping an endpoint window, compute the region index instead
     of searching for it, and verify that the address was mapped (Damien
     Le Moal)

   - When mapping an endpoint window, verify that the address hasn't
     been mapped already (Damien Le Moal)

   - Implement pci_epc_ops.align_addr() for rockchip-ep (Damien Le Moal)

   - Fix MSI IRQ data mapping to observe the alignment constraint, which
     fixes intermittent page faults in memcpy_toio() and memcpy_fromio()
     (Damien Le Moal)

   - Rename rockchip_pcie_parse_ep_dt() to
     rockchip_pcie_ep_get_resources() for consistency with similar DT
     interfaces (Damien Le Moal)

   - Skip the unnecessary link train in rockchip_pcie_ep_probe() and do
     it only in the endpoint start operation (Damien Le Moal)

   - Implement pci_epc_ops.stop_link() to disable link training and
     controller configuration (Damien Le Moal)

   - Attempt link training at 5 GT/s when both partners support it
     (Damien Le Moal)

   - Add a handler for PERST# signal so we can detect host-initiated
     resets and start link training after PERST# is deasserted (Damien
     Le Moal)

  Synopsys DesignWare PCIe controller driver:

   - Clear outbound address on unmap so dw_pcie_find_index() won't match
     an ATU index that was already unmapped (Damien Le Moal)

   - Use of_property_present() instead of of_property_read_bool() when
     testing for presence of non-boolean DT properties (Rob Herring)

   - Advertise 1MB size if endpoint supports Resizable BARs, which was
     inadvertently lost in v6.11 (Niklas Cassel)

  TI J721E PCIe driver:

   - Add PCIe support for J722S SoC (Siddharth Vadapalli)

   - Delay PCIE_T_PVPERL_MS (100 ms), not just PCIE_T_PERST_CLK_US (100
     us), before deasserting PERST# to ensure power and refclk are
     stable (Siddharth Vadapalli)

  TI Keystone PCIe controller driver:

   - Set the 'ti,keystone-pcie' mode so v3.65a devices work in Root
     Complex mode (Kishon Vijay Abraham I)

   - Try to avoid unrecoverable SError for attempts to issue config
     transactions when the link is down; this is racy but the best we
     can do (Kishon Vijay Abraham I)

  Miscellaneous:

   - Reorganize kerneldoc parameter names to match order in function
     signature (Julia Lawall)

   - Fix sysfs reset_method_store() memory leak (Todd Kjos)

   - Simplify pci_create_slot() (Ilpo Järvinen)

   - Fix incorrect printf format specifiers in pcitest (Luo Yifan)"

* tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (127 commits)
  PCI: rockchip-ep: Handle PERST# signal in EP mode
  PCI: rockchip-ep: Improve link training
  PCI: rockship-ep: Implement the pci_epc_ops::stop_link() operation
  PCI: rockchip-ep: Refactor endpoint link training enable
  PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() MSI-X hiding
  PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations
  PCI: rockchip-ep: Rename rockchip_pcie_parse_ep_dt()
  PCI: rockchip-ep: Fix MSI IRQ data mapping
  PCI: rockchip-ep: Implement the pci_epc_ops::align_addr() operation
  PCI: rockchip-ep: Improve rockchip_pcie_ep_map_addr()
  PCI: rockchip-ep: Improve rockchip_pcie_ep_unmap_addr()
  PCI: rockchip-ep: Use a macro to define EP controller .align feature
  PCI: rockchip-ep: Fix address translation unit programming
  PCI/pwrctrl: Rename pwrctrl functions and structures
  PCI/pwrctrl: Rename pwrctl files to pwrctrl
  PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent
  PCI/pwrctl: Ensure that pwrctl drivers are probed before PCI client drivers
  PCI/pwrctl: Create pwrctl device only if at least one power supply is present
  PCI/pwrctl: Use of_platform_device_create() to create pwrctl devices
  tools: PCI: Fix incorrect printf format specifiers
  ...
2024-11-26 18:05:44 -08:00
Bjorn Helgaas
10099266de Merge branch 'pci/typos'
- Fix typos and whitespace errors (Bjorn Helgaas)

* pci/typos:
  PCI: Fix typos
2024-11-25 13:41:00 -06:00
Bjorn Helgaas
2d56427989 Merge branch 'pci/misc'
- Reorganize kerneldoc parameter names to match order in function signature
  (Julia Lawall)

- Remove kerneldoc return value descriptions from hotplug registration
  interfaces that don't return anything (Ilpo Järvinen)

- Fix sysfs reset_method_store() memory leak (Todd Kjos)

- Simplify pci_create_slot() (Ilpo Järvinen)

- Fix incorrect printf format specifiers in pcitest (Luo Yifan)

* pci/misc:
  tools: PCI: Fix incorrect printf format specifiers
  PCI: Simplify pci_create_slot() logic
  PCI: Fix reset_method_store() memory leak
  PCI: hotplug: Remove "Returns" kerneldoc from void functions
  PCI: hotplug: Reorganize kerneldoc parameter names
2024-11-25 13:41:00 -06:00
Bjorn Helgaas
e6b95ef1af Merge branch 'pci/controller/vmd'
- Add 0xb60b and 0xb06f Device IDs for client SKUs (Nirmal Patel)

* pci/controller/vmd:
  PCI: vmd: Add DID 8086:B06F and 8086:B60B for Intel client SKUs
2024-11-25 13:41:00 -06:00
Bjorn Helgaas
c60603ca1d Merge branch 'pci/controller/tegra194'
- Move endpoint controller cleanups that depend on refclk from the host to
  the notifier that tells us the host has deasserted PERST# (Manivannan
  Sadhasivam)

* pci/controller/tegra194:
  PCI: tegra194: Move controller cleanups to pex_ep_event_pex_rst_deassert()
2024-11-25 13:41:00 -06:00
Bjorn Helgaas
72ae381b00 Merge branch 'pci/controller/rockchip'
- Fix address translation unit programming (Damien Le Moal)

- Define ROCKCHIP_PCIE_AT_SIZE_ALIGN to replace magic 256 endpoint .align
  value (Damien Le Moal)

- When unmapping an endpoint window, compute the region index instead of
  searching for it, and verify that the address was mapped (Damien Le Moal)

- When mapping an endpoint window, verify that the address hasn't been
  mapped already (Damien Le Moal)

- Implement pci_epc_ops.align_addr() for rockchip-ep (Damien Le Moal)

- Fix MSI IRQ data mapping to observe the alignment constraint, which fixes
  intermittent page faults in memcpy_toio() and memcpy_fromio() (Damien Le
  Moal)

- Rename rockchip_pcie_parse_ep_dt() to rockchip_pcie_ep_get_resources()
  for consistency with similar DT interfaces (Damien Le Moal)

- Factor out memory allocations to tidy rockchip_pcie_ep_probe() (Damien Le
  Moal)

- Factor out MSI-X quirk to tidy rockchip_pcie_ep_probe() (Damien Le Moal)

- Skip the unnecessary link train in rockchip_pcie_ep_probe() and only in
  the endpoint start operation (Damien Le Moal)

- Implement pci_epc_ops.stop_link() to disable link training and controller
  configuration (Damien Le Moal)

- Attempt link training at 5 GT/s when both partners support it (Damien Le
  Moal)

- Add a handler for PERST# signal so we can detect host resets and start
  link training when exiting reset (Damien Le Moal)

* pci/controller/rockchip:
  PCI: rockchip-ep: Handle PERST# signal in EP mode
  PCI: rockchip-ep: Improve link training
  PCI: rockship-ep: Implement the pci_epc_ops::stop_link() operation
  PCI: rockchip-ep: Refactor endpoint link training enable
  PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() MSI-X hiding
  PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations
  PCI: rockchip-ep: Rename rockchip_pcie_parse_ep_dt()
  PCI: rockchip-ep: Fix MSI IRQ data mapping
  PCI: rockchip-ep: Implement the pci_epc_ops::align_addr() operation
  PCI: rockchip-ep: Improve rockchip_pcie_ep_map_addr()
  PCI: rockchip-ep: Improve rockchip_pcie_ep_unmap_addr()
  PCI: rockchip-ep: Use a macro to define EP controller .align feature
  PCI: rockchip-ep: Fix address translation unit programming
2024-11-25 13:40:59 -06:00
Bjorn Helgaas
f54ff407e6 Merge branch 'pci/controller/qcom'
- Enable MSI interrupts if 'global' IRQ is supported, since a previous
  commit unintentionally masked them (Manivannan Sadhasivam)

- Move endpoint controller cleanups that depend on refclk from the host to
  the notifier that tells us the host has deasserted PERST# (Manivannan
  Sadhasivam)

- Add DT binding and driver support for IPQ9574, with Synopsys IP v5.80a
  and Qcom IP 1.27.0 (devi priya)

- Move the OPP "operating-points-v2" table from the qcom,pcie-sm8450.yaml
  DT binding to qcom,pcie-common.yaml, where it can be used by other Qcom
  platforms (Qiang Yu)

- Add 'global' SPI interrupt for events like link-up, link-down to
  qcom,pcie-x1e80100 DT binding so we can start enumeration when the link
  comes up (Qiang Yu)

- Disable ASPM L0s for qcom,pcie-x1e80100 since the PHY is not tuned to
  support this (Qiang Yu)

- Add ops_1_21_0 for SC8280X family SoC, which doesn't use the 'iommu-map'
  DT property and doesn't need BDF-to-SID translation (Qiang Yu)

* pci/controller/qcom:
  PCI: qcom: Disable ASPM L0s for X1E80100
  PCI: qcom: Remove BDF2SID mapping config for SC8280X family SoC
  dt-bindings: PCI: qcom,pcie-x1e80100: Add 'global' interrupt
  dt-bindings: PCI: qcom: Move OPP table to qcom,pcie-common.yaml
  PCI: qcom: Add support for IPQ9574
  dt-bindings: PCI: qcom: Document the IPQ9574 PCIe controller
  PCI: qcom-ep: Move controller cleanups to qcom_pcie_perst_deassert()
  PCI: qcom: Enable MSI interrupts together with Link up if 'Global IRQ' is supported
2024-11-25 13:40:59 -06:00
Bjorn Helgaas
7b5d234e69 Merge branch 'pci/controller/microchip'
- Add DT and driver support for using either of the two PolarFire Root
  Ports (Conor Dooley)

* pci/controller/microchip:
  PCI: microchip: Add support for using either Root Port 1 or 2
  dt-bindings: PCI: microchip,pcie-host: Add reg for Root Port 2
2024-11-25 13:40:59 -06:00
Bjorn Helgaas
4268106135 Merge branch 'pci/controller/mediatek'
- Add support for DT 'max-link-speed' and 'num-lanes' properties to
  restrict the link speed and width (AngeloGioacchino Del Regno)

* pci/controller/mediatek:
  PCI: mediatek-gen3: Remove unneeded semicolon
  PCI: mediatek-gen3: Add support for restricting link width
  PCI: mediatek-gen3: Add support for setting max-link-speed limit
2024-11-25 13:40:58 -06:00
Bjorn Helgaas
c1787c3e41 Merge branch 'pci/controller/keystone'
- Set the 'ti,keystone-pcie' mode so v3.65a devices work in Root Complex
  mode (Kishon Vijay Abraham I)

- Try to avoid unrecoverable SError for attempts to issue config
  transactions when the link is down; this is racy but the best we can do
  (Kishon Vijay Abraham I)

* pci/controller/keystone:
  PCI: keystone: Add link up check to ks_pcie_other_map_bus()
  PCI: keystone: Set mode as Root Complex for "ti,keystone-pcie" compatible
2024-11-25 13:40:58 -06:00
Bjorn Helgaas
5c8bd7f277 Merge branch 'pci/controller/j721e'
- Add PCIe support for J722S SoC (Siddharth Vadapalli)

- Delay PCIE_T_PVPERL_MS (100 ms), not just PCIE_T_PERST_CLK_US (100 us),
  before deasserting PERST# to ensure power and refclk are stable
  (Siddharth Vadapalli)

* pci/controller/j721e:
  PCI: j721e: Deassert PERST# after a delay of PCIE_T_PVPERL_MS milliseconds
  PCI: j721e: Add PCIe support for J722S SoC
2024-11-25 13:40:58 -06:00
Bjorn Helgaas
7b86e0a589 Merge branch 'pci/controller/imx6'
- Fix suspend/resume support on i.MX6QDL, which has a hardware erratum that
  prevents use of L2 (Stefan Eichenberger)

* pci/controller/imx6:
  PCI: imx6: Fix suspend/resume support on i.MX6QDL
2024-11-25 13:40:57 -06:00
Bjorn Helgaas
2b4049d192 Merge branch 'pci/controller/dwc'
- Clear outbound address on unmap so dw_pcie_find_index() won't match an
  ATU index that was already unmapped (Damien Le Moal)

- Use of_property_present() instead of of_property_read_bool() when testing
  for presence of non-boolean DT properties (Rob Herring)

- Advertise 1MB size if endpoint supports Resizable BARs, which was
  inadvertently lost in v6.11 (Niklas Cassel)

* pci/controller/dwc:
  PCI: dwc: ep: Fix advertised resizable BAR size regression
  PCI: dwc: Use of_property_present() for non-boolean properties
  PCI: dwc: endpoint: Clear outbound address on unmap
2024-11-25 13:40:57 -06:00
Bjorn Helgaas
5b8d59ca27 Merge branch 'pci/controller/cadence'
- Lower severity of 'phy-names' message (Bartosz Wawrzyniak)

* pci/controller/cadence:
  PCI: cadence: Lower severity of message when phy-names property is absent in DTS
2024-11-25 13:40:57 -06:00
Bjorn Helgaas
bd43348872 Merge branch 'pci/endpoint'
- Add pci_epc_function_is_valid() to avoid repeating common validation
  checks (Damien Le Moal)

- Skip attempts to allocate from endpoint controller memory window if the
  requested size is larger than the window (Damien Le Moal)

- Add and document pci_epc_mem_map() and pci_epc_mem_unmap() to handle
  controller-specific size and alignment constraints, and add test cases to
  the endpoint test driver (Damien Le Moal)

- Implement dwc pci_epc_ops.align_addr() so pci_epc_mem_map() can observe
  DWC-specific alignment requirements (Damien Le Moal)

- Synchronously cancel command handler work in endpoint test before
  cleaning up DMA and BARs (Damien Le Moal)

- Respect endpoint page size in dw_pcie_ep_align_addr() (Niklas Cassel)

- Use dw_pcie_ep_align_addr() in dw_pcie_ep_raise_msi_irq() and
  dw_pcie_ep_raise_msix_irq() instead of open coding the equivalent (Niklas
  Cassel)

- Remove superfluous 'return' from pci_epf_test_clean_dma_chan() (Wang
  Jiang)

- Avoid NULL dereference if Modem Host Interface Endpoint lacks 'mmio' DT
  property (Zhongqiu Han)

- Release PCI domain ID of Endpoint controller parent (not controller
  itself) and before unregistering the controller, to avoid use-after-free
  (Zijun Hu)

- Clear secondary (not primary) EPC in pci_epc_remove_epf() when removing
  the secondary controller associated with an NTB (Zijun Hu)

- Fix pci_epc_map map_size kerneldoc (Rick Wertenbroek)

* pci/endpoint:
  PCI: endpoint: Fix pci_epc_map map_size kerneldoc string
  PCI: endpoint: Clear secondary (not primary) EPC in pci_epc_remove_epf()
  PCI: endpoint: Fix PCI domain ID release in pci_epc_destroy()
  PCI: endpoint: epf-mhi: Avoid NULL dereference if DT lacks 'mmio'
  PCI: endpoint: Remove surplus return statement from pci_epf_test_clean_dma_chan()
  PCI: dwc: ep: Use align addr function for dw_pcie_ep_raise_{msi,msix}_irq()
  PCI: endpoint: test: Synchronously cancel command handler work
  PCI: dwc: endpoint: Implement the pci_epc_ops::align_addr() operation
  PCI: endpoint: test: Use pci_epc_mem_map/unmap()
  PCI: endpoint: Update documentation
  PCI: endpoint: Introduce pci_epc_mem_map()/unmap()
  PCI: endpoint: Improve pci_epc_mem_alloc_addr()
  PCI: endpoint: Introduce pci_epc_function_is_valid()
2024-11-25 13:40:56 -06:00
Bjorn Helgaas
5cdd50dc10 Merge branch 'pci/virtualization'
- Add ACS quirk for Wangxun FF5xxx NICs, which don't advertise and ACS
  capability but do isolate functions as though PCI_ACS_RR and PCI_ACS_CR
  were set, so the functions can be in independent IOMMU groups (Mengyuan
  Lou)

* pci/virtualization:
  PCI: Add ACS quirk for Wangxun FF5xxx NICs
2024-11-25 13:40:56 -06:00
Bjorn Helgaas
ab02bafcec Merge branch 'pci/tph'
- Add and document TLP Processing Hints (TPH) support so drivers can enable
  and disable TPH and the kernel can save/restore TPH configuration (Wei
  Huang)

- Add TPH Steering Tag support so drivers can retrieve Steering Tag values
  associated with specific CPUs via an ACPI _DSM to direct DMA writes
  closer to their consumers (Wei Huang)

* pci/tph:
  PCI/TPH: Add TPH documentation
  PCI/TPH: Add Steering Tag support
  PCI: Add TLP Processing Hints (TPH) support
2024-11-25 13:40:55 -06:00
Bjorn Helgaas
efcbd9d397 Merge branch 'pci/thunderbolt'
- Detect some Thunderbolt chips that are built-in and hence 'trustworthy'
  by a heuristic since the 'ExternalFacingPort' and 'usb4-host-interface'
  ACPI properties are not quite enough (Esther Shimanovich)

* pci/thunderbolt:
  PCI: Detect and trust built-in Thunderbolt chips
2024-11-25 13:40:55 -06:00
Bjorn Helgaas
c03d361c20 Merge branch 'pci/resource'
- Add resource_set_size() to set resource size when start has already been
  set (Ilpo Järvinen)

- Add resource_set_range() helper to set both resource start and size (Ilpo
  Järvinen)

- Use IS_ALIGNED() and resource_size() in quirk_s3_64M() instead of
  open-coding them (Ilpo Järvinen)

- Add ALIGN_DOWN_IF_NONZERO() to avoid code duplication when distributing
  resources across devices (Ilpo Järvinen)

- Improve pdev_sort_resources() warning message to be more specific (Ilpo
  Järvinen)

* pci/resource:
  PCI: Improve pdev_sort_resources() warning message
  PCI: Add ALIGN_DOWN_IF_NONZERO() helper
  PCI: Use align and resource helpers, and SZ_* in quirk_s3_64M()
  PCI: Use resource_set_{range,size}() helpers
  resource: Add resource set range and size helpers
2024-11-25 13:40:55 -06:00
Bjorn Helgaas
d985e2beb9 Merge branch 'pci/reset'
- Add sysfs 'reset_subordinate' to reset hierarchy below bridge (Keith
  Busch)

- Warn if we reset a running device where driver didn't register
  pci_error_handlers notification callbacks (Keith Busch)

* pci/reset:
  PCI: Warn if a running device is unaware of reset
  PCI: Add 'reset_subordinate' to reset hierarchy below bridge
2024-11-25 13:40:54 -06:00
Bjorn Helgaas
ce1deca962 Merge branch 'pci/pwrctl'
- Use of_platform_device_create() instead of of_platform_populate() to
  create pwrctl platform devices so we can control it based on the child
  nodes (Manivannan Sadhasivam)

- Create pwrctrl platform devices only if there's a relevant power supply
  property (Manivannan Sadhasivam)

- Add device link from the pwrctl supplier to the PCI dev to ensure pwrctl
  drivers are probed before the PCI dev driver; this avoids a race where
  pwrctl could change device power state while the PCI driver was active
  (Manivannan Sadhasivam)

- Find pwrctl device for removal with of_find_device_by_node() instead of
  searching all children of the parent (Manivannan Sadhasivam)

- Rename 'pwrctl' to 'pwrctrl' to use the same 'ctrl' suffix as 'bwctrl'
  and other PCI files to reduce confusion (Bjorn Helgaas)

* pci/pwrctl:
  PCI/pwrctrl: Rename pwrctrl functions and structures
  PCI/pwrctrl: Rename pwrctl files to pwrctrl
  PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent
  PCI/pwrctl: Ensure that pwrctl drivers are probed before PCI client drivers
  PCI/pwrctl: Create pwrctl device only if at least one power supply is present
  PCI/pwrctl: Use of_platform_device_create() to create pwrctl devices

# Conflicts:
#	drivers/pci/bus.c
#	drivers/pci/remove.c
2024-11-25 13:40:54 -06:00
Bjorn Helgaas
95e93032ba Merge branch 'pci/pm'
- Enable starfive controller runtime PM before probing host bridge (Mayank
  Rana)

- Enable runtime power management for host bridges (Krishna chaitanya
  chundru)

* pci/pm:
  PCI: Enable runtime PM of the host bridge
  PCI: starfive: Enable controller runtime PM before probing host bridge
2024-11-25 13:40:46 -06:00
Bjorn Helgaas
2438a74571 Merge branch 'pci/of'
- Use PCI bus addresses (not CPU addresses) in 'ranges' properties when
  building dynamic DT nodes so systems where the PCI and CPU addresses
  space differ work correctly (Andrea della Porta)

* pci/of:
  PCI: of_property: Assign PCI instead of CPU bus address to dynamic PCI nodes
2024-11-25 13:40:46 -06:00
Bjorn Helgaas
5d756f3fa8 Merge branch 'pci/locking'
- Make pci_stop_dev() and pci_destroy_dev() concurrent safe (Keith Busch)

- Move __pci_walk_bus() mutex up into the caller, which avoids the need for
  a parameter to control locking (Keith Busch)

- Simplify __pci_walk_bus() by making it recursive (Keith Busch)

- Unexport pci_walk_bus_locked(), which is only used internally by the PCI
  core (Keith Busch)

* pci/locking:
  PCI: Unexport pci_walk_bus_locked()
  PCI: Convert __pci_walk_bus() to be recursive
  PCI: Move __pci_walk_bus() mutex to where we need it
  PCI: Make pci_destroy_dev() concurrent safe
  PCI: Make pci_stop_dev() concurrent safe
2024-11-25 13:40:45 -06:00
Bjorn Helgaas
665e4a3456 Merge branch 'pci/hotplug-octeon'
- Add hotplug controller driver for Marvell OCTEON multi-function device
  where function 0 has a management console interface to enable/disable and
  provision various personalities for the other functions (Shijith Thotton)

* pci/hotplug-octeon:
  PCI: hotplug: Add OCTEON PCI hotplug controller driver
2024-11-25 13:40:45 -06:00
Bjorn Helgaas
dcd12456b3 Merge branch 'pci/hotplug'
- Remove unused cpcihp struct cpci_hp_controller_ops.hardware_test
  (Guilherme Giacomo Simoes)

- Remove unused cpqphp struct ctrl_dbg.ctrl (Christophe JAILLET)

- Clean up cpqphp PCIBIOS_* return value confusion (Ilpo Järvinen)

- Use pci_bus_read_dev_vendor_id() instead of hand-coded presence detection
  in cpqphp (Ilpo Järvinen)

- Simplify cpqphp enumeration, which is already simple-minded and doesn't
  handle devices below hot-added bridges (Ilpo Järvinen)

- Retain a reference to the pci_bus for the lifetime of a pci_slot to avoid
  a use-after-free when the thunderbolt driver resets USB4 host routers on
  boot, causing hotplug remove/add of downstream docks or other devices
  (Lukas Wunner)

* pci/hotplug:
  PCI: Fix use-after-free of slot->bus on hot remove
  PCI: cpqphp: Simplify PCI_ScanBusForNonBridge()
  PCI: cpqphp: Use define to read class/revision dword
  PCI: cpqphp: Use pci_bus_read_dev_vendor_id() to detect presence
  PCI: cpqphp: Fix PCIBIOS_* return value confusion
  PCI: cpqphp: Remove unused struct ctrl_dbg.ctrl
  PCI: cpcihp: Remove unused struct cpci_hp_controller_ops.hardware_test
2024-11-25 13:40:44 -06:00
Bjorn Helgaas
77ac2e28f1 Merge branch 'pci/enumeration'
- Simplify pci_read_bridge_bases() logic (Ilpo Järvinen)

* pci/enumeration:
  PCI: Simplify pci_read_bridge_bases() logic
  PCI: Move struct pci_bus_resource into bus.c
  PCI: Remove unused PCI_SUBTRACTIVE_DECODE
2024-11-25 13:40:44 -06:00
Bjorn Helgaas
dd97612368 Merge branch 'pci/driver-remove'
- Convert driver .remove_new() callbacks to .remove() again to finish the
  conversion from returning 'int' to being 'void' (Sergio Paracuellos)

* pci/driver-remove:
  PCI: acpiphp_ampere_altra: Switch back to struct platform_driver::remove()
  PCI: controller: Switch back to struct platform_driver::remove()
2024-11-25 13:40:44 -06:00
Bjorn Helgaas
f326ce1693 Merge branch 'pci/devm'
- Export pcim_request_all_regions(), a managed interface to request all
  BARs (Philipp Stanner)

- Replace pcim_iomap_regions_request_all() with pcim_request_all_regions(),
  and pcim_iomap_table()[n] with pcim_iomap(n), in the following drivers:
  ahci, crypto qat, crypto octeontx2, intel_th, iwlwifi, ntb idt, serial
  rp2, ALSA korg1212 (Philipp Stanner)

- Remove the now unused pcim_iomap_regions_request_all() (Philipp Stanner)

- Export pcim_iounmap_region(), a managed interface to unmap and release a
  PCI BAR (Philipp Stanner)

- Replace pcim_iomap_regions(mask) with pcim_iomap_region(n), and
  pcim_iounmap_regions(mask) with pcim_iounmap_region(n), in the following
  drivers: fpga dfl-pci, block mtip32xx, gpio-merrifield, cavium (Philipp
  Stanner)

* pci/devm:
  ethernet: cavium: Replace deprecated PCI functions
  gpio: Replace deprecated PCI functions
  fpga/dfl-pci.c: Replace deprecated PCI functions
  PCI: Deprecate pcim_iounmap_regions()
  PCI: Make pcim_iounmap_region() a public function
  PCI: Remove pcim_iomap_regions_request_all()
  ALSA: korg1212: Replace deprecated PCI functions
  serial: rp2: Replace deprecated PCI functions
  ntb: idt: Replace deprecated PCI functions
  wifi: iwlwifi: replace deprecated PCI functions
  intel_th: pci: Replace deprecated PCI functions
  crypto: marvell - replace deprecated PCI functions
  crypto: qat - replace deprecated PCI functions
  ata: ahci: Replace deprecated PCI functions
  PCI: Make pcim_request_all_regions() a public function
2024-11-25 13:40:43 -06:00
Bjorn Helgaas
73bdd7304a Merge branch 'pci/doe'
- Wait up to 1 second for DOE Busy bit to clear before writing a request to
  the mailbox to avoid failures if the mailbox is still busy from a
  previous transfer (Gregory Price)

* pci/doe:
  PCI/DOE: Poll DOE Busy bit for up to 1 second in pci_doe_send_req()
2024-11-25 13:40:43 -06:00
Bjorn Helgaas
d957ff7aca Merge branch 'pci/bwctrl'
- Add read/modify/write locking for Link Control 2, which is used to manage
  Link speed (Ilpo Järvinen)

- Cache all supported Link speeds for use by the PCIe bandwidth controller
  (Ilpo Järvinen)

- Extract the Link Bandwidth Management Status check into pcie_lbms_seen(),
  where it can be shared between the bandwidth controller and quirks that
  use it to help retrain failed links (Ilpo Järvinen)

- Re-add Link Bandwidth notification support with updates to address the
  reasons it was previously reverted (Alexandru Gagniuc, Ilpo Järvinen)

- Add pcie_set_target_speed() and related functionality to manage PCIe Link
  speed based on thermal constraints (Ilpo Järvinen)

- Add a thermal cooling driver to throttle PCIe Links via the existing
  thermal management framework (Ilpo Järvinen)

- Add a userspace selftest for the PCIe bandwidth controller (Ilpo
  Järvinen)

- Drop duplicate pcie_get_speed_cap(), pcie_get_width_cap() declarations
  (Bjorn Helgaas)

* pci/bwctrl:
  PCI: Drop duplicate pcie_get_speed_cap(), pcie_get_width_cap() declarations
  selftests/pcie_bwctrl: Create selftests
  thermal: Add PCIe cooling driver
  PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
  PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
  PCI: Abstract LBMS seen check into pcie_lbms_seen()
  PCI: Refactor pcie_update_link_speed()
  PCI: Store all PCIe Supported Link Speeds
  PCI: Protect Link Control 2 Register with RMW locking
  Documentation PCI: Reformat RMW ops documentation
2024-11-25 13:40:43 -06:00
Damien Le Moal
a7137cbf6b PCI: rockchip-ep: Handle PERST# signal in EP mode
Currently, the Rockchip PCIe endpoint controller driver does not handle
the PERST# signal, which prevents detecting when link training should
actually be started or if the host resets the device. This however can
be supported using the controller reset_gpios property set as an input
GPIO for endpoint mode.

Modify the Rockchip PCI endpoint controller driver to get the reset_gpio
and its associated interrupt which is serviced using a threaded IRQ with
the function rockchip_pcie_ep_perst_irq_thread() as handler.

This handler function notifies a link down event corresponding to the RC
side asserting the PERST# signal using pci_epc_linkdown() when the gpio
is high. Once the gpio value goes down, corresponding to the RC
de-asserting the PERST# signal, link training is started. The polarity
of the gpio interrupt trigger is changed from high to low after the RC
asserted PERST#, and conversely changed from low to high after the RC
de-asserts PERST#.

Also, given that the host mode controller and the endpoint mode
controller use two different property names for the same PERST# signal
(ep_gpios property and reset_gpios property respectively), for clarity,
rename the ep_gpio field of struct rockchip_pcie to perst_gpio.

Link: https://lore.kernel.org/r/20241017015849.190271-14-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
[kwilczynski: make log messages consistent, add missing include]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
bd6e61df4b PCI: rockchip-ep: Improve link training
The Rockchip RK3399 TRM V1.3 Part2, Section 17.5.8.1.2, step 7,
describes the endpoint mode link training process clearly and states
that:

  Insure link training completion and success by observing link_st field
  in PCIe Client BASIC_STATUS1 register change to 2'b11. If both side
  support PCIe Gen2 speed, re-train can be Initiated by asserting the
  Retrain Link field in Link Control and Status Register. The software
  should insure the BASIC_STATUS0[negotiated_speed] changes to "1", that
  indicates re-train to Gen2 successfully.

This procedure is very similar to what is done for the root-port mode
in rockchip_pcie_host_init_port().

Implement this link training procedure for the endpoint mode as well.
Given that the RK3399 SoC does not have an interrupt signaling link
status changes, training is implemented as a delayed work which is
rescheduled until the link training completes or the endpoint controller
is stopped. The link training work is first scheduled in
rockchip_pcie_ep_start() when the endpoint function is started. Link
training completion is signaled to the function using pci_epc_linkup().
Accordingly, the linkup_notifier field of the Rockchip pci_epc_features
structure is changed to true.

Link: https://lore.kernel.org/r/20241017015849.190271-13-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
[kwilczynski: update log messages to make them consistent]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
00080d0887 PCI: rockship-ep: Implement the pci_epc_ops::stop_link() operation
Define the EPC operation ->stop() for the Rockchip endpoint driver with
the function rockchip_pcie_ep_stop(). This function disables link
training and the controller configuration, as the reverse to what
the start operation defined with rockchip_pcie_ep_start() does.

Link: https://lore.kernel.org/r/20241017015849.190271-12-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-11-25 13:18:36 -06:00
Damien Le Moal
091022f5f9 PCI: rockchip-ep: Refactor endpoint link training enable
The function rockchip_pcie_init_port() enables link training for a
controller configured in EP mode. Enabling link training is again done
in rockchip_pcie_ep_probe() after that function executed
rockchip_pcie_init_port(). Enabling link training only needs to be done
once, and doing so at the probe stage before the controller is actually
started by the user serves no purpose.

Refactor this by removing the link training enablement from both
rockchip_pcie_init_port() and rockchip_pcie_ep_probe() and moving it to
the endpoint start operation defined with rockchip_pcie_ep_start().
Enabling the controller configuration using the PCIE_CLIENT_CONF_ENABLE
bit in the same PCIE_CLIENT_CONFIG register is also moved to
rockchip_pcie_ep_start() and both the controller configuration and link
training enable bits are set with a single call to
rockchip_pcie_write().

Link: https://lore.kernel.org/r/20241017015849.190271-11-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
8efda8aebe PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() MSI-X hiding
Move the code in rockchip_pcie_ep_probe() to hide the MSI-X capability
to its own function, rockchip_pcie_ep_hide_broken_msix_cap().

No functional changes.

Link: https://lore.kernel.org/r/20241017015849.190271-10-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-11-25 13:18:36 -06:00
Damien Le Moal
9456480194 PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations
Introduce the function rockchip_pcie_ep_init_ob_mem() allocate the
outbound memory regions and memory needed for IRQ handling.

These changes tidy up rockchip_pcie_ep_probe().

No functional change.

Link: https://lore.kernel.org/r/20241017015849.190271-9-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
2968534e63 PCI: rockchip-ep: Rename rockchip_pcie_parse_ep_dt()
To be consistent with the usual "get_resources" naming of driver
functions that acquire controller resources like clocks, PHY etc,
rename the function rockchip_pcie_parse_ep_dt() to
rockchip_pcie_ep_get_resources().

No functional changes.

Link: https://lore.kernel.org/r/20241017015849.190271-8-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
9f737cca6c PCI: rockchip-ep: Fix MSI IRQ data mapping
The call to rockchip_pcie_prog_ep_ob_atu() used to map the PCI address
of MSI data to the memory window allocated on probe for IRQs is done
in rockchip_pcie_ep_send_msi_irq() assuming a fixed alignment to a
256B boundary of the PCI address.  This is not correct as the alignment
constraint for the RK3399 PCI mapping depends on the number of bits of
address changing in the mapped region. This leads to an unstable system
which sometimes work and sometimes does not (crashing on paging faults
when memcpy_toio() or memcpy_fromio() are used).

Similar to regular data mapping, the MSI data mapping must thus be
handled according to the information provided by
rockchip_pcie_ep_align_addr(). Modify rockchip_pcie_ep_send_msi_irq()
to use rockchip_pcie_ep_align_addr() to correctly program entry 0 of
the ATU for sending MSI IRQs.

Link: https://lore.kernel.org/r/20241017015849.190271-7-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
b21255326d PCI: rockchip-ep: Implement the pci_epc_ops::align_addr() operation
The Rockchip PCIe endpoint controller handles PCIe transfers addresses
by masking the lower bits of the programmed PCI address and using the
same number of lower bits from the CPU address space used for the
mapping. For a PCI mapping of size bytes starting from pci_addr, the
number of bits masked is the number of address bits changing in the
address range [pci_addr..pci_addr + size - 1], up to 20 bits, that is,
up to 1MB mappings.

This means that when preparing a PCI address mapping, an endpoint
function driver must use an offset into the allocated controller
memory region that is equal to the mask of the starting PCI address
over rockchip_pcie_ep_ob_atu_num_bits() bits. This offset also
determines the maximum size of the mapping given the starting PCI
address and the fixed 1MB controller memory window size.

Implement the ->align_addr() endpoint controller operation to allow the
mapping alignment to be transparently handled by endpoint function
drivers through the function pci_epc_mem_map().

Link: https://lore.kernel.org/r/20241017015849.190271-6-dlemoal@kernel.org
Co-developed-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
[kwilczynski: change local variable name for address offset]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:36 -06:00
Damien Le Moal
d8dbd21cfa PCI: rockchip-ep: Improve rockchip_pcie_ep_map_addr()
Add a check to verify that the outbound region to be used for mapping an
address is not already in use.

Link: https://lore.kernel.org/r/20241017015849.190271-5-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-11-25 13:18:35 -06:00
Damien Le Moal
57ed93fe79 PCI: rockchip-ep: Improve rockchip_pcie_ep_unmap_addr()
There is no need to loop over all regions to find the memory window used
to map an address. We can use rockchip_ob_region() to determine the
region index, together with a check that the address passed as argument
is the address used to create the mapping. Furthermore, the
ob_region_map bitmap should also be checked to ensure that we are not
attempting to unmap an address that is not mapped.

Link: https://lore.kernel.org/r/20241017015849.190271-4-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-25 13:18:35 -06:00
Damien Le Moal
739e25f51a PCI: rockchip-ep: Use a macro to define EP controller .align feature
Introduce the macro ROCKCHIP_PCIE_AT_SIZE_ALIGN to initialize the .align
field of the controller epc_features structure to 256. This is defined
as a shift using the macro ROCKCHIP_PCIE_AT_MIN_NUM_BITS (to avoid
using the "magic" value 8 directly).

Link: https://lore.kernel.org/r/20241017015849.190271-3-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-11-25 13:18:35 -06:00
Damien Le Moal
64f093c4d9 PCI: rockchip-ep: Fix address translation unit programming
The Rockchip PCIe endpoint controller handles PCIe transfers addresses
by masking the lower bits of the programmed PCI address and using the
same number of lower bits masked from the CPU address space used for the
mapping. For a PCI mapping of <size> bytes starting from <pci_addr>,
the number of bits masked is the number of address bits changing in the
address range [pci_addr..pci_addr + size - 1].

However, rockchip_pcie_prog_ep_ob_atu() calculates num_pass_bits only
using the size of the mapping, resulting in an incorrect number of mask
bits depending on the value of the PCI address to map.

Fix this by introducing the helper function
rockchip_pcie_ep_ob_atu_num_bits() to correctly calculate the number of
mask bits to use to program the address translation unit. The number of
mask bits is calculated depending on both the PCI address and size of
the mapping, and clamped between 8 and 20 using the macros
ROCKCHIP_PCIE_AT_MIN_NUM_BITS and ROCKCHIP_PCIE_AT_MAX_NUM_BITS. As
defined in the Rockchip RK3399 TRM V1.3 Part2, Sections 17.5.5.1.1 and
17.6.8.2.1, this clamping is necessary because:

  1) The lower 8 bits of the PCI address to be mapped by the outbound
     region are ignored. So a minimum of 8 address bits are needed and
     imply that the PCI address must be aligned to 256.

  2) The outbound memory regions are 1MB in size. So while we can specify
     up to 63-bits for the PCI address (num_bits filed uses bits 0 to 5 of
     the outbound address region 0 register), we must limit the number of
     valid address bits to 20 to match the memory window maximum size (1
     << 20 = 1MB).

Fixes: cf590b0783 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Link: https://lore.kernel.org/r/20241017015849.190271-2-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2024-11-25 13:18:35 -06:00
Bjorn Helgaas
3f925cd628
PCI/pwrctrl: Rename pwrctrl functions and structures
Rename pwrctrl functions and structures from "pwrctl" to "pwrctrl" to match
the similar file renames.

Link: https://lore.kernel.org/r/20241115214428.2061153-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
2024-11-21 16:01:27 +00:00
Bjorn Helgaas
b88cbaaa6f
PCI/pwrctrl: Rename pwrctl files to pwrctrl
To slightly reduce confusion between "pwrctl" (the power controller and
power sequencing framework) and "bwctrl" (the bandwidth controller),
rename "pwrctl" to "pwrctrl" so they use the same "ctrl" suffix.

Rename drivers/pci/pwrctl/ to drivers/pci/pwrctrl/, including the related
MAINTAINERS, include file (include/linux/pci-pwrctl.h), Makefile, and
Kconfig changes.

This is the minimal rename of files only.  A subsequent commit will rename
functions and data structures.

Link: https://lore.kernel.org/r/20241115214428.2061153-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
2024-11-21 16:01:24 +00:00
Manivannan Sadhasivam
681725afb6
PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent
There is no need to iterate over all children of the pwrctl device parent
to remove the pwrctl device. Since the pwrctl device associated with the
PCI device can be found using of_find_device_by_node() API, use it directly
instead.

Any pwrctl devices lying around without getting associated with the PCI
devices will be removed once their parent device gets removed.

Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-5-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-11-21 16:01:20 +00:00
Manivannan Sadhasivam
b458ff7e81
PCI/pwrctl: Ensure that pwrctl drivers are probed before PCI client drivers
As per the kernel device driver model, a pwrctl device is the supplier for
the PCI device, but the device link that enforces the supplier-consumer
relationship was previously created by the pwrctl driver. Therefore, the
driver model didn't prevent probing PCI client drivers before probing the
corresponding pwrctl drivers. This may lead to a race condition if the PCI
device was already powered on by the bootloader (before the pwrctl driver).

If the bootloader did not power on the PCI device, this wouldn't create any
problem as the pwrctl driver will be the one powering on the device, so the
PCI client driver always gets probed afterward. But if the device was
already powered on, then the device will be seen by the PCI core and the
PCI client driver may get probed before its pwrctl driver. This creates a
race condition as the pwrctl driver may change the device power state while
the device is being accessed by the client driver.

One such issue was already reported on the Qcom X13s platform with the WLAN
device and fixed with a hack in the WCN pwrseq driver by a9aaf1ff88
("power: sequencing: request the WLAN enable GPIO as-is").

A cleaner way to fix the above mentioned race condition is to ensure that
the pwrctl drivers are always probed before the client drivers.

If the PCI device is associated with a pwrctl platform device with a power
supply, add a device link between the PCI device and the pwrctl device
before device_attach() in pci_bus_add_device().

Note that there is no need to explicitly remove the device link as that
will be taken care of by the driver core when the PCI device gets removed.

Fixes: 4565d2652a ("PCI/pwrctl: Add PCI power control core code")
Fixes: 8fb18619d9 ("PCI/pwrctl: Create platform devices for child OF nodes of the port node")
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-3-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: squash fix from
https://lore.kernel.org/r/20241120062459.6371-1-manivannan.sadhasivam@linaro.org
for SPARCv9 issue reported by Jonathan Currier <dullfire@yahoo.com>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: wrap code to 80 columns]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable+noautosel@kernel.org		# Depends on power supply check
2024-11-21 16:01:13 +00:00
Manivannan Sadhasivam
278dd091e9
PCI/pwrctl: Create pwrctl device only if at least one power supply is present
Currently, pwrctl devices are created if the corresponding PCI nodes are
defined in devicetree. But this is not correct, because not all PCI nodes
require pwrctl support. Pwrctl comes into the picture only when the device
requires kernel to manage its power state. This can be determined using the
power supply properties present in the devicetree node of the device.

Add of_pci_supply_present() to check whether the devicetree contains at
least one power supply property for a device. If one is present, create a
pwrctl device for that PCI node.

Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 8fb18619d9 ("PCI/pwrctl: Create platform devices for child OF nodes of the port node")
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-2-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: rename of_pci_is_supply_present() to of_pci_supply_present() for
readability]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable+noautosel@kernel.org		# Depends on of_platform_device_create() rework
2024-11-21 16:01:04 +00:00
Manivannan Sadhasivam
7582fe07f4
PCI/pwrctl: Use of_platform_device_create() to create pwrctl devices
The of_platform_populate() API creates platform devices by descending
through the children of the parent node. But it provides no control over
the child nodes, which makes it difficult to add checks for the child
nodes in the future.

Use of_platform_device_create() and for_each_child_of_node_scoped() to make
it possible to add checks for each node before creating the platform
device.

Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-1-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-11-21 16:01:00 +00:00
Linus Torvalds
14d0e1a09f soc: driver updates for 6.12
Nothing particular important in the SoC driver updates, just the usual
 improvements to for drivers/soc and a couple of subsystems that don't
 fit anywhere else:
 
  - The largest set of updates is for Qualcomm SoC drivers, extending the
    set of supported features for additional SoCs in the QSEECOM, LLCC
    and socinfo drivers.a
 
  - The ti_sci firmware driver gains support for power managment
 
  - The drivers/reset subsystem sees a rework of the microchip
    sparx5 and amlogic reset drivers to support additional chips,
    plus a few minor updates on other platforms
 
  - The SCMI firmware interface driver gains support for two protocol
    extensions, allowing more flexible use of the shared memory area
    and new DT binding properties for configurability.
 
  - Mediatek SoC drivers gain support for power managment on the MT8188
    SoC and a new driver for DVFS.
 
  - The AMD/Xilinx ZynqMP SoC drivers gain support for system reboot
    and a few bugfixes
 
  - The Hisilicon Kunpeng HCCS driver gains support for configuring
    lanes through sysfs
 
 Finally, there are cleanups and minor fixes for drivers/soc, drivers/bus,
 and drivers/memory, including changing back the .remove_new callback
 to .remove, as well as a few other updates for freescale (powerpc)
 soc drivers, NXP i.MX soc drivers, cznic turris platform driver, memory
 controller drviers, TI OMAP SoC drivers, and Tegra firmware drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmc+DsUACgkQYKtH/8kJ
 UifNWRAA49Ife6ybk8jamM9Bd07kFmHdaad0ttgUtx7HMJBg51+JLNFwTVYM2p6b
 A1SWCsS+sxP1RBKuhgZrt+sDPAoDlYLQaF1WQB7cs4FXqYpc2Po8BmBili5BV635
 Zv/9C9ofsWiWg9pGy0rRFvHW0W48lBoQM61YZzQc85pyEod5RSgji/jUEzvBvhln
 V3hegw0myBecJ8b7jH9Fjre3gMSC65amlXemkDS/7FGXXA7V3BKmALglJj6BR4RD
 QtQgFOAe/XGmbOguMvZJvVbMnW8PbmS5k50ppixBPAultHflkdg4DdnIW59yUfK+
 Mr98sW8U/LirACX93uwSzBNY1m5cW+GP4DoemxIUIQAvXxR4HroLoJdHS+BfWH+H
 Pn9dgSZu/dUlxfzTYzvd0B5TUjDGkYubVtQ00PLOWFHNfhZSmCqGl5J5NjgINRCf
 mBwhvUBYXgvNrOaEnll2kt2ONbxT7WAJAcKdnXKDjG4nPDyXBLRYoE4gro4Iii7+
 1OA7NlInwW+XFfpIIJeYa+AOTgb0/MKdONG+CkUnn6Bc9+B7Xdg0w0VDlmsVbXae
 fRyaI6XKmyNtmFZM4+gUxIhzvOgYpOoMITQJHcHSYuzWQpsnkkRas9aTCyBSLAd4
 D59cQwqtmE9rCfp3A7heMeKCIRtfJzoWnW0bjJAPSccLyJP99rI=
 =xeCE
 -----END PGP SIGNATURE-----

Merge tag 'soc-drivers-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC driver updates from Arnd Bergmann:
 "Nothing particular important in the SoC driver updates, just the usual
  improvements to for drivers/soc and a couple of subsystems that don't
  fit anywhere else:

   - The largest set of updates is for Qualcomm SoC drivers, extending
     the set of supported features for additional SoCs in the QSEECOM,
     LLCC and socinfo drivers.a

   - The ti_sci firmware driver gains support for power managment

   - The drivers/reset subsystem sees a rework of the microchip sparx5
     and amlogic reset drivers to support additional chips, plus a few
     minor updates on other platforms

   - The SCMI firmware interface driver gains support for two protocol
     extensions, allowing more flexible use of the shared memory area
     and new DT binding properties for configurability.

   - Mediatek SoC drivers gain support for power managment on the MT8188
     SoC and a new driver for DVFS.

   - The AMD/Xilinx ZynqMP SoC drivers gain support for system reboot
     and a few bugfixes

   - The Hisilicon Kunpeng HCCS driver gains support for configuring
     lanes through sysfs

  Finally, there are cleanups and minor fixes for drivers/{soc, bus,
  memory}, including changing back the .remove_new callback to .remove,
  as well as a few other updates for freescale (powerpc) soc drivers,
  NXP i.MX soc drivers, cznic turris platform driver, memory controller
  drviers, TI OMAP SoC drivers, and Tegra firmware drivers"

* tag 'soc-drivers-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (116 commits)
  soc: fsl: cpm1: qmc: Set the ret error code on platform_get_irq() failure
  soc: fsl: rcpm: fix missing of_node_put() in copy_ippdexpcr1_setting()
  soc: fsl: cpm1: tsa: switch to for_each_available_child_of_node_scoped()
  platform: cznic: turris-omnia-mcu: Rename variable holding GPIO line names
  platform: cznic: turris-omnia-mcu: Document the driver private data structure
  firmware: turris-mox-rwtm: Document the driver private data structure
  bus: Switch back to struct platform_driver::remove()
  soc: qcom: ice: Remove the device_link field in qcom_ice
  drm/msm/adreno: Setup SMMU aparture for per-process page table
  firmware: qcom: scm: Introduce CP_SMMU_APERTURE_ID
  firmware: arm_scpi: Check the DVFS OPP count returned by the firmware
  soc: qcom: socinfo: add IPQ5424/IPQ5404 SoC ID
  dt-bindings: arm: qcom,ids: add SoC ID for IPQ5424/IPQ5404
  soc: qcom: llcc: Flip the manual slice configuration condition
  dt-bindings: firmware: qcom,scm: Document sm8750 SCM
  firmware: qcom: uefisecapp: Allow X1E Devkit devices
  misc: lan966x_pci: Fix dtc warn 'Missing interrupt-parent'
  misc: lan966x_pci: Fix dtc warns 'missing or empty reg/ranges property'
  soc: qcom: llcc: Add LLCC configuration for the QCS8300 platform
  dt-bindings: cache: qcom,llcc: Document the QCS8300 LLCC
  ...
2024-11-20 15:40:54 -08:00
Linus Torvalds
e6de688e93 Devicetree updates for v6.13:
Bindings:
 
 - Enable dtc "interrupt_provider" warnings for binding examples.
   Fix the warnings in fsl,mu-msi and ti,sci-inta due to this.
 
 - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton,  and
   altr,fpga-passive-serial to DT schema format
 
 - Add some documentation on the different forms of YAML text blocks
   which are a constant source of review comments
 
 - Fix some schema errors in constraints for arrays
 
 - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462
 
 DT core:
 
 - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
 
 - Add some warnings on deprecated address handling
 
 - Rework early_init_dt_scan() so the arch can pass in the phys address
   of the DTB as __pa() is not always valid to use. This fixes a warning
   for arm64 with kexec.
 
 - Add and use some new DT graph iterators for iterating over ports and
   endpoints
 
 - Rework reserved-memory handling to be sized dynamically for fixed
   regions
 
 - Optimize of_modalias() to avoid a strlen() call
 
 - Constify struct device_node and property pointers where ever possible
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmc7qaoACgkQ+vtdtY28
 YcN54g/+Ifz4hQTSWV+VBhihovMMPiQUdxZ+MfJfPnPcZ7NJzaTf+zqhZyS4wQou
 v0pdtyR0B1fCM/EvKaYD+1aTTAQFEIT5Dqac+9ePwqaYqSk+yCTxyzW9m+P3rTPV
 THo8SGRss7T+Rs+2WaUGxphTJItMGIRdbBvoqK+82EdKFXXKw2BSD8tlJTWwbTam
 9xkrpUzw7f4FvVY8vVhRyOd5i8/v+FH8D65DMIT6ME9zRn4MzKVzCg6udgYeCBld
 C2XbV+wnyewtjrN2IX+2uQ2mheb7yJu3AEI3iFR5x/sRrsSLpisxrUl38xOOpxrM
 XxYtHgE3omjagQ+y+L2PMthlKvhFrXVXIvhUH8xxje5z1Vyq3VMfiABkHlMpAnys
 5LY4xEhvqDkPNo65UmjMiHxGW/xtcKsmAZBOp+HLerZfCJIFvl380fi8mNg/Sjvz
 7ExCSpzCPsHASZg7QCTplU3BUtg+067Ch/k8Hsn/Og73Pqm3xH4IezQZKwweN9ZT
 LC6OQBI7C3Yt1hom9qgUcA4H4/aaPxTVV7i0DGuAKh8Lon6SaoX2yFpweUBgbsL/
 c9DIW4vbYBIGASxxUbHlNMKvPCKACKmpFXhsnH5Waj+VWSOwsJ8bjGpH8PfMKdFW
 dyJB/r94GqCGpCW7+FC1qGmXiQJGkCo89pKBVjSf4Kj45ht/76o=
 =NCYS
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "Bindings:

   - Enable dtc "interrupt_provider" warnings for binding examples. Fix
     the warnings in fsl,mu-msi and ti,sci-inta due to this.

   - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton, and
     altr,fpga-passive-serial to DT schema format

   - Add some documentation on the different forms of YAML text blocks
     which are a constant source of review comments

   - Fix some schema errors in constraints for arrays

   - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462

  DT core:

   - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n

   - Add some warnings on deprecated address handling

   - Rework early_init_dt_scan() so the arch can pass in the phys
     address of the DTB as __pa() is not always valid to use. This fixes
     a warning for arm64 with kexec.

   - Add and use some new DT graph iterators for iterating over ports
     and endpoints

   - Rework reserved-memory handling to be sized dynamically for fixed
     regions

   - Optimize of_modalias() to avoid a strlen() call

   - Constify struct device_node and property pointers where ever
     possible"

* tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (36 commits)
  of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
  dt-bindings: interrupt-controller: qcom,pdc: Add SAR2130P compatible
  of/address: Rework bus matching to avoid warnings
  of: WARN on deprecated #address-cells/#size-cells handling
  of/fdt: Don't use default address cell sizes for address translation
  dt-bindings: Enable dtc "interrupt_provider" warnings
  of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
  dt-bindings: cache: qcom,llcc: Fix X1E80100 reg entries
  dt-bindings: watchdog: convert zii,rave-sp-wdt.txt to yaml format
  dt-bindings: input: convert zii,rave-sp-pwrbutton.txt to yaml
  media: xilinx-tpg: use new of_graph functions
  fbdev: omapfb: use new of_graph functions
  gpu: drm: omapdrm: use new of_graph functions
  ASoC: audio-graph-card2: use new of_graph functions
  ASoC: audio-graph-card: use new of_graph functions
  ASoC: test-component: use new of_graph functions
  of: property: use new of_graph functions
  of: property: add of_graph_get_next_port_endpoint()
  of: property: add of_graph_get_next_port()
  of: module: remove strlen() call in of_modalias()
  ...
2024-11-20 13:19:25 -08:00
Zijun Hu
688d2eb4c6
PCI: endpoint: Clear secondary (not primary) EPC in pci_epc_remove_epf()
In addition to a primary endpoint controller, an endpoint function may be
associated with a secondary endpoint controller, epf->sec_epc, to provide
NTB (non-transparent bridge) functionality.

Previously, pci_epc_remove_epf() incorrectly cleared epf->epc instead of
epf->sec_epc when removing from the secondary endpoint controller.

Extend the epc->list_lock coverage and clear either epf->epc or
epf->sec_epc as indicated.

Link: https://lore.kernel.org/r/20241107-epc_rfc-v2-2-da5b6a99a66f@quicinc.com
Fixes: 63840ff532 ("PCI: endpoint: Add support to associate secondary EPC with EPF")
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org
2024-11-18 17:18:19 +00:00
Zijun Hu
4acc902ed3
PCI: endpoint: Fix PCI domain ID release in pci_epc_destroy()
pci_epc_destroy() invokes pci_bus_release_domain_nr() to release the PCI
domain ID, but there are two issues:

  - 'epc->dev' is passed to pci_bus_release_domain_nr() which was already
    freed by device_unregister(), leading to a use-after-free issue.

  - Domain ID corresponds to the EPC device parent, so passing 'epc->dev'
    is also wrong.

Fix these issues by passing 'epc->dev.parent' to
pci_bus_release_domain_nr() and also do it before device_unregister().

Fixes: 0328947c50 ("PCI: endpoint: Assign PCI domain number for endpoint controllers")
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241107-epc_rfc-v2-1-da5b6a99a66f@quicinc.com
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org
2024-11-18 17:17:05 +00:00
Niklas Cassel
118397c9ba
PCI: dwc: ep: Fix advertised resizable BAR size regression
The advertised resizable BAR size was fixed in commit 72e34b8593 ("PCI:
dwc: endpoint: Fix advertised resizable BAR size").

Commit 867ab111b2 ("PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown()
API to handle Link Down event") was included shortly after this, and
moved the code to another function. When the code was moved, this fix
was mistakenly lost.

According to the spec, it is illegal to not have a bit set in
PCI_REBAR_CAP, and 1 MB is the smallest size allowed.

So, set bit 4 in PCI_REBAR_CAP, so that we actually advertise support
for a 1 MB BAR size.

Fixes: 867ab111b2 ("PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event")
Link: https://lore.kernel.org/r/20241116005950.2480427-2-cassel@kernel.org
Link: https://lore.kernel.org/r/20240606-pci-deinit-v1-3-4395534520dc@linaro.org
Link: https://lore.kernel.org/r/20240307111520.3303774-1-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org
2024-11-16 18:21:28 +00:00
Rob Herring (Arm)
154fc1f642
PCI: dwc: Use of_property_present() for non-boolean properties
The use of of_property_read_bool() for non-boolean properties is
deprecated in favor of of_property_present() when testing for property
presence.

Link: https://lore.kernel.org/r/20241104190714.275977-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-11-16 18:21:06 +00:00
Zhongqiu Han
5089b3d874
PCI: endpoint: epf-mhi: Avoid NULL dereference if DT lacks 'mmio'
If platform_get_resource_byname() fails and returns NULL because DT lacks
an 'mmio' property for the MHI endpoint, dereferencing res->start will
cause a NULL pointer access. Add a check to prevent it.

Fixes: 1bf5f25324 ("PCI: endpoint: Add PCI Endpoint function driver for MHI bus")
Link: https://lore.kernel.org/r/20241105120735.1240728-1-quic_zhonhan@quicinc.com
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
[kwilczynski: error message update per the review feedback]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-11-16 18:20:11 +00:00
Wang Jiang
9b80bdb10a
PCI: endpoint: Remove surplus return statement from pci_epf_test_clean_dma_chan()
Remove a surplus return statement from the void function that has been
added in the commit commit 8353813c88 ("PCI: endpoint: Enable DMA
tests for endpoints with DMA capabilities").

Especially, as an empty return statements at the end of a void functions
serve little purpose.

This fixes the following checkpatch.pl script warning:

  WARNING: void function return statements are not generally useful
  #296: FILE: drivers/pci/endpoint/functions/pci-epf-test.c:296:
  +     return;
  +}

Link: https://lore.kernel.org/r/tencent_F250BEE2A65745A524E2EFE70CF615CA8F06@qq.com
Signed-off-by: Wang Jiang <jiangwang@kylinos.cn>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2024-11-16 18:20:11 +00:00
Niklas Cassel
3fafc38b77
PCI: dwc: ep: Use align addr function for dw_pcie_ep_raise_{msi,msix}_irq()
Use the dw_pcie_ep_align_addr() function to calculate the alignment in
dw_pcie_ep_raise_{msi,msix}_irq() instead of open coding the same.

Link: https://lore.kernel.org/r/20241017132052.4014605-6-cassel@kernel.org
Link: https://lore.kernel.org/r/20241104205144.409236-2-cassel@kernel.org
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
[kwilczynski: squashed patch that fixes memory map sizes]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-11-16 18:20:01 +00:00
Bjorn Helgaas
ba58eee1c5 PCI: Drop duplicate pcie_get_speed_cap(), pcie_get_width_cap() declarations
6cf57be0f7 ("PCI: Add pcie_get_speed_cap() to find max supported link
speed") and c70b65fb7f ("PCI: Add pcie_get_width_cap() to find max
supported link width") added declarations to drivers/pci/pci.h.

576c7218a1 ("PCI: Export pcie_get_speed_cap and pcie_get_width_cap")
subsequently added duplicates to include/linux/pci.h.

Remove the originals from drivers/pci/pci.h.  Both interfaces are used by
amdgpu, so they must be in include/linux/pci.h.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
2024-11-16 10:09:30 -06:00
Ilpo Järvinen
d278b09828 thermal: Add PCIe cooling driver
Add a thermal cooling driver to provide path to access PCIe bandwidth
controller using the usual thermal interfaces.

A cooling device is instantiated for controllable PCIe Ports from the
bwctrl service driver.

If registering the cooling device fails, allow bwctrl's probe to succeed
regardless. As cdev in that case contains IS_ERR() pseudo "pointer", clean
that up inside the probe function so the remove side doesn't need to
suddenly make an odd looking IS_ERR() check.

The thermal side state 0 means no throttling, i.e., maximum supported PCIe
Link Speed.

Link: https://lore.kernel.org/r/20241018144755.7875-9-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: dropped data->cdev test per
https://lore.kernel.org/r/ZzRm1SJTwEMRsAr8@wunner.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # From the cooling device interface perspective
2024-11-16 10:09:30 -06:00
Ilpo Järvinen
de9a6c8d5d PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
Currently, PCIe Link Speeds are adjusted by custom code rather than in a
common function provided in PCI core. The PCIe bandwidth controller
(bwctrl) introduces an in-kernel API, pcie_set_target_speed(), to set PCIe
Link Speed.

Convert Target Speed quirk to use the new API. The Target Speed quirk runs
very early when bwctrl is not yet probed for a Port and can also run later
when bwctrl is already setup for the Port, which requires the per port
mutex (set_speed_mutex) to be only taken if the bwctrl setup is already
complete.

The new API is also intended to be used in an upcoming commit that adds a
thermal cooling device to throttle PCIe bandwidth when thermal thresholds
are reached.

The PCIe bandwidth control procedure is as follows. The highest speed
supported by the Port and the PCIe device which is not higher than the
requested speed is selected and written into the Target Link Speed in the
Link Control 2 Register. Then bandwidth controller retrains the PCIe Link.

Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to
keep track PCIe Link Speed changes. While Bandwidth Notifications should
also be generated when bandwidth controller alters the PCIe Link Speed, a
few platforms do not deliver LMBS interrupt after Link Training as
expected. Thus, after changing the Link Speed, bandwidth controller makes
additional read for the Link Status Register to ensure cur_bus_speed is
consistent with the new PCIe Link Speed.

Link: https://lore.kernel.org/r/20241018144755.7875-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash devm_mutex_init() error checking from
https://lore.kernel.org/r/20241030163139.2111689-1-andriy.shevchenko@linux.intel.com,
drop export of pcie_set_target_speed()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-16 10:09:30 -06:00
Ilpo Järvinen
665745f274 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
This mostly reverts the commit b4c7d2076b ("PCI/LINK: Remove bandwidth
notification"). An upcoming commit extends this driver building PCIe
bandwidth controller on top of it.

PCIe bandwidth notifications were first added in the commit e8303bb7a7
("PCI/LINK: Report degraded links via link bandwidth notification") but
later had to be removed. The significant changes compared with the old
bandwidth notification driver include:

1) Don't print the notifications into kernel log, just keep the Link
   Speed cached in struct pci_bus updated. While somewhat unfortunate,
   the log spam was the source of complaints that eventually lead to
   the removal of the bandwidth notifications driver (see the links
   below for further information).

2) Besides the Link Bandwidth Management Interrupt, also enable Link
   Autonomous Bandwidth Interrupt to cover the other source of bandwidth
   changes.

3) Handle Link Speed updates robustly. Refresh the cached Link Speed
   when enabling Bandwidth Notification Interrupts, and solve the race
   between Link Speed read and LBMS/LABS update in
   pcie_bwnotif_irq_thread().

4) Use concurrency safe LNKCTL RMW operations.

5) The driver is now called PCIe bwctrl (bandwidth controller) instead
   of just bandwidth notifications because of increased scope and
   functionality within the driver.

6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain().
   Provide LBMS counting API for it.

7) Tweaks to variable/functions names for consistency and length reasons.

Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to
keep track PCIe Link Speed changes.

[bhelgaas: This is based on previous work by Alexandru Gagniuc
<mr.nuke.me@gmail.com>; see e8303bb7a7 ("PCI/LINK: Report degraded links
via link bandwidth notification")]

Link: https://lore.kernel.org/r/20241018144755.7875-7-ilpo.jarvinen@linux.intel.com
Link: https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/
Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/
Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/
Suggested-by: Lukas Wunner <lukas@wunner.de> # Building bwctrl on top of bwnotif
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler:
https://lore.kernel.org/r/20241115165717.15233-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-16 10:09:04 -06:00
Mengyuan Lou
aa46a3736a PCI: Add ACS quirk for Wangxun FF5xxx NICs
Wangxun FF5xxx NICs are similar to SFxxx, RP1000 and RP2000 NICs.  They may
be multi-function devices, but they do not advertise an ACS capability.

But the hardware does isolate FF5xxx functions as though it had an ACS
capability and PCI_ACS_RR and PCI_ACS_CR were set in the ACS Control
register, i.e., all peer-to-peer traffic is directed upstream instead of
being routed internally.

Add ACS quirk for FF5xxx NICs in pci_quirk_wangxun_nic_acs() so the
functions can be in independent IOMMU groups.

Link: https://lore.kernel.org/r/E16053DB2B80E9A5+20241115024604.30493-1-mengyuanlou@net-swift.com
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-15 17:30:15 -06:00
Bjorn Helgaas
31457d4cea PCI: Fix typos
Fix typos and whitespace errors.

Link: https://lore.kernel.org/r/20241102174537.1362183-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-15 14:50:00 -06:00
Andrea della Porta
5e316d34b5 PCI: of_property: Assign PCI instead of CPU bus address to dynamic PCI nodes
When populating "ranges" property for a PCI bridge or endpoint,
of_pci_prop_ranges() incorrectly uses the CPU address of the resource.  In
such PCI nodes, the window should instead be in PCI address space. Call
pci_bus_address() on the resource in order to obtain the PCI bus address.

[Previous discussion at:
https://lore.kernel.org/all/8b4fa91380fc4754ea80f47330c613e4f6b6592c.1724159867.git.andrea.porta@suse.com/]

Link: https://lore.kernel.org/r/20241108094256.28933-1-andrea.porta@suse.com
Fixes: 407d1a5192 ("PCI: Create device tree node for bridge")
Tested-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2024-11-15 11:17:52 -06:00
Shijith Thotton
e434e54d3f PCI: hotplug: Add OCTEON PCI hotplug controller driver
Add a PCI hotplug controller driver for the OCTEON PCIe device. The OCTEON
PCIe device is a multi-function device where function 0 serves as the PCI
hotplug controller.

There is an out-of-band management console interface to firmware running on
function 0 whereby an administrator can disable functions to save power or
enable them with one of several personalities (virtio-net, virtio-crypto,
NVMe, etc) for the other functions.  Function 0 initiates hotplug events
handled by this driver when the other functions are enabled or disabled.

                 +--------------------------------+
                 |           Root Port            |
                 +--------------------------------+
                                 |
                                PCIe
                                 |
  +---------------------------------------------------------------+
  |              OCTEON PCIe Multifunction Device                 |
  +---------------------------------------------------------------+
               |                    |              |            |
               |                    |              |            |
  +---------------------+  +----------------+  +-----+  +----------------+
  |      Function 0     |  |   Function 1   |  | ... |  |   Function 7   |
  | (Hotplug controller)|  | (Hotplug slot) |  |     |  | (Hotplug slot) |
  +---------------------+  +----------------+  +-----+  +----------------+
               |
               |
  +-------------------------+
  |   Controller Firmware   |
  +-------------------------+

The hotplug controller driver enables hotplugging of non-controller
functions within the same device. During probing, the driver removes
the non-controller functions and registers them as PCI hotplug slots.
These slots are added back by the driver, only upon request from the
device firmware.

The controller uses MSI-X interrupts to notify the host of hotplug
events initiated by the OCTEON firmware. Additionally, the driver
allows users to enable or disable individual functions via sysfs slot
entries, as provided by the PCI hotplug framework.

Link: https://lore.kernel.org/r/20241111134523.2796699-1-sthotton@marvell.com
Co-developed-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
[bhelgaas: use pci_info() when possible]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-11-13 17:51:39 -06:00
Keith Busch
a3151e6daa PCI: Warn if a running device is unaware of reset
If a reset is issued to a running device with a driver that didn't register
the notification callbacks, the driver may be unaware of this event and
have an inconsistent view of the device's state. Log a warning of this
event because there's nothing else indicating the event occured, which
could be confusing when debugging such situations.

Link: https://lore.kernel.org/r/20241025222755.3756162-2-kbusch@meta.com
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-11-13 16:48:49 -06:00
Keith Busch
2fa046449a PCI: Add 'reset_subordinate' to reset hierarchy below bridge
The "bus" and "cxl_bus" reset methods reset a device by asserting Secondary
Bus Reset on the bridge leading to the device.  These only work if the
device is the only device below the bridge.

Add a sysfs 'reset_subordinate' attribute on bridges that can assert
Secondary Bus Reset regardless of how many devices are below the bridge.

This resets all the devices below a bridge in a single command, including
the locking and config space save/restore that reset methods normally do.

This may be the only way to reset devices that don't support other reset
methods (ACPI, FLR, PM reset, etc).

Link: https://lore.kernel.org/r/20241025222755.3756162-1-kbusch@meta.com
Signed-off-by: Keith Busch <kbusch@kernel.org>
[bhelgaas: commit log, add capable(CAP_SYS_ADMIN) check]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>
2024-11-13 16:43:34 -06:00