Commit Graph

99 Commits

Author SHA1 Message Date
Ce Sun
48cb9c3b21 drm/amdgpu: The interrupt source was not released
When the driver is unloaded, the interrupt source of
the rma device is not released, resulting in the failure
of hw_init when loading again using bad_page_threshold.

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-15 14:07:51 -04:00
Dave Airlie
7e2818386a amd-drm-next-6.17-2025-07-01:
amdgpu:
 - FAMS2 fixes
 - OLED fixes
 - Misc cleanups
 - AUX fixes
 - DMCUB updates
 - SR-IOV hibernation support
 - RAS updates
 - DP tunneling fixes
 - DML2 fixes
 - Backlight improvements
 - Suspend improvements
 - Use scaling for non-native modes on eDP
 - SDMA 4.4.x fixes
 - PCIe DPM fixes
 - SDMA 5.x fixes
 - Cleaner shader updates for GC 9.x
 - Remove fence slab
 - ISP genpd support
 - Parition handling rework
 - SDMA FW checks for userq support
 - Add missing firmware declaration
 - Fix leak in amdgpu_ctx_mgr_entity_fini()
 - Freesync fix
 - Ring reset refactoring
 - Legacy dpm verbosity changes
 
 amdkfd:
 - GWS fix
 - mtype fix for ext coherent system memory
 - MMU notifier fix
 - gfx7/8 fix
 
 radeon:
 - CS validation support for additional GL extensions
 - Bump driver version for new CS validation checks
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaGQ6hwAKCRC93/aFa7yZ
 2JVJAQC/IpZoSmW22VrtXjBiQ3yoYt61oOM/WJVXPV11FmisyQEAqQ9nc+/lY9SM
 s5/RskaQGlfCGqafnaSGuoILkD/YDQ4=
 =i8mT
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.17-2025-07-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.17-2025-07-01:

amdgpu:
- FAMS2 fixes
- OLED fixes
- Misc cleanups
- AUX fixes
- DMCUB updates
- SR-IOV hibernation support
- RAS updates
- DP tunneling fixes
- DML2 fixes
- Backlight improvements
- Suspend improvements
- Use scaling for non-native modes on eDP
- SDMA 4.4.x fixes
- PCIe DPM fixes
- SDMA 5.x fixes
- Cleaner shader updates for GC 9.x
- Remove fence slab
- ISP genpd support
- Parition handling rework
- SDMA FW checks for userq support
- Add missing firmware declaration
- Fix leak in amdgpu_ctx_mgr_entity_fini()
- Freesync fix
- Ring reset refactoring
- Legacy dpm verbosity changes

amdkfd:
- GWS fix
- mtype fix for ext coherent system memory
- MMU notifier fix
- gfx7/8 fix

radeon:
- CS validation support for additional GL extensions
- Bump driver version for new CS validation checks

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250701194707.32905-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-07-04 10:06:29 +10:00
Lijo Lazar
a3e510fd69 drm/amdgpu: Convert from DRM_* to dev_*
Convert from generic DRM_* to dev_* calls to have device context info.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-30 11:54:55 -04:00
Samuel Zhang
4108c2be12 drm/amdgpu: fix fence fallback timer expired error
IH is not working after switching a new gpu index for the first time.

During VM resume, QEMU programming of VF MSIX table (register GFXMSIX_VECT0_ADDR_LO)
may not work.The access could be blocked by nBIF protection as VF isn't in
exclusive access mode. Exclusive access is enabled now, disable/enable MSIX
so that QEMU reprograms MSIX table.

call amdgpu_restore_msix on resume to restore msix table.

Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-18 12:19:15 -04:00
Linus Torvalds
b08494a8f7 drm for 6.16-rc1
new drivers:
 - bring in the asahi uapi header standalone
 - nova-drm: stub driver
 
 rust dependencies (for nova-core):
 - auxiliary
   - bus abstractions
   - driver registration
   - sample driver
 - devres changes from driver-core
 - revocable changes
 
 core:
 - add Apple fourcc modifiers
 - add virtio capset definitions
 - extend EXPORT_SYNC_FILE for timeline syncobjs
 - convert to devm_platform_ioremap_resource
 - refactor shmem helper page pinning
 - DP powerup/down link helpers
 - remove disgusting turds
 - extended %p4cc in vsprintf.c to support fourcc prints
 - change vsprintf %p4cn to %p4chR, remove %p4cn
 - Add drm_file_err function
 - IN_FORMATS_ASYNC property
 - move sitronix from tiny to their own subdir
 
 rust:
 - add drm core infrastructure rust abstractions
   (device/driver, ioctl, file, gem)
 
 dma-buf:
 - adjust sg handling to not cache map on attach
 - allow setting dma-device for import
 - Add a helper to sort and deduplicate dma_fence arrays
 
 docs:
 - updated drm scheduler docs
 - fbdev todo update
 - fb rendering
 - actual brightness
 
 ttm:
 - fix delayed destroy resv object
 
 bridge:
 - add kunit tests
 - convert tc358775 to atomic
 - convert drivers to devm_drm_bridge_alloc
 - convert rk3066_hdmi to bridge driver
 
 scheduler:
 - add kunit tests
 
 panel:
 - refcount panels to improve lifetime handling
 - Powertip PH128800T004-ZZA01
 - NLT NL13676BC25-03F, Tianma TM070JDHG34-00
 - Himax HX8279/HX8279-D DDIC
 - Visionox G2647FB105
 - Sitronix ST7571
 - ZOTAC rotation quirk
 
 vkms:
 - allow attaching more displays
 
 i915:
 - xe3lpd display updates
 - vrr refactor
 - intel_display struct conversions
 - xe2hpd memory type identification
 - add link rate/count to i915_display_info
 - cleanup VGA plane handling
 - refactor HDCP GSC
 - fix SLPC wait boosting reference counting
 - add 20ms delay to engine reset
 - fix fence release on early probe errors
 
 xe:
 - SRIOV updates
 - BMG PCI ID update
 - support separate firmware for each GT
 - SVM fix, prelim SVM multi-device work
 - export fan speed
 - temp disable d3cold on BMG
 - backup VRAM in PM notifier instead of suspend/freeze
 - update xe_ttm_access_memory to use GPU for non-visible access
 - fix guc_info debugfs for VFs
 - use copy_from_user instead of __copy_from_user
 - append PCIe gen5 limitations to xe_firmware document
 
 amdgpu:
 - DSC cleanup
 - DC Scaling updates
 - Fused I2C-over-AUX updates
 - DMUB updates
 - Use drm_file_err in amdgpu
 - Enforce isolation updates
 - Use new dma_fence helpers
 - USERQ fixes
 - Documentation updates
 - SR-IOV updates
 - RAS updates
 - PSP 12 cleanups
 - GC 9.5 updates
 - SMU 13.x updates
 - VCN / JPEG SR-IOV updates
 
 amdkfd:
 - Update error messages for SDMA
 - Userptr updates
 - XNACK fixes
 
 radeon:
 - CIK doorbell cleanup
 
 nouveau:
 - add support for NVIDIA r570 GSP firmware
 - enable Hopper/Blackwell support
 
 nova-core:
 - fix task list
 - register definition infrastructure
 - move firmware into own rust module
 - register auxiliary device for nova-drm
 
 nova-drm:
 - initial driver skeleton
 
 msm:
 - GPU:
   - ACD (adaptive clock distribution) for X1-85
   - drop fictional address_space_size
   - improve GMU HFI response time out robustness
   - fix crash when throttling during boot
 - DPU:
   - use single CTL path for flushing on DPU 5.x+
   - improve SSPP allocation code for better sharing
   - Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
   - Added SAR2130P support
   - Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
 - DP:
   - switch to new audio helpers
   - better LTTPR handling
 - DSI:
   - Added support for SA8775P
   - Added SAR2130P support
 - HDMI:
   - Switched to use new helpers for ACR data
   - Fixed old standing issue of HPD not working in some cases
 
 amdxdna:
 - add dma-buf support
 - allow empty command submits
 
 renesas:
 - add dma-buf support
 - add zpos, alpha, blend support
 
 panthor:
 - fail properly for NO_MMAP bos
 - add SET_LABEL ioctl
 - debugfs BO dumping support
 
 imagination:
 - update DT bindings
 - support TI AM68 GPU
 
 hibmc:
 - improve interrupt handling and HPD support
 
 virtio:
 - add panic handler support
 
 rockchip:
 - add RK3588 support
 - add DP AUX bus panel support
 
 ivpu:
 - add heartbeat based hangcheck
 
 mediatek:
 - prepares support for MT8195/99 HDMIv2/DDCv2
 
 anx7625:
 - improve HPD
 
 tegra:
 - speed up firmware loading
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmg2aVAACgkQDHTzWXnE
 hr6DjhAApr2fZjugU3EmpsARdcIWgEd+X65R97ef7RlUGqBKm2joSwZGOhH0oBsG
 9WyO92Qzu6XMe8OibKqY4D2hir9UPz5v+uEWe3q9CzZGbNyAwyVRjVkaKpnI9upv
 1dmHFI7HgPu6qbz6RfPIfgALBLXvVXMaQ4+ZgN/cLtZFa+OLAV5ByqWsRPPXZFb0
 F/pQGQ4ursglfA+LH3SVPfnTN53lu93IlM5/Os9OQQGj+44w94zQ6DCm7CY1AugH
 n+RM/0Yv7WaoF1ByeOtq4FcrmLRrd+ozsvITbRZqhOx7zS/mhP8LRzAwgKWOYzSh
 puKunyQiSdHR7FSqSi8uyY3YumcLWNa/17LMKoTf+KqweJbKGE7RVBuFBn6WUdPb
 AYHZrSB4USAeyahdrrsU+q7ltu5urs5ckpbXsRurMiaUz/BLim1PIm3N5FDLPY7B
 PD1n1FcMUv3CmJT5Y+aNIQgmf1/dETESRTSAgSoOo3gNp6jdRCYqSuWIBsppibWT
 26+tyz0/FGhE50QviHzg0Sv+jd/g93fN6snNlV8wNFMviq3bC69Toa+y3qJ5e7UC
 /42R7nCWdkCZJfr6E67rOaahe9TDV/LXLqPErwptOkdK8sMchaIgF+deybgTtTi/
 zGRBfjLvb5ocYBmPbeGX4mtXNRpyZ3o9I0QUyGUO4zMwFXmFwn0=
 =jpVr
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "As part of building up nova-core/nova-drm pieces we've brought in some
  rust abstractions through this tree, aux bus being the main one, with
  devres changes also in the driver-core tree. Along with the drm core
  abstractions and enough nova-core/nova-drm to use them. This is still
  all stub work under construction, to build the nova driver upstream.

  The other big NVIDIA related one is nouveau adds support for
  Hopper/Blackwell GPUs, this required a new GSP firmware update to
  570.144, and a bunch of rework in order to support multiple fw
  interfaces.

  There is also the introduction of an asahi uapi header file as a
  precursor to getting the real driver in later, but to unblock
  userspace mesa packages while the driver is trapped behind rust
  enablement.

  Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe,
  and msm being the main ones, and some changes to vsprintf.

  new drivers:
   - bring in the asahi uapi header standalone
   - nova-drm: stub driver

  rust dependencies (for nova-core):
   - auxiliary
       - bus abstractions
       - driver registration
       - sample driver
   - devres changes from driver-core
   - revocable changes

  core:
   - add Apple fourcc modifiers
   - add virtio capset definitions
   - extend EXPORT_SYNC_FILE for timeline syncobjs
   - convert to devm_platform_ioremap_resource
   - refactor shmem helper page pinning
   - DP powerup/down link helpers
   - extended %p4cc in vsprintf.c to support fourcc prints
   - change vsprintf %p4cn to %p4chR, remove %p4cn
   - Add drm_file_err function
   - IN_FORMATS_ASYNC property
   - move sitronix from tiny to their own subdir

  rust:
   - add drm core infrastructure rust abstractions
     (device/driver, ioctl, file, gem)

  dma-buf:
   - adjust sg handling to not cache map on attach
   - allow setting dma-device for import
   - Add a helper to sort and deduplicate dma_fence arrays

  docs:
   - updated drm scheduler docs
   - fbdev todo update
   - fb rendering
   - actual brightness

  ttm:
   - fix delayed destroy resv object

  bridge:
   - add kunit tests
   - convert tc358775 to atomic
   - convert drivers to devm_drm_bridge_alloc
   - convert rk3066_hdmi to bridge driver

  scheduler:
   - add kunit tests

  panel:
   - refcount panels to improve lifetime handling
   - Powertip PH128800T004-ZZA01
   - NLT NL13676BC25-03F, Tianma TM070JDHG34-00
   - Himax HX8279/HX8279-D DDIC
   - Visionox G2647FB105
   - Sitronix ST7571
   - ZOTAC rotation quirk

  vkms:
   - allow attaching more displays

  i915:
   - xe3lpd display updates
   - vrr refactor
   - intel_display struct conversions
   - xe2hpd memory type identification
   - add link rate/count to i915_display_info
   - cleanup VGA plane handling
   - refactor HDCP GSC
   - fix SLPC wait boosting reference counting
   - add 20ms delay to engine reset
   - fix fence release on early probe errors

  xe:
   - SRIOV updates
   - BMG PCI ID update
   - support separate firmware for each GT
   - SVM fix, prelim SVM multi-device work
   - export fan speed
   - temp disable d3cold on BMG
   - backup VRAM in PM notifier instead of suspend/freeze
   - update xe_ttm_access_memory to use GPU for non-visible access
   - fix guc_info debugfs for VFs
   - use copy_from_user instead of __copy_from_user
   - append PCIe gen5 limitations to xe_firmware document

  amdgpu:
   - DSC cleanup
   - DC Scaling updates
   - Fused I2C-over-AUX updates
   - DMUB updates
   - Use drm_file_err in amdgpu
   - Enforce isolation updates
   - Use new dma_fence helpers
   - USERQ fixes
   - Documentation updates
   - SR-IOV updates
   - RAS updates
   - PSP 12 cleanups
   - GC 9.5 updates
   - SMU 13.x updates
   - VCN / JPEG SR-IOV updates

  amdkfd:
   - Update error messages for SDMA
   - Userptr updates
   - XNACK fixes

  radeon:
   - CIK doorbell cleanup

  nouveau:
   - add support for NVIDIA r570 GSP firmware
   - enable Hopper/Blackwell support

  nova-core:
   - fix task list
   - register definition infrastructure
   - move firmware into own rust module
   - register auxiliary device for nova-drm

  nova-drm:
   - initial driver skeleton

  msm:
   - GPU:
       - ACD (adaptive clock distribution) for X1-85
       - drop fictional address_space_size
       - improve GMU HFI response time out robustness
       - fix crash when throttling during boot
   - DPU:
       - use single CTL path for flushing on DPU 5.x+
       - improve SSPP allocation code for better sharing
       - Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
       - Added SAR2130P support
       - Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
   - DP:
       - switch to new audio helpers
       - better LTTPR handling
   - DSI:
       - Added support for SA8775P
       - Added SAR2130P support
   - HDMI:
       - Switched to use new helpers for ACR data
       - Fixed old standing issue of HPD not working in some cases

  amdxdna:
   - add dma-buf support
   - allow empty command submits

  renesas:
   - add dma-buf support
   - add zpos, alpha, blend support

  panthor:
   - fail properly for NO_MMAP bos
   - add SET_LABEL ioctl
   - debugfs BO dumping support

  imagination:
   - update DT bindings
   - support TI AM68 GPU

  hibmc:
   - improve interrupt handling and HPD support

  virtio:
   - add panic handler support

  rockchip:
   - add RK3588 support
   - add DP AUX bus panel support

  ivpu:
   - add heartbeat based hangcheck

  mediatek:
   - prepares support for MT8195/99 HDMIv2/DDCv2

  anx7625:
   - improve HPD

  tegra:
   - speed up firmware loading

* tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits)
  drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr()
  drm/xe: Default auto_link_downgrade status to false
  drm/xe/guc: Make creation of SLPC debugfs files conditional
  drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue()
  drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read
  drm/i915/ptl: Use everywhere the correct DDI port clock select mask
  drm/nouveau/kms: add support for GB20x
  drm/dp: add option to disable zero sized address only transactions.
  drm/nouveau: add support for GB20x
  drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle
  drm/nouveau: add support for GB10x
  drm/nouveau/gf100-: track chan progress with non-WFI semaphore release
  drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA
  drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos
  drm/nouveau: add support for GH100
  drm/nouveau: improve handling of 64-bit BARs
  drm/nouveau/gv100-: switch to volta semaphore methods
  drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES
  drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY
  drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM
  ...
2025-05-28 09:46:39 -07:00
Jiri Slaby (SUSE)
493e109267 gpu: Switch to irq_domain_create_linear()
irq_domain_add_linear() is going away as being obsolete now. Switch to
the preferred irq_domain_create_linear(). That differs in the first
parameter: It takes more generic struct fwnode_handle instead of struct
device_node. Therefore, of_fwnode_handle() is added around the
parameter.

Note some of the users can likely use dev->fwnode directly instead of
indirect of_fwnode_handle(dev->of_node). But dev->fwnode is not
guaranteed to be set for all, so this has to be investigated on case to
case basis (by people who can actually test with the HW).

[ tglx: Fix up subject prefix ]

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250319092951.37667-19-jirislaby@kernel.org
2025-05-16 21:06:09 +02:00
Ce Sun
f71509fdd0 drm/amdgpu: Fix the kernel panic caused by RAS records exceed threshold
kernel panic caused by RAS records exceeding the threshold when load driver
specifying RMA(bad_page_threshold=128)

1.Fix the warnings caused by disabling the interrupt source
before it was enabled
2.Fix kernel panic when xcp sysfs is not initialized,null pointer
appears during fini
3.Fix the memory leak caused by the device's early exit due to rma

The first reason:
[ 2744.246650] ------------[ cut here ]------------
[ 2744.246651] WARNING: CPU: 0 PID: 289 at /tmp/amd.BkfTLqYV/amd/amdgpu/amdgpu_irq.c:635 amdgpu_irq_put.cold+0x42/0x6e [amdgpu]
[ 2744.247108] Modules linked in: amdgpu(OE+) amddrm_ttm_helper(OE) amdttm(OE) amdxcp(OE) amddrm_buddy(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay binfmt_misc intel_rapl_msr intel_rapl_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel nls_iso8859_1 kvm rapl isst_if_mbox_pci isst_if_mmio pmt_telemetry pmt_crashlog isst_if_common pmt_class mei_me mei acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath
[ 2744.247167]  linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul ghash_clmulni_intel mlx5_core sysfillrect sysimgblt aesni_intel mlxfw fb_sys_fops psample cec crypto_simd cryptd rc_core i2c_i801 nvme xhci_pci tls intel_pmt drm pci_hyperv_intf nvme_core i2c_smbus i2c_ismt xhci_pci_renesas wmi pinctrl_emmitsburg
[ 2744.247194] CPU: 0 PID: 289 Comm: kworker/0:1 Tainted: G           OE     5.15.0-70-generic #77-Ubuntu
[ 2744.247197] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C23.AG.2 11/21/2024
[ 2744.247198] Workqueue: events work_for_cpu_fn
[ 2744.247206] RIP: 0010:amdgpu_irq_put.cold+0x42/0x6e [amdgpu]
[ 2744.247634] Code: 79 7f ff 44 89 ee 48 c7 c7 4d 5a 42 c2 89 55 d4 e8 90 09 bc bf 8b 55 d4 4c 89 e6 4c 89 ff e8 3c 76 7f ff 8b 55 d4 84 c0 75 07 <0f> 0b e9 95 79 7f ff 49 03 5c 24 08 f0 ff 0b 75 13 4c 89 e6 4c 89
[ 2744.247636] RSP: 0018:ffa0000019e27cb0 EFLAGS: 00010246
[ 2744.247639] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ff11000150fa87c0
[ 2744.247641] RDX: 0000000000000000 RSI: ffffffffc2222430 RDI: ff1100019f200000
[ 2744.247642] RBP: ffa0000019e27ce0 R08: 0000000000000003 R09: ffffffffffe41a08
[ 2744.247643] R10: 0000000000ffff0a R11: 0000000000000001 R12: ff1100019f22ce60
[ 2744.247644] R13: 0000000000000000 R14: 00000000ffffffea R15: ff1100019f200000
[ 2744.247645] FS:  0000000000000000(0000) GS:ff11007e7e400000(0000) knlGS:0000000000000000
[ 2744.247647] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2744.247649] CR2: 00007f3d2002819c CR3: 0000000006810003 CR4: 0000000000771ef0
[ 2744.247650] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2744.247651] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 2744.247652] PKRU: 55555554
[ 2744.247653] Call Trace:
[ 2744.247654]  <TASK>
[ 2744.247656]  sdma_v4_4_2_hw_fini+0x7a/0xc0 [amdgpu]
[ 2744.247997]  ? vcn_v4_0_3_hw_fini+0x5f/0xa0 [amdgpu]
[ 2744.248336]  amdgpu_ip_block_hw_fini+0x31/0x61 [amdgpu]
[ 2744.248776]  amdgpu_device_fini_hw+0x3bb/0x47b [amdgpu]
[ 2744.249197]  ? blocking_notifier_chain_unregister+0x56/0xb0
[ 2744.249202]  amdgpu_driver_unload_kms+0x51/0x60 [amdgpu]
[ 2744.249482]  amdgpu_driver_load_kms.cold+0x18/0x2e [amdgpu]
[ 2744.249913]  amdgpu_pci_probe+0x23e/0x590 [amdgpu]
[ 2744.250187]  local_pci_probe+0x48/0x90
[ 2744.250191]  work_for_cpu_fn+0x17/0x30
[ 2744.250196]  process_one_work+0x228/0x3d0
[ 2744.250198]  worker_thread+0x223/0x420
[ 2744.250200]  ? process_one_work+0x3d0/0x3d0
[ 2744.250201]  kthread+0x127/0x150
[ 2744.250204]  ? set_kthread_struct+0x50/0x50
[ 2744.250207]  ret_from_fork+0x1f/0x30
[ 2744.250212]  </TASK>
[ 2744.250213] ---[ end trace 488c997a88508bc3 ]---

The second reason:
[ 5139.303446] Memory manager not clean during takedown.
[ 5139.303509] WARNING: CPU: 145 PID: 117699 at drivers/gpu/drm/drm_mm.c:998 drm_mm_takedown+0x27/0x30 [drm]
[ 5139.303542] Modules linked in: amdgpu(OE+) amddrm_ttm_helper(OE) amdttm(OE) amdxcp(OE) amddrm_buddy(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel binfmt_misc kvm nls_iso8859_1 rapl isst_if_mbox_pci pmt_telemetry pmt_crashlog isst_if_mmio pmt_class isst_if_common mei_me mei acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath
[ 5139.303572]  linear mlx5_ib ib_uverbs ib_core crct10dif_pclmul ast crc32_pclmul i2c_algo_bit ghash_clmulni_intel aesni_intel crypto_simd drm_vram_helper cryptd drm_ttm_helper mlx5_core ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core mlxfw psample intel_pmt nvme xhci_pci drm tls i2c_i801 pci_hyperv_intf nvme_core i2c_smbus i2c_ismt xhci_pci_renesas wmi pinctrl_emmitsburg [last unloaded: amdkcl]
[ 5139.303588] CPU: 145 PID: 117699 Comm: modprobe Tainted: G     U     OE     5.15.0-70-generic #77-Ubuntu
[ 5139.303590] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C23.AG.2 11/21/2024
[ 5139.303591] RIP: 0010:drm_mm_takedown+0x27/0x30 [drm]
[ 5139.303605] Code: cc 66 90 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 f8 75 05 c3 cc cc cc cc 55 48 c7 c7 18 d0 10 c0 48 89 e5 e8 5a bc c3 c1 <0f> 0b 5d c3 cc cc cc cc 90 0f 1f 44 00 00 55 b9 15 00 00 00 48 89
[ 5139.303607] RSP: 0018:ffa00000325c3940 EFLAGS: 00010286
[ 5139.303608] RAX: 0000000000000000 RBX: ff1100012f5cecb0 RCX: 0000000000000027
[ 5139.303609] RDX: ff11007e7fa60588 RSI: 0000000000000001 RDI: ff11007e7fa60580
[ 5139.303610] RBP: ffa00000325c3940 R08: 0000000000000003 R09: fffffffff00c2b78
[ 5139.303610] R10: 000000000000002b R11: 0000000000000001 R12: ff1100012f5cec00
[ 5139.303611] R13: ff1100012138f068 R14: 0000000000000000 R15: ff1100012f5cec90
[ 5139.303611] FS:  00007f42ffca0000(0000) GS:ff11007e7fa40000(0000) knlGS:0000000000000000
[ 5139.303612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5139.303613] CR2: 00007f23d945ab68 CR3: 00000001212ce005 CR4: 0000000000771ee0
[ 5139.303614] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5139.303615] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 5139.303615] PKRU: 55555554
[ 5139.303616] Call Trace:
[ 5139.303617]  <TASK>
[ 5139.303619]  amdttm_range_man_fini_nocheck+0xfe/0x1c0 [amdttm]
[ 5139.303625]  amdgpu_ttm_fini+0x2ed/0x390 [amdgpu]
[ 5139.303800]  amdgpu_bo_fini+0x27/0xc0 [amdgpu]
[ 5139.303959]  gmc_v9_0_sw_fini+0x63/0x90 [amdgpu]
[ 5139.304144]  amdgpu_device_fini_sw+0x125/0x6a0 [amdgpu]
[ 5139.304302]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[ 5139.304455]  devm_drm_dev_init_release+0x4a/0x80 [drm]
[ 5139.304472]  devm_action_release+0x12/0x20
[ 5139.304476]  release_nodes+0x3d/0xb0
[ 5139.304478]  devres_release_all+0x9b/0xd0
[ 5139.304480]  really_probe+0x11d/0x420
[ 5139.304483]  __driver_probe_device+0x119/0x190
[ 5139.304485]  driver_probe_device+0x23/0xc0
[ 5139.304487]  __driver_attach+0xf7/0x1f0
[ 5139.304489]  ? __device_attach_driver+0x140/0x140
[ 5139.304491]  bus_for_each_dev+0x7c/0xd0
[ 5139.304493]  driver_attach+0x1e/0x30
[ 5139.304494]  bus_add_driver+0x148/0x220
[ 5139.304496]  driver_register+0x95/0x100
[ 5139.304498]  __pci_register_driver+0x68/0x70
[ 5139.304500]  amdgpu_init+0xbc/0x1000 [amdgpu]
[ 5139.304655]  ? 0xffffffffc0b8f000
[ 5139.304657]  do_one_initcall+0x46/0x1e0
[ 5139.304659]  ? kmem_cache_alloc_trace+0x19e/0x2e0
[ 5139.304663]  do_init_module+0x52/0x260
[ 5139.304665]  load_module+0xb2b/0xbc0
[ 5139.304667]  __do_sys_finit_module+0xbf/0x120
[ 5139.304669]  __x64_sys_finit_module+0x18/0x20
[ 5139.304670]  do_syscall_64+0x59/0xc0
[ 5139.304673]  ? exit_to_user_mode_prepare+0x37/0xb0
[ 5139.304676]  ? syscall_exit_to_user_mode+0x27/0x50
[ 5139.304678]  ? __x64_sys_mmap+0x33/0x50
[ 5139.304680]  ? do_syscall_64+0x69/0xc0
[ 5139.304681]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 5139.304684] RIP: 0033:0x7f42ffdbf88d
[ 5139.304686] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
[ 5139.304687] RSP: 002b:00007ffcb7427158 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 5139.304688] RAX: ffffffffffffffda RBX: 000055ce8b8f3150 RCX: 00007f42ffdbf88d
[ 5139.304689] RDX: 0000000000000000 RSI: 000055ce8b8f9a70 RDI: 000000000000000a
[ 5139.304690] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000011
[ 5139.304690] R10: 000000000000000a R11: 0000000000000246 R12: 000055ce8b8f9a70
[ 5139.304691] R13: 000055ce8b8f2ec0 R14: 000055ce8b8f2ab0 R15: 000055ce8b8f9aa0
[ 5139.304692]  </TASK>
[ 5139.304693] ---[ end trace 8536b052f7883003 ]---

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-13 09:32:11 -04:00
Pratap Nirujogi
0253d718a0 drm/amd/amdgpu: Map ISP interrupts as generic IRQs
Map ISP IH interrupts to Linux generic IRQ for ISP driver to
handle the interrupts using MFD IORESOURCE_IRQ resource.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-27 17:34:40 -04:00
Linus Torvalds
f0bae243b2 pci-v6.10-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmZLzNIUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwr/Q//STe2XGKI8bAKqP2wbbkzm+ISnK4A
 Lqf3FEAIXunxDRspszfXKKV2p4vaIkmOFiwIdtp/kWvd0DQn5+ATXJ/iQtp8aFX/
 R+6BQ7EZc2G7fN5fbQuK54+CvmWEpkKEMbXYbd6ivQ14Cijdb3Nbu+w+DYFjS+6C
 k2a9lS1bTW7Xcy0fyiO1w6GQiWqtmOH8U3OlQtIrI0EVkDG9OG1LsLuc92/FgkOo
 REN+sU+hX1K5fHrvm2CtjYDn/9/B6bJ/It22H1dPgUL9nKvKC67fYzosMtUCOX1M
 6XSPjZIuXOmQGeZXHhpSlVwaidxoUjYO98I7nMquxKdCy6yct3geK7ULG/xeQCgD
 ML7MGQB4+sTiSWalXUQaziKqF1FIDEvU3HMGXFWnoBL5l56eRp8KS1EI9Eqk9pU3
 pk9fJaCkcFnkzPtMFzqPOm5q9zUZ6bGbfYb0hs72TUKplmVDhFo2T1YsW2AOyHZ7
 mjuDzUYZX0H7uM1tntA56IgZX+oNOrLvhBt5L5M/BQeCsZFBBUfIcAEaYoL9LwXO
 AYgIG3jdqzHHyAUzutJF+XHKinJLMHm0XVYbFmO6saPhFzrUJSNHqT7NzW1DGGTl
 OnO8e1WNMX1EcnKvnc6fXyGmM3SgVwy45FsbG/zRnhn4uBKqKtjrh6uX/myA22LK
 CSeqSUK9XmXxFNA=
 =xjoS
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Skip E820 checks for MCFG ECAM regions for new (2016+) machines,
     since there's no requirement to describe them in E820 and some
     platforms require ECAM to work (Bjorn Helgaas)

   - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien
     Le Moal)

   - Remove last user and pci_enable_device_io() (Heiner Kallweit)

   - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen)

   - Skip waiting for devices that have been disconnected while
     suspended (Ilpo Järvinen)

   - Clear Secondary Status errors after enumeration since Master Aborts
     and Unsupported Request errors are an expected part of enumeration
     (Vidya Sagar)

  MSI:

   - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas)

  Error handling:

   - Mask Genesys GL975x SD host controller Replay Timer Timeout
     correctable errors caused by a hardware defect; the errors cause
     interrupts that prevent system suspend (Kai-Heng Feng)

   - Fix EDR-related _DSM support, which previously evaluated revision 5
     but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan)

  ASPM:

   - Simplify link state definitions and mask calculation (Ilpo
     Järvinen)

  Power management:

   - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS
     apparently doesn't know how to put them back in D0 (Mario
     Limonciello)

  CXL:

   - Support resetting CXL devices; special handling required because
     CXL Ports mask Secondary Bus Reset by default (Dave Jiang)

  DOE:

   - Support DOE Discovery Version 2 (Alexey Kardashevskiy)

  Endpoint framework:

   - Set endpoint BAR to be 64-bit if the driver says that's all the
     device supports, in addition to doing so if the size is >2GB
     (Niklas Cassel)

   - Simplify endpoint BAR allocation and setting interfaces (Niklas
     Cassel)

  Cadence PCIe controller driver:

   - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof
     Kozlowski)

  Cadence PCIe endpoint driver:

   - Configure endpoint BARs to be 64-bit based on the BAR type, not the
     BAR value (Niklas Cassel)

  Freescale Layerscape PCIe controller driver:

   - Convert DT binding to YAML (Frank Li)

  MediaTek MT7621 PCIe controller driver:

   - Add DT binding missing 'reg' property for child Root Ports
     (Krzysztof Kozlowski)

   - Fix theoretical string truncation in PHY name (Sergio Paracuellos)

  NVIDIA Tegra194 PCIe controller driver:

   - Return success for endpoint probe instead of falling through to the
     failure path (Vidya Sagar)

  Renesas R-Car PCIe controller driver:

   - Add DT binding missing IOMMU properties (Geert Uytterhoeven)

   - Add DT binding R-Car V4H compatible for host and endpoint mode
     (Yoshihiro Shimoda)

  Rockchip PCIe controller driver:

   - Configure endpoint BARs to be 64-bit based on the BAR type, not the
     BAR value (Niklas Cassel)

   - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski)

   - Set the Subsystem Vendor ID, which was previously zero because it
     was masked incorrectly (Rick Wertenbroek)

  Synopsys DesignWare PCIe controller driver:

   - Restructure DBI register access to accommodate devices where this
     requires Refclk to be active (Manivannan Sadhasivam)

   - Remove the deinit() callback, which was only need by the
     pcie-rcar-gen4, and do it directly in that driver (Manivannan
     Sadhasivam)

   - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean
     up things like eDMA (Manivannan Sadhasivam)

   - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel
     to dw_pcie_ep_init() (Manivannan Sadhasivam)

   - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to
     reflect the actual functionality (Manivannan Sadhasivam)

   - Call dw_pcie_ep_init_registers() directly from all the glue
     drivers, not just those that require active Refclk from the host
     (Manivannan Sadhasivam)

   - Remove the "core_init_notifier" flag, which was an obscure way for
     glue drivers to indicate that they depend on Refclk from the host
     (Manivannan Sadhasivam)

  TI J721E PCIe driver:

   - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli)

   - Add DT binding J722S SoC support (Siddharth Vadapalli)

  TI Keystone PCIe controller driver:

   - Add DT binding missing num-viewport, phys and phy-name properties
     (Jan Kiszka)

  Miscellaneous:

   - Constify and annotate with __ro_after_init (Heiner Kallweit)

   - Convert DT bindings to YAML (Krzysztof Kozlowski)

   - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming
     Zhou)"

* tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits)
  PCI: Do not wait for disconnected devices when resuming
  x86/pci: Skip early E820 check for ECAM region
  PCI: Remove unused pci_enable_device_io()
  ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
  PCI: Update pci_find_capability() stub return types
  PCI: Remove PCI_IRQ_LEGACY
  scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
  scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios
  Revert "genirq/msi: Provide constants for PCI/IMS support"
  Revert "x86/apic/msi: Enable PCI/IMS"
  Revert "iommu/vt-d: Enable PCI/IMS"
  Revert "iommu/amd: Enable PCI/IMS"
  Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support"
  ...
2024-05-21 10:09:28 -07:00
Ma Jun
2e55bcf3d7 drm/amdgpu: Initialize timestamp for some legacy SOCs
Initialize the interrupt timestamp for some legacy SOCs
to fix the coverity issue "Uninitialized scalar variable"

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-26 17:22:43 -04:00
Damien Le Moal
6927b01680 drm/amdgpu: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
Use the macro PCI_IRQ_INTX instead of the deprecated PCI_IRQ_LEGACY macro.

Link: https://lore.kernel.org/r/20240325070944.3600338-11-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-25 12:53:31 -05:00
Ma Jun
a1ce3e1f7c drm/amd: Fix the flag setting code for interrupt request
[1] Remove the irq flags setting code since pci_alloc_irq_vectors()
handles these flags.
[2] Free the msi vectors in case of error.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11 17:10:09 -04:00
Philip Yang
bf80d34b6c drm/amdgpu: Increase soft IH ring size
Retry faults are delegated to soft IH ring and then processed by
deferred worker. Current soft IH ring size PAGE_SIZE can store 128
entries, which may overflow and drop retry faults, causes HW stucks
because the retry fault is not recovered.

Increase soft IH ring size to 8KB, enough to store 256 CAM entries
because we clear the CAM entry after handling the retry fault from soft
ring.

Define macro IH_RING_SIZE and IH_SW_RING_SIZE to remove duplicate
constant.

Show warning message if soft IH ring overflows with CAM enabled because
this should not happen.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 10:57:25 -04:00
Ma Jun
a1c23485b8 drm/amdgpu: Print client id for the unregistered interrupt resource
Modify the debug information and print the clien id for these
interrupts as well.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-15 11:06:57 -04:00
Guchun Chen
8ffd6f0442 drm/amdgpu: keep irq count in amdgpu_irq_disable_all
This can clean up all irq warnings because of unbalanced
amdgpu_irq_get/put when unplugging/unbinding device, and leave
irq count decrease in each ip fini function.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:31:12 -04:00
Srinivasan Shanmugam
413521a4c9 drm/amd/amdgpu: Fix warnings in amdgpu_irq.c
checkpatch reports:

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: braces {} are not necessary for any arm of this statement
+               if (nvec <= 0) {
[...]
+               } else {
[...]
WARNING: Block comments use a trailing */ on a separate line

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:55:38 -04:00
Le Ma
0d81101c19 drm/amdgpu: complement the IH node_id table for multiple AIDs
With different node_id, the SDMA interrupt from multiple AIDs can be
distinguished by sw driver.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:43:50 -04:00
Le Ma
15091a6f43 drm/amdgpu: add node_id to physical id conversion in EOP handler
A new field nodeid in interrupt cookie indicates the node ID.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:42:13 -04:00
Alan Liu
1fa8d71057 drm/amdgpu: Fix desktop freezed after gpu-reset
[Why]
After gpu-reset, sometimes the driver fails to enable vblank irq,
causing flip_done timed out and the desktop freezed.

During gpu-reset, we disable and enable vblank irq in dm_suspend() and
dm_resume(). Later on in amdgpu_irq_gpu_reset_resume_helper(), we check
irqs' refcount and decide to enable or disable the irqs again.

However, we have 2 sets of API for controling vblank irq, one is
dm_vblank_get/put() and another is amdgpu_irq_get/put(). Each API has
its own refcount and flag to store the state of vblank irq, and they
are not synchronized.

In drm we use the first API to control vblank irq but in
amdgpu_irq_gpu_reset_resume_helper() we use the second set of API.

The failure happens when vblank irq was enabled by dm_vblank_get()
before gpu-reset, we have vblank->enabled true. However, during
gpu-reset, in amdgpu_irq_gpu_reset_resume_helper() vblank irq's state
checked from amdgpu_irq_update() is DISABLED. So finally it disables
vblank irq again. After gpu-reset, if there is a cursor plane commit,
the driver will try to enable vblank irq by calling drm_vblank_enable(),
but the vblank->enabled is still true, so it fails to turn on vblank
irq and causes flip_done can't be completed in vblank irq handler and
desktop become freezed.

[How]
Combining the 2 vblank control APIs by letting drm's API finally calls
amdgpu_irq's API, so the irq's refcount and state of both APIs can be
synchronized. Also add a check to prevent refcount from being less then
0 in amdgpu_irq_put().

v2:
- Add warning in amdgpu_irq_enable() if the irq is already disabled.
- Call dc_interrupt_set() in dm_set_vblank() to avoid refcount change
  if it is in gpu-reset.

v3:
- Improve commit message and code comments.

Signed-off-by: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-18 16:28:54 -04:00
Thomas Zimmermann
973ad6273c drm/amdgpu: Remove unnecessary include statements for drm_crtc_helper.h
Several source files include drm_crtc_helper.h without needing it or
only to get its transitive include statements; leading to unnecessary
compile-time dependencies.

Directly include required headers and drop drm_crtc_helper.h where
possible.

v2:
	* keep includes sorted in amdgpu_device.c (Sam)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230116131235.18917-4-tzimmermann@suse.de
2023-01-18 09:25:30 +01:00
Alex Deucher
a347ca9720 drm/amdgpu: move non-DC vblank handling out of irq code
Move it into the DCE code for each generation. This avoids
confusion with the different display paths.

v2: no need for a hotplug worker for vkms

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 11:51:36 -05:00
Alex Deucher
a775e4e494 Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"
This reverts commit 92020e81dd.

This causes stuttering and timeouts with DMCUB for some users
so revert it until we understand why and safely enable it
to save power.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: stable@vger.kernel.org
2022-06-29 14:50:52 -04:00
Tao Zhou
b3c76814ce drm/amdgpu: add RAS fatal error interrupt handler
The fatal error handler is independent from general ras interrupt
handler since there is no related IH ring.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-22 14:50:18 -04:00
yipechai
2e54fe5d05 drm/amdgpu: Modify nbio block to fit for the unified ras block data and ops
1.Modify nbio block to fit for the unified ras block data and ops.
2.Change amdgpu_nbio_ras_funcs to amdgpu_nbio_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of mmhub ras variable so that nbio ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register nbio ras block into amdgpu device ras block link list.
5.Remove the redundant code about nbio in amdgpu_ras.c after using the unified ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:59 -05:00
Philip Yang
0771c80591 drm/amdgpu: Handle fault with same timestamp
Remove not unique timestamp WARNING as same timestamp interrupt happens
on some chips,

Drain fault need to wait for the processed_timestamp to be truly greater
than the checkpoint or the ring to be empty to be sure no stale faults
are handled.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1818
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:35 -05:00
Isabella Basso
bbe04dec5c drm/amd: fix improper docstring syntax
This fixes various warnings relating to erroneous docstring syntax, of
which some are listed below:

 warning: Function parameter or member 'adev' not described in
 'amdgpu_atomfirmware_ras_rom_addr'
 ...
 warning: expecting prototype for amdgpu_atpx_validate_functions().
 Prototype was for amdgpu_atpx_validate() instead
 ...
 warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new'
 ...
 warning: Cannot understand  * @kfd_get_cu_occupancy - Collect number of
 waves in-flight on this device
 ...
 warning: This comment starts with '/**', but isn't a kernel-doc
 comment. Refer Documentation/doc-guide/kernel-doc.rst

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Philip Yang
3c2d6ea279 drm/amdgpu: handle IH ring1 overflow
IH ring1 is used to process GPU retry fault, overflow is enabled to
drain retry fault because we want receive other interrupts while
handling retry fault to recover range. There is no overflow flag set
when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow
and drain retry fault.

If fault timestamp goes backward, the fault is filtered and should not
be processed. Drain fault is finished if processed_timestamp is equal to
or larger than checkpoint timestamp.

Add amdgpu_ih_functions interface decode_iv_ts for different chips to
get timestamp from IV entry with different iv size and timestamp offset.
amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:34 -05:00
Alex Deucher
92020e81dd drm/amdgpu/display: set vblank_disable_immediate for DC
Disable vblanks immediately to save power.  I think this was
missed when we merged DC support.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1781
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:03 -05:00
Linus Torvalds
477f70cd2a drm for v5.15-rc1
core:
 - extract i915 eDP backlight into core
 - DP aux bus support
 - drm_device.irq_enabled removed
 - port drivers to native irq interfaces
 - export gem shadow plane handling for vgem
 - print proper driver name in framebuffer registration
 - driver fixes for implicit fencing rules
 - ARM fixed rate compression modifier added
 - updated fb damage handling
 - rmfb ioctl logging/docs
 - drop drm_gem_object_put_locked
 - define DRM_FORMAT_MAX_PLANES
 - add gem fb vmap/vunmap helpers
 - add lockdep_assert(once) helpers
 - mark drm irq midlayer as legacy
 - use offset adjusted bo mapping conversion
 
 vgaarb:
 - cleanups
 
 fbdev:
 - extend efifb handling to all arches
 - div by 0 fixes for multiple drivers
 
 udmabuf:
 - add hugepage mapping support
 
 dma-buf:
 - non-dynamic exporter fixups
 - document implicit fencing rules
 
 amdgpu:
 - Initial Cyan Skillfish support
 - switch virtual DCE over to vkms based atomic
 - VCN/JPEG power down fixes
 - NAVI PCIE link handling fixes
 - AMD HDMI freesync fixes
 - Yellow Carp + Beige Goby fixes
 - Clockgating/S0ix/SMU/EEPROM fixes
 - embed hw fence in job
 - rework dma-resv handling
 - ensure eviction to system ram
 
 amdkfd:
 - uapi: SVM address range query added
 - sysfs leak fix
 - GPUVM TLB optimizations
 - vmfault/migration counters
 
 i915:
 - Enable JSL and EHL by default
 - preliminary XeHP/DG2 support
 - remove all CNL support (never shipped)
 - move to TTM for discrete memory support
 - allow mixed object mmap handling
 - GEM uAPI spring cleaning
   - add I915_MMAP_OBJECT_FIXED
   - reinstate ADL-P mmap ioctls
   - drop a bunch of unused by userspace features
   - disable and remove GPU relocations
 - revert some i915 misfeatures
 - major refactoring of GuC for Gen11+
 - execbuffer object locking separate step
 - reject caching/set-domain on discrete
 - Enable pipe DMC loading on XE-LPD and ADL-P
 - add PSF GV point support
 - Refactor and fix DDI buffer translations
 - Clean up FBC CFB allocation code
 - Finish INTEL_GEN() and friends macro conversions
 
 nouveau:
 - add eDP backlight support
 - implicit fence fix
 
 msm:
 - a680/7c3 support
 - drm/scheduler conversion
 
 panfrost:
 - rework GPU reset
 
 virtio:
 - fix fencing for planes
 
 ast:
 - add detect support
 
 bochs:
 - move to tiny GPU driver
 
 vc4:
 - use hotplug irqs
 - HDMI codec support
 
 vmwgfx:
 - use internal vmware device headers
 
 ingenic:
 - demidlayering irq
 
 rcar-du:
 - shutdown fixes
 - convert to bridge connector helpers
 
 zynqmp-dsub:
 - misc fixes
 
 mgag200:
 - convert PLL handling to atomic
 
 mediatek:
 - MT8133 AAL support
 - gem mmap object support
 - MT8167 support
 
 etnaviv:
 - NXP Layerscape LS1028A SoC support
 - GEM mmap cleanups
 
 tegra:
 - new user API
 
 exynos:
 - missing unlock fix
 - build warning fix
 - use refcount_t
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmEtvn8ACgkQDHTzWXnE
 hr7aqw//WfcIyGdPLjAz59cW8jm+FgihD5colHtOUYRHRO4GeX/bNNufquR8+N3y
 HESsyZdpihFHms/wURMq41ibmHg0EuHA01HZzjZuGBesG4F9I8sP/HnDOxDuYuAx
 N7Lg4PlUNlfFHmw7Y84owQ6s/XWmNp5iZ8e/mTK5hcraJFQKS4QO74n9RbG/F1vC
 Hc3P6AnpqGac2AEGXt0NjIRxVVCTUIBGx+XOhj+1AMyAGzt9VcO1DS9PVCS0zsEy
 zKMj9tZAPNg0wYsXAi4kA1lK7uVY8KoXSVDYLpsI5Or2/e7mfq2b4EWrezbtp6UA
 H+w86axuwJq7NaYHYH6HqyrLTOmvcHgIl2LoZN91KaNt61xfJT3XZkyQoYViGIrJ
 oZy6X/+s+WPoW98bHZrr6vbcxtWKfEeQyUFEAaDMmraKNJwROjtwgFC9DP8MDctq
 PUSM+XkwbGRRxQfv9dNKufeWfV5blVfzEJO8EfTU1YET3WTDaUHe/FoIcLZt2DZG
 JAJgZkIlU8egthPdakUjQz/KoyLMyovcN5zcjgzgjA9PyNEq74uElN9l446kSSxu
 jEVErOdd+aG3Zzk7/ZZL/RmpNQpPfpQ2RaPUkgeUsW01myNzUNuU3KUDaSlVa+Oi
 1n7eKoaQ2to/+LjhYApVriri4hIZckNNn5FnnhkgwGi8mpHQIVQ=
 =vZkA
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights:

   - i915 has seen a lot of refactoring and uAPI cleanups due to a
     change in the upstream direction going forward

     This has all been audited with known userspace, but there may be
     some pitfalls that were missed.

   - i915 now uses common TTM to enable discrete memory on DG1/2 GPUs

   - i915 enables Jasper and Elkhart Lake by default and has preliminary
     XeHP/DG2 support

   - amdgpu adds support for Cyan Skillfish

   - lots of implicit fencing rules documented and fixed up in drivers

   - msm now uses the core scheduler

   - the irq midlayer has been removed for non-legacy drivers

   - the sysfb code now works on more than x86.

  Otherwise the usual smattering of stuff everywhere, panels, bridges,
  refactorings.

  Detailed summary:

  core:
   - extract i915 eDP backlight into core
   - DP aux bus support
   - drm_device.irq_enabled removed
   - port drivers to native irq interfaces
   - export gem shadow plane handling for vgem
   - print proper driver name in framebuffer registration
   - driver fixes for implicit fencing rules
   - ARM fixed rate compression modifier added
   - updated fb damage handling
   - rmfb ioctl logging/docs
   - drop drm_gem_object_put_locked
   - define DRM_FORMAT_MAX_PLANES
   - add gem fb vmap/vunmap helpers
   - add lockdep_assert(once) helpers
   - mark drm irq midlayer as legacy
   - use offset adjusted bo mapping conversion

  vgaarb:
   - cleanups

  fbdev:
   - extend efifb handling to all arches
   - div by 0 fixes for multiple drivers

  udmabuf:
   - add hugepage mapping support

  dma-buf:
   - non-dynamic exporter fixups
   - document implicit fencing rules

  amdgpu:
   - Initial Cyan Skillfish support
   - switch virtual DCE over to vkms based atomic
   - VCN/JPEG power down fixes
   - NAVI PCIE link handling fixes
   - AMD HDMI freesync fixes
   - Yellow Carp + Beige Goby fixes
   - Clockgating/S0ix/SMU/EEPROM fixes
   - embed hw fence in job
   - rework dma-resv handling
   - ensure eviction to system ram

  amdkfd:
   - uapi: SVM address range query added
   - sysfs leak fix
   - GPUVM TLB optimizations
   - vmfault/migration counters

  i915:
   - Enable JSL and EHL by default
   - preliminary XeHP/DG2 support
   - remove all CNL support (never shipped)
   - move to TTM for discrete memory support
   - allow mixed object mmap handling
   - GEM uAPI spring cleaning
       - add I915_MMAP_OBJECT_FIXED
       - reinstate ADL-P mmap ioctls
       - drop a bunch of unused by userspace features
       - disable and remove GPU relocations
   - revert some i915 misfeatures
   - major refactoring of GuC for Gen11+
   - execbuffer object locking separate step
   - reject caching/set-domain on discrete
   - Enable pipe DMC loading on XE-LPD and ADL-P
   - add PSF GV point support
   - Refactor and fix DDI buffer translations
   - Clean up FBC CFB allocation code
   - Finish INTEL_GEN() and friends macro conversions

  nouveau:
   - add eDP backlight support
   - implicit fence fix

  msm:
   - a680/7c3 support
   - drm/scheduler conversion

  panfrost:
   - rework GPU reset

  virtio:
   - fix fencing for planes

  ast:
   - add detect support

  bochs:
   - move to tiny GPU driver

  vc4:
   - use hotplug irqs
   - HDMI codec support

  vmwgfx:
   - use internal vmware device headers

  ingenic:
   - demidlayering irq

  rcar-du:
   - shutdown fixes
   - convert to bridge connector helpers

  zynqmp-dsub:
   - misc fixes

  mgag200:
   - convert PLL handling to atomic

  mediatek:
   - MT8133 AAL support
   - gem mmap object support
   - MT8167 support

  etnaviv:
   - NXP Layerscape LS1028A SoC support
   - GEM mmap cleanups

  tegra:
   - new user API

  exynos:
   - missing unlock fix
   - build warning fix
   - use refcount_t"

* tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm: (1318 commits)
  drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box
  drm/amd/display: Remove duplicate dml init
  drm/amd/display: Update bounding box states (v2)
  drm/amd/display: Update number of DCN3 clock states
  drm/amdgpu: disable GFX CGCG in aldebaran
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amdgpu: Add support for RAS XGMI err query
  drm/amdkfd: Account for SH/SE count when setting up cu masks.
  drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain
  drm/amdgpu: drop redundant cancel_delayed_work_sync call
  drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend
  drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
  drm/amdkfd: map SVM range with correct access permission
  drm/amdkfd: check access permisson to restore retry fault
  drm/amdgpu: Update RAS XGMI Error Query
  drm/amdgpu: Add driver infrastructure for MCA RAS
  drm/amd/display: Add Logging for HDMI color depth information
  drm/amd/amdgpu: consolidate PSP TA init shared buf functions
  drm/amd/amdgpu: add name field back to ras_common_if
  drm/amdgpu: Fix build with missing pm_suspend_target_state module export
  ...
2021-09-01 11:26:46 -07:00
Marc Zyngier
66c6594b6d gpu: Bulk conversion to generic_handle_domain_irq()
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-12 11:39:40 +01:00
Thomas Zimmermann
450d61794d drm/amdgpu: Convert to Linux IRQ interfaces
Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
don't benefit from using it.

DRM IRQ callbacks are now being called directly or inlined.

The interrupt number returned by pci_msi_vector() is now stored
in struct amdgpu_irq. Calls to pci_msi_vector() can fail and return
a negative errno code. Abort initlaizaton in thi case. The DRM IRQ
midlayer does not handle this correctly.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210803090704.32152-2-tzimmermann@suse.de
2021-08-10 20:00:44 +02:00
Dave Airlie
04d505de7f Merge tag 'amd-drm-next-5.15-2021-07-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-07-29:

amdgpu:
- VCN/JPEG power down sequencing fixes
- Various navi pcie link handling fixes
- Clockgating fixes
- Yellow Carp fixes
- Beige Goby fixes
- Misc code cleanups
- S0ix fixes
- SMU i2c bus rework
- EEPROM handling rework
- PSP ucode handling cleanup
- SMU error handling rework
- AMD HDMI freesync fixes
- USB PD firmware update rework
- MMIO based vram access rework
- Misc display fixes
- Backlight fixes
- Add initial Cyan Skillfish support
- Overclocking fixes suspend/resume

amdkfd:
- Sysfs leak fix
- Add counters for vm faults and migration
- GPUVM TLB optimizations

radeon:
- Misc fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210730033455.3852-1-alexander.deucher@amd.com
2021-07-30 16:48:35 +10:00
Chengzhe Liu
61a6813f3f drm/amdgpu: Add msix restore for pass-through mode
In pass-through mode, after mode 1 reset, msix enablement status would
lost and never receives interrupt again. So, we should restore msix
status after mode 1 reset.

Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:09:40 -04:00
Dave Airlie
588b3eee52 drm-misc-next for v5.15:
UAPI Changes:
 
 Cross-subsystem Changes:
 - udmabuf: Add support for mapping hugepages
 - Add dma-buf stats to sysfs.
 - Assorted fixes to fbdev/omap2.
 - dma-buf: Document DMA_BUF_IOCTL_SYNC
 - Improve dma-buf non-dynamic exporter expectations better.
 - Add module parameters for dma-buf size and list limit.
 - Add HDMI codec support to vc4, to replace vc4's own codec.
 - Document dma-buf implicit fencing rules.
 - dma_resv_test_signaled test_all handling.
 
 Core Changes:
 - Extract i915's eDP backlight code into DRM helpers.
 - Assorted docbook updates.
 - Rework drm_dp_aux documentation.
 - Add support for the DP aux bus.
 - Shrink dma-fence-chain slightly.
 - Add alloc/free helpers for dma-fence-chain.
 - Assorted fixes to TTM., drm/of, bridge
 - drm_gem_plane_helper_prepare/cleanup_fb is now the default for gem drivers.
 - Small fix for scheduler completion.
 - Remove use of drm_device.irq_enabled.
 - Print the driver name to dmesg when registering framebuffer.
 - Export drm/gem's shadow plane handling, and use it in vkms.
 - Assorted small fixes.
 
 Driver Changes:
 - Add eDP backlight to nouveau.
 - Assorted fixes and cleanups to nouveau, panfrost, vmwgfx, anx7625,
   amdgpu, gma500, radeon, mgag200, vgem, vc4, vkms, omapdrm.
 - Add support for Samsung DB7430, Samsung ATNA33XC20, EDT ETMV570G2DHU,
   EDT ETM0350G0DH6, Innolux EJ030NA panels.
 - Fix some simple pannels missing bus_format and connector types.
 - Add mks-guest-stats instrumentation support to vmwgfx.
 - Merge i915-ttm topic branch.
 - Make s6e63m0 panel use Mipi-DBI helpers.
 - Add detect() supoprt for AST.
 - Use interrupts for hotplug on vc4.
 - vmwgfx is now moved to drm-misc-next, as sroland is no longer a maintainer for now.
 - vmwgfx now uses copies of vmware's internal device headers.
 - Slowly convert ti-sn65dsi83 over to atomic.
 - Rework amdgpu dma-resv handling.
 - Fix virtio fencing for planes.
 - Ensure amdgpu can always evict to SYSTEM.
 - Many drivers fixed for implicit fencing rules.
 - Set default prepare/cleanup fb for tiny, vram and simple helpers too.
 - Rework panfrost gpu reset and related serialization.
 - Update VKMS todo list.
 - Make bochs a tiny gpu driver, and use vram helper.
 - Use linux irq interfaces instead of drm_irq in some drivers.
 - Add support for Raspberry Pi Pico to GUD.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmDxaBwACgkQ/lWMcqZw
 E8PBYRAAsZgmuQU1urEsDTL931jWoJ8zxHpxSLow8ZtplembyhloGeRXRmGT8erd
 ocw1wAzm0UajbFLvv50XW5N4jPnsn9IBRQVhfNNc06g4OH6qy17PPAA+clHaBJrf
 BFiAcK4rzmUet3+6335ko/OvkD5er0s7ipNljxgB7FkIwP3gh3NEFG0yFcpFpxF4
 fzT5Wz5vMW++XUCXZHMX+vBMjFP2AosxLVvsnxpM/48dyFWTiYRg7jhy5bICKYBM
 3GdRj2e1wm3cAsZISbqtDpXSlstIw6u0w+BB6ryQvD/K5nPTqydE/YMOB85DUWLg
 Sp1tijxM/KtOyC5w/IpDLkf9X24KAIcu0eKffUGbkLvIkP5cSyibelOtZBG6Jmln
 AubXpgi4+mGVyYvMEVngHyrY2tW/rtpNGr/g9To9hYVHKkdRZUtolQk7KgtdV7v3
 pFq60AilYTENJthkjCRoTi66BsocpaJfQOyppp6uD8/a0Spxfrq5tM+POWNylqxB
 70L2ObvM4Xx51GI0ziCZQwkMp2Uzwosr+6CdbrzQKaxxpbQEcr3frkv6cap5V0WY
 lnYgFw3dbA/Ga6YsnInQ87KmF4svnaWB2z/KzfnBF5pNrwoR9/4K5k7Vfb3P9YyN
 w+nrfeHto0r768PjC/05uyD9diDuHOw3RHtljf/C4klBNRDDovU=
 =x8Eo
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-07-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:
- udmabuf: Add support for mapping hugepages
- Add dma-buf stats to sysfs.
- Assorted fixes to fbdev/omap2.
- dma-buf: Document DMA_BUF_IOCTL_SYNC
- Improve dma-buf non-dynamic exporter expectations better.
- Add module parameters for dma-buf size and list limit.
- Add HDMI codec support to vc4, to replace vc4's own codec.
- Document dma-buf implicit fencing rules.
- dma_resv_test_signaled test_all handling.

Core Changes:
- Extract i915's eDP backlight code into DRM helpers.
- Assorted docbook updates.
- Rework drm_dp_aux documentation.
- Add support for the DP aux bus.
- Shrink dma-fence-chain slightly.
- Add alloc/free helpers for dma-fence-chain.
- Assorted fixes to TTM., drm/of, bridge
- drm_gem_plane_helper_prepare/cleanup_fb is now the default for gem drivers.
- Small fix for scheduler completion.
- Remove use of drm_device.irq_enabled.
- Print the driver name to dmesg when registering framebuffer.
- Export drm/gem's shadow plane handling, and use it in vkms.
- Assorted small fixes.

Driver Changes:
- Add eDP backlight to nouveau.
- Assorted fixes and cleanups to nouveau, panfrost, vmwgfx, anx7625,
  amdgpu, gma500, radeon, mgag200, vgem, vc4, vkms, omapdrm.
- Add support for Samsung DB7430, Samsung ATNA33XC20, EDT ETMV570G2DHU,
  EDT ETM0350G0DH6, Innolux EJ030NA panels.
- Fix some simple pannels missing bus_format and connector types.
- Add mks-guest-stats instrumentation support to vmwgfx.
- Merge i915-ttm topic branch.
- Make s6e63m0 panel use Mipi-DBI helpers.
- Add detect() supoprt for AST.
- Use interrupts for hotplug on vc4.
- vmwgfx is now moved to drm-misc-next, as sroland is no longer a maintainer for now.
- vmwgfx now uses copies of vmware's internal device headers.
- Slowly convert ti-sn65dsi83 over to atomic.
- Rework amdgpu dma-resv handling.
- Fix virtio fencing for planes.
- Ensure amdgpu can always evict to SYSTEM.
- Many drivers fixed for implicit fencing rules.
- Set default prepare/cleanup fb for tiny, vram and simple helpers too.
- Rework panfrost gpu reset and related serialization.
- Update VKMS todo list.
- Make bochs a tiny gpu driver, and use vram helper.
- Use linux irq interfaces instead of drm_irq in some drivers.
- Add support for Raspberry Pi Pico to GUD.

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Fri 16 Jul 2021 21:06:04 AEST
# gpg:                using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3
# gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten@debian.org>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired]
# gpg: Note: This key has expired!
# Primary key fingerprint: B97B D6A8 0CAC 4981 091A  E547 FE55 8C72 A670 13C3
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/444811c3-cbec-e9d5-9a6b-9632eda7962a@linux.intel.com
2021-07-21 11:58:28 +10:00
Emily.Deng
9be26ddf88 drm/amdgpu: Restore msix after FLR
After FLR, the msix will be cleared, so need to re-enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:48 -04:00
Emily.Deng
3c727c1c45 drm/amdgpu: Restore msix after FLR
After FLR, the msix will be cleared, so need to re-enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:35:04 -04:00
Thomas Zimmermann
0cabcf83b2 drm/amdgpu: Track IRQ state in local device state
Replace usage of struct drm_device.irq_enabled with the driver's
own state field struct amdgpu_device.irq.installed. The field in
the DRM device structure is considered legacy and should not be
used by KMS drivers.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625082222.3845-2-tzimmermann@suse.de
2021-06-29 11:03:17 +02:00
Andrey Grodzovsky
d10d0daa20 drm/amdgpu: Handle IOMMU enabled case.
Problem:
Handle all DMA IOMMU group related dependencies before the
group is removed. Those manifest themself in that when IOMMU
enabled DMA map/unmap is dependent on the presence of IOMMU
group the device belongs to but, this group is released once
the device is removed from PCI topology.

Fix:
Expedite all such unmap operations to pci remove driver callback.

v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
v6: Drop the BO unamp list
v7:
Drop amdgpu_gart_fini
In amdgpu_ih_ring_fini do uncinditional  check (!ih->ring)
to avoid freeing uniniitalized rings.
Call amdgpu_ih_ring_fini unconditionally.
v8: Add deatiled explanation

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210517143851.475058-1-andrey.grodzovsky@amd.com
2021-05-19 23:50:27 -04:00
Andrey Grodzovsky
72c8c97b15 drm/amdgpu: Split amdgpu_device_fini into early and late
Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-3-andrey.grodzovsky@amd.com
2021-05-19 23:45:49 -04:00
Hawking Zhang
6e36f23193 drm/amdgpu: split nbio callbacks into ras and non-ras ones
nbio ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. split nbio callbacks
into ras and non-ras ones so gpu driver only
initializes nbio ras callbacks when it manages
nbio ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:51:04 -04:00
Christian König
266b2d25e3 drm/amdgpu: remove irq_src->data handling
That is unused for quite some time now.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:43:28 -04:00
shaoyunl
c8941550aa drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac
This recent change introduce SDMA interrupt info printing with irq->process function.
These functions do not require a set function to enable/disable the irq

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:11:03 -04:00
Oak Zeng
56237c6aef drm/amdgpu: Fix IH client ID naming table
Client ID 26 is reserved. Add it to the table.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:53:22 -04:00
Yong Zhao
be14729a33 drm/amdgpu: Print the IH client ID name when vm fault happens
This gives more information and improves productivity.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:51:40 -04:00
Alex Deucher
f3e4a07fb7 drm/amdgpu: fix handling of irq domains on soc15 and newer GPUs
We need to take into account the client id otherwise we'll end
up sending generic events for any src id that is registered.

We only support irq domains on pre-soc15 parts so client is
always legacy.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-23 15:07:30 -05:00
Christian König
26f32a377e drm/amdgpu: add infrastructure for soft IH ring
Add a soft IH ring implementation similar to the hardware IH1/2.

This can be used if the hardware delegation of interrupts to IH1/2
doesn't work for some reason.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-24 12:06:38 -05:00
Luben Tuikov
4a580877bd drm/amdgpu: Get DRM dev from adev by inline-f
Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Luben Tuikov
1348969ab6 drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Aurabindo Pillai
dd4fa6c1b8 drm/amd/amdgpu: remove hardcoded module name in prints
Let format prefixes take care of printing the module name
through pr_fmt and dev_fmt definitions.

Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13 12:02:40 -04:00
Hawking Zhang
858a2bbad6 drm/amdgpu: pull ras controller int status only when ras enabled
ras_controller_irq and athub_err_event_irq are only registered
when PCIE_BIF ras is marked as supported. as the result, the driver
also just need pull the int status in such case.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 12:09:23 -05:00