Commit Graph

11996 Commits

Author SHA1 Message Date
Srinivasan Shanmugam
120ceaf78e drm/amd/amdgpu: Fix error do not initialise globals to 0
Global variables do not need to be initialized to 0 and checkpatch
flags this error in drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:

ERROR: do not initialise globals to 0
+int amdgpu_no_queue_eviction_on_vm_fault = 0;

Fix this checkpatch error.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Luben Tuikov
8782007b5f drm/amdgpu: Return from switch early for EEPROM I2C address
As soon as control->i2c_address is set, return; remove the "break;" from the
switch--it is unnecessary. This mimics what happens when for some cases in the
switch, we call helper functions with "return <helper function>".

Remove final function "return true;" to indicate that the switch is final and
terminal, and that there should be no code after the switch.

Cc: Candice Li <candice.li@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Luben Tuikov
1bb745d759 drm/amdgpu: Remove second moot switch to set EEPROM I2C address
Remove second switch since it already has its own function and case in the
first switch. This also avoids requalifying the EEPROM I2C address for VEGA20,
SIENNA CICHLID, and ALDEBARAN, as those have been set by the first switch and
shouldn't match SMU v13.0.x.

Cc: Candice Li <candice.li@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Fixes: 1582252946 ("drm/amdgpu: Add EEPROM I2C address for smu v13_0_0")
Fixes: c9bdc6c3cf ("drm/amdgpu: Add EEPROM I2C address support for ip discovery")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Stanley.Yang
059478929a drm/amdgpu: print ras drv fw debug info
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Hawking Zhang
9af357bc3e drm/amdgpu: Add fatal error handling in nbio v4_3
GPU will stop working once fatal error is detected.
it will inform driver to do reset to recover from
the fatal error.

v2: squash in logic fix (Srinivasan)
v3: squash in logic fix (Dan)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Tong Liu01
f5a5b08139 drm/amdgpu: skip unload tmr when tmr is not loaded
[why]
Skip TMR unload for Navi12 and CHIP_SIENNA_CICHLID SRIOV as TMR is
not loaded at all

Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-27 18:20:14 -04:00
Jack Xiao
5ee33d905f drm/amd/amdgpu: limit one queue per gang
Limit one queue per gang in mes self test,
due to mes schq fw change.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 01:07:04 -04:00
YiPeng Chai
28606c4e58 drm/amdgpu: resume ras for gfx v11_0_3 during reset on SRIOV
Gfx v11_0_3 supports ras on SRIOV, so need to resume ras
during reset.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:50 -04:00
YiPeng Chai
ec64350d01 drm/amdgpu: reinit mes ip block during reset on SRIOV
Reinit mes ip block during reset on SRIOV.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:44 -04:00
YiPeng Chai
fe120b9f5c drm/amdgpu: enable ras for mp0 v13_0_10 on SRIOV
Enable ras for mp0 v13_0_10 on SRIOV.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:37 -04:00
Kai-Heng Feng
3ad5dcfe00 drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is
caused by commit 0064b0ce85 ("drm/amd/pm: enable ASPM by default").

The root cause is still not clear for now.

So extend and apply the ASPM quirk from commit e02fe3bc7a
("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to
workaround the issue on Navi cards too.

Fixes: 0064b0ce85 ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:08 -04:00
Hawking Zhang
cfa0759827 drm/amdgpu: Initialize umc ras callback
Fix a coding error which results to null interrupt
handler for umc ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:01 -04:00
Lee Jones
a37558e63b drm/amd/amdgpu/amdgpu_vce: Provide description for amdgpu_vce_validate_bo()'s 'p' param
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:599: warning: Function parameter or member 'p' not described in 'amdgpu_vce_validate_bo'

v2: reword (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
0b9ff428de drm/amd/amdgpu/amdgpu_mes: Ensure amdgpu_bo_create_kernel()'s return value is checked
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function ‘amdgpu_mes_ctx_alloc_meta_data’:
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1099:13: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Jack Xiao <Jack.Xiao@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
25a75f56be drm/amd/amdgpu/ih_v6_0: Repair misspelling and provide descriptions for 'ih'
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:392: warning: Function parameter or member 'ih' not described in 'ih_v6_0_get_wptr'
 drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:432: warning: Function parameter or member 'ih' not described in 'ih_v6_0_irq_rearm'
 drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:458: warning: Function parameter or member 'ih' not described in 'ih_v6_0_set_rptr'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
3e4bc662ec drm/amd/amdgpu/gmc_v11_0: Provide a few missing param descriptions relating to hubs and flushes
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'vmhub' not described in 'gmc_v11_0_flush_gpu_tlb'
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb'
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'all_hub' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
4caee043bd drm/amd/amdgpu/amdgpu_vm_pt: Supply description for amdgpu_vm_pt_free_dfs()'s unlocked param
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:683: warning: Function parameter or member 'unlocked' not described in 'amdgpu_vm_pt_free_dfs'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
f5211c5ded drm/amd/amdgpu/amdgpu_ucode: Remove unused function ‘amdgpu_ucode_print_imu_hdr’
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:129:6: warning: no previous prototype for ‘amdgpu_ucode_print_imu_hdr’ [-Wmissing-prototypes]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Lee Jones
80bd2de1db drm/amd/amdgpu/amdgpu_device: Provide missing kerneldoc entry for 'reset_context'
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:5152:
   warning: Function parameter or member 'reset_context' not described in 'amdgpu_device_gpu_recover'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Jane Jian
bf35dbc135 drm/amdgpu/jpeg: enable jpeg v4_0 for sriov
- skip direct jpeg registers read&write since it is not allowed
- reset Doorbell range layout for sriov

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Bill Liu
3cd658deb0 drm/amdgpu: Adding CAP firmware initialization
Added CAP firmware initialization for PSP v13.0.10 under psp_init_sriov_microcode

Signed-off-by: Bill Liu <Bill.Liu@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Jane Jian
5957a96759 drm/amdgpu/gfx: set cg flags to enter/exit safe mode
sriov needs to enter/exit safe mode in update umd p state
add the cg flag to let it enter or exit while needed

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
YuBiao Wang
f466b111a0 drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.

[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Tong Liu01
3234fac0f9 drm/amdgpu: add mes resume when do gfx post soft reset
[why]
when gfx do soft reset, mes will also do reset, if mes is not
resumed when do recover from soft reset, mes is unable to respond
in later sequence

[how]
resume mes when do gfx post soft reset

Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Tim Huang
af1f298503 drm/amdgpu: skip ASIC reset for APUs when go to S4
For GC IP v11.0.4/11, PSP TMR need to be reserved
for ASIC mode2 reset. But for S4, when psp suspend,
it will destroy the TMR that fails the ASIC reset.

[  96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset
[  100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002
[  100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed!
[  100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62
[  100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62
[  100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs
[  100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62

We can skip the reset on APUs, assuming we can resume them
properly. Verified on some GFX11, GFX10 and old GFX9 APUs.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:59 -04:00
Tim Huang
d24eae4dd7 drm/amdgpu: reposition the gpu reset checking for reuse
Move the amdgpu_acpi_should_gpu_reset out of
CONFIG_SUSPEND to share it with hibernate case.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:58 -04:00
Alex Deucher
aef98f2e1b drm/amdgpu: drop the extra sign extension
amdgpu_bo_gpu_offset_no_check() already calls
amdgpu_gmc_sign_extend() so no need to call it twice.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:47:58 -04:00
Dave Airlie
c6265f5c2f Merge tag 'drm-misc-next-2023-03-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.4-rc1:

Cross-subsystem Changes:
- Add drm_bridge.h to drm_bridge maintainers.

Core Changes:
- Assorted fixes to TTM, tests, format-helper, accel.
- Assorted Makefile fixes to drivers and accel.
- Implement fbdev emulation for GEM DMA drivers, and convert a lot of
  drivers to use it.
- Use tgid instead of pid for tracking clients.

Driver Changes:
- Assorted fixes in rockchip, vmwgfx, nouveau, cirrus.
- Add imx25 driver.
- Add Elida KD50T048A, Sony TD4353, Novatek NT36523, STARRY 2081101QFH032011-53G panels.
- Add 4K mode support to rockchip.
- Convert cirrus to use regular atomic helpers, and more cirrus
  improvements.
- Add damage clipping to cirrus, virtio.

[airlied: add drm_bridge.h include to imx]
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f7b765c7-d49d-edb5-2a6a-4f7a7be16a59@linux.intel.com
2023-03-22 04:42:36 +10:00
Dave Airlie
90031bc33f Merge tag 'amd-drm-next-6.4-2023-03-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.4-2023-03-17:

amdgpu:
- Misc code cleanups
- Documentation fixes
- Make kobj structures const
- Add thermal throttling adjustments for supported APUs
- UMC RAS fixes
- Display reset fixes
- DCN 3.2 fixes
- Freesync fixes
- DC code reorg
- Generalize dmabuf import to work with KFD
- DC DML fixes
- SRIOV fixes
- UVD code cleanups
- IH 4.4.2 updates
- HDP 4.4.2 updates
- SDMA 4.4.2 updates
- PSP 13.0.6 updates
- Add capped/uncapped workload handling for supported APUs
- DCN 3.1.4 updates
- Re-org DC Kconfig
- USB4 fixes
- Reorg DC plane and stream handling
- Register vga_switcheroo for apple-gmux
- SMU 13.0.6 updates
- Fix error checking in read_mm_registers functions for affected families
- VCN 4.0.4 fix
- Drop redundant pci_enable_pcie_error_reporting() call
- RDNA2 SMU OD suspend/resume fix
- Expose additional memory stats via fdinfo
- RAS fixes
- Misc display fixes
- DP MST fixes
- IOMMU regression fix for KFD

amdkfd:
- Make kobj structures const
- Support for exporting buffers via dmabuf
- Multi-VMA page migration fixes
- NBIO fixes
- Misc code cleanups
- Fix possible double free
- Fix possible UAF

radeon:
- iMac fix

UAPI:
- KFD dmabuf export support.  Required for importing KFD buffers into GEM contexts and for RDMA P2P support.
  Proposed user mode changes: https://github.com/fxkamd/ROCT-Thunk-Interface/commits/fxkamd/dmabuf

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230317164416.138340-1-alexander.deucher@amd.com
2023-03-20 16:44:36 +10:00
Hawking Zhang
4489f0fd9e drm/amdgpu: Retire pcie_gen3_enable function
Not needed since from vi. drop the function so
we don't duplicate code when introduce new asics.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Hawking Zhang
dabc114e4b drm/amdgpu: Move to common helper to query soc rev_id
Replace soc15, nv, soc21 get_rev_id callback with common
helper so we don't need to duplicate code when introduce
new asics.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Hawking Zhang
65ba96e91b drm/amdgpu: Move to common indirect reg access helper
Replace soc15, nv, soc21 specific callbacks with common
one. so we don't need to duplicate code when introduce
new asics.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Hawking Zhang
370808876b drm/amdgpu: drop ras check at asic level for new blocks
amdgpu_ras_register_ras_block should always be invoked
by ras_sw_init, where driver needs to check ras caps
at ip level, instead of asic level.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Hawking Zhang
fdc94d3a8c drm/amdgpu: Rework pcie_bif ras sw_init
pcie_bif ras blocks needs to be initialized as early
as possible to handle fatal error detected in hw_init
phase. also align the pcie_bif ras sw_init with other
ras blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Hawking Zhang
da9d669eab drm/amdgpu: Rework xgmi_wafl_pcs ras sw_init
To align with other IP blocks.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Hawking Zhang
7f544c5488 drm/amdgpu: Rework mca ras sw_init
To align with other IP blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:26 -04:00
Yifan Zha
057e335c71 drm/amdgpu: Init MMVM_CONTEXTS_DISABLE in gmc11 golden setting under SRIOV
[Why]
If disable the mmhub vm contexts(set MMVM_CONTEXTS_DISABLE to 0xffff),
driver loading failed on vf due to fence fallback timer expired on all rings.
FLR cannot reset MMVM_CONTEXTS_DISABLE.
So this vf can not be recovered anymore unless trigger a whole gpu reset.

[How]
Under SRIOV, init MMVM_CONTEXTS_DISABLE in gmc11 golden register setting.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <Horace.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:26 -04:00
Tvrtko Ursulin
4230cea89c drm: Track clients by tgid and not tid
Thread group id (aka pid from userspace point of view) is a more
interesting thing to show as an owner of a DRM fd, so track and show that
instead of the thread id.

In the next patch we will make the owner updated post file descriptor
handover, which will also be tgid based to avoid ping-pong when multiple
threads access the fd.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230314141904.1210824-2-tvrtko.ursulin@linux.intel.com
2023-03-15 14:03:00 +01:00
Dave Airlie
faf0d83e10 Merge tag 'drm-misc-next-2023-03-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.4-rc1:

Note: Only changes since pull request from 2023-02-23 are included here.

UAPI Changes:
- Convert rockchip bindings to YAML.
- Constify kobj_type structure in dma-buf.
- FBDEV cmdline parser fixes, and other small fbdev fixes for mode
   parsing.

Cross-subsystem Changes:
- Add Neil Armstrong as linaro maintainer.
- Actually signal the private stub dma-fence.

Core Changes:
- Add function for adding syncobj dep to sched_job and use it in panfrost, v3d.
- Improve DisplayID 2.0 topology parsing and EDID parsing in general.
- Add a gem eviction function and callback for generic GEM shrinker
  purposes.
- Prepare to convert shmem helper to use the GEM reservation lock instead of own
  locking. (Actual commit itself got reverted for now)
- Move the suballocator from radeon and amdgpu drivers to core in preparation
  for Xe.
- Assorted small fixes and documentation.
- Fixes to HPD polling.
- Assorted small fixes in simpledrm, bridge, accel, shmem-helper,
   and the selftest of format-helper.
- Remove dummy resource when ttm bo is created, and during pipelined
   gutting. Fix all drivers to accept a NULL ttm_bo->resource.
- Handle pinned BO moving prevention in ttm core.
- Set drm panel-bridge orientation before connector is registered.
- Remove dumb_destroy callback.
- Add documentation to GEM_CLOSE, PRIME_HANDLE_TO_FD, PRIME_FD_TO_HANDLE, GETFB2 ioctl's.
- Add atomic enable_plane callback, use it in ast, mgag200, tidss.

Driver Changes:
- Use drm_gem_objects_lookup in vc4.
- Assorted small fixes to virtio, ast, bridge/tc358762, meson, nouveau.
- Allow virtio KMS to be disabled and compiled out.
- Add Radxa 8/10HD, Samsung AMS495QA01 panels.
- Fix ivpu compiler errors.
- Assorted fixes to drm/panel, malidp, rockchip, ivpu, amdgpu, vgem,
   nouveau, vc4.
- Assorted cleanups, simplifications and fixes to vmwgfx.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ac1f5186-54bb-02f4-ac56-907f5b76f3de@linux.intel.com
2023-03-14 12:18:54 +10:00
Guilherme G. Piccoli
1aff0a5d71 drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSes
The VCN firmware loading path enables the indirect SRAM mode if it's
advertised as supported. We might have some cases of FW issues that
prevents this mode to working properly though, ending-up in a failed
probe. An example below, observed in the Steam Deck:

[...]
[drm] failed to load ucode VCN0_RAM(0x3A)
[drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000)
amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
[drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110
amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed
amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
[...]

Disabling the VCN block circumvents this, but it's a very invasive
workaround that turns off the entire feature. So, let's add a quirk
on VCN loading that checks for known problematic BIOSes on Vangogh,
so we can proactively disable the indirect SRAM mode and allow the
HW proper probe and VCN IP block to work fine.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2385
Fixes: 82132ecc54 ("drm/amdgpu: enable Vangogh VCN indirect sram mode")
Cc: stable@vger.kernel.org
Cc: James Zhu <James.Zhu@amd.com>
Cc: Leo Liu <leo.liu@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:49 -04:00
Alex Deucher
84b31d484e drm/amdgpu/nv: fix codec array for SR_IOV
Copy paste error.

Fixes: 384334120b ("drm/amdgpu/nv: don't expose AV1 if VCN0 is harvested")
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4454
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:49 -04:00
Hawking Zhang
474e2d491e drm/amdgpu: Move hdp ras block init to ras sw_init
Initialize hdp ras block only when mmhub ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Hawking Zhang
fec70a8601 drm/amdgpu: Move mmhub ras block init to ras sw_init
Initialize mmhub ras block only when mmhub ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Hawking Zhang
af8312a38f drm/amdgpu: Correct gfx ras_late_init callback
Use default gfx ras_late_init callback for gfx
ras block.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Hawking Zhang
a6dcf9a7cc drm/amdgpu: Move umc ras block init to gmc ras sw_init
Initialize umc ras block only when umc ip block
supports ras. Driver queries ras capabilities after
early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Hawking Zhang
f81c31d975 drm/amdgpu: Move vcn ras block init to ras sw_init
Initialize vcn ras block only when vcn ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_int.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Hawking Zhang
5640e06e60 drm/amdgpu: Move jpeg ras block init to ras sw_init
Initialize jpeg ras block only when jpeg ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_int.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Guchun Chen
c69d51395a drm/amdgpu: move poll enabled/disable into non DC path
Some amd asics having reliable hotplug support don't call
drm_kms_helper_poll_init in driver init sequence. However,
due to the unified suspend/resume path for all asics, because
the output_poll_work->func is not set for these asics, a warning
arrives when suspending.

[   90.656049]  <TASK>
[   90.656050]  ? console_unlock+0x4d/0x100
[   90.656053]  ? __irq_work_queue_local+0x27/0x60
[   90.656056]  ? irq_work_queue+0x2b/0x50
[   90.656057]  ? __wake_up_klogd+0x40/0x60
[   90.656059]  __cancel_work_timer+0xed/0x180
[   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
[   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
[   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
[   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0

drm_kms_helper_poll_enable/disable is valid when poll_init is called in
amdgpu code, which is only used in non DC path. So move such codes into
non-DC path code to get rid of such warnings.

v1: introduce use_kms_poll flag in amdgpu as the poll stuff check
v2: use dc_enabled as the flag to simply code
v3: move code into non DC path instead of relying on any flag

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a ("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Guchun Chen
6ab68650a1 drm/amdgpu: use drm_device pointer directly rather than convert again
The convert from adev is redundant.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Guchun Chen
53e9d836ea drm/amdgpu: drop pm_sysfs_en flag from amdgpu_device structure
pm_sysfs_en is overlapped with pm.sysfs_initialized, so drop it
for simplifying code(no functional change).

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00