Commit Graph

74 Commits

Author SHA1 Message Date
Pratap Nirujogi
9bed716f87 drm/amd/pm: Add support to set min ISP clocks
Add support to set ISP clocks for SMU v14.0.0. ISP driver
uses amdgpu_dpm_set_soft_freq_range() API to set clocks via
SMU interface than communicating with PMFW directly.

amdgpu_dpm_set_soft_freq_range() is updated to take in any
pp_clock_type than limiting to support only PP_SCLK to allow
ISP and other driver modules to set the min/max clocks. Any
clock specific restrictions are expected to be taken care in
SOC specific SMU implementations instead of generic amdgpu_dpm
and amdgpu_smu interfaces.

Reviewed-by: Xiaojian Du <xiaojian.du@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-24 10:02:44 -04:00
Pratap Nirujogi
fba8d14747 drm/amd/pm: Add support to set ISP Power
Add support to set ISP power for SMU v14.0.0. ISP driver
uses amdgpu_dpm_set_powergating_by_smu() API to
enable / disable power via SMU interface than communicating
with PMFW directly.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-24 10:02:36 -04:00
Alex Deucher
5cccf10f65 drm/amdgpu: disable workload profile switching when OD is enabled
Users have reported that they have to reduce the level of undervolting
to acheive stability when dynamic workload profiles are enabled on
GC 10.3.x. Disable dynamic workload profiles if the user has enabled
OD.

Fixes: b9467983b7 ("drm/amdgpu: add dynamic workload profile switching for gfx10")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4262
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.15.x
2025-06-03 15:04:24 -04:00
Lijo Lazar
54a01f7751 drm/amd/pm: Add support to query partition metrics
Add interfaces to query compute partition related metrics data.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-22 12:01:33 -04:00
Alex Deucher
6dafb5d4c7 drm/amdgpu/pm: add workload profile pause helper
To be used for display idle optimizations when
we want to pause non-default profiles.

Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08 16:05:17 -04:00
Ruili Ji
9f7ce6a9ab drm/amd/pm: implement dpm vcn reset function
Implement VCN engine reset by sending MSG_ResetVCN
on smu 13.0.6.

v2: fix format for code and message

Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 18:01:08 -04:00
Ce Sun
921c040efe drm/amd/pm: Add link reset for SMU 13.0.6
Add link reset implementation

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Xiang Liu
ce615fe328 drm/amdgpu: Check if CPER enabled when generating CPER
In the case of CPER disabled, generating CPER will cause kernel NULL
pointer dereference without checking.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:45:12 -05:00
Jesse.zhang@amd.com
d190e4d0f7 drm/amd/pm: add support for checking SDMA reset capability
This patch introduces a new function to check if the SMU supports resetting the SDMA engine.
This capability check ensures that the driver does not attempt to reset the SDMA engine
on hardware that does not support it.

The following changes are included:
- New function `amdgpu_dpm_reset_sdma_is_supported` to check SDMA reset
  support at the AMDGPU driver level.
- New function `smu_reset_sdma_is_supported` to check SDMA reset support
  at the SMU level.
- Implementation of `smu_v13_0_6_reset_sdma_is_supported` for the specific
  SMU version v13.0.6.
- Updated `smu_v13_0_6_reset_sdma` to use the new capability check before
  attempting to reset the SDMA engine.

v2: change smu_reset_sdma_is_supported type to bool (Tim)

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25 11:43:59 -05:00
Xiang Liu
f9d35b945c drm/amdgpu: Generate bad page threshold cper records
Generate CPER record when bad page threshold exceed and
commit to CPER ring.

v2: return -ENOMEM instead of false
v2: check return value of fill section function

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-17 14:09:30 -05:00
Alex Deucher
960a628774 drm/amdgpu/pm: fix UVD handing in amdgpu_dpm_set_powergating_by_smu()
UVD and VCN were split into separate dpm helpers in commit
ff69bba05f ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
as such, there is no need to include UVD in the is_vcn variable since
UVD and VCN are handled by separate dpm helpers now. Fix the check.

Fixes: ff69bba05f ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3959
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-February/119827.html
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
2025-02-12 18:13:32 -05:00
Lijo Lazar
5889339298 drm/amd/pm: Revert state if force level fails
Before forcing level, CG/PG is disabled or enabled depending on the new
level. However if the force level operation fails, CG/PG state remains
modified. Revert the state change on failure. Also, move invalid
operation checks to the beginning before any logic that could change SOC
state.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10 10:29:55 -05:00
Boyuan Zhang
393f026b16 drm/amdgpu: add inst to amdgpu_dpm_enable_vcn
Add an instance parameter to amdgpu_dpm_enable_vcn() function, and change
all calls from vcn ip functions to add instance argument. vcn generations
with only one instance (v1.0, v2.0) always use 0 as instance number. vcn
generations with multiple instances (v2.5, v3.0, v4.0, v4.0.3, v4.0.5,
v5.0.0) use the actual instance number.

v2: remove for-loop in amdgpu_dpm_enable_vcn(), and temporarily move it
to vcn ip with multiple instances, in order to keep the exact same logic
as before, until further separation in next patch.

v3: fix missing prefix

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10 10:26:47 -05:00
Boyuan Zhang
ff69bba05f drm/amd/pm: add inst to dpm_set_powergating_by_smu
Add an instance parameter to amdgpu_dpm_set_powergating_by_smu() function,
and use the instance to call set_powergating_by_smu().

v2: remove duplicated functions.

remove for-loop in amdgpu_dpm_set_powergating_by_smu(), and temporarily
move it to amdgpu_dpm_enable_vcn(), in order to keep the exact same logic
as before, until further separation in next patch.

v3: drop SI logic in amdgpu_dpm_enable_vcn().

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10 10:26:47 -05:00
Boyuan Zhang
697cb5cc25 drm/amd/pm: add inst to set_powergating_by_smu
Add an instance parameter to set_powergating_by_smu() function, and
re-write all amd_pm functions accordingly. Then use the instance to
call smu_dpm_set_vcn_enable().

v2: remove duplicated functions.

remove for-loop in smu_dpm_set_power_gate(), and temporarily move it to
to amdgpu_dpm_set_powergating_by_smu(), in order to keep the exact same
logic as before, until further separation in next patch.

v3: add instance number in error message.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10 10:26:47 -05:00
Jiadong Zhu
610696505c drm/amd/pm: implement dpm sdma reset function
Implement sdma soft reset by sending MSG_ResetSDMA on smu 13.0.6.

v2: Add firmware version for the reset message.
v3: Add ip version check. Print inst_mask on failure.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10 10:26:45 -05:00
Victor Skvortsov
f83cec3b3a drm/amdgpu: Disable dpm_enabled flag while VF is in reset
VFs do not perform HW fini/suspend in FLR, so the dpm_enabled
is incorrectly kept enabled. Add interface to disable it in
virt_pre_reset call.

v2: Made implementation generic for all asics
v3: Re-order conditionals so PP_MP1_STATE_FLR is only evaluated on VF

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13 12:12:52 -04:00
Alex Deucher
608d886c97 drm/amdgpu: Fix APU handling in amdgpu_pm_load_smu_firmware()
We only need to skip this on modern APUs.  It's required
on older APUs as it's where start_smu gets called from.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3502
Fixes: 064d92436b ("drm/amd/pm: avoid to load smu firmware for APUs")
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Tim Huang <Tim.Huang@amd.com>
2024-07-27 17:34:36 -04:00
Tim Huang
064d92436b drm/amd/pm: avoid to load smu firmware for APUs
Certain call paths still load the SMU firmware for APUs,
which needs to be skipped.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-08 16:46:30 -04:00
Lijo Lazar
546e6309d1 drm/amd/pm: Remove legacy interface for xgmi plpd
Replace the legacy interface with amdgpu_dpm_set_pm_policy to set XGMI
PLPD mode. Also, xgmi_plpd_policy sysfs node is not used by any client.
Remove that as well.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17 17:40:39 -04:00
Lijo Lazar
4d154b1ca5 drm/amd/pm: Add support for DPM policies
Add support to set/get information about different DPM policies. The
support is only available on SOCs which use swsmu architecture.

A DPM policy type may be defined with different levels. For example, a
policy may be defined to select Pstate preference and then later a
pstate preference may be chosen.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17 17:40:38 -04:00
Ma Jun
b2207dc698 drm/amdgpu/pm: Add support for MACO flag checking
Add support for MACO flag checking.
MACO mode only works if BACO is supported.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-09 22:07:59 -04:00
Yang Wang
e3bfb8d917 drm/amdgpu: implement smu send rma reason for smu v13.0.6
implement smu send rma reason function for smu v13.0.6

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-12 16:08:17 -05:00
Peyton Lee
a2f2f43f74 drm/amd/pm: support return vpe clock table
pm supports return vpe clock table and soc clock table

Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14 15:25:59 -05:00
Perry Yuan
2e9b152325 drm/amdgpu: optimize RLC powerdown notification on Vangogh
The smu needs to get the rlc power down message to sync the rlc state
with smu, the rlc state updating message need to be sent at while smu
begin suspend sequence , otherwise SMU will crash while RLC state is not
notified by driver, and rlc state probally changed after that
notification, so it needs to notify rlc state to smu at the end of the
suspend sequence in amdgpu_device_suspend() that can make sure the rlc
state  is correctly set to SMU.

[  101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[  101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
[  110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[  110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features.
[  110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
[  110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
[  110.884394] PM: suspend of devices aborted after 21213.620 msecs
[  110.884402] PM: start suspend of devices aborted after 21213.882 msecs
[  110.884405] PM: Some devices failed to suspend, or early wake event detected

Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:48:58 -05:00
Lijo Lazar
12c2d3b5f5 drm/amd/pm: Add support to fetch pm metrics sample
Add API support to fetch a snapshot of power management metrics from PMFW.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:23:46 -05:00
Ma Jun
fbbcb3f2b7 drm/amd/pm: Fix return value and drop redundant param
Fix the return value and drop redundant parameter
of get_asic_baco_capability function.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 09:29:53 -05:00
Lijo Lazar
8cfd6a0575 drm/amd/pm: Hide irrelevant pm device attributes
Change return code to EOPNOTSUPP for unsupported functions. Use the
error code information to hide sysfs nodes not valid for the SOC.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-07 12:03:31 -05:00
Lang Yu
d60fbf2d25 drm/amdgpu: add support to powerup VPE by SMU
Powerup VPE by SMU.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:00:03 -04:00
Le Ma
3152d01e88 drm/amd/pm: deprecate allow_xgmi_power_down interface
Replace with set_plpd_mode uniformly for places to use.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-28 15:36:23 -04:00
Le Ma
21e43386ae drm/amd/pm: add xgmi_plpd_policy sysfs node for user to change plpd policy
Add xgmi_plpd_policy sysfs node for users to check and select xgmi
per-link power down policy:
  - arg 0: disallow plpd
  - arg 1: default policy
  - arg 2: optimized policy

v2: split from smu v13.0.6 code and miscellaneous updates
v3: add usage comments around set/get functions

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26 17:00:23 -04:00
Guchun Chen
5247f05ead drm/amd/pm: avoid potential UBSAN issue on legacy asics
Prevent further dpm casting on legacy asics without od_enabled in
amdgpu_dpm_is_overdrive_supported. This can avoid UBSAN complain
in init sequence.

v2: add a macro to check legacy dpm instead of checking asic family/type
v3: refine macro name for naming consistency

Suggested-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-05-11 08:41:36 -04:00
Guchun Chen
58d9b9a14b drm/amd/pm: parse pp_handle under appropriate conditions
amdgpu_dpm_is_overdrive_supported is a common API across all
asics, so we should cast pp_handle into correct structure
under different power frameworks.

v2: using return directly to simplify code
v3: SI asic does not carry od_enabled member in pp_handle, and update Fixes tag

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2541
Fixes: eb4900aa4c ("drm/amdgpu: Fix kernel NULL pointer dereference in dpm functions")
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-05-11 01:06:27 -04:00
Kun Liu
c3ed0e72c8 drm/amdgpu: added a sysfs interface for thermal throttling
added a sysfs interface for thermal throttling, then userspace
can get/update thermal limit

Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:36:00 -05:00
Kenneth Feng
230dd6bb61 drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10
implement mode2 reset on smu_v13_0_10

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-14 15:47:15 -05:00
André Almeida
0ad7347a64 drm/amd: Add detailed GFXOFF stats to debugfs
Add debugfs interface to log GFXOFF statistics:

- Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the
  time of query since system power-up

- Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop.
  Read it to get average GFXOFF residency % multiplied by 100
  during the last logging interval.

Both features are designed to be keep the values persistent between
suspends.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-08-16 18:17:31 -04:00
Huang Rui
7101ab97e3 drm/amdgpu/pm: implement the SMU_MSG_EnableGfxImu function
GC v11_0_1 asic needs to issue the EnableGfxImu message after start IMU.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:44:15 -04:00
Lijo Lazar
39dbde650f drm/amd/pm: Return auto perf level, if unsupported
When powerplay is not enabled, return AUTO as default level.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Dave Airlie
e954d2c94d Linux 5.18-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmJu9FYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGAyEH/16xtJSpLmLwrQzG
 o+4ToQxSQ+/9UHyu0RTEvHg2THm9/8emtIuYyc/5FgdoWctcSa3AaDcveWmuWmkS
 KYcdhfJsaEqjNHS3OPYXN84fmo9Hel7263shu5+IYmP/sN0DfQp6UWTryX1q4B3Q
 4Pdutkuq63Uwd8nBZ5LXQBumaBrmkkuMgWEdT4+6FOo1mPzwdIGBxCuz1UsNNl5k
 chLWxkQfe2eqgWbYJrgCQfrVdORXVtoU2fGilZUNrHRVGkkldXkkz5clJfapyZD3
 odmZCEbrE4GPKgZwCmDERMfD1hzhZDtYKiHfOQ506szH5ykJjPBcOjHed7dA60eB
 J3+wdek=
 =39Ca
 -----END PGP SIGNATURE-----

Backmerge tag 'v5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next

Linux 5.18-rc5

There was a build fix for arm I wanted in drm-next, so backmerge rather then cherry-pick.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-05-03 16:08:48 +10:00
Evan Quan
a71849cdea drm/amd/pm: fix the deadlock issue observed on SI
The adev->pm.mutx is already held at the beginning of
amdgpu_dpm_compute_clocks/amdgpu_dpm_enable_uvd/amdgpu_dpm_enable_vce.
But on their calling path, amdgpu_display_bandwidth_update will be
called and thus its sub functions amdgpu_dpm_get_sclk/mclk. They
will then try to acquire the same adev->pm.mutex and deadlock will
occur.

By placing amdgpu_display_bandwidth_update outside of adev->pm.mutex
protection(considering logically they do not need such protection) and
restructuring the call flow accordingly, we can eliminate the deadlock
issue. This comes with no real logics change.

Fixes: 3712e7a494 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reported-by: Arthur Marsh <arthur.marsh@internode.on.net>
Link: https://lore.kernel.org/all/9e689fea-6c69-f4b0-8dee-32c4cf7d8f9c@molgen.mpg.de/
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1957
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-27 17:38:02 -04:00
Huang Rui
eea5c7b339 drm/amdgpu/pm: fix the null pointer while the smu is disabled
It needs to check if the pp_funcs is initialized while release the
context, otherwise it will trigger null pointer panic while the software
smu is not enabled.

[ 1109.404555] BUG: kernel NULL pointer dereference, address: 0000000000000078
[ 1109.404609] #PF: supervisor read access in kernel mode
[ 1109.404638] #PF: error_code(0x0000) - not-present page
[ 1109.404657] PGD 0 P4D 0
[ 1109.404672] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1109.404701] CPU: 7 PID: 9150 Comm: amdgpu_test Tainted: G           OEL    5.16.0-custom #1
[ 1109.404732] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 1109.404765] RIP: 0010:amdgpu_dpm_force_performance_level+0x1d/0x170 [amdgpu]
[ 1109.405109] Code: 5d c3 44 8b a3 f0 80 00 00 eb e5 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 4c 8b b7 f0 7d 00 00 <49> 83 7e 78 00 0f 84 f2 00 00 00 80 bf 87 80 00 00 00 48 89 fb 0f
[ 1109.405176] RSP: 0018:ffffaf3083ad7c20 EFLAGS: 00010282
[ 1109.405203] RAX: 0000000000000000 RBX: ffff9796b1c14600 RCX: 0000000002862007
[ 1109.405229] RDX: ffff97968591c8c0 RSI: 0000000000000001 RDI: ffff9796a3700000
[ 1109.405260] RBP: ffffaf3083ad7c50 R08: ffffffff9897de00 R09: ffff979688d9db60
[ 1109.405286] R10: 0000000000000000 R11: ffff979688d9db90 R12: 0000000000000001
[ 1109.405316] R13: ffff9796a3700000 R14: 0000000000000000 R15: ffff9796a3708fc0
[ 1109.405345] FS:  00007ff055cff180(0000) GS:ffff9796bfdc0000(0000) knlGS:0000000000000000
[ 1109.405378] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1109.405400] CR2: 0000000000000078 CR3: 000000000a394000 CR4: 00000000000506e0
[ 1109.405434] Call Trace:
[ 1109.405445]  <TASK>
[ 1109.405456]  ? delete_object_full+0x1d/0x20
[ 1109.405480]  amdgpu_ctx_set_stable_pstate+0x7c/0xa0 [amdgpu]
[ 1109.405698]  amdgpu_ctx_fini.part.0+0xcb/0x100 [amdgpu]
[ 1109.405911]  amdgpu_ctx_do_release+0x71/0x80 [amdgpu]
[ 1109.406121]  amdgpu_ctx_ioctl+0x52d/0x550 [amdgpu]
[ 1109.406327]  ? _raw_spin_unlock+0x1a/0x30
[ 1109.406354]  ? drm_gem_handle_delete+0x81/0xb0 [drm]
[ 1109.406400]  ? amdgpu_ctx_get_entity+0x2c0/0x2c0 [amdgpu]
[ 1109.406609]  drm_ioctl_kernel+0xb6/0x140 [drm]

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-19 13:58:32 -04:00
Alex Deucher
ebc002e3ee drm/amdgpu: don't use BACO for reset in S3
Seems to cause a reboots or hangs on some systems.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953
Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-04-06 12:02:57 -04:00
Alex Deucher
34452ac303 drm/amdgpu: don't use BACO for reset in S3
Seems to cause a reboots or hangs on some systems.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953
Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-05 10:27:34 -04:00
Stanley.Yang
93dde6ccd6 drm/amdgpu/pm: add asic smu support check
It must check asic whether support smu
before call smu powerplay function, otherwise
it may cause null point on no support smu asic.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-25 12:40:24 -04:00
Stanley.Yang
a29d44aea1 drm/amd/pm: use pm mutex to protect ecc info table
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-15 14:34:34 -04:00
Stanley.Yang
d510eccfa5 drm/amd/pm: add send bad channel info function
support message SMU update bad channel info to update HBM bad channel
info in OOB table

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-15 14:25:16 -04:00
Darren Powell
5d64f9bbb6 amdgpu/pm: Implement new API function "emit" that accepts buffer base and write offset
(v3)
     Rewrote patchset to order patches as (API, hw impl, usecase)

     - added API for new power management function emit_clk_levels
       This function should duplicate the functionality of print_clk_levels,
       but this solution passes the buffer base and write offset down the stack.
     - new powerplay function emit_clock_levels, implemented by smu_emit_ppclk_levels()
       This function parallels the implementation of smu_print_ppclk_levels and
       calls emit_clk_levels, and allows the returns of errors
     - new helper function smu_convert_to_smuclk called by smu_print_ppclk_levels and
       smu_emit_ppclk_levels

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-By: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02 18:27:58 -05:00
Evan Quan
75513bf5d7 drm/amd/pm: fix the deadlock observed on performance_level setting
The sub-routine(amdgpu_gfx_off_ctrl) tried to obtain the lock
adev->pm.mutex which was actually hold by amdgpu_dpm_force_performance_level.
A deadlock happened then.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-27 15:48:22 -05:00
Flora Cui
1613f346f8 drm/amd/pm: fix null ptr access
check null ptr first before access its element

v2: check adev->pm.dpm_enabled early in amdgpu_debugfs_pm_init()

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:52:47 -05:00
Evan Quan
685fae24d9 drm/amd/pm: correct the checks for fan attributes support
On functionality unsupported, -EOPNOTSUPP will be returned. And we rely
on that to determine the fan attributes support.

Fixes: 79c65f3fcb ("drm/amd/pm: do not expose power implementation details to amdgpu_pm.c")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:15 -05:00