Commit Graph

22333 Commits

Author SHA1 Message Date
Alex Deucher
b8b64595d6 drm/amdgpu: simplify amdgpu_device_asic_has_dc_support()
Drop extra cases in the default case.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Leung, Martin
a820190204 drm/amdgpu/display: Prepare for new interfaces
why:
lut pipeline will be hooked up differently in some asics
need to add new interfaces

how:
add them

Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Martin <martin.leung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Evan Quan
12d6c18cfa drm/amdgpu: suppress the compile warning about 64 bit type
Suppress the compile warning below:
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1292
gfx_v11_0_rlc_backdoor_autoload_copy_ucode() warn: should '1 << id' be a 64 bit type?

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Evan Quan
4513edf74c drm/amd/pm: suppress compile warnings about possible unaligned accesses
Suppress the following compile warnings:
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_pptable.h:163:17:
warning: field smc_pptable within 'struct smu_11_0_powerplay_table' is
less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_powerplay_table'
being packed, which can lead to unaligned accesses [-Wunaligned-access]
         PPTable_t smc_pptable;                        //PPTable_t in smu11_driver_if.h
                   ^
   1 warning generated.
--
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_7_pptable.h:193:17:
warning: field smc_pptable within 'struct smu_11_0_7_powerplay_table' is
less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_7_powerplay_table'
being packed, which can lead to unaligned accesses [-Wunaligned-access]
         PPTable_t smc_pptable;                        //PPTable_t in smu11_driver_if.h
                   ^
   1 warning generated.
--
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v13_0_pptable.h:161:12:
warning: field smc_pptable within 'struct smu_13_0_powerplay_table' is less aligned than
'PPTable_t' and is usually due to 'struct smu_13_0_powerplay_table' being packed, which
can lead to unaligned accesses [-Wunaligned-access]

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Philip Yang
88467db6e2 drm/amdkfd: Fix partial migration bugs
Migration range from system memory to VRAM, if system page can not be
locked or unmapped, we do partial migration and leave some pages in
system memory. Several bugs found to copy pages and update GPU mapping
for this situation:

1. copy to vram should use migrate->npage which is total pages of range
as npages, not migrate->cpages which is number of pages can be migrated.

2. After partial copy, set VRAM res cursor as j + 1, j is number of
system pages copied plus 1 page to skip copy.

3. copy to ram, should collect all continuous VRAM pages and copy
together.

4. Call amdgpu_vm_update_range, should pass in offset as bytes, not
as number of pages.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-03 16:43:22 -04:00
Lang Yu
4fac4fcf45 drm/amdkfd: add pinned BOs to kfd_bo_list
The kfd_bo_list is used to restore process BOs after
evictions. As page tables could be destroyed during
evictions, we should also update pinned BOs' page tables
during restoring to make sure they are valid.

So for pinned BOs,
1, Validate them and update their page tables.
2, Don't add eviction fence for them.

v2:
 - Don't handle pinned ones specially in BO validation.(Felix)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:27:17 -04:00
Philip Yang
4d1e5f12b7 drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:27:17 -04:00
Sunil Khatri
371017309a drm/amdgpu: enable tmz by default for GC 10.3.7
Add IP GC 10.3.7 in the list of target to have
tmz enabled by default.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
2022-06-03 16:27:00 -04:00
Mario Limonciello
7c4f4f197e drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions
Loading amdgpu on GC 10.3.7 shows an ERR level message:
`kfd kfd: amdgpu: GC IP 0a0307 not supported in kfd`

Add these targets to match yellow carp structures.

Reported-by: David Chang <david.chang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Jesse(Jie) Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
2022-06-03 16:26:36 -04:00
Philip Yang
fa582c6f36 drm/amdkfd: Use mmget_not_zero in MMU notifier
MMU notifier callback may pass in mm with mm->mm_users==0 when process
is exiting, use mmget_no_zero to avoid accessing invalid mm in deferred
list work after mm is gone.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 16:04:41 -04:00
Candice Li
2a46096335 drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus
Adjust the sequence for ras late init and separate ras reset error status
from query status.

v2: squash in fix from Candice

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:58:45 -04:00
Stanley.Yang
28caf8c467 drm/amdgpu: fix ras supported check
Fix aldebaran ras supported check on SRIOV guest side,
the previous check conditicon block all ras feature
on baremetal

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:58:09 -04:00
Aurabindo Pillai
fd843d0341 drm/amd/display: remove stale config guards
This code should be executed.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-01 15:57:18 -04:00
sunliming
8365ed22d0 drm/amdgpu: make gfx_v11_0_rlc_stop static
This symbol is not used outside of gfx_v11_0.c, so marks it static.

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1945:6: warning: no previous
prototype for function 'gfx_v11_0_rlc_stop' [-Wmissing-prototypes].

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:57:12 -04:00
sunliming
418214ddcf drm/amdgpu: fix a missing break in gfx_v11_0_handle_priv_fault
Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5873:2: warning: unannotated
fall-through between switch labels [-Wimplicit-fallthrough].

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:57:08 -04:00
Roman Li
ae969b62e7 drm/amdgpu: fix aper_base for APU
[Why]
Wrong fb offset results in dmub f/w errors and white screen.
[drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

[How]
Read aper_base from mmhub because GC is off by default

v2: use BAR for passthrough (Alex)

Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:59 -04:00
Alex Deucher
97e5030554 drm/amdgpu: update VCN codec support for Yellow Carp
Supports AV1.  Mesa already has support for this and
doesn't rely on the kernel caps for yellow carp, so
this was already working from an application perspective.

Fixes: 554398174d ("amdgpu/nv.c - Added video codec support for Yellow Carp")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2002
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-01 15:56:49 -04:00
Jiapeng Chong
11594fa114 drm/amdgpu: make program_imu_rlc_ram static
This symbol is not used outside of imu_v11_0.c, so marks it
static.

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:302:6: warning: no previous
prototype for ‘program_imu_rlc_ram’ [-Wmissing-prototypes].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:49 -04:00
Aric Cyr
0675418477 drm/amd/display: 3.2.187
This version brings along the following fixes:

* Changes to DP LT fallback behavior to more closely match the DP standard
* Added new interfaces for lut pipeline
* Restore ref_dtblck value when clk struct is cleared in init_clocks
* Fixes DMUB outbox trace in S4
* Fixes lingering DIO FIFO errors when DIO no longer enabled
* Reads Golden Settings Table from VBIOS

Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:49 -04:00
Ilya
583ad88871 drm/amd/display: Fix possible infinite loop in DP LT fallback
[Why]
It's possible for some fallback scenarios to result in infinite looping
during link training.

[How]
This change modifies DP LT fallback behavior to more closely match the
DP standard. Keep track of the link rate during the EQ_FAIL fallback,
and use it as the maximum link rate for the CR sequence.

Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Ilya <Ilya.Bakoulin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:49 -04:00
Alvin
f0ad66f42a drm/amd/display: Don't clear ref_dtbclk value
[Description]
ref_dtbclk value is assigned in clk_mgr_construct,
but the clks struct is cleared in init_clocks.
Make sure to restore the value or we will get
0 value for ref_dtbclk in DCN31.

Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:48 -04:00
Hung, Cruise
6ecf9773a5 drm/amd/display: Fix DMUB outbox trace in S4 (#4465)
[Why]
DMUB Outbox0 read/write pointer not sync after resumed from S4.
And that caused old traces were sent to outbox.

[How]
Disable DMUB Outbox0 interrupt
and clear DMUB Outbox0 read/write pointer when resumes from S4.
And then enable Outbox0 interrupt before starts DMCUB.

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:48 -04:00
hengzhou
92909cde32 drm/amd/display: Wait DMCUB to idle state before reset.
[WHY]
Very low rate to cause memory access issue while resetting
DMCUB after the halt command was sent to it.
The process of stopping fw of DMCUB may be timeout, that means
it is not in idle state, such as the window frames may still be
kept in cache, so reset by force will cause MMHUB hang.

[HOW]
After the halt command was sent, keep checking the DMCUB state until
it is idle.

Reviewed-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: hengzhou <Hengyong.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:48 -04:00
Nicholas Kazlauskas
8440f57532 drm/amd/display: Pass the new context into disable OTG WA
[Why]
When enabling an HPO stream for the first time after having previously
enabled a DIO stream there may be lingering DIO FIFO errors even though
the DIO is no longer enabled.

These can cause display clock change to hang if we don't apply the
OTG disable workaround since the ramping logic is tied to OTG on.

[How]
The workaround wasn't being applied in the sequence of:

1 DIO stream
0 streams
1 HPO stream

because current_state has no stream or planes in its context - and
it's only swapped after optimize has finished.

We should be using the incoming context instead to determine whether
this logic is needed or not.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:48 -04:00
Leung, Martin
0ec7440847 drm/amd/display: revert Blank eDP on disable/enable drv
why and how:
Revert this change. It was causing a black screen with certain blocks

Reviewed-by: George Shen <George.Shen@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Leung, Martin <Martin.Leung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:48 -04:00
Sherry Wang
4b81dd2cc6 drm/amd/display: Read Golden Settings Table from VBIOS
[Why]
Dmub read AUX_DPHY_RX_CONTROL0 from Golden Setting Table,
but driver will set it to default value 0x103d1110, which
causes issue in some case

[How]
Remove the driver code, use the value set by dmub in
dp_aux_init

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Sherry Wang <YAO.WANG1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-01 15:56:48 -04:00
Christian König
9bdc1992c9 drm/amdgpu: add drm-client-id to fdinfo v2
This is enough to get gputop working :)

v2: rebase and some addition cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
af0b541670 drm/amdgpu: Convert to common fdinfo format v5
Convert fdinfo format to one documented in drm-usage-stats.rst.

It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the usage of the
engine, but with varying time slices.

So 10% usage for application A could mean something completely different
than 10% usage for application B.

Completely nuke that and just use the now standardized nanosecond
interface.

v2: drop the documentation change for now, nuke percentage calculation
v3: only account for each hw_ip, move the time_spend to the ctx mgr.
v4: move general ctx changes into separate patch, rework the fdinfo to
    ctx_mgr interface so that all usages are calculated at once, drop
    some unecessary and dangerous refcount dance.
v5: add one more comment how we calculate the time spend

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
08cffb3eb7 drm/amdgpu: bump minor version number
Increase the minor version number to indicate that the new flags are
available.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
b6c65a2c92 drm/amdgpu: add AMDGPU_VM_NOALLOC v2
Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation.

v2: also add the flag to the allowed flags.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
fab2cc8335 drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE
Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
doesn't needs to be preserved during eviction.

KFD was already using a similar functionality for SVM BOs so replace the
internal flag with the new UAPI.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Alex Deucher
62e9bd2003 drm/amdgpu: add beige goby PCI ID
Add a beige goby PCI ID.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-05-26 14:56:33 -04:00
Lijo Lazar
39dbde650f drm/amd/pm: Return auto perf level, if unsupported
When powerplay is not enabled, return AUTO as default level.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Julia Lawall
6bd8d4b7d5 drm/amdkfd: fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Julia Lawall
ab5a7fb6d2 drm/amdgpu/gfx: fix typos in comments
Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Dave Airlie
31ab27b14d drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.
Submitting a cs with 0 chunks, causes an oops later, found trying
to execute the wrong userspace driver.

MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo

[172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
[172536.665188] #PF: supervisor read access in kernel mode
[172536.665189] #PF: error_code(0x0000) - not-present page
[172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0
[172536.665195] Oops: 0000 [#1] SMP NOPTI
[172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P           O      5.10.81 #1-NixOS
[172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
[172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu]
[172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10
[172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246
[172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68
[172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38
[172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40
[172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28
[172536.665283] FS:  00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000
[172536.665284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0
[172536.665287] Call Trace:
[172536.665322]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665332]  drm_ioctl_kernel+0xaa/0xf0 [drm]
[172536.665338]  drm_ioctl+0x201/0x3b0 [drm]
[172536.665369]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665372]  ? selinux_file_ioctl+0x135/0x230
[172536.665399]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[172536.665403]  __x64_sys_ioctl+0x83/0xb0
[172536.665406]  do_syscall_64+0x33/0x40
[172536.665409]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Alex Deucher
d534ca7128 drm/amdgpu: differentiate between LP and non-LP DDR memory
Some applications want to know whether the memory is LP or
not.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Candice Li
a5457087eb drm/amdgpu: Resolve pcie_bif RAS recovery bug
Check shared buf instead of init flag for xgmi ta shared buf init
during xgmi ta initialization.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Prike Liang
1c75524146 drm/amdgpu: clean up asd on the ta_firmware_header_v2_0
On the psp13 series use ta_firmware_header_v2_0 and the asd firmware
was buildin ta, so needn't request asd firmware separately.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Alex Deucher
a0ccc717c4 drm/amdgpu/discovery: validate VCN and SDMA instances
Validate the VCN and SDMA instances against the driver
structure sizes to make sure we don't get into a
situation where the firmware reports more instances than
the driver supports.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Sung Joon Kim
76818cdd11 drm/amd/display: add Coverage blend mode for overlay plane
According to the KMS man page, there is a
"Coverage" alpha blend mode that assumes the
pixel color values have NOT been pre-multiplied
and will be done when the actual blending to
the background color values happens.

Previously, this mode hasn't been enabled
in our driver and it was assumed that all
normal overlay planes are pre-multiplied
by default.

When a 3rd party app is used to input a image
in a specific format, e.g. PNG, as a source
of a overlay plane to blend with the background
primary plane, the pixel color values are not
pre-multiplied. So by adding "Coverage" blend
mode, our driver will support those cases.

Issue fixed: Overlay plane alpha channel blending is incorrect
Issue tracker: https://gitlab.freedesktop.org/drm/amd/-/issues/1769

Reference:
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#plane-composition-properties

Adding Coverage support also enables IGT
kms_plane_alpha_blend Coverage subtests:
1. coverage-7efc
2. coverage-vs-premult-vs-constant

Changes
1. Add DRM_MODE_BLEND_COVERAGE blend mode capability
2. Add "pre_multiplied_alpha" flag for Coverage case
3. Read the correct flag and set the DCN MPCC
pre_multiplied register bit (only on overlay plane)

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1769
Signed-off-by: Sung Joon Kim <Sungjoon.Kim@amd.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Dan Carpenter
a35faec3db drm/amdgpu: Off by one in dm_dmub_outbox1_low_irq()
The > ARRAY_SIZE() should be >= ARRAY_SIZE() to prevent an out of bounds
access.

Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Evan Quan
caa5eadc14 drm/amdgpu: suppress some compile warnings
Suppress two compile warnings about "no previous prototype".

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Evan Quan
396beb91a9 drm/amd/pm: correct the metrics version for SMU 11.0.11/12/13
Correct the metrics version used for SMU 11.0.11/12/13.
Fixes misreported GPU metrics (e.g., fan speed, etc.) depending
on which version of SMU firmware is loaded.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1925
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Jay Cornwall
6a8170383c drm/amdkfd: Add gfx11 trap handler
Based on gfx10 with following changes:

- GPR_ALLOC.VGPR_SIZE field moved (and size corrected in gfx10)
- s_sendmsg_rtn_b64 replaces some s_sendmsg/s_getreg
- Buffer instructions no longer have direct-to-LDS modifier

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Eric Huang
5e613723f8 drm/amdkfd: port cwsr trap handler from dkms branch
Most of changes are for debugger feature, and it is
to simplify trap handler support for new asics in the
future.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Alan Liu
6880ed280e drm/amd/display: Add HDMI_ACP_SEND register
Define HDMI_ACP_SEND register shift/mask.

Signed-off-by: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Lijo Lazar
b0f4d663fc drm/amd/pm: Fix missing thermal throttler status
On aldebaran, when thermal throttling happens due to excessive GPU
temperature, the reason for throttling event is missed in warning
message. This patch fixes it.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Sunil Khatri
49b74d12d1 drm/amdgpu: add support of tmz for GC 10.3.7
Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Sunil Khatri
0ef3dc7e97 drm/amdgpu: change code name to ip version for tmz set
Use IP version rather then code name of IPs for
tmz set.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Sunil Khatri
4d33e7040d drm/amdgpu: move amdgpu_gmc_tmz_set after ip_version populated
To enable TMZ feature based on IP version needs adev->ip_version
populated but its empty. Move amdgpu_gmc_tmz_set to a place where
ip_version is populated.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Stanley.Yang
950d64250f drm/amdgpu: support ras on SRIOV
support umc/gfx/sdma ras on guest side

Changed from V1:
    move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Dan Carpenter
2c270d3e71 drm/amdgpu/pm: smu_v13_0_4: delete duplicate condition
There is no need to check if "clock_ranges' is non-NULL.  It is checked
already on the line before.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Evan Quan
3670c46f07 drm/amd/pm: enable memory temp reading for SMU 13.0.0
With the latest vbios, the memory temp reading is working.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Evan Quan
0aceb728f4 drm/amd/pm: enable more dpm features for SMU 13.0.0
Enable OOB Monitor and SOC CG which are ready since 78.38.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Evan Quan
6fd693817d drm/amd/pm: correct the softpptable ids used for SMU 13.0.0
To better match with the pptable_id settings from VBIOS.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Evan Quan
1c65e54881 drm/amd/pm: update SMU 13.0.0 driver_if header
To align with 78.37.0 and later PMFWs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Evan Quan
704d6bf605 drm/amd/pm: skip dpm disablement on suspend for SMU 13.0.0
Since PMFW will handle this properly. Driver involvement is
unnecessary.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Evan Quan
72063c71c3 drm/amd/pm: enable more dpm features for SMU 13.0.0
Enable MP0CLK DPM and FW Dstate since they are already supported
by latest 78.36.0 PMFW.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Gong Yuanjun
d2f4460a3d drm/amd/pm: fix a potential gpu_metrics_table memory leak
gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but
not freed in yellow_carp_fini_smc_tables().

Signed-off-by: Gong Yuanjun <ruc_gongyuanjun@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Haohui Mai
10784fec9c drm/amdgpu/gfx10: rework KIQ programming
Make sure the queue is not longer active before programming
the kiq EOP registers.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Haohui Mai
842035543c drm/amdgpu: Set CP_HQD_PQ_CONTROL.RPTR_BLOCK_SIZE correctly
Remove the accidental shifts on the values of RPTR_BLOCK_SIZE
in gfx_v8-v11. The bug essentially always programs the
corresponding fields to zero instead of the correct value.
The hardware clamps the min value to 5 so this resulted in a
value of 5 being programmed.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Christian König
69493c034d drm/amdgpu: cleanup ctx implementation
Let each context have a pointer to the ctx manager and properly
initialize the adev pointer inside the context manager.

Reduce the BUG_ON() in amdgpu_ctx_add_fence() into a WARN_ON() and
directly return the sequence number instead of writing into a parmeter.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Jonathan Kim
a0af5dbdc9 drm/amdkfd: simplify cpu hive assignment
CPU hive assignment currently assumes when a GPU hive is connected_to_cpu,
there is only one hive in the system.

Only assign CPUs to the hive if they are explicitly directly connected to
the GPU hive to get rid of the need for this assumption.

It's more efficient to do this when querying IO links since other non-CRAT
info has to be filled in anyways.  Also, stop re-assigning the
same CPU to the same GPU hive if it has already been done before.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Haohui Mai
2c2dd0555f drm/amdgpu: Clean up of initializing doorbells for gfx_v9 and gfx_v10
Clean up redundant, copy-paste code blocks during the initialization of
the doorbells in mqd_init().

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Aric Cyr
c51bdd1a9c drm/amd/display: 3.2.186
This version brings along the following:
- Improvements in link training fallback
- Adding individual edp hotplug support
- Fixes in DPIA HPD status, display clock change hang, etc.
- FPU isolation work for DCN30

Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Michael Strauss
4d1d699f47 Revert "drm/amd/display: Refactor LTTPR cap retrieval"
This reverts commit 3b90318d44.

[WHY]
Regressions unintentionally caused by change,
reverting until this can be resolved.

Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Bhawanpreet Lakha
a32cc8177e drm/amd/display: Fic incorrect pipe being used for clk update
[Why]
we save the prev_dppclk value using "dpp_inst" but
when reading this value we use the index "i". In
a case where a pipe is fused off we can end up reading
the incorrect instance because i != dpp_inst in this
case.

[How]
read the prev_dppclk using dpp_inst instead of i

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Jasdeep Dhillon
e4b0eac3e6 drm/amd/display: Move FPU associated DCN30 code to DML folder
[why & how]
As part of the FPU isolation work documented in
https://patchwork.freedesktop.org/series/93042/, isolate
code that uses FPU in DCN30 to DML, where all FPU code
should locate.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Nicholas Kazlauskas
66a1972037 drm/amd/display: Check zero planes for OTG disable W/A on clock change
[Why]
A display clock change hang can occur when switching between DIO and HPO
enabled modes during the optimize_bandwidth in dc_commit_state_no_check
call.

This happens when going from 4k120 8bpc 420 to 4k144 10bpc 444.

Display clock in the DIO case is 1200MHz, but pixel rate is 600MHz
because the pixel format is 420.

Display clock in the HPO case is less (800MHz?) because of ODM combine
which results in a smaller divider.

The DIO is still active in prepare but not active in the optimize which
results in the hang occuring.

During this change there are no planes on the stream so it's safe to
apply the workaround, but dpms_off = false and signal type is not
virtual.

[How]
Check for plane_count == 0, no planes on the stream.

It's easiest to check pipe->plane_state == NULL as an equivalent check
rather than trying to search for the stream status in the context
associated with the stream, so let's do that.

The primary, non MPO pipe should not have a NULL plane state.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Derek Lai
ab144f0b4a drm/amd/display: Allow individual control of eDP hotplug support
[Why]
Second eDP can send display off notification through HPD
but DC isn't hooked up to handle. Some primary eDP panels
will toggle on/off incorrectly if it's enabled generically.

[How]
Extend the debug option to allow individually enabling hotplug
either the first eDP or the second eDP in a dual eDP system.

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Derek Lai <Derek.Lai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:30 -04:00
David Galiffi
49947b906a drm/amd/display: Check if modulo is 0 before dividing.
[How & Why]
If a value of 0 is read, then this will cause a divide-by-0 panic.

Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:30 -04:00
Paul Hsieh
3f69ee66f5 drm/amd/display: clear request when release aux engine
[Why]
when driver and dmub request aux engine at the same time,
dmub grant the aux engine but driver fail. Then driver
release aux engine but doesn't clear the request bit.
Then aux engine will be occupied by driver forever.

[How]
When driver release aux engine, clear request bit as well.

Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Paul Hsieh <paul.hsieh@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:30 -04:00
Alvin Lee
903940b0b7 drm/amd/display: Clean up code in dc
[Why & How]
Code clean up in dc.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:30 -04:00
Jimmy Kizito
fc0b067df7 drm/amd/display: Query DPIA HPD status.
[Why]
Driver needs up to date DPIA HPD status.

[How]
Use HPD query command to get DPIA HPD status.

Reviewed-by: Meenakshikumar Somasundaram <Meenakshikumar.Somasundaram@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:30 -04:00
Jimmy Kizito
d84c4d194e drm/amd/display: Update link training fallback behaviour.
[Why]
Some displays may need several link training attempts before
link training succeeds.

[How]
If training succeeds after falling back to lower link bandwidth,
retry at original link bandwidth instead of abandoning link training
whenever link bandwidth is less than stream bandwidth.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:30 -04:00
Dave Airlie
00df0514ab Merge tag 'amd-drm-next-5.19-2022-05-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.19-2022-05-18:

amdgpu:
- Misc code cleanups
- Additional SMU 13.x enablement
- Smartshift fixes
- GFX11 fixes
- Support for SMU 13.0.4
- SMU mutex fix
- Suspend/resume fix

amdkfd:
- static checker fix
- Doorbell/MMIO resource handling fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518205621.5741-1-alexander.deucher@amd.com
2022-05-19 14:09:54 +10:00
Mario Limonciello
0223e51647 drm/amd: Don't reset dGPUs if the system is going to s2idle
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de0874 ("drm/amdgpu:
always reset the asic in suspend (v2)").  A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Luben Tuikov
5ad25ace7c drm/amdgpu: Unmap legacy queue when MES is enabled
This fixes a kernel oops when MES is not enabled.

Reported-by: Kenny Ho <Kenny.Ho@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Fixes: 18ee4ce63e ("drm/amdgpu: add mes unmap legacy queue routine")
Fixes: 3d879e81f0 ("drm/amdgpu: add init support for GFX11 (v2)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Sathishkumar S
494c143254 drm/amd/pm: consistent approach for smartshift
create smartshift sysfs attributes from dGPU device even
on smartshift 1.0 platform to be consistent. Do not populate
the attributes on platforms that have APU only but not dGPU
or vice versa.

V2:
 avoid checking for the number of VGA/DISPLAY devices (Lijo)
 move code to read from dGPU or APU into a function and reuse (Lijo)

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 14:20:55 -04:00
Jiapeng Chong
f3106c9424 drm/amd/display: clean up some inconsistent indenting
Eliminate the follow smatch warning:

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9687
amdgpu_dm_atomic_commit_tail() warn: inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 14:20:46 -04:00
Graham Sider
04fd07397e drm/amdkfd: Fix static checker warning on MES queue type
convert_to_mes_queue_type return can be negative, but
queue_input.queue_type is uint32_t. Put return in integer var and cast
to unsigned after negative check.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:58 -04:00
Hans de Goede
4b9caaa028 drm/amdgpu: Move mutex_init(&smu->message_lock) to smu_early_init()
Lockdep complains about the smu->message_lock mutex being used before
it is initialized through the following call path:

amdgpu_device_init()
 amdgpu_dpm_mode2_reset()
  smu_mode2_reset()
   smu_v12_0_mode2_reset()
    smu_cmn_send_smc_msg_with_param()

Move the mutex_init() call to smu_early_init() to fix the mutex being
used before it is initialized.

This fixes the following lockdep splat:

[    3.867331] ------------[ cut here ]------------
[    3.867335] fbcon: Taking over console
[    3.867338] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[    3.867340] WARNING: CPU: 14 PID: 491 at kernel/locking/mutex.c:579 __mutex_lock+0x44c/0x830
[    3.867349] Modules linked in: amdgpu(+) crct10dif_pclmul drm_ttm_helper crc32_pclmul ttm crc32c_intel ghash_clmulni_intel hid_lg_g15 iommu_v2 sp5100_tco nvme gpu_sched drm_dp_helper nvme_core ccp wmi video hid_logitech_dj ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse i2c_dev
[    3.867363] CPU: 14 PID: 491 Comm: systemd-udevd Tainted: G          I       5.18.0-rc5+ #33
[    3.867366] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021
[    3.867369] RIP: 0010:__mutex_lock+0x44c/0x830
[    3.867372] Code: ff 85 c0 0f 84 33 fc ff ff 8b 0d b7 50 25 01 85 c9 0f 85 25 fc ff ff 48 c7 c6 fb 41 82 99 48 c7 c7 6b 63 80 99 e8 88 2a f8 ff <0f> 0b e9 0b fc ff ff f6 83 b9 0c 00 00 01 0f 85 64 ff ff ff 4c 89
[    3.867377] RSP: 0018:ffffaef8c0fc79f0 EFLAGS: 00010286
[    3.867380] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000027
[    3.867382] RDX: ffff9ccc0dda0928 RSI: 0000000000000001 RDI: ffff9ccc0dda0920
[    3.867384] RBP: ffffaef8c0fc7a80 R08: 0000000000000000 R09: ffffaef8c0fc7820
[    3.867386] R10: 0000000000000003 R11: ffff9ccc2a2fffe8 R12: 0000000000000002
[    3.867388] R13: ffff9cc990808058 R14: 0000000000000000 R15: ffff9cc98bfc0000
[    3.867390] FS:  00007fc4d830f580(0000) GS:ffff9ccc0dd80000(0000) knlGS:0000000000000000
[    3.867394] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.867396] CR2: 0000560a77031410 CR3: 000000010f522000 CR4: 0000000000750ee0
[    3.867398] PKRU: 55555554
[    3.867399] Call Trace:
[    3.867401]  <TASK>
[    3.867403]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867533]  ? __mutex_lock+0x90/0x830
[    3.867535]  ? amdgpu_dpm_mode2_reset+0x37/0x60 [amdgpu]
[    3.867653]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867758]  smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867857]  smu_mode2_reset+0x2b/0x50 [amdgpu]
[    3.867953]  amdgpu_dpm_mode2_reset+0x46/0x60 [amdgpu]
[    3.868096]  amdgpu_device_init.cold+0x1069/0x1e78 [amdgpu]
[    3.868219]  ? _raw_spin_unlock_irqrestore+0x30/0x50
[    3.868222]  ? pci_conf1_read+0x9b/0xf0
[    3.868226]  amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
[    3.868314]  amdgpu_pci_probe+0x1a9/0x3c0 [amdgpu]
[    3.868398]  local_pci_probe+0x41/0x80
[    3.868401]  pci_device_probe+0xab/0x200
[    3.868404]  really_probe+0x1a1/0x370
[    3.868407]  __driver_probe_device+0xfc/0x170
[    3.868410]  driver_probe_device+0x1f/0x90
[    3.868412]  __driver_attach+0xbf/0x1a0
[    3.868414]  ? __device_attach_driver+0xe0/0xe0
[    3.868416]  bus_for_each_dev+0x65/0x90
[    3.868419]  bus_add_driver+0x151/0x1f0
[    3.868421]  driver_register+0x89/0xd0
[    3.868423]  ? 0xffffffffc0bd4000
[    3.868425]  do_one_initcall+0x5d/0x300
[    3.868428]  ? do_init_module+0x22/0x240
[    3.868431]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868434]  ? trace_kmalloc+0x30/0xe0
[    3.868437]  ? kmem_cache_alloc_trace+0x1e6/0x3a0
[    3.868440]  do_init_module+0x4a/0x240
[    3.868442]  __do_sys_finit_module+0x93/0xf0
[    3.868446]  do_syscall_64+0x5b/0x80
[    3.868449]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868451]  ? lockdep_hardirqs_on_prepare+0xd9/0x180
[    3.868454]  ? do_syscall_64+0x67/0x80
[    3.868456]  ? do_syscall_64+0x67/0x80
[    3.868458]  ? do_syscall_64+0x67/0x80
[    3.868460]  ? do_syscall_64+0x67/0x80
[    3.868462]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    3.868465] RIP: 0033:0x7fc4d8ec1ced
[    3.868467] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fb 70 0e 00 f7 d8 64 89 01 48
[    3.868472] RSP: 002b:00007fff687ae6b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    3.868475] RAX: ffffffffffffffda RBX: 0000560a76fbca60 RCX: 00007fc4d8ec1ced
[    3.868477] RDX: 0000000000000000 RSI: 00007fc4d902343c RDI: 0000000000000011
[    3.868479] RBP: 00007fc4d902343c R08: 0000000000000000 R09: 0000560a76fb59c0
[    3.868481] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000020000
[    3.868484] R13: 0000560a76f8bfd0 R14: 0000000000000000 R15: 0000560a76fc2d10
[    3.868487]  </TASK>
[    3.868489] irq event stamp: 120617
[    3.868490] hardirqs last  enabled at (120617): [<ffffffff9817169e>] __up_console_sem+0x5e/0x70
[    3.868494] hardirqs last disabled at (120616): [<ffffffff98171683>] __up_console_sem+0x43/0x70
[    3.868497] softirqs last  enabled at (119684): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868501] softirqs last disabled at (119679): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868504] ---[ end trace 0000000000000000 ]---

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:58 -04:00
Xiaojian Du
0d6ec07a95 drm/amdgpu/discovery: add SMU v13.0.4 into the IP discovery list
This patch will add SMU v13.0.4 into the IP discovery list.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:58 -04:00
Tim Huang
33ef11cd7c drm/amdgpu/pm: add GFXOFF control IP version check for SMU IP v13.0.4
Enable the SMU IP v13.0.4 GFXOFF control

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:58 -04:00
Tim Huang
17f78bb409 drm/amdgpu/pm: enable swsmu for SMU IP v13.0.4
Add the entry to set the ppt functions for SMU IP v13.0.4.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Tim Huang
55c894945b drm/amdgpu/pm: add swsmu ppt implementation for SMU IP v13.0.4
Add swsmu ppt files for SMU IP v13.0.4.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Tim Huang
a0219175b3 drm/amdgpu/pm: add some common ppt functions for SMU IP v13.0.x
Add some common ppt functions that will be used by SMU IP v13.0.x
and drop the not used function smu_v13_0_mode2_reset.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Tim Huang
563cb2d82f drm/amdgpu/pm: add EnableGfxImu message dummy map for SMU IP v13.0.4
The SMU needs this message to trigger IMU initialization.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Huang Rui
6384d44bc4 drm/amdgpu/pm: add smu v13.0.4 driver SMU if headers
Add smu v13.0.4 driver SMU interface headers.

v2: squash in the header updates (Alex)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Jack Xiao
7bd3114b1c drm/amdgpu/gfx11: fix mes mqd settings
Use the correct Memory Queue Descriptor (MQD)
structure for GC 11.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Jack Xiao
2fc092d4c7 drm/amdgpu/gfx11: fix me field handling in map_queue packet
Select the correct microengine (me) when using the
map_queue packet.  There are different me's for GFX,
compute, and scheduling.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Sathishkumar S
cdf4c8ec39 drm/amd/pm: update smartshift powerboost calc for smu13
smartshift apu and dgpu power boost are reported as percentage
with respect to their power limits. adjust the units of power before
calculating the percentage of boost.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Sathishkumar S
138292f1dc drm/amd/pm: update smartshift powerboost calc for smu12
smartshift apu and dgpu power boost are reported as percentage with
respect to their power limits. This value[0-100] reflects the boost
for the respective device.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Lang Yu
7226f40af6 drm/amdkfd: allocate MMIO/DOORBELL BOs with AMDGPU_GEM_CREATE_PREEMPTIBLE
MMIO/DOORBELL BOs' backing resources(bus address resources that are
used to talk to the GPU) are not managed by GTT manager, but they
are counted by GTT manager. That makes no sense.

With AMDGPU_GEM_CREATE_PREEMPTIBLE flag, such BOs will be managed by
PREEMPT manager(for preemptible contexts, e.g., KFD). Then they won't
be evicted and don't need to be pinned as well.

But we still leave these BOs pinned to indicate that the underlying
resource never moves.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Haohui Mai
b992a19085 drm/amdgpu: Ensure the DMA engine is deactivated during set ups
Setting the HALT bit of SDMA_F32_CNTL in all paths before programming
the ring buffer of the SDMA engine.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Alex Deucher
505c170b62 drm/amdgpu/ctx: only reset stable pstate if the user changed it (v2)
Check if the requested stable pstate matches the current one before
changing it.  This avoids changing the stable pstate on context
destroy if the user never changed it in the first place via the
IOCTL.

v2: compare the current and requested rather than setting a flag (Lijo)

Fixes: 8cda7a4f96 ("drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Yang Wang
1e46ab91e5 drm/amd/pm: add smu power_limit callback for smu_v13_0_7
- get_power_limit
- set_power_limit

add above callback functions to enable power_cap hwmon node.

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Yang Wang
d72a475b48 drm/amd/pm: add smu feature map support for smu_v13_0_0
the pp_features can't display full feauture information
when these mapping is not exiting.

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Yang Wang
6b1407795e drm/amd/pm: add smu feature map support for smu_v13_0_7
the pp_features can't display full feauture information
when these mapping is not exiting.

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00