Commit Graph

11486 Commits

Author SHA1 Message Date
Evan Quan
4abc1765d2 drm/amd/powerplay: enable SW SMU power profile switch support in KFD
Hook up the SW SMU power profile switch in KFD routine.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Evan Quan
7aa3f675d1 drm/amd/powerplay: support power profile retrieval and setting on arcturus
Enable arcturus power profile retrieval and setting.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Evan Quan
839f9117e1 drm/amd/powerplay: guard consistency between CPU copy and local VRAM
This can prevent CPU to use the out-dated copy.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Tao Zhou
bd2280da46 drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS
ce can also trigger interrupt, and even both ce and ue error can be
found in one ras query, distinguishing between ce and ue in interrupt
handler is uncessary.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Suggested-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:39 -05:00
Tao Zhou
91ba68f8d5 drm/amdgpu: only uncorrectable error needs gpu reset
we only read error information for correctable error in interrupt
handler, gpu reset is unnecessary since there is no data lost
in correctable error

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
b1a5895352 drm/amdgpu: update the calc algorithm of umc ecc error count
the initial value of ecc error count can be adjusted

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
b7f92097f5 drm/amdgpu: implement umc ras init function
enable umc ce interrupt and initialize ecc error count

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
51437623a0 drm/amdgpu: support ce interrupt in ras module
correctable error can also trigger interrupt in some ras blocks

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
13b7c46c18 drm/amdgpu: add error address query for umc ras
umc error address query can get ce/ue error address and clear error
status

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
2b671b6049 drm/amdgpu: apply umc_for_each_channel macro to umc_6_1
use umc_for_each_channel to make code simpler

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
fee858ba5f drm/amdgpu: add macro of umc for each channel
common function for all umc versions, loop for each umc channel is
a frequent used operation in umc block, define it as a macro to
simplify code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
3aacf4ea11 drm/amdgpu: initialize new parameters and functions for amdgpu_umc structure
add initialization for new members of amdgpu_umc structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
33b97cf896 drm/amdgpu: add more parameters and functions to amdgpu_umc structure
expose more parameters and functions of specific umc version to common
umc layer, so amdgpu_umc layer and other blocks could access them

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Tao Zhou
a55c8d7bda drm/amdgpu: remove the clear of MCA_ADDR
clearing MCA_STATUS is enough to reset the whole MCA, writing zero to
MCA_ADDR is unnecessary

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Kevin Wang
b94afb61cd drm/amd/powerplay: honor hw limit on fetching metrics data for navi10
too frequently to update mertrics table will cause smu internal error.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Nicholas Kazlauskas
bd200d190f drm/amd/display: Don't replace the dc_state for fast updates
[Why]
DRM private objects have no hw_done/flip_done fencing mechanism on their
own and cannot be used to sequence commits accordingly.

When issuing commits that don't touch the same set of hardware resources
like page-flips on different CRTCs we can run into the issue below
because of this:

1. Client requests non-blocking Commit #1, has a new dc_state #1,
state is swapped, commit tail is deferred to work queue

2. Client requests non-blocking Commit #2, has a new dc_state #2,
state is swapped, commit tail is deferred to work queue

3. Commit #2 work starts, commit tail finishes,
atomic state is cleared, dc_state #1 is freed

4. Commit #1 work starts,
commit tail encounters null pointer deref on dc_state #1

In order to change the DC state as in the private object we need to
ensure that we wait for all outstanding commits to finish and that
any other pending commits must wait for the current one to finish as
well.

We do this for MEDIUM and FULL updates. But not for FAST updates, nor
would we want to since it would cause stuttering from the delays.

FAST updates that go through dm_determine_update_type_for_commit always
create a new dc_state and lock the DRM private object if there are
any changed planes.

We need the old state to validate, but we don't actually need the new
state here.

[How]
If the commit isn't a full update then the use after free can be
resolved by simply discarding the new state entirely and retaining
the existing one instead.

With this change the sequence above can be reexamined. Commit #2 will
still free Commit #1's reference, but before this happens we actually
added an additional reference as part of Commit #2.

If an update comes in during this that needs to change the dc_state
it will need to wait on Commit #1 and Commit #2 to finish. Then it'll
swap the state, finish the work in commit tail and drop the last
reference on Commit #2's dc_state.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204181
Fixes: 004b3938e6 ("drm/amd/display: Check scaling info when determing update type")

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Nicholas Kazlauskas
43d10d30df drm/amd/display: Skip determining update type for async updates
[Why]
By passing through the dm_determine_update_type_for_commit for atomic
commits that can be done asynchronously we are incurring a
performance penalty by locking access to the global private object
and holding that access until the end of the programming sequence.

This is also allocating a new large dc_state on every access in addition
to retaining all the references on each stream and plane until the end
of the programming sequence.

[How]
Shift the determination for async update before validation. Return early
if it's going to be an async update.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Nicholas Kazlauskas
e16e37efb4 drm/amd/display: Allow cursor async updates for framebuffer swaps
[Why]
We previously allowed framebuffer swaps as async updates for cursor
planes but had to disable them due to a bug in DRM with async update
handling and incorrect ref counting. The check to block framebuffer
swaps has been added to DRM for a while now, so this check is redundant.

The real fix that allows this to properly in DRM has also finally been
merged and is getting backported into stable branches, so dropping
this now seems to be the right time to do so.

[How]
Drop the redundant check for old_fb != new_fb.

With the proper fix in DRM, this should also fix some cursor stuttering
issues with xf86-video-amdgpu since it double buffers the cursor.

IGT tests that swap framebuffers (-varying-size for example) should
also pass again.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Colin Ian King
ac4bf4a1eb drm/amdgpu: fix unsigned variable instance compared to less than zero
Currenly the error check on variable instance is always false because
it is a uint32_t type and this is never less than zero. Fix this by
making it an int type.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 7d0e6329df ("drm/amdgpu: update more sdma instances irq support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Matt Coffin
f0ced3f61b drm/amd/powerplay: Allow changing of fan_control in smu_v11_0
[Why]
Before this change, the fan control state on smu_v11 was not able to be
changed because the capability check for checking if the fan control
capability existed was inverted.

[How]
The capability check for fan control in smu_v11_0_auto_fan_control was
inverted, to correctly check for the absence, instead of presence of fan
control capabilities.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Colin Ian King
ab63131155 drm/amd/powerplay: fix a few spelling mistakes
There are a few spelling mistakes "unknow" -> "unknown" and
"enabeld" -> "enabled". Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:38 -05:00
Colin Ian King
e3bf125bdb drm/amd/powerplay: fix off-by-one upper bounds limit checks
There are two occurrances of off-by-one upper bound checking of indexes
causing potential out-of-bounds array reads. Fix these.

Addresses-Coverity: ("Out-of-bounds read")
Fixes: cb33363d0e ("drm/amd/powerplay: add smu feature name support")
Fixes: 6b294793e3 ("drm/amd/powerplay: add smu message name support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:30:37 -05:00
Jay Cornwall
5145d57ec5 drm/amdkfd: Extend CU mask to 8 SEs (v3)
Following bitmap layout logic introduced by:
"drm/amdgpu: support get_cu_info for Arcturus".

v2: squash in fixup for gfx_v9_0.c (Alex)
v3: squash in debug print output fix

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:19:11 -05:00
Le Ma
857b82d0df drm/amdgpu: support get_cu_info for Arcturus
This change is because SE/SH layout on Arcturus is 8*1, different from
4*2(or 4*1) on Vega ASICs.

Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely
across SW stack. To mostly reduce the scale of impact, we make the cu bitmap
array compatible with SE/SH layout on Arcturus. Then the store of cu bits of
each shader array for Arcturus will be like below:
    SE0,SH0 --> bitmap[0][0]
    SE1,SH0 --> bitmap[1][0]
    SE2,SH0 --> bitmap[2][0]
    SE3,SH0 --> bitmap[3][0]
    SE4,SH0 --> bitmap[0][1]
    SE5,SH0 --> bitmap[1][1]
    SE6,SH0 --> bitmap[2][1]
    SE7,SH0 --> bitmap[3][1]

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:19:05 -05:00
Kent Russell
612e4ed99b drm/amdgpu: Fix pcie_bw on Vega20
The registers used for VG20 are different in that certain performance
counters were split off to TXCLK3/4. Vega10/12 doesn't have this, so add
a new vg20_get_pcie_usage to reflect this change.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:58 -05:00
Kent Russell
57d352f769 drm/amdgpu: Update NBIO headers to add TXCLK3/4
These are added for VG20, and are needed for PCIe bandwidth.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:50 -05:00
Andrey Grodzovsky
19ed70ff5d drm/amdgpu: Add amdgpu_asic_funcs.reset_method for Vega20
Fixes GPU reset crash.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:44 -05:00
Felix Kuehling
6856e4b65f drm/amdgpu: Mark KFD VRAM allocations for wipe on release
Memory used by KFD applications can contain sensitive information that
should not be leaked to other processes. The current approach to prevent
leaks is to clear VRAM at allocation time. This is not effective because
memory can be reused in other ways without being cleared. Synchronously
clearing memory on the allocation path also carries a significant
performance penalty.

Stop clearing memory at allocation time. Instead mark the memory for
wipe on release.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:38 -05:00
Felix Kuehling
ab2f7a5c18 drm/amdgpu: Implement VRAM wipe on release
Wipe VRAM memory containing sensitive data when moving or releasing
BOs. Clearing the memory is pipelined to minimize any impact on
subsequent memory allocation latency. Use of a poison value should
help debug future use-after-free bugs.

When moving BOs, the existing ttm_bo_pipelined_move ensures that the
memory won't be reused before being wiped.

When releasing BOs, the BO is fenced with the memory fill operation,
which results in queuing the BO for a delayed delete.

v2: Move amdgpu_amdkfd_unreserve_memory_limit into
amdgpu_bo_release_notify so that KFD can use memory that's still
being cleared in the background

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:18:32 -05:00
Leo Li
d9ec5cfd5a drm/amd/display: Use switch table for dc_to_smu_clock_type
Using a static int array will cause errors if the given dm_pp_clk_type
is out-of-bounds. For robustness, use a switch table, with a default
case to handle all invalid values.

v2: 0 is a valid clock type for smu_clk_type. Return SMU_CLK_COUNT
    instead on invalid mapping.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:57 -05:00
Nathan Chancellor
d196bbbc28 drm/amd/display: Use proper enum conversion functions
clang warns:

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:336:8:
warning: implicit conversion from enumeration type 'enum smu_clk_type'
to different enumeration type 'enum amd_pp_clock_type'
[-Wenum-conversion]
                                        dc_to_smu_clock_type(clk_type),
                                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:421:14:
warning: implicit conversion from enumeration type 'enum
amd_pp_clock_type' to different enumeration type 'enum smu_clk_type'
[-Wenum-conversion]
                                        dc_to_pp_clock_type(clk_type),
                                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are functions to properly convert between all of these types, use
them so there are no longer any warnings.

Fixes: a43913ea50 ("drm/amd/powerplay: add function get_clock_by_type_with_latency for navi10")
Fixes: e5e4e22391 ("drm/amd/powerplay: add interface to get clock by type with latency for display (v2)")
Link: https://github.com/ClangBuiltLinux/linux/issues/586
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:40 -05:00
Monk Liu
482f0e5385 drm/amdgpu: fix double ucode load by PSP(v3)
previously the ucode loading of PSP was repreated, one executed in
phase_1 init/re-init/resume and the other in fw_loading routine

Avoid this double loading by clearing ip_blocks.status.hw in suspend or reset
prior to the FW loading and any block's hw_init/resume

v2:
still do the smu fw loading since it is needed by bare-metal

v3:
drop the change in reinit_early_sriov, just clear all block's status.hw
in the head place and set the status.hw after hw_init done is enough

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:34 -05:00
Monk Liu
9244d3a6eb drm/amdgpu: fix incorrect judge on sos fw version
for SRIOV the SOS fw of PSP is loaded in hypervisor thus
guest won't tell the version of it, and judging feature by
reading the sos fw version in guest side is completely wrong

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:28 -05:00
Monk Liu
4cd4c5c064 drm/amdgpu: cleanup vega10 SRIOV code path
we can simplify all those unnecessary function under
SRIOV for vega10 since:
1) PSP L1 policy is by force enabled in SRIOV
2) original logic always set all flags which make itself
   a dummy step

besides,
1) the ih_doorbell_range set should also be skipped
for VEGA10 SRIOV.
2) the gfx_common registers should also be skipped
for VEGA10 SRIOV.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:21 -05:00
Kevin Wang
67194518cb drm/amd/powerplay: sort feature status index by asic feature id for smu
before this change, the pp_feature sysfs show feature enable state by
logic feature id, it is not easy to read.
this change will sort pp_features show index by asic feature id.

before:
features high: 0x00000623 low: 0xb3cdaffb
00. DPM_PREFETCHER       ( 0) : enabeld
01. DPM_GFXCLK           ( 1) : enabeld
02. DPM_UCLK             ( 3) : enabeld
03. DPM_SOCCLK           ( 4) : enabeld
04. DPM_MP0CLK           ( 5) : enabeld
05. DPM_LINK             ( 6) : enabeld
06. DPM_DCEFCLK          ( 7) : enabeld
07. DS_GFXCLK            (10) : enabeld
08. DS_SOCCLK            (11) : enabeld
09. DS_LCLK              (12) : disabled
10. PPT                  (23) : enabeld
11. TDC                  (24) : enabeld
12. THERMAL              (33) : enabeld
13. RM                   (35) : disabled
......

after:
features high: 0x00000623 low: 0xb3cdaffb
00. DPM_PREFETCHER       ( 0) : enabeld
01. DPM_GFXCLK           ( 1) : enabeld
02. DPM_GFX_PACE         ( 2) : disabled
03. DPM_UCLK             ( 3) : enabeld
04. DPM_SOCCLK           ( 4) : enabeld
05. DPM_MP0CLK           ( 5) : enabeld
06. DPM_LINK             ( 6) : enabeld
07. DPM_DCEFCLK          ( 7) : enabeld
08. MEM_VDDCI_SCALING    ( 8) : enabeld
09. MEM_MVDD_SCALING     ( 9) : enabeld
10. DS_GFXCLK            (10) : enabeld
11. DS_SOCCLK            (11) : enabeld
12. DS_LCLK              (12) : disabled
13. DS_DCEFCLK           (13) : enabeld
......

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-02 10:17:05 -05:00
Dave Airlie
4b381ee25d Merge tag 'drm-fixes-5.3-2019-07-31' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.3-2019-07-31:

amdgpu:
- Fix temperature granularity for navi
- Fix stable pstate setting for navi
- Fix VCN DPM enablement on navi
- Fix error handling on CS ioctl when processing dependencies
- Fix possible information leak in debugfs

amdkfd:
- fix memory alignment for VegaM

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731191648.25729-1-alexander.deucher@amd.com
2019-08-02 09:35:40 +10:00
Alex Deucher
9475a77b57 drm/amdkfd: enable KFD support for navi14
Same as navi10.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:26 -05:00
Dennis Li
dc4d716d4c drm/amdgpu: disable inject for failed subblocks of gfx
some subblocks of gfx fail in inject test, disable them

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:20 -05:00
Dennis Li
83b0582c90 drm/amdgpu: support gfx ras error injection and err_cnt query
check gfx error count in both ras querry function and
ras interrupt handler.

gfx ras is still disabled by default due to known stability
issue found in gpu reset.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:14 -05:00
Dennis Li
2c960ea02f drm/amdgpu: add RAS callback for gfx
Add functions for RAS error inject and query error counter

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:08 -05:00
Dennis Li
dc23a08f03 drm/amdgpu: add define for gfx ras subblock
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:51:01 -05:00
Dennis Li
4bb6b8c758 drm/amd/include: add define of TCP_EDC_CNT_NEW
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:54 -05:00
Dennis Li
ca3f422f53 drm/amd/include: add bitfield define for EDC registers
Add EDC registers to support VEGA20 RAS

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:47 -05:00
Tao Zhou
7cdc2ee300 drm/amdgpu: remove ras_reserve_vram in ras injection
error injection address is not in gpu address space

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:41 -05:00
Tao Zhou
e10634938b drm/amdgpu: add check for ras error type
only ue and ce errors are supported

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:35 -05:00
Tao Zhou
81e02619e9 drm/amdgpu: update interrupt callback for all ras clients
add err_data parameter in interrupt cb for ras clients

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:29 -05:00
Tao Zhou
cf04dfd0e9 drm/amdgpu: allow ras interrupt callback to return error data
add error data as parameter for ras interrupt cb and process it

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:23 -05:00
Tao Zhou
8c94810357 drm/amdgpu: query umc ras error address
query umc ras error address, translate it to gpu 4k page view
and save it.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:17 -05:00
Tao Zhou
c2742aef4d drm/amdgpu: add structures for umc error address translation
add related registers, callback function and channel index table

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:11 -05:00
Tao Zhou
6f102dba80 drm/amdgpu: add support for recording ras error address
more than one error address may be recorded in one query

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:50:05 -05:00
Tao Zhou
f1ed4afa13 drm/amdgpu: update algorithm of umc uncorrectable error counting
remove the check of ErrorCodeExt

v2: refine the if condition for ue counting

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:58 -05:00
Tao Zhou
045c021653 drm/amdgpu: switch to amdgpu_umc structure
create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:52 -05:00
Tao Zhou
5bbfb64a17 drm/amdgpu: use 64bit operation macros for umc
replace some 32bit macros with 64bit operations to simplify code

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:46 -05:00
Tao Zhou
4fa1c6a679 drm/amdgpu: add RREG64/WREG64(_PCIE) operations
add 64 bits register access functions

v2: implement 64 bit functions in low level

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:40 -05:00
Tao Zhou
05a58345db drm/amdgpu: add ras error count after each query (v2)
v1: increase ras ce/ue error count
v2: log the number of correctable and uncorrectable errors

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:33 -05:00
Hawking Zhang
939e2258ce drm/amdgpu: querry umc error count
check umc error count in both ras querry function and
ras interrupt handler

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:28 -05:00
Hawking Zhang
5b6b35aaac drm/amdgpu: init umc v6_1 functions for vega20
init umc callback function for vega20 in sw early init phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:22 -05:00
Hawking Zhang
9884c2b1c3 drm/amdgpu: add umc v6_1 query error count support
Implement umc query_ras_error_count function to support querry
both correctable and uncorrectable error

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:16 -05:00
Hawking Zhang
03c9963f47 drm/amdgpu: add umc v6_1_1 IP headers
the change introduces IP headers for unified memory controller (umc)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:10 -05:00
Hawking Zhang
245219a660 drm/amdgpu: add rsmu v_0_0_2 ip headers
remote smu (rsmu) is a sub-block used as ip register interface,
error handling, reset generation.etc

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:03 -05:00
Hawking Zhang
9e585a523b drm/amdgpu: add amdgpu_umc_functions structure
This is common structure as UMC callback function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:57 -05:00
Hawking Zhang
6501a77170 drm/amdgpu: init RSMU and UMC ip base address for vega20
the driver needs to program RSMU and UMC registers to
support vega20 RAS feature

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:51 -05:00
Hawking Zhang
7af25d5b7e drm/amdgpu: move some ras data structure to amdgpu_ras.h
These are common structures that can be included by IP specific
source files

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:32 -05:00
Alex Deucher
fa1884f9d8 drm/amdgpu: drop drmP.h from vcn_v2_5.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:41 -05:00
Alex Deucher
9a2ffeb525 drm/amdgpu: drop drmP.h from vcn_v2_0.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:36 -05:00
Alex Deucher
75589f496d drm/amdgpu: drop drmP.h from sdma_v5_0.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:31 -05:00
Alex Deucher
e9eea90247 drm/amdgpu: drop drmP.h from nv.c
And fix up the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:26 -05:00
Alex Deucher
b23b2e9e49 drm/amdgpu: drop drmP.h from navi10_ih.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:21 -05:00
Alex Deucher
0a069bbe13 drm/amdgpu: drop drmP.h in gfx_v10_0.c
And fix the fallout.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:16 -05:00
Alex Deucher
3b90f6ecdf drm/amdgpu: drop drmP.h from amdgpu_amdkfd_gfx_v10.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:33:10 -05:00
Alex Deucher
32978d8cfd drm/amdgpu: drop drmP.h in amdgpu_amdkfd_arcturus.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:32:56 -05:00
Andrzej Pietrasiewicz
5b50fa2b35 drm/amdgpu: Provide ddc symlink in connector sysfs directory
Use the ddc pointer provided by the generic connector.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7fee0fa0d0f77af6595d283d5f3ae5d551475821.1564161140.git.andrzej.p@collabora.com
2019-07-31 16:35:37 +02:00
Daniel Vetter
b2ad978fd0 drm/amdgpu: Fill out gem_object->resv
That way we can ditch our gem_prime_res_obj implementation. Since ttm
absolutely needs the right reservation object all the boilerplate is
already there and we just have to wire it up correctly.

Note that gem/prime doesn't care when we do this, as long as we do it
before the bo is registered and someone can call the handle2fd ioctl
on it.

Aside: ttm_buffer_object.ttm_resv could probably be ditched in favour
of always passing a non-NULL resv to ttm_bo_init(). At least for gem
drivers that would avoid having two of these, on in ttm_buffer_object
and the other in drm_gem_object, one just there for confusion.

Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Michel Dänzer" <michel.daenzer@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Amber Lin <Amber.Lin@amd.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Junwei Zhang <Jerry.Zhang@amd.com>
Cc: Thomas Zimmermann <contact@tzimmermann.org>
Cc: Samuel Li <Samuel.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725132655.11951-4-daniel.vetter@ffwll.ch
2019-07-31 10:19:23 +02:00
Evan Quan
6dee4829cf drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 02:02:22 -05:00
Evan Quan
a3ebbdb95f drm/amd/powerplay: correct Navi10 VCN powergate control (v2)
No VCN DPM bit check as that's different from VCN PG. Also
no extra check for possible double enablement/disablement
as that's already done by VCN.

v2: check return value of smu_feature_set_enabled

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 02:02:22 -05:00
Evan Quan
e21e3581e2 drm/amd/powerplay: support VCN powergate status retrieval for SW SMU
Commonly used for VCN powergate status retrieval for SW SMU.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 02:02:22 -05:00
Evan Quan
201cd702b7 drm/amd/powerplay: support VCN powergate status retrieval on Raven
Enable VCN powergate status report on Raven.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 02:02:22 -05:00
Evan Quan
a02709818f drm/amd/powerplay: add new sensor type for VCN powergate status
VCN is widely used in new ASICs and different from tranditional
UVD and VCE.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 02:02:22 -05:00
Wang Xiayang
929e571c04 drm/amdgpu: fix a potential information leaking bug
Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 01:26:09 -05:00
Christian König
67d0859e27 drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 01:26:09 -05:00
Evan Quan
f0bc1ee473 drm/amd/powerplay: enable SW SMU reset functionality
Move SMU irq handler register to sw_init as that's totally
software related. Otherwise, it will prevent SMU reset working.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 00:07:16 -05:00
Evan Quan
479156f2e5 drm/amd/powerplay: fix null pointer dereference around dpm state relates
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 00:06:58 -05:00
Alex Deucher
090efd946d drm/amdgpu/powerplay: use proper revision id for navi
The PCI revision id determines the sku.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 00:05:37 -05:00
Kevin Wang
45a660143b drm/amd/powerplay: fix temperature granularity error in smu11
in this patch,
drm/amd/powerplay: add callback function of get_thermal_temperature_range
the driver missed temperature granularity change on other temperature.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 00:01:05 -05:00
Kevin Wang
2c0f07fe3c drm/amd/powerplay: add callback function of get_thermal_temperature_range
1. the thermal temperature is asic related data, move the code logic to
xxx_ppt.c.
2. replace data structure PP_TemperatureRange with
smu_temperature_range.
3. change temperature uint from temp*1000 to temp (temperature uint).

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 00:00:49 -05:00
Kent Russell
d65848657c drm/amdkfd: Fix byte align on VegaM
This was missed during the addition of VegaM support

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:59:00 -05:00
Evan Quan
59de58f84f drm/amd/powerplay: determine the features to enable by pptable only
Per current logics, the features to enable are determined together
by driver and pptable. This is not efficient in co-debug with
firmware team.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Hawking Zhang
861324983d drm/amdgpu: correct irq type used for sdma ecc
we should pass irq type, instead of irq client id,
to irq_get/put interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Evan Quan
b4af964e75 drm/amd/powerplay: make power limit retrieval as asic specific
The power limit retrieval should be done per asic. Since we may
need to lookup in the pptable and that's really asic specific.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Evan Quan
1f23cadbe0 drm/amd/powerplay: correct arcturus current clock level calculation
There may be 1Mhz delta between target and actual frequency. That
should be taken into consideration for current level check.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Evan Quan
60d435b73d drm/amd/powerplay: support UMD PSTATE settings on arcturus
Enable arcturus UMD PSTATE support.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Evan Quan
4bf76e60b9 drm/amd/powerplay: fix arcturus real-time clock frequency retrieval
Make sure we can still get the accurate gfxclk/uclk/socclk frequency
even on dpm disabled.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Kevin Wang
790ef68afc drm/amd/powerplay: remove redundancy debug log in smu
remove redundacy debug log in smu.
eg:
[ 6897.969447] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6897.969448] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6897.969448] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024114] amdgpu: [powerplay] Unsupported SMU message: 38
[ 6899.024151] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024151] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024152] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078296] amdgpu: [powerplay] Unsupported SMU message: 38
[ 6900.078332] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078332] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078333] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6901.133230] amdgpu: [powerplay] Unsupported SMU message: 38

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
8a856ced35 drm/amd/powerplay: correct the bitmask used in arcturus
Those bitmask prefixed by "SMU_" should be used.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
55bf7e6243 drm/amd/powerplay: add missing arcturus feature maps
Add missing feature maps for arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
d427cf8f7f drm/amd/powerplay: support fan speed retrieval on arcturus
Support arcturus fan speed retrieval.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
631807f091 drm/amd/powerplay: support real-time clock retrieval on arcturus
Enable arcturus real-time clock retrieval.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
ba74c8bf88 drm/amd/powerplay: support sensor reading on arcturus
Support sensor reading for gpu loading, power and
temperatures.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
832a7062a0 drm/amd/powerplay: init arcturus SMU metrics table on bootup
Initialize arcturus SMU metrics table.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
1f96ecef6f drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
5fa790f6c9 drm/amd/powerplay: correct Navi10 VCN powergate control (v2)
No VCN DPM bit check as that's different from VCN PG. Also
no extra check for possible double enablement/disablement
as that's already done by VCN.

v2: check return value of smu_feature_set_enabled

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
bf2bf52383 drm/amd/powerplay: support VCN powergate status retrieval for SW SMU
Commonly used for VCN powergate status retrieval for SW SMU.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
ab9e314886 drm/amd/powerplay: support VCN powergate status retrieval on Raven
Enable VCN powergate status report on Raven.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
9829e3d89b drm/amd/powerplay: add new sensor type for VCN powergate status
VCN is widely used in new ASICs and different from tranditional
UVD and VCE.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Le Ma
7d0e6329df drm/amdgpu: update more sdma instances irq support
Update for sdma ras ecc_irq and other minors.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Le Ma
9d4d7236ef drm/amd/include: adjust base offset of SMUIO and THM for Arcturus
Arcturus has different _BASE_IDX value in some HWIP_offset.h. To make source
files like smu_v11_0.c and soc15.c that include HWIP_offset.h of Vega20
reusable for Arcturus, align this base offset with Vega20.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
3ff101b8ab drm/amd/powerplay: hold on the arcturus gfx dpm support in driver
As for now, only "Prefetcher" is guarded to be working from
SMU firmware.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
7c16d24abe drm/amdgpu: correct VCN powergate routine for acturus
Arcturus VCN should powergate in the way as Navi.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
fe089e1dd7 drm/amd/powerplay: enable arcturus powerplay
Arcturus powerplay is ready to use.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
cca4fafc09 drm/amd/powerplay: initialize arcturus MP1 and THM base address
Initialize base address for those IPs which are used in powerplay.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
22e1831734 drm/amd/powerplay: enable SW SMU routine support for arcturus
Enable arcturus SW SMU routines.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
a94235af11 drm/amd/powerplay: update arcturus_ppt.c/h V3
Arcturus ASIC specific powerplay interfaces.

V2: correct SMU msg naming
    drop unnecessary debugs

V3: rebase (Alex)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
c8893d5ce7 drm/amd/powerplay: update arcturus_ppsmc.h
Correct header and fix typo.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
22808306f2 drm/amd/powerplay: update smu11_driver_if_arcturus.h
It guides how driver should interface with SMU in arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
7c8bcaf408 drm/amd/powerplay: add SW SMU interface for dumping pptable out (v2)
This is especially useful in early bring up phase.

v2: disabled by default (Alex)

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
4c35e77865 drm/amd/powerplay: add smcdpminfo table v4_6 support
New smcdpminfo table used in arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Jay Cornwall
1faa3b8054 drm/amdkfd: Save/restore vcc on gfx10
VCC moved out of user SGPR allocation in gfx10. It's now stored
in SGPRs 106-107.

Also fixes incorrect SGPR read offsets.

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Jay Cornwall
f9e346aba1 drm/amdkfd: Save/restore flat_scratch_lo/hi on gfx10
These moved from SGPRs in gfx9 to HWREG in gfx10.

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Jay Cornwall
7ce55e0b6f drm/amdkfd: Fix gfx10 wave64 VGPR context restore
Copy/paste error, first 4 VGPRs are separated by 64 dwords (256 bytes).

Cc: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Nicholas Kazlauskas
f16d523f9d drm/amd/display: Support uclk switching for DCN2
[Why]
We were previously forcing the uclk for every state to max and reducing
the switch time to prevent uclk switching from occuring. This workaround
was previously needed in order to avoid hangs + underflow under certain
display configurations.

Now that DC has the proper fix complete we can drop the hacks and
improve power for most display configurations.

[How]
We still need the function pointers hooked up to grab the real uclk
states from pplib. The rest of the prior hack can be reverted.

The key requirements here are really just DC support, updated firmware,
and support for disabling p-state support when needed in pplib/smu.

When these requirements are met uclk switching works without underflow
or hangs.

Fixes: 02316e963a ("drm/amd/display: Force uclk to max for every state")

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Nicholas Kazlauskas
fb6959ae50 drm/amd/display: Embed DCN2 SOC bounding box
[Why]
In order to support uclk switching on NV10 the SOC bounding box
needs to be updated.

[How]
We currently read the constants from the gpu info FW, but supporting
workarounds in DC for different versions of the FW adds additional
complexity to the codebase.

NV10 has been released so it's cleanest to keep the bounding box and
source code in sync by embedding the bounding box like we do for
other ASICs.

Fixes: 02316e963a ("drm/amd/display: Force uclk to max for every state")

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Wang Xiayang
1a2c29bce0 drm/amdgpu: fix a potential information leaking bug
Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Christian König
6494120695 drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Kenneth Feng
6e92e156aa drm/amdgpu/powerplay: provide the interface to disable uclk switch for DAL
provide the interface for DAL to disable uclk switch on navi10.
in this case, the uclk will be fixed to maximum.
this is a workaround when display configuration causes underflow issue.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Thong Thai
c74dbe44ea drm/amd/amdgpu/vcn_v2_0: Move VCN 2.0 specific dec ring test to vcn_v2_0
VCN 2.0 firmware now requires a packet start command to be sent before
any other decode ring buffer command.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Alex Deucher
3207dcf3af drm/amdgpu/gfx10: update golden settings for navi14
Updated settings for hw team.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
98eb03bbf0 drm/amd/powerplay: implment sysfs feature status function in smu
1. Unified feature enable status format in sysfs
2. Rename ppfeature to pp_features to adapt other pp sysfs node name
3. this function support all asic, not asic related function.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Rui Huang <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
26dd668155 drm/amd/powerplay: move smu_feature_update_enable_state to up level
this function is not ip or asic related function,
so move it to top level as public api in smu.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
cb33363d0e drm/amd/powerplay: add smu feature name support
add smu_get_feature_name support in smu.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
6b294793e3 drm/amd/powerplay: add smu message name support
add smu_get_message_name support in smu.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
0ba5eda81a drm/amd/powerplay: move smu types to smu_types.h
move some enum type (message, feature, clock) to smu_types.h.
these types is too long in amdgpu_smu.h, and not clearly.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Joseph Greathouse
2c89731803 drm/amdgpu: Default disable GDS for compute+gfx
Units in the GDS block default to allowing all VMIDs access to all
entries. Disable shader access to the GDS, GWS, and OA blocks from all
compute and gfx VMIDs by default. For compute, HWS firmware will set
up the access bits for the appropriate VMID when a compute queue
requires access to these blocks.
The driver will handle enabling access on-demand for graphics VMIDs.

Leaving VMID0 with full access because otherwise HWS cannot save or
restore values during task switch.

v2: Fixed code and comment styling.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Thong Thai
333fe325fe drm/amd/amdgpu/vcn_v2_0: Mark RB commands as KMD commands
Sets the CMD_SOURCE bit for VCN 2.0 Decoder Ring Buffer commands. This
bit was previously set by the RBC HW on older firmware. Newer firmware
uses a SW RBC and this bit has to be set by the driver.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Alex Deucher
d3b9f39d84 drm/amdgpu/display: fix the build without CONFIG_DRM_AMD_DC_DSC_SUPPORT
Some code was missing the CONFIG_DRM_AMD_DC_DSC_SUPPORT guard.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Andrey Grodzovsky
f2bd8a0ed7 drm/amdgpu: Fix amdgpu_display_supported_domains logic.
Add restriction to dissallow GTT domain if the relevant BO
doesn't have USWC flag set to avoid the APU hang scenario.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Alex Deucher
354e6e14ef drm/amdgpu/powerplay: use proper revision id for navi
The PCI revision id determines the sku.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Alex Deucher
a3a09142f4 drm/amdgpu: put the SMC into the proper state on reset/unload
When doing a GPU reset or unloading the driver, we need to
put the SMU into the apprpriate state for the re-init after
the reset or unload to reliably work.

I don't think this is necessary for BACO because the SMU actually
controls the BACO state to it needs to be active.

For suspend (S3), the asic is put into D3 so the SMU would be
powered down so I don't think we need to put the SMU into
any special state.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:44 -05:00
Alex Deucher
1c074a6383 drm/amdgpu/powerplay: add set_mp1_state for vega12
This sets the SMU into the proper state for various
operations (shutdown, unload, GPU reset, etc.).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:41 -05:00
Alex Deucher
e254102d50 drm/amdgpu/powerplay: add set_mp1_state for vega10
This sets the SMU into the proper state for various
operations (shutdown, unload, GPU reset, etc.).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:39 -05:00
Alex Deucher
a906277d22 drm/amdgpu/powerplay: add set_mp1_state for vega20
This sets the SMU into the proper state for various
operations (shutdown, unload, GPU reset, etc.).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:35 -05:00
Alex Deucher
479baeacd8 drm/amdgpu/powerplay: return success if set_mp1_state is not set
Some asics (APUs) don't have this callback so we want to return
success.  Avoids spurious error messages on APUs.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:25 -05:00
Alex Deucher
a2c28e34f8 drm/amdgpu/powerplay: add a new interface to set the mp1 state
This is required for certain cases such as various GPU resets
(mode1, mode2), BACO, shutdown, unload, etc. to put the SMU into
the appropriate state for when the hw is re-initialized.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:21 -05:00
Alex Deucher
2ddc6c3ef9 drm/amdgpu: add reset_method asic callback for navi
Navi uses either mode1 or baco depending on various
conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:17 -05:00
Alex Deucher
ee360c0b7c drm/amdgpu: add reset_method asic callback for soc15
APUs only support mode2 reset.  dGPUs use either mode1 or
baco depending on various conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:13 -05:00
Alex Deucher
9bc1932f5c drm/amdgpu: add reset_method asic callback for vi
VI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:10 -05:00
Alex Deucher
6d0f50dafe drm/amdgpu: add reset_method asic callback for cik
CIK always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:06 -05:00
Alex Deucher
dd81eede77 drm/amdgpu: add reset_method asic callback for si
SI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:56 -05:00
Alex Deucher
0cf3c64f29 drm/amdgpu: add an asic callback to determine the reset method
Sometimes the driver may have to behave differently depending
on the method we are using to reset the GPU.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:44 -05:00
Evan Quan
4d7fd9e20b drm/amd/powerplay: enable SW SMU reset functionality
Move SMU irq handler register to sw_init as that's totally
software related. Otherwise, it will prevent SMU reset working.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:37 -05:00
Evan Quan
f0d2a7dc11 drm/amd/powerplay: fix null pointer dereference around dpm state relates
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:29 -05:00