Commit Graph

1184 Commits

Author SHA1 Message Date
Somalapuram Amaranath
3d8785f6c0 drm/amdgpu: adding device coredump support
Added device coredump information:
- Kernel version
- Module
- Time
- VRAM status
- Guilty process name and PID
- GPU register dumps
v1 -> v2: Variable name change
v1 -> v2: NULL check
v1 -> v2: Code alignment
v1 -> v2: Adding dummy amdgpu_devcoredump_free
v1 -> v2: memset reset_task_info to zero
v2 -> v3: add CONFIG_DEV_COREDUMP for variables
v2 -> v3: remove NULL check on amdgpu_devcoredump_read

Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-06 14:41:19 -04:00
Somalapuram Amaranath
651d7ee63f drm/amdgpu: save the reset dump register value for devcoredump
Allocate memory for register value and use the same values for devcoredump.
v1 -> v2: Change krealloc_array() to kmalloc_array()
v2 -> v3: Fix alignment

Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-06 14:41:12 -04:00
pengfuyuan
faf26f2b12 drm/amd: Fix spelling typo in comments
Fix spelling typo in comments.

Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: pengfuyuan <pengfuyuan@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03 16:43:36 -04:00
Mario Limonciello
0223e51647 drm/amd: Don't reset dGPUs if the system is going to s2idle
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de0874 ("drm/amdgpu:
always reset the asic in suspend (v2)").  A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Likun Gao
1b49133042 drm/amdgpu: add lsdma block
Add Light SDMA (LSDMA) block and related function. LSDMA
is a small instance of SDMA mainly for kernel driver use.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:11 -04:00
Likun Gao
8424f2ccb3 drm/amdgpu/psp: Add vbflash sysfs interface support
Add sysfs interface to copy VBIOS.

v2: squash in fix for proper vmalloc API (Alex)

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:10 -04:00
Alex Deucher
5a90c24ad0 Revert "drm/amdgpu: disable runpm if we are the primary adapter"
This reverts commit b95dc06af3.

This workaround is no longer necessary.  We have a better workaround
in commit f95af4a923 ("drm/amdgpu: don't runtime suspend if there are displays attached (v3)").

Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-05 16:50:00 -04:00
Jack Xiao
928fe236c0 drm/amdgpu: add mes_kiq module parameter v2
mes_kiq parameter is used to enable mes kiq pipe.
This module parameter is unneccessary or enabled by default
in final version.

v2: reword commit message.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:49 -04:00
Jack Xiao
2bc956ef54 drm/amdgpu: add the per-context meta data v3
The per-context meta data is a per-context data structure associated
with a mes-managed hardware ring, which includes MCBP CSA, ring buffer
and etc.

v2: fix typo
v3: a. use structure instead of typedef
    b. move amdgpu_mes_ctx_get_offs_* to amdgpu_ring.h
    c. use __aligned to make alignement

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:03:17 -04:00
Jack Xiao
5405a52627 drm/amdgpu: define MQD abstract layer for hw ip
Define MQD abstract layer for hw ip, for the passing
mqd configuration not only from ring but more sources,
like user queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:03:11 -04:00
Likun Gao
7f318f4e30 drm/amdgpu: add tracking for the enablement of SCPM
Add parmeter to shows whether SCPM feature is enabled or not, and
whether is valid.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 09:56:51 -04:00
Likun Gao
563fcfbf31 drm/amdgpu: add hdp version 6 functions
Unify hdp related function into hdp structure for hdp version 6.
V2: Remove hdp invalidate function as hdp v6 doesn't have read cache.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 09:53:58 -04:00
Likun Gao
1d5eee7dd6 drm/amdgpu: add function to decode ip version
Add function to decode IP version.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28 17:46:28 -04:00
Likun Gao
3202c7e782 drm/amdgpu: increase HWIP MAX INSTANCE
Extend HWIP MAX INSTANCE to 11.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28 17:46:24 -04:00
Evan Quan
25faeddcf3 drm/amdgpu: expand cg_flags from u32 to u64
With this, we can support more CG flags.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-08 17:24:24 -04:00
Christian König
a190f8dc4a drm/amdgpu: header cleanup
No function change, just move a bunch of definitions from amdgpu.h into
separate header files.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-04 13:03:30 -05:00
Ruijing Dong
11eb648d01 drm/amdgpu/vcn: Add vcn firmware log
vcn fwlog is for debugging purpose only,
by default, it is disabled.

Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-04 13:03:30 -05:00
Dave Airlie
38a15ad948 Merge tag 'amd-drm-next-5.18-2022-02-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.18-2022-02-25:

amdgpu:
- Raven2 suspend/resume fix
- SDMA 5.2.6 updates
- VCN 3.1.2 updates
- SMU 13.0.5 updates
- DCN 3.1.5 updates
- Virtual display fixes
- SMU code cleanup
- Harvest fixes
- Expose benchmark tests via debugfs
- Drop no longer relevant gart aperture tests
- More RAS restructuring
- W=1 fixes
- PSR rework
- DP/VGA adapter fixes
- DP MST fixes
- GPUVM eviction fix
- GPU reset debugfs register dumping support
- Misc display fixes
- SR-IOV fix
- Aldebaran mGPU fix
- Add module parameter to disable XGMI for testing

amdkfd:
- IH ring overflow logging fixes
- CRIU fixes
- Misc fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225183535.5907-1-alexander.deucher@amd.com
2022-03-01 16:19:02 +10:00
Alex Sierra
158a05a0b8 drm/amdgpu: Add use_xgmi_p2p module parameter
This parameter controls xGMI p2p communication, which is enabled by
default. However, it can be disabled by setting it to 0. In case xGMI
p2p is disabled in a dGPU, PCIe p2p interface will be used instead.
This parameter is ignored in GPUs that do not support xGMI
p2p configuration.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-24 17:25:12 -05:00
Dave Airlie
54f43c17d6 drm-misc-next for v5.18:
UAPI Changes:
 
 Cross-subsystem Changes:
 - Split out panel-lvds and lvds dt bindings .
 - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h
   and use it in drivers and tomoyo.
 - Clarify dma_fence_chain and dma_fence_array should never include eachother.
 - Flatten chains in syncobj's.
 - Don't double add in fbdev/defio when page is already enlisted.
 - Don't sort deferred-I/O pages by default in fbdev.
 
 Core Changes:
 - Fix missing pm_runtime_put_sync in bridge.
 - Set modifier support to only linear fb modifier if drivers don't
   advertise support.
 - As a result, we remove allow_fb_modifiers.
 - Add missing clear for EDID Deep Color Modes in drm_reset_display_info.
 - Assorted documentation updates.
 - Warn once in drm_clflush if there is no arch support.
 - Add missing select for dp helper in drm_panel_edp.
 - Assorted small fixes.
 - Improve fb-helper's clipping handling.
 - Don't dump shmem mmaps in a core dump.
 - Add accounting to ttm resource manager, and use it in amdgpu.
 - Allow querying the detected eDP panel through debugfs.
 - Add helpers for xrgb8888 to 8 and 1 bits gray.
 - Improve drm's buddy allocator.
 - Add selftests for the buddy allocator.
 
 Driver Changes:
 - Add support for nomodeset to a lot of drm drivers.
 - Use drm_module_*_driver in a lot of drm drivers.
 - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau,
   bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86.
 - Add bridge/it6505.
 - Create DP and DVI-I connectors in ast.
 - Assorted nouveau backlight fixes.
 - Rework amdgpu reset handling.
 - Add dt bindings for ingenic,jz4780-dw-hdmi.
 - Support reading edid through aux channel in ingenic.
 - Add a drm driver for Solomon SSD130x OLED displays.
 - Add simple support for sharp LQ140M1JW46.
 - Add more panels to nt35560.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmIWLSEACgkQ/lWMcqZw
 E8OP7hAAjix94EX5fhFa7OAdqUbFtsiKhK/4zNtV9FWpFiEsDBz+dlbfDQWIx5an
 FIiiiQtSfWjpDv6pcMhoNf80w+dDbc/Cuauz6nNGO7Pkaerh2D/EPG74FD7f7nE3
 EIScVs1heYtzM9usKrFKupNYgIdgZxmeovClWuE0OTjLOes2PGvvxXK6iJqNqJMX
 VlDO5SR7GRqsDUDV6vmwl63uKL77xJXAahAXIx+BQ/1xrtEhlu6NwsgHIsmPmMSN
 YluX34zc1xD/6/uUqvEdp7u46/5/He1c5Q/ia1WV3wRxsO/eMZ+axXqCZP3XGZdt
 rMdGNtj1MWKkudYiowStWkCVSG/0fXJCFIAhvRmeZy+YqPdVlqZ2W7g4H1l9iJoo
 UVfT9cHrKoxHsukvIEckC5Ov9v1yr39Bd4wUuqaUTUSxY8VID5vjY63TsXl9Zke1
 SluTFe9qybbnRNz/hYRvwIS1eT8HvUauAfAhypGTLI5DYHTD7PawcfMJkNzCtJm4
 Ta4SC3rTpkpN+7oc8SoNgqRHQ8U9KL5oksP0wVa8vwHsMptSd3X4pUljc6TcfjLv
 GEo41D5AuJz3HRVcn9yqPbLoPE2FFB7bfwIMH77yNnoos4Izy/LGhKpN0YdImmI5
 W5XVFB0jltGSIhkzLe1mFpLrdJwdUTSUVeCK4H5PhZZQEHLkVtg=
 =HuwD
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2022-02-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.18:

UAPI Changes:

Cross-subsystem Changes:
- Split out panel-lvds and lvds dt bindings .
- Put yes/no on/off disabled/enabled strings in linux/string_helpers.h
  and use it in drivers and tomoyo.
- Clarify dma_fence_chain and dma_fence_array should never include eachother.
- Flatten chains in syncobj's.
- Don't double add in fbdev/defio when page is already enlisted.
- Don't sort deferred-I/O pages by default in fbdev.

Core Changes:
- Fix missing pm_runtime_put_sync in bridge.
- Set modifier support to only linear fb modifier if drivers don't
  advertise support.
- As a result, we remove allow_fb_modifiers.
- Add missing clear for EDID Deep Color Modes in drm_reset_display_info.
- Assorted documentation updates.
- Warn once in drm_clflush if there is no arch support.
- Add missing select for dp helper in drm_panel_edp.
- Assorted small fixes.
- Improve fb-helper's clipping handling.
- Don't dump shmem mmaps in a core dump.
- Add accounting to ttm resource manager, and use it in amdgpu.
- Allow querying the detected eDP panel through debugfs.
- Add helpers for xrgb8888 to 8 and 1 bits gray.
- Improve drm's buddy allocator.
- Add selftests for the buddy allocator.

Driver Changes:
- Add support for nomodeset to a lot of drm drivers.
- Use drm_module_*_driver in a lot of drm drivers.
- Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau,
  bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86.
- Add bridge/it6505.
- Create DP and DVI-I connectors in ast.
- Assorted nouveau backlight fixes.
- Rework amdgpu reset handling.
- Add dt bindings for ingenic,jz4780-dw-hdmi.
- Support reading edid through aux channel in ingenic.
- Add a drm driver for Solomon SSD130x OLED displays.
- Add simple support for sharp LQ140M1JW46.
- Add more panels to nt35560.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/686ec871-e77f-c230-22e5-9e3bb80f064a@linux.intel.com
2022-02-25 05:50:18 +10:00
Somalapuram Amaranath
5ce5a584cb drm/amdgpu: add debugfs for reset registers list
List of register populated for dump collection during the GPU reset.

Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:26:36 -05:00
Alex Deucher
b784f42cf7 drm/amdgpu: drop testing module parameter
This test is not particularly useful now that GTT and GART
are decoupled in the driver.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:02:51 -05:00
Alex Deucher
0b1a63487b drm/amdgpu: drop benchmark module parameter
Now that we expose the benchmarks via debugfs, there is no
longer a need for the module parameter.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:02:51 -05:00
Alex Deucher
f113cc32e3 drm/amdgpu: add a benchmark mutex
To avoid multiple runs in parallel to avoid mixing results.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:02:50 -05:00
Alex Deucher
e460f244fb drm/amdgpu: plumb error handling though amdgpu_benchmark()
So we can tell when this function fails.

v2: squash in error handling fix (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23 14:02:50 -05:00
Mario Limonciello
0ab5d711ec drm/amd: Refactor amdgpu_aspm to be evaluated per device
Evaluating `pcie_aspm_enabled` as part of driver probe has the implication
that if one PCIe bridge with an AMD GPU connected doesn't support ASPM
then none of them do.  This is an invalid assumption as the PCIe core will
configure ASPM for individual PCIe bridges.

Create a new helper function that can be called by individual dGPUs to
react to the `amdgpu_aspm` module parameter without having negative results
for other dGPUs on the PCIe bus.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17 15:59:05 -05:00
Luben Tuikov
a6c40b1780 drm/amdgpu: Show IP discovery in sysfs
Add IP discovery data in sysfs. The format is:
/sys/class/drm/cardX/device/ip_discovery/die/D/B/I/<attrs>
where,
X is the card ID, an integer,
D is the die ID, an integer,
B is the IP HW ID, an integer, aka block type,
I is the IP HW ID instance, an integer.
<attrs> are the attributes of the block instance. At the moment these
include HW ID, instance number, major, minor, revision, number of base
addresses, and the base addresses themselves.

A symbolic link of the acronym HW ID is also created, under D/, if you
prefer to browse by something humanly accessible.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Tom StDenis <tom.stdenis@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14 15:08:40 -05:00
Dave Airlie
123db17ddf Merge tag 'amd-drm-next-5.18-2022-02-11-1' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.18-2022-02-11-1:

amdgpu:
- Clean up of power management code
- Enable freesync video mode by default
- Clean up of RAS code
- Improve VRAM access for debug using SDMA
- Coding style cleanups
- SR-IOV fixes
- More display FP reorg
- TLB flush fixes for Arcuturus, Vega20
- Misc display fixes
- Rework special register access methods for SR-IOV
- DP2 fixes
- DP tunneling fixes
- DSC fixes
- More IP discovery cleanups
- Misc RAS fixes
- Enable both SMU i2c buses where applicable
- s2idle improvements
- DPCS header cleanup
- Add new CAP firmware support for SR-IOV

amdkfd:
- Misc cleanups
- SVM fixes
- CRIU support
- Clean up MQD manager

UAPI:
- Add interface to amdgpu CTX ioctl to request a stable power state for profiling
  https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207
- Add amdkfd support for CRIU
  https://github.com/checkpoint-restore/criu/pull/1709
- Remove old unused amdkfd debugger interface
  Was only implemented for Kaveri and was only ever used by an old HSA tool that was never open sourced

radeon:
- Fix error handling in radeon_driver_open_kms
- UVD suspend fix
- Misc fixes

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220211220706.5803-1-alexander.deucher@amd.com
2022-02-14 10:31:51 +10:00
Andrey Grodzovsky
89a7a87093 drm/amdgpu: Move in_gpu_reset into reset_domain
We should have a single instance per entrire reset domain.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74116.html
2022-02-09 12:17:57 -05:00
Andrey Grodzovsky
d0fb18b535 drm/amdgpu: Move reset sem into reset_domain
We want single instance of reset sem across all
reset clients because in case of XGMI we should stop
access cross device MMIO because any of them could be
in a reset in the moment.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74117.html
2022-02-09 12:17:32 -05:00
Andrey Grodzovsky
cfbb6b0047 drm/amdgpu: Rework reset domain to be refcounted.
The reset domain contains register access semaphor
now and so needs to be present as long as each device
in a hive needs it and so it cannot be binded to XGMI
hive life cycle.
Adress this by making reset domain refcounted and pointed
by each member of the hive and the hive itself.

v4:

Fix crash on boot witrh XGMI hive by adding type to reset_domain.
XGMI will only create a new reset_domain if prevoius was of single
device type meaning it's first boot. Otherwsie it will take a
refocunt to exsiting reset_domain from the amdgou device.

Add a wrapper around reset_domain->refcount get/put
and a wrapper around send to reset wq (Lijo)

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74121.html
2022-02-09 12:17:09 -05:00
Andrey Grodzovsky
54f329cc7a drm/amdgpu: Serialize non TDR gpu recovery with TDRs
Use reset domain wq also for non TDR gpu recovery trigers
such as sysfs and RAS. We must serialize all possible
GPU recoveries to gurantee no concurrency there.
For TDR call the original recovery function directly since
it's already executed from within the wq. For others just
use a wrapper to qeueue work and wait on it to finish.

v2: Rename to amdgpu_recover_work_struct

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74113.html
2022-02-09 12:15:23 -05:00
Andrey Grodzovsky
a4c63cafa5 drm/amdgpu: Introduce reset domain
Defined a reset_domain struct such that
all the entities that go through reset
together will be serialized one against
another. Do it for both single device and
XGMI hive cases.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74111.html
2022-02-09 12:14:32 -05:00
Mario Limonciello
18b66ace6b drm/amd: add support to check whether the system is set to s3
This will be used to help make decisions on what to do in
misconfigured systems.

v2: squash in semicolon fix from Stephen Rothwell

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-02 18:26:30 -05:00
Mario Limonciello
f588a1bbfc drm/amd: Warn users about potential s0ix problems
On some OEM setups users can configure the BIOS for S3 or S2idle.
When configured to S3 users can still choose 's2idle' in the kernel by
using `/sys/power/mem_sleep`.  Before commit 6dc8265f98 ("drm/amdgpu:
always reset the asic in suspend (v2)"), the GPU would crash.  Now when
configured this way, the system should resume but will use more power.

As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about
potential power consumption issues during their first attempt at
suspending.

Reported-by: Bjoren Dasse <bjoern.daase@gmail.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-31 16:12:29 -05:00
Hawking Zhang
04022982fc drm/amdgpu: switch to common helper to read bios from rom
create a common helper function for soc15 and onwards
to read bios image from rom

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25 18:00:33 -05:00
Xiaojian Du
86700a4026 drm/amdgpu: modify a pair of functions for the pcie port wreg/rreg
This patch will modify a pair of functions for pcie port wreg/rreg.
AMD GPU have had an independent NBIO block from SOC15 arch.
If the dirver wants to read/write the address space of the pcie devices,
it has to go through the NBIO block.
This patch will move the pcie port wreg/rreg functions to
"amdgpu_device.c", so that to reuse the functions on the
future GPU ASICs.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:11 -05:00
Jonathan Kim
400ef298f4 drm/amdgpu: cleanup ttm debug sdma vram access function
Some suggested cleanups to declutter ttm when doing debug VRAM access over
SDMA.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:52:00 -05:00
yipechai
7cab212405 drm/amdgpu: Modify the compilation failed problem when other ras blocks' .h include amdgpu_ras.h
Modify the compilation failed problem when other ras blocks' .h include amdgpu_ras.h.

v2: squash in forward declaration warning fix (Alex)

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:59 -05:00
yipechai
6492e1b07c drm/amdgpu: Unify ras block interface for each ras block
1. Define unified ops interface for each block.
2. Add ras_block_match function pointer in ops interface, each ras block can customize specail match function to identify itself.
3. Add amdgpu_ras_block_match_default new function. If a ras block doesn't define .ras_block_match, default execute amdgpu_ras_block_match_default to identify this ras block.
4. Define unified basic ras block data for each ras block.
5. Create dedicated amdgpu device ras block link list to manage all of the ras blocks.
6. Add amdgpu_ras_register_ras_block new function interface for each ras block to register itself to ras controlling block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:59 -05:00
Evan Quan
ebfc253335 drm/amd/pm: do not expose the smu_context structure used internally in power
This can cover the power implementation details. And as what did for
powerplay framework, we hook the smu_context to adev->powerplay.pp_handle.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:14 -05:00
Evan Quan
d698a2c485 drm/amd/pm: move pp_force_state_enabled member to amdgpu_pm structure
As it lables an internal pm state and amdgpu_pm structure is the more
proper place than amdgpu_device structure for it.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:14 -05:00
Solomon Chiu
de05abe6b9 drm/amd/display: Enable Freesync Video Mode by default
[Why&How]
Freesync Video Mode is a experimental feature previously,
and need to be enabled by kernel parameter. We enable it
by default with removing module paramterter in amdgpu_dm.

v2: squash the patches together

Signed-off-by: Solomon Chiu <solomon.chiu@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-14 17:51:13 -05:00
Linus Torvalds
8d0749b4f8 drm for 5.17-rc1
core:
 - add privacy screen support
 - move nomodeset option into drm subsystem
 - clean up nomodeset handling in drivers
 - make drm_irq.c legacy
 - fix stack_depot name conflicts
 - remove DMA_BUF_SET_NAME ioctl restrictions
 - sysfs: send hotplug event
 - replace several DRM_* logging macros with drm_*
 - move hashtable to legacy code
 - add error return from gem_create_object
 - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER
 - kernel.h related include cleanups
 - support XRGB2101010 source buffers
 
 ttm:
 - don't include drm hashtable
 - stop pruning fences after wait
 - documentation updates
 
 dma-buf:
 - add dma_resv selftest
 - add debugfs helpers
 - remove dma_resv_get_excl_unlocked
 - documentation
 - make fences mandatory in dma_resv_add_excl_fence
 
 dp:
 - add link training delay helpers
 
 gem:
 - link shmem/cma helpers into separate modules
 - use dma_resv iteratior
 - import dma-buf namespace into gem helper modules
 
 scheduler:
 - fence grab fix
 - lockdep fixes
 
 bridge:
 - switch to managed MIPI DSI helpers
 - register and attach during probe fixes
 - convert to YAML in several places.
 
 panel:
 - add bunch of new panesl
 
 simpledrm:
 - support FB_DAMAGE_CLIPS
 - support virtual screen sizes
 - add Apple M1 support
 
 amdgpu:
 - enable seamless boot for DCN 3.01
 - runtime PM fixes
 - use drm_kms_helper_connector_hotplug_event
 - get all fences at once
 - use generic drm fb helpers
 - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes
 - add smart trace buffer (STB) for supported GPUs
 - display debugfs entries
 - new SMU debug option
 - Documentation update
 
 amdkfd:
 - IP discovery enumeration refactor
 - interface between driver fixes
 - SVM fixes
 - kfd uapi header to define some sysfs bitfields.
 
 i915:
 - support VESA panel backlights
 - enable ADL-P by default
 - add eDP privacy screen support
 - add Raptor Lake S (RPL-S) support
 - DG2 page table support
 - lots of GuC/HuC fw refactoring
 - refactored i915->gt interfaces
 - CD clock squashing support
 - enable 10-bit gamma support
 - update ADL-P DMC fw to v2.14
 - enable runtime PM autosuspend by default
 - ADL-P DSI support
 - per-lane DP drive settings for ICL+
 - add support for pipe C/D DMC firmware
 - Atomic gamma LUT updates
 - remove CCS FB stride restrictions on ADL-P
 - VRR platform support for display 11
 - add support for display audio codec keepalive
 - lots of display refactoring
 - fix runtime PM handling during PXP suspend
 - improved eviction performance with async TTM moves
 - async VMA unbinding improvements
 - VMA locking refactoring
 - improved error capture robustness
 - use per device iommu checks
 - drop bits stealing from i915_sw_fence function ptr
 - remove dma_resv_prune
 - add IC cache invalidation on DG2
 
 nouveau:
 - crc fixes
 - validate LUTs in atomic check
 - set HDMI AVI RGB quant to full
 
 tegra:
 - buffer objects reworks for dma-buf compat
 - NVDEC driver uAPI support
 - power management improvements
 
 etnaviv:
 - IOMMU enabled system support
 - fix > 4GB command buffer mapping
 - close a DoS vector
 - fix spurious GPU resets
 
 ast:
 - fix i2c initialization
 
 rcar-du:
 - DSI output support
 
 exynos:
 - replace legacy gpio interface
 - implement generic GEM object mmap
 
 msm:
 - dpu plane state cleanup in prep for multirect
 - dpu debugfs cleanups
 - dp support for sc7280
 - a506 support
 - removal of struct_mutex
 - remove old eDP sub-driver
 
 anx7625:
 - support MIPI DSI input
 - support HDMI audio
 - fix reading EDID
 
 lvds:
 - fix bridge DT bindings
 
 megachips:
 - probe both bridges before registering
 
 dw-hdmi:
 - allow interlace on bridge
 
 ps8640:
 - enable runtime PM
 - support aux-bus
 
 tx358768:
 - enable reference clock
 - add pulse mode support
 
 ti-sn65dsi86:
 - use regmap bulk write
 - add PWM support
 
 etnaviv:
 - get all fences at once
 
 gma500:
 - gem object cleanups
 
 kmb:
 - enable fb console
 
 radeon:
 - use dma_resv_wait_timeout
 
 rockchip:
 - add DSP hold timeout
 - suspend/resume fixes
 - PLL clock fixes
 - implement mmap in GEM object functions
 - use generic fbdev emulation
 
 sun4i:
 - use CMA helpers without vmap support
 
 vc4:
 - fix HDMI-CEC hang with display is off
 - power on HDMI controller while disabling
 - support 4K@60Hz modes
 - support 10-bit YUV 4:2:0 output
 
 vmwgfx:
 - fix leak on probe errors
 - fail probing on broken hosts
 - new placement for MOB page tables
 - hide internal BOs from userspace
 - implement GEM support
 - implement GL 4.3 support
 
 virtio:
 - overflow fixes
 
 xen:
 - implement mmap as GEM object function
 
 omapdrm:
 - fix scatterlist export
 - support virtual planes
 
 mediatek:
 - MT8192 support
 - CMDQ refinement
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHX1vMACgkQDHTzWXnE
 hr6rAw/9ES5RO5N3Ku9foFk1CI9bqy1Kh663KLkkEc+rDdhKpiZbBnAsrKkZ9sGu
 fNuHmWNN5nWXtDSOqHWuslt3F7Gh+qEBQtlkqC9mZsBm3bWB0aJK6E4QaJxfeSaK
 ta6AmyGx8DaV+C69i86dnemQurYSDVjROd7LDPKnCU0Fye/JxiXSXQmXksKMFVxd
 x5vmO9yfeDSg3EF+u1yB6nJNUYZBV0vhrAfjPqxPCRBXuQc7akuaglE/SFwlGnEk
 vn0GjVHEQcRTqYKrHr64xvQxIoKXcJP0pkDUyT7KYCsyj8GJkvxkb7/ls5pp5DvL
 SwyNg3J3vwUVP6w6GEvzf3ffG720qqUZvCbvLmE+A/t2DhGILiAm+HXSo43PTOW8
 uagT7Gxma8dy8EovjSxioS9HPX8Gcu+S+XYavgOsevOZ7oeEt4f4TLW7LXsw9d6y
 75FrMhiUpreab5hAh8Le0swuLYZHjdnJRdjSTqZJ/T6VdTdVftLT6IfwvSDx5CHy
 cWuufgcAjd7xVTXFquHWYXWLTQkiSMGf1M02jx9IWolTd4Cm41LNBhqMEDHZLHJD
 7ngGgoaREVDQ+MqjG90yfIwJFIpJPI3YOaHLi/Kznga+iDzFY6cyOQWW2vX7ZdY5
 7+LJWsgGT8Feb7/bzD5hX1mYqJLxh1pWUIaqIKMl+7LJL7gTVU8=
 =MByd
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights are support for privacy screens found in new laptops, a
  bunch of nomodeset refactoring, and i915 enables ADL-P systems by
  default, while starting to add RPL-S support.

  vmwgfx adds GEM and support for OpenGL 4.3 features in userspace.

  Lots of internal refactorings around dma reservations, and lots of
  driver refactoring as well.

  Summary:

  core:
   - add privacy screen support
   - move nomodeset option into drm subsystem
   - clean up nomodeset handling in drivers
   - make drm_irq.c legacy
   - fix stack_depot name conflicts
   - remove DMA_BUF_SET_NAME ioctl restrictions
   - sysfs: send hotplug event
   - replace several DRM_* logging macros with drm_*
   - move hashtable to legacy code
   - add error return from gem_create_object
   - cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER
   - kernel.h related include cleanups
   - support XRGB2101010 source buffers

  ttm:
   - don't include drm hashtable
   - stop pruning fences after wait
   - documentation updates

  dma-buf:
   - add dma_resv selftest
   - add debugfs helpers
   - remove dma_resv_get_excl_unlocked
   - documentation
   - make fences mandatory in dma_resv_add_excl_fence

  dp:
   - add link training delay helpers

  gem:
   - link shmem/cma helpers into separate modules
   - use dma_resv iteratior
   - import dma-buf namespace into gem helper modules

  scheduler:
   - fence grab fix
   - lockdep fixes

  bridge:
   - switch to managed MIPI DSI helpers
   - register and attach during probe fixes
   - convert to YAML in several places.

  panel:
   - add bunch of new panesl

  simpledrm:
   - support FB_DAMAGE_CLIPS
   - support virtual screen sizes
   - add Apple M1 support

  amdgpu:
   - enable seamless boot for DCN 3.01
   - runtime PM fixes
   - use drm_kms_helper_connector_hotplug_event
   - get all fences at once
   - use generic drm fb helpers
   - PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes
   - add smart trace buffer (STB) for supported GPUs
   - display debugfs entries
   - new SMU debug option
   - Documentation update

  amdkfd:
   - IP discovery enumeration refactor
   - interface between driver fixes
   - SVM fixes
   - kfd uapi header to define some sysfs bitfields.

  i915:
   - support VESA panel backlights
   - enable ADL-P by default
   - add eDP privacy screen support
   - add Raptor Lake S (RPL-S) support
   - DG2 page table support
   - lots of GuC/HuC fw refactoring
   - refactored i915->gt interfaces
   - CD clock squashing support
   - enable 10-bit gamma support
   - update ADL-P DMC fw to v2.14
   - enable runtime PM autosuspend by default
   - ADL-P DSI support
   - per-lane DP drive settings for ICL+
   - add support for pipe C/D DMC firmware
   - Atomic gamma LUT updates
   - remove CCS FB stride restrictions on ADL-P
   - VRR platform support for display 11
   - add support for display audio codec keepalive
   - lots of display refactoring
   - fix runtime PM handling during PXP suspend
   - improved eviction performance with async TTM moves
   - async VMA unbinding improvements
   - VMA locking refactoring
   - improved error capture robustness
   - use per device iommu checks
   - drop bits stealing from i915_sw_fence function ptr
   - remove dma_resv_prune
   - add IC cache invalidation on DG2

  nouveau:
   - crc fixes
   - validate LUTs in atomic check
   - set HDMI AVI RGB quant to full

  tegra:
   - buffer objects reworks for dma-buf compat
   - NVDEC driver uAPI support
   - power management improvements

  etnaviv:
   - IOMMU enabled system support
   - fix > 4GB command buffer mapping
   - close a DoS vector
   - fix spurious GPU resets

  ast:
   - fix i2c initialization

  rcar-du:
   - DSI output support

  exynos:
   - replace legacy gpio interface
   - implement generic GEM object mmap

  msm:
   - dpu plane state cleanup in prep for multirect
   - dpu debugfs cleanups
   - dp support for sc7280
   - a506 support
   - removal of struct_mutex
   - remove old eDP sub-driver

  anx7625:
   - support MIPI DSI input
   - support HDMI audio
   - fix reading EDID

  lvds:
   - fix bridge DT bindings

  megachips:
   - probe both bridges before registering

  dw-hdmi:
   - allow interlace on bridge

  ps8640:
   - enable runtime PM
   - support aux-bus

  tx358768:
   - enable reference clock
   - add pulse mode support

  ti-sn65dsi86:
   - use regmap bulk write
   - add PWM support

  etnaviv:
   - get all fences at once

  gma500:
   - gem object cleanups

  kmb:
   - enable fb console

  radeon:
   - use dma_resv_wait_timeout

  rockchip:
   - add DSP hold timeout
   - suspend/resume fixes
   - PLL clock fixes
   - implement mmap in GEM object functions
   - use generic fbdev emulation

  sun4i:
   - use CMA helpers without vmap support

  vc4:
   - fix HDMI-CEC hang with display is off
   - power on HDMI controller while disabling
   - support 4K@60Hz modes
   - support 10-bit YUV 4:2:0 output

  vmwgfx:
   - fix leak on probe errors
   - fail probing on broken hosts
   - new placement for MOB page tables
   - hide internal BOs from userspace
   - implement GEM support
   - implement GL 4.3 support

  virtio:
   - overflow fixes

  xen:
   - implement mmap as GEM object function

  omapdrm:
   - fix scatterlist export
   - support virtual planes

  mediatek:
   - MT8192 support
   - CMDQ refinement"

* tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm: (1241 commits)
  drm/amdgpu: no DC support for headless chips
  drm/amd/display: fix dereference before NULL check
  drm/amdgpu: always reset the asic in suspend (v2)
  drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
  drm/amd/display: Fix the uninitialized variable in enable_stream_features()
  drm/amdgpu: fix runpm documentation
  amdgpu/pm: Make sysfs pm attributes as read-only for VFs
  drm/amdgpu: save error count in RAS poison handler
  drm/amdgpu: drop redundant semicolon
  drm/amd/display: get and restore link res map
  drm/amd/display: support dynamic HPO DP link encoder allocation
  drm/amd/display: access hpo dp link encoder only through link resource
  drm/amd/display: populate link res in both detection and validation
  drm/amd/display: define link res and make it accessible to all link interfaces
  drm/amd/display: 3.2.167
  drm/amd/display: [FW Promotion] Release 0.0.98
  drm/amd/display: Undo ODM combine
  drm/amd/display: Add reg defs for DCN303
  drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
  drm/amd/display: Set optimize_pwr_state for DCN31
  ...
2022-01-10 12:58:46 -08:00
Alex Deucher
b95dc06af3 drm/amdgpu: disable runpm if we are the primary adapter
If we are the primary adapter (i.e., the one used by the firwmare
framebuffer), disable runtime pm.  This fixes a regression caused
by commit 55285e21f0 which results in the displays waking up
shortly after they go to sleep due to the device coming out of
runtime suspend and sending a hotplug uevent.

v2: squash in reworked fix from Evan

Fixes: 55285e21f0 ("fbdev/efifb: Release PCI device's runtime PM ref during FB destroy")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215203
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1840
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-31 08:57:45 -05:00
Dave Airlie
cb6846fbb8 Merge tag 'amd-drm-next-5.17-2021-12-30' of ssh://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.17-2021-12-30:

amdgpu:
- Suspend/resume fixes
- Fence fix
- Misc code cleanups
- IP discovery fixes
- SRIOV fixes
- RAS fixes
- GMC 8 VRAM detection fix
- FRU fixes for Aldebaran
- Display fixes

amdkfd:
- SVM fixes
- IP discovery fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211230141032.613596-1-alexander.deucher@amd.com
2021-12-31 10:59:17 +10:00
Kent Russell
6c92fe5fa5 drm/amdgpu: Increase potential product_name to 64 characters
Having seen at least 1 42-character product_name, bump the number up to
64, and put that definition into amdgpu.h to make future adjustments
simpler.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:43 -05:00
Dave Airlie
b06103b532 Merge tag 'amd-drm-next-5.17-2021-12-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amdgpu:
- Add some display debugfs entries
- RAS fixes
- SR-IOV fixes
- W=1 fixes
- Documentation fixes
- IH timestamp fix
- Misc power fixes
- IP discovery fixes
- Large driver documentation updates
- Multi-GPU memory use reductions
- Misc display fixes and cleanups
- Add new SMU debug option

amdkfd:
- SVM fixes

radeon:
- Fix typo in comment

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216202731.5900-1-alexander.deucher@amd.com
2021-12-23 11:55:28 +10:00
Philip Yang
4a74c38cd6 drm/amdgpu: Detect if amdgpu in IOMMU direct map mode
If host and amdgpu IOMMU is not enabled or IOMMU is pass through mode,
set adev->ram_is_direct_mapped flag which will be used to optimize
memory usage for multi GPU mappings.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Lang Yu
34f3a4a98b drm/amdgpu: introduce a kind of halt state for amdgpu device
It is useful to maintain error context when debugging
SW/FW issues. Introduce amdgpu_device_halt() for this
purpose. It will bring hardware to a kind of halt state,
so that no one can touch it any more.

Compare to a simple hang, the system will keep stable
at least for SSH access. Then it should be trivial to
inspect the hardware state and see what's going on.

v2:
 - Set adev->no_hw_access earlier to avoid potential crashes.(Christian)

Suggested-by: Christian Koenig <christian.koenig@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.co>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Isabella Basso
e105b64a36 drm/amdgpu: fix location of prototype for amdgpu_kms_compat_ioctl
This fixes the warning below by changing the prototype to a location
that's actually included by the .c files that call
amdgpu_kms_compat_ioctl:

 warning: no previous prototype for ‘amdgpu_kms_compat_ioctl’
 [-Wmissing-prototypes]
 37 | long amdgpu_kms_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
    |      ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Lijo Lazar
fe9c5c9aff drm/amdgpu: Use MAX_HWIP instead of HW_ID_MAX
HW_ID_MAX considers HWID of all IPs, far more than what amdgpu uses.
amdgpu tracks only the IPs defined by amd_hw_ip_block_type whose max
is MAX_HWIP.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:04:10 -05:00
Thomas Zimmermann
a713ca234e Merge drm/drm-next into drm-misc-next
Backmerging from drm/drm-next for v5.16-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-11-18 09:36:39 +01:00
Christian König
fa78e367a2 drm/amdgpu: stop getting excl fence separately
Just grab all fences for the display flip in one go.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028132630.2330-2-christian.koenig@amd.com
2021-11-17 14:32:27 +01:00
Kent Russell
68daadf3d6 drm/amdgpu: Add kernel parameter support for ignoring bad page threshold
When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).

If the bad_page_threshold kernel parameter is set to -2,
continue to initialize the GPU, while printing a warning to dmesg that
this action has been done

v2: squash in Luben's fix to restore RAS info reporting

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Guchun Chen
e17e27f9bd drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:56 -04:00
Christian König
c8365dbda0 drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"
This reverts commit 728e7e0cd6.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:36 -04:00
Alex Deucher
1d789535a0 drm/amdgpu: convert IP version array to include instances
Allow us to query instances versions more cleanly.

Instancing support is not consistent unfortunately. SDMA is a
good example.  Sienna cichlid has 4 total SDMA instances, each
enumerated separately (HWIDs 42, 43, 68, 69).  Arcturus has 8
total SDMA instances, but they are enumerated as multiple
instances of the same HWIDs (4x HWID 42, 4x HWID 43).  UMC
is another example.  On most chips there are multiple
instances with the same HWID.  This allows us to support both
forms.

v2: rebase
v3: clarify instancing support

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
5eceb20192 drm/amdgpu: add VCN1 hardware IP
So we can store the VCN IP revision for each instance of VCN.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
5f93148955 drm/amdgpu: add DCI HWIP
So we can track grab the appropriate DCE info out of the
IP discovery table.  This is a separare IP from DCN.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
1534db5549 drm/amdgpu: add XGMI HWIP
So we can track grab the appropriate XGMI info out of the
IP discovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
5f52e9a780 drm/amdgpu: store HW IP versions in the driver structure
So we can check the IP versions directly rather than using
asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher
81d1bf01e4 drm/amdgpu: add debugfs access to the IP discovery table
Useful for debugging and new asic validation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Ernst Sjöstrand
67a44e6598 drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10
Seems like newer cards can have even more instances now.
Found by UBSAN: array-index-out-of-bounds in
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29
index 8 is out of range for type 'uint32_t *[8]'

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697
Cc: stable@vger.kernel.org
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 16:14:41 -04:00
John Clements
3907c49218 drm/amdgpu: Add driver infrastructure for MCA RAS
Add MCA specific IP blocks targetting RAS features

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:18 -04:00
Evan Quan
0d8318e112 drm/amd/pm: drop the unnecessary intermediate percent-based transition
Currently, the readout of fan speed pwm is transited into percent-based
and then pwm-based. However, the transition into percent-based is totally
unnecessary and make the final output less accurate.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Ryan Taylor
84ec374bd5 drm/amdgpu: create amdgpu_vkms (v4)
Modify the VKMS driver into an api that dce_virtual can use to create
virtual displays that obey drm's atomic modesetting api.

v2: Made local functions static.

v3: Switched vkms_output kzalloc for kcalloc.
    Cleanup patches by moving display mode fixes to this patch.

v4: Update atomic_check and atomic_update to comply with new kms api.

Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Pratik Vishwakarma
d0260f62ee drm/amdgpu: Rename amdgpu_acpi_is_s0ix_supported
Rename amdgpu_acpi_is_s0ix_supported to better explain
functionality by renaming to amdgpu_acpi_is_s0ix_active

Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-27 12:04:26 -04:00
Kevin Wang
048af66be7 drm/amdgpu: split amdgpu_device_access_vram() into two small parts
split amdgpu_device_access_vram()
1. amdgpu_device_mm_access(): using MM_INDEX/MM_DATA to access vram
2. amdgpu_device_aper_access(): using vram aperature to access vram (option)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:58:13 -04:00
Veerabadhran Gopalakrishnan
9075096b09 amdgpu/nv.c - Optimize code for video codec support structure
Optimized the code for codec info structure initialization

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:46:09 -04:00
Eric Huang
810085ddb7 drm/amdgpu: Don't flush/invalidate HDP for APUs and A+A
Integrate two generic functions to determine if HDP
flush is needed for all Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:02:38 -04:00
Sathishkumar S
30d95a37f4 drm/amdgpu: attr to control SS2.0 bias level (v2)
add sysfs attr to read/write smartshift bias level.
document smartshift_bias sysfs attr.

V2: add attr to amdgpu_device_attrs and use attr_update (Lijo)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:00 -04:00
Dave Airlie
5745d647d5 Merge tag 'amd-drm-next-5.14-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-06-02:

amdgpu:
- GC/MM register access macro clean up for SR-IOV
- Beige Goby updates
- W=1 Fixes
- Aldebaran fixes
- Misc display fixes
- ACPI ATCS/ATIF handling rework
- SR-IOV fixes
- RAS fixes
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes for suspend/resume
- More buffer object subclassing work
- Add new INFO query for additional vbios information
- Add new placement for preemptable SG buffers

amdkfd:
- Misc fixes

radeon:
- W=1 Fixes
- Misc cleanups

UAPI:
- Add new INFO query for additional vbios information
  Useful for debugging vbios related issues.  Proposed umr patch:
  https://patchwork.freedesktop.org/patch/433297/
- 16bpc fixed point format support
  IGT test:
  https://lists.freedesktop.org/archives/igt-dev/2021-May/031507.html
  Proposed Vulkan patch:
  a25d480207
- Add a new GEM flag which is only used internally in the kernel driver.  Userspace
  is not allowed to set it.

drm:
- 16bpc fixed point format fourcc

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602214009.4553-1-alexander.deucher@amd.com
2021-06-04 06:13:57 +10:00
Sathishkumar S
3fa8f89d72 drm/amdgpu: enable smart shift on dGPU (v5)
enable smart shift on dGPU if it is part of HG system and
the platform supports ATCS method to handle power shift.

V2: avoid psc updates in baco enter and exit (Lijo)
    fix alignment (Shashank)
V3: rebased on unified ATCS handling. (Alex)
V4: check for return value and warn on failed update (Shashank)
    return 0 if device does not support smart shift.  (Lizo)
V5: rebased on ATPX/ATCS structures global (Alex)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Sathishkumar S
16eb48c62b drm/amdgpu: support atcs method powershift (v4)
add support to handle ATCS method for power shift control.
used to communicate dGPU device state to SBIOS.

V2: use defined acpi func for checking psc support (Lijo)
    fix alignment (Shashank)
V3: rebased on unified ATCS handling (Alex)
V4: rebased on ATPX/ATCS structures global (Alex)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:36:48 -04:00
Alex Deucher
f9b7f3703f drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)
They are global ACPI methods, so maybe the structures
global in the driver. This simplified a number of things
in the handling of these methods.

v2: reset the handle if verify interface fails (Lijo)
v3: fix compilation when ACPI is not defined.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:33:52 -04:00
Andrey Grodzovsky
7afefb81b7 drm/amdgpu: Rename flag which prevents HW access
Make it's name not feature but function descriptive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521204122.762288-1-andrey.grodzovsky@amd.com
2021-05-25 11:53:52 -04:00
Thomas Zimmermann
304ba5dca4 Merge drm/drm-next into drm-misc-next
Backmerging from drm/drm-next to the patches for AMD devices
for v5.14.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-05-22 07:17:05 +02:00
Alex Deucher
77bf762f8b drm/amdgpu/acpi: unify ATCS handling (v3)
Treat it like ATIF and check both the dGPU and APU for
the method.  This is required because ATCS may be hung
off of the APU in ACPI on A+A systems.

v2: add back accidently removed ACPI handle check.
v3: Fix incorrect atif check (Colin)
    Fix uninitialized variable (Colin)

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 18:03:01 -04:00
Peng Ju Zhou
a5504e9ad4 drm/amdgpu: Indirect register access for Navi12 sriov
This patch series are used for GC/MMHUB(part)/IH_RB_CNTL
indirect access in the SRIOV environment.

There are 4 bits, controlled by host, to control
if GC/MMHUB(part)/IH_RB_CNTL indirect access enabled.
(one bit is master bit controls other 3 bits)

For GC registers, changing all the register access from MMIO to
RLC and use RLC as the default access method in the full access time.

For partial MMHUB registers, changing their access from MMIO to
RLC in the full access time, the remaining registers
keep the original access method.

For IH_RB_CNTL register, changing it's access from MMIO to PSP.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:06 -04:00
Dave Airlie
c99c4d0ca5 Merge tag 'amd-drm-next-5.14-2021-05-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-05-19:

amdgpu:
- Aldebaran updates
- More LTTPR display work
- Vangogh updates
- SDMA 5.x GCR fixes
- RAS fixes
- PCIe ASPM support
- Modifier fixes
- Enable TMZ on Renoir
- Buffer object code cleanup
- Display overlay fixes
- Initial support for multiple eDP panels
- Initial SR-IOV support for Aldebaran
- DP link training refactor
- Misc code cleanups and bug fixes
- SMU regression fixes for variable sized arrays
- MAINTAINERS fixes for amdgpu

amdkfd:
- Initial SR-IOV support for Aldebaran
- Topology fixes
- Initial HMM SVM support
- Misc code cleanups and bug fixes

radeon:
- Misc code cleanups and bug fixes
- SMU regression fixes for variable sized arrays
- Flickering fix for Oland with multiple 4K displays

UAPI:
- amdgpu: Drop AMDGPU_GEM_CREATE_SHADOW flag.
  This was always a kernel internal flag and userspace use of it has always been blocked.
  It's no longer needed so remove it.
- amdkgd: HMM SVM support
  Overview: https://patchwork.freedesktop.org/series/85562/
  Porposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/fxkamd/hmm-wip

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210520031258.231896-1-alexander.deucher@amd.com
2021-05-21 15:29:40 +10:00
Andrey Grodzovsky
72c8c97b15 drm/amdgpu: Split amdgpu_device_fini into early and late
Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.

v4: Change functions prefix early->hw and late->sw

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-3-andrey.grodzovsky@amd.com
2021-05-19 23:45:49 -04:00
Likun GAO
7bd939d04d drm/amdgpu: add judgement when add ip blocks (v2)
Judgement whether to add an sw ip according to the harvest info.

v2: fix indentation (Alex)

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-19 22:29:22 -04:00
Dave Airlie
3a3ca72653 drm-misc-next for 5.14:
UAPI Changes:
 
  * drm: Disable connector force-probing for non-master clients
  * drm: Enforce consistency between IN_FORMATS property and cap + related
    driver cleanups
  * drm/amdgpu: Track devices, process info and fence info via
    /proc/<pid>/fdinfo
  * drm/ioctl: Mark AGP-related ioctls as legacy
  * drm/ttm: Provide tt_shrink file to trigger shrinker via debugfs;
 
 Cross-subsystem Changes:
 
  * fbdev/efifb: Special handling of non-PCI devices
  * fbdev/imxfb: Fix error message
 
 Core Changes:
 
  * drm: Add connector helper to attach HDR-metadata property and convert
    drivers
  * drm: Add connector helper to compare HDR-metadata and convert drivers
  * drm: Add conenctor helper to attach colorspace property
  * drm: Signal colorimetry in HDMI infoframe
  * drm: Support pitch for destination buffers; Add blitter function
    with generic format conversion
  * drm: Remove struct drm_device.pdev and update legacy drivers
  * drm: Remove obsolete DRM_KMS_FB_HELPER config option in core and drivers
  * drm: Remove obsolete drm_pci_alloc/drm_pci_free
 
  * drm/aperture: Add helpers for aperture ownership and convert drivers, replaces rsp fbdev helpers
 
  * drm/agp: Mark DRM AGP code as legacy and convert legacy drivers
 
  * drm/atomic-helpers: Cleanups
 
  * drm/dp: Handle downstream port counts of 0 correctly; AUX channel fixes; Use
    drm_err_*/drm_dbg_*(); Cleanups
 
  * drm/dp_dual_mode: Use drm_err_*/drm_dbg_*()
 
  * drm/dp_mst: Use drm_err_*/drm_dbg_*(); Use Extended Base Receiver Capability DPCD space
 
  * drm/gem-ttm-helper: Provide helper for dumb_map_offset and convert drivers
 
  * drm/panel: Use sysfs_emit; panel-simple: Use runtime PM, Power up panel
               when reading EDID, Cache EDID, Cleanups;
               Lms397KF04: DT bindings
 
  * drm/pci: Mark AGP helpers as legacy
 
  * drm/print: Handle NULL for DRM devices gracefully
 
  * drm/scheduler: Change scheduled fence track
 
  * drm/ttm: Don't count SG BOs against pages_limit; Warn about freeing pinned
             BOs; Fix error handling if no BO can be swapped out; Move special
             handling of non-GEM drivers into vmwgfx; Move page_alignment into
             the BO; Set drm-misc as TTM tree in MAINTAINERS; Cleanup
 	    ttm_agp_backend; Add ttm_sys_manager for system domain; Cleanups
 
 Driver Changes:
 
  * drm: Don't set allow_fb_modifiers explictly in drivers
 
  * drm/amdgpu: Pin/unpin fixes wrt to TTM; Use bo->base.size instead of
    mem->num_pages
 
  * drm/ast: Use managed pcim_iomap(); Fix EDID retrieval with DP501
 
  * drm/bridge: MHDP8546: HDCP support + DT bindings, Register DP AUX channel
    with userspace; Sil8620: Fix module dependencies; dw-hdmi: Add option to
    not load CEC driver; Fix stopping in drm_bridge_chain_pre_enable();
    Ti-sn65dsi86: Fix refclk handling, Break GPIO and MIPI-to-eDP into
    subdrivers, Use pm_runtime autosuspend, cleanups; It66121: Add
    driver + DT bindings; Adv7511: Support I2S IEC958 encoding; Anx7625: fix
    power-on delay; Nwi-dsi: Modesetting fixes; Cleanups
 
  * drm/bochs: Support screen blanking
 
  * drm/gma500: Cleanups
 
  * drm/gud: Cleanups
 
  * drm/i915: Use correct max source link rate for MST
 
  * drm/kmb: Cleanups
 
  * drm/meson: Disable dw-hdmi CEC driver
 
  * drm/nouveau: Pin/unpin fixes wrt to TTM; Use bo->base.size instead of
    mem->num_pages; Register AUX adapters after their connectors
 
  * drm/qxl: Fix shadow BO unpin
 
  * drm/radeon: Duplicate some DRM AGP code to uncouple from legacy drivers
 
  * drm/simpledrm: Add a generic DRM driver for simple-framebuffer devices
 
  * drm/tiny: Fix log spam if probe function gets deferred
 
  * drm/vc4: Add support for HDR-metadata property; Cleanups
 
  * drm/virtio: Create dumb BOs as guest blobs;
 
  * drm/vkms: Use managed drmm_universal_plane_alloc(); Add XRGB plane
    composition; Add overlay support
 
  * drm/vmwgfx: Enable console with DRM_FBDEV_EMULATION; Fix CPU updates
    of coherent multisample surfaces; Remove reservation semaphore; Add
    initial SVGA3 support; Support amd64; Use 1-based IDR; Use min_t();
    Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmCb4QsACgkQaA3BHVML
 eiPwUgf/eTodvGQyB0cjv1vyHlttLo2t9k4QBO0pzVH0DJokl/pMpY0CuS8A/afW
 RmKLYod3TQb2QeEqWjocPxcYrh5WCbjdDZlmSb+pF+qau4b4s09SzIogK3lO1Nve
 9N1WVa7C3JC3k3XYexpeZ78RtoNN0UboMKDfbZODnn1PtjVtOp7Nbb92trRuB7y+
 B72A8RQMYB5IywVln9+lzLYcrmpHZbk/sLmC5pIGBPcTyhn0TFinUYlg9iq1PvNM
 fIqvPvXwxDVRO6hgnxZWKrdvQKCOcl5KFnk4E6H+ZkgWJ+yuAWI9r2N9TeelcW+M
 jlCHreWEHhuTPkr/ypnVmO8kuEgSFA==
 =G9ip
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-05-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.14:

UAPI Changes:

 * drm: Disable connector force-probing for non-master clients
 * drm: Enforce consistency between IN_FORMATS property and cap + related
   driver cleanups
 * drm/amdgpu: Track devices, process info and fence info via
   /proc/<pid>/fdinfo
 * drm/ioctl: Mark AGP-related ioctls as legacy
 * drm/ttm: Provide tt_shrink file to trigger shrinker via debugfs;

Cross-subsystem Changes:

 * fbdev/efifb: Special handling of non-PCI devices
 * fbdev/imxfb: Fix error message

Core Changes:

 * drm: Add connector helper to attach HDR-metadata property and convert
   drivers
 * drm: Add connector helper to compare HDR-metadata and convert drivers
 * drm: Add conenctor helper to attach colorspace property
 * drm: Signal colorimetry in HDMI infoframe
 * drm: Support pitch for destination buffers; Add blitter function
   with generic format conversion
 * drm: Remove struct drm_device.pdev and update legacy drivers
 * drm: Remove obsolete DRM_KMS_FB_HELPER config option in core and drivers
 * drm: Remove obsolete drm_pci_alloc/drm_pci_free

 * drm/aperture: Add helpers for aperture ownership and convert drivers, replaces rsp fbdev helpers

 * drm/agp: Mark DRM AGP code as legacy and convert legacy drivers

 * drm/atomic-helpers: Cleanups

 * drm/dp: Handle downstream port counts of 0 correctly; AUX channel fixes; Use
   drm_err_*/drm_dbg_*(); Cleanups

 * drm/dp_dual_mode: Use drm_err_*/drm_dbg_*()

 * drm/dp_mst: Use drm_err_*/drm_dbg_*(); Use Extended Base Receiver Capability DPCD space

 * drm/gem-ttm-helper: Provide helper for dumb_map_offset and convert drivers

 * drm/panel: Use sysfs_emit; panel-simple: Use runtime PM, Power up panel
              when reading EDID, Cache EDID, Cleanups;
              Lms397KF04: DT bindings

 * drm/pci: Mark AGP helpers as legacy

 * drm/print: Handle NULL for DRM devices gracefully

 * drm/scheduler: Change scheduled fence track

 * drm/ttm: Don't count SG BOs against pages_limit; Warn about freeing pinned
            BOs; Fix error handling if no BO can be swapped out; Move special
            handling of non-GEM drivers into vmwgfx; Move page_alignment into
            the BO; Set drm-misc as TTM tree in MAINTAINERS; Cleanup
	    ttm_agp_backend; Add ttm_sys_manager for system domain; Cleanups

Driver Changes:

 * drm: Don't set allow_fb_modifiers explictly in drivers

 * drm/amdgpu: Pin/unpin fixes wrt to TTM; Use bo->base.size instead of
   mem->num_pages

 * drm/ast: Use managed pcim_iomap(); Fix EDID retrieval with DP501

 * drm/bridge: MHDP8546: HDCP support + DT bindings, Register DP AUX channel
   with userspace; Sil8620: Fix module dependencies; dw-hdmi: Add option to
   not load CEC driver; Fix stopping in drm_bridge_chain_pre_enable();
   Ti-sn65dsi86: Fix refclk handling, Break GPIO and MIPI-to-eDP into
   subdrivers, Use pm_runtime autosuspend, cleanups; It66121: Add
   driver + DT bindings; Adv7511: Support I2S IEC958 encoding; Anx7625: fix
   power-on delay; Nwi-dsi: Modesetting fixes; Cleanups

 * drm/bochs: Support screen blanking

 * drm/gma500: Cleanups

 * drm/gud: Cleanups

 * drm/i915: Use correct max source link rate for MST

 * drm/kmb: Cleanups

 * drm/meson: Disable dw-hdmi CEC driver

 * drm/nouveau: Pin/unpin fixes wrt to TTM; Use bo->base.size instead of
   mem->num_pages; Register AUX adapters after their connectors

 * drm/qxl: Fix shadow BO unpin

 * drm/radeon: Duplicate some DRM AGP code to uncouple from legacy drivers

 * drm/simpledrm: Add a generic DRM driver for simple-framebuffer devices

 * drm/tiny: Fix log spam if probe function gets deferred

 * drm/vc4: Add support for HDR-metadata property; Cleanups

 * drm/virtio: Create dumb BOs as guest blobs;

 * drm/vkms: Use managed drmm_universal_plane_alloc(); Add XRGB plane
   composition; Add overlay support

 * drm/vmwgfx: Enable console with DRM_FBDEV_EMULATION; Fix CPU updates
   of coherent multisample surfaces; Remove reservation semaphore; Add
   initial SVGA3 support; Support amd64; Use 1-based IDR; Use min_t();
   Cleanups

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YJvkD523evviED01@linux-uq9g.fritz.box
2021-05-19 09:22:56 +10:00
Likun GAO
83a0b86391 drm/amdgpu: add judgement when add ip blocks (v2)
Judgement whether to add an sw ip according to the harvest info.

v2: fix indentation (Alex)

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-13 10:46:58 -04:00
Luben Tuikov
8ab0d6f030 drm/amdgpu: Rename to ras_*_enabled
Rename,
  ras_hw_supported --> ras_hw_enabled, and
  ras_features     --> ras_enabled,
to show that ras_enabled is a subset of
ras_hw_enabled, which itself is a subset
of the ASIC capability.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:08:12 -04:00
Luben Tuikov
e509965e58 drm/amdgpu: Move up ras_hw_supported
Move ras_hw_supported into struct amdgpu_dev.
The dependency is:
struct amdgpu_ras <== struct amdgpu_dev <== ASIC,
read as "struct amdgpu_ras depends on struct
amdgpu_dev, which depends on the hardware."

This can be loosely understood as, "if RAS is
supported, which is property of the ASIC (struct
amdgpu_dev), then we can access struct
amdgpu_ras."

v2: Fix a typo: must binary AND in ternary cond
    in amdgpu_ras.c

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:08:06 -04:00
Roy Sun
8744425411 drm/amdgpu: Add show_fdinfo() interface
Tracking devices, process info and fence info using
/proc/pid/fdinfo

Signed-off-by: David M Nieto <David.Nieto@amd.com>
Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426062701.39732-2-Roy.Sun@amd.com
2021-05-05 09:26:53 +02:00
Jude Shih
f066af882b drm/amdgpu: add DMUB outbox event IRQ source define/complete/debug flag
[Why & How]
We use outbox interrupt that allows us to do the AUX via DMUB
Therefore, we need to add some irq source related definition
in the header files;

Signed-off-by: Jude Shih <shenshih@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:54:42 -04:00
Lijo Lazar
5d89bb2d2f drm/amdgpu: Make set PG/CG state functions public
Expose PG/CG set states functions for other clients

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:46:22 -04:00
Lijo Lazar
04442bf70d drm/amdgpu: Add reset control handling to reset workflow
This prefers reset control based handling if it's implemented
for a particular ASIC. If not, it takes the legacy path. It uses
the legacy method of preparing environment (job, scheduler tasks)
and restoring environment.

v2: remove unused variable (Alex)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:46:14 -04:00
Lijo Lazar
e071dce38f drm/amdgpu: Add reset control to amdgpu_device
v1: Add generic amdgpu_reset_control to handle different types of resets. It
may be added at device, hive or ip level. Each reset control has a list
of handlers associated with it to handle different types of reset. Reset
control is responsible for choosing the right handler given a particular
reset context.

Handler objects may implement a set of functions on how to handle a
particular type of reset.

prepare_env = Prepare environment/software context (not used currently).
prepare_hwcontext = Prepare hardware context for the reset.
perform_reset = Perform the type of reset.
restore_hwcontext = Restore the hw context after reset.
restore_env = Restore the environment after reset (not used currently).

Reset context carries the context of reset, as of now this is based on
the parameters used for current set of resets.

v2: Fix coding style

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:46:08 -04:00
Wan Jiabing
32c811b097 drivers: gpu: Remove duplicate include of amdgpu_hdp.h
amdgpu_hdp.h has been included at line 91, so remove
the duplicate include.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:42:16 -04:00
Alex Deucher
62498733d4 drm/amdgpu: rework S3/S4/S0ix state handling
Set flags at the top level pmops callbacks to track
state.  This cleans up the current set of flags and
properly handles S4 on S0ix capable systems.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:36:29 -04:00
Alex Deucher
b98c6299ef drm/amdgpu: disentangle HG systems from vgaswitcheroo
There's no need to keep vgaswitcheroo around for HG
systems.  They don't use muxes and their power control
is handled via ACPI.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:36:07 -04:00
Dennis Li
56b53c0b5a drm/amdgpu: add codes to capture invalid hardware access when recovery
When recovery thread has begun GPU reset, there should be not other
threads to access hardware, otherwise system randomly hang.

v2 (chk): rewritten from scratch, use trylock and lockdep instead of
hand wiring the logic.

v3: add in_irq check

v4: change to check in_task

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:34:53 -04:00
Daniel Vetter
2cbcb78c9e Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.13-2021-03-23:

amdgpu:
- Debugfs cleanup
- Various cleanups and spelling fixes
- Flexible array cleanups
- Initial AMD Freesync HDMI
- Display fixes
- 10bpc dithering improvements
- Display ASSR support
- Clean up and unify powerplay and swsmu interfaces
- Vangogh fixes
- Add SMU gfx busy queues for RV/PCO
- PCIE DPM fixes
- S0ix fixes
- GPU metrics data fixes
- DCN secure display support
- Backlight type override
- Add initial support for Aldebaran
- RAS fixes
- Prime fixes for A+A systems
- Reset fixes
- Initial resource cursor support
- Drop legacy IO BAR requirements
- Various power fixes

amdkfd:
- MMU notifier fixes
- APU fixes

radeon:
- Debugfs cleanups
- Flexible array cleanups

UAPI:
- amdgpu: Add a new INFO ioctl interface to query video capabilities
  rather than hardcoding them in userspace.  This allows us to provide
  fine grained asic capabilities (e.g., if a particular part is
  bandwidth limited, we can limit the capabilities).  Proposed userspace:
  https://gitlab.freedesktop.org/leoliu/drm/-/commits/info_video_caps
  https://gitlab.freedesktop.org/leoliu/mesa/-/commits/info_video_caps
- amdkfd: bump the driver version.  There was a problem with reporting
  some RAS features on older versions of the driver. Proposed userspace:
  7cdd63475c

Danvet: A bunch of conflicts all over, but it seems to compile ... I
did put the call to dc_allow_idle_optimizations() on a single line
since it looked a bit too jarring to be left alone.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210324040147.1990338-1-alexander.deucher@amd.com
2021-03-26 15:53:21 +01:00
Nikola Cornij
a85ba00538 drm/amdgpu/display: re-enable freesync video patches
Since this is a "revert of a revert", the end effect is that freesync
video is back to its original state, the way it was before the first
revert.

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:36:10 -04:00
Alex Deucher
e99d2eaafd drm/amdgpu: drop legacy IO bar support
It was leftover from radeon where it was required for some
specific old hardware.  It hasn't been required for ages
and the driver already falls back to MMIO when legacy IO
is not available.  Legacy IO also seems to be problematic on
on some thunderbolt devices.  Drop it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
2021-03-23 23:31:17 -04:00
shaoyunl
e3c1b0712f drm/amdgpu: Reset the devices in the XGMI hive duirng probe
In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices
without sync to each other. This could cause device hang since for XGMI configuration, all the devices
within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue
by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't
do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init
to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:38 -04:00
shaoyunl
655ce9cb13 drm/amdgpu: Add reset_list for device list used for reset
The gmc.xgmi.head list originally is designed for device list in the XGMI hive. Mix use it
for reset purpose will prevent the reset function to adjust XGMI device list which is required
in next change

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:35 -04:00
Aurabindo Pillai
c0ea73a4ad Revert freesync video patches temporarily
This temporarily reverts freesync video patches since it causes regression with
eDP displays. This patch is a squashed revert of the following patches:

6f59f229f8 ("drm/amd/display: Skip modeset for front porch change")
d10cd527f5 ("drm/amd/display: Add freesync video modes based on preferred modes")
0eb1af2e82 ("drm/amd/display: Add module parameter for freesync video mode")

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Anson Jacob <anson.jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:04 -04:00
Dennis Li
88f8575bca drm/amdgpu: enable watchdog feature for SQ of aldebaran
SQ's watchdog timer monitors forward progress, a mask of which waves
caused the watchdog timeout is recorded into ras status registers and
then trigger a system fatal error event.

v2:
1. change *query_timeout_status to *query_sq_timeout_status.
2. move query_sq_timeout_status into amdgpu_ras_do_recovery.
3. add module parameters to enable/disable fatal error event and modify
the watchdog timer.

v3:
1. remove unused parameters of *enable_watchdog_timer

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:52 -04:00
Lijo Lazar
8738a82b37 drm/amd/amdgpu: Add smu_pptable module parameter
Temporarily add smu_pptable module parameter for aldebaran.This is used
to force soft PPTable use overriding any VBIOS PPTable.

Signed-off-by: Lijo Lazar <Lijo.Lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:14 -04:00
Feifei Xu
5c03e5843e drm/amdgpu:add smu mode1/2 support for aldebaran
Use MSG_GfxDriverReset for mode reset and retire MSG_Mode1Reset.
Centralize soc15_asic_mode1_reset() and nv_asic_mode1_reset()functions.
Add mode2_reset_is_support() for smu->ppt_funcs.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:49 -04:00
Dave Airlie
51c3b916a4 drm-misc-next for 5.13:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - %p4cc printk format modifier
   - atomic: introduce drm_crtc_commit_wait, rework atomic plane state
     helpers to take the drm_commit_state structure
   - dma-buf: heaps rework to return a struct dma_buf
   - simple-kms: Add plate state helpers
   - ttm: debugfs support, removal of sysfs
 
 Driver Changes:
   - Convert drivers to shadow plane helpers
   - arc: Move to drm/tiny
   - ast: cursor plane reworks
   - gma500: Remove TTM and medfield support
   - mxsfb: imx8mm support
   - panfrost: MMU IRQ handling rework
   - qxl: rework to better handle resources deallocation, locking
   - sun4i: Add alpha properties for UI and VI layers
   - vc4: RPi4 CEC support
   - vmwgfx: doc cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYD9fUAAKCRDj7w1vZxhR
 xcRLAQDdWKgUNeHnkKCUNh3ewPGabxvc6KQtPqAxcFv0I3ZmWgEAlfTS0pRLdyzQ
 ITRBL0T0S7cIyqnDULZkwfqB6Q8D0ws=
 =hPCS
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-03-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.13:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - %p4cc printk format modifier
  - atomic: introduce drm_crtc_commit_wait, rework atomic plane state
    helpers to take the drm_commit_state structure
  - dma-buf: heaps rework to return a struct dma_buf
  - simple-kms: Add plate state helpers
  - ttm: debugfs support, removal of sysfs

Driver Changes:
  - Convert drivers to shadow plane helpers
  - arc: Move to drm/tiny
  - ast: cursor plane reworks
  - gma500: Remove TTM and medfield support
  - mxsfb: imx8mm support
  - panfrost: MMU IRQ handling rework
  - qxl: rework to better handle resources deallocation, locking
  - sun4i: Add alpha properties for UI and VI layers
  - vc4: RPi4 CEC support
  - vmwgfx: doc cleanup

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
2021-03-16 17:08:46 +10:00
Takashi Iwai
7a46f05e5e drm/amd/display: Add a backlight module option
There seem devices that don't work with the aux channel backlight
control.  For allowing such users to test with the other backlight
control method, provide a new module option, aux_backlight, to specify
enabling or disabling the aux backport support explicitly.  As
default, the aux support is detected by the hardware capability.

v2: make the backlight option generic in case we add future
backlight types (Alex)

BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-10 16:12:15 -05:00
Takashi Iwai
7c20984795 drm/amd/display: Add a backlight module option
There seem devices that don't work with the aux channel backlight
control.  For allowing such users to test with the other backlight
control method, provide a new module option, aux_backlight, to specify
enabling or disabling the aux backport support explicitly.  As
default, the aux support is detected by the hardware capability.

v2: make the backlight option generic in case we add future
backlight types (Alex)

BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:13:55 -05:00
Leo (Hanghong) Ma
c79fe9b436 drm/amdgpu: add DMUB trace event IRQ source define
[Why & How]
We use DMCUB outbox0 interrupt to log DMCUB trace buffer events
as Linux kernel traces, so need to add some irq source related
defination in the header files;

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:10:49 -05:00
Alex Deucher
6f786950b1 drm/amdgpu/codec: drop the internal codec index
And just use the ioctl index.  They are the same.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Alex Deucher
9269bf1868 drm/amdgpu: add asic callback for querying video codec info (v3)
This will be used by a new INFO ioctl query to fetch the decode
and encode capabilities from the kernel driver rather than
hardcoding them in mesa.  This gives us more fine grained control
of capabilities using information that is only availabl in the
kernel (e.g., platform limitations or bandwidth restrictions).

v2: reorder the codecs to better align with mesa
v3: add max_pixels_per_frame to handle the portrait case

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:49 -05:00
Aurabindo Pillai
0eb1af2e82 drm/amd/display: Add module parameter for freesync video mode
[Why]
This option shall be opt-in by default since it is a temporary solution
until long term solution is agreed upon which may require userspace interface
changes. This feature give the user a seamless experience when freesync aware
programs (media players for instance) switches to a compatible freesync mode
when playing videos. Enabling this feature also have the potential side effect
of causing higher power consumption due to running a mode with lower resolution
and base clock frequency with the highest base clock supported on the monitor as
per its advertised modes. There has been precedent of manufacturing modes in the
kernel. In AMDGPU, the existing usage are for common modes and scaling modes.
Other driver have a similar approach as well.

[How]
Adds a module parameter to enable freesync video mode modeset
optimization. Enabling this mode allows the driver to skip a full modeset when a
freesync compatible mode is requested by the userspace. This parameter will also
add some additional modes that are within the connected monitor's VRR range
corresponding to common video modes, which media players can use for a seamless
experience while making use of freesync.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-26 17:23:37 -05:00
Prike Liang
b092b19602 drm/amdgpu: fix shutdown and poweroff process failed with s0ix
In the shutdown and poweroff opt on the s0i3 system we still need
un-gate the gfx clock gating and power gating before destory amdgpu device.

Fixes: 628c36d7b2 ("drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1499
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-24 09:48:46 -05:00
Prike Liang
b00978de90 drm/amdgpu: fix shutdown and poweroff process failed with s0ix
In the shutdown and poweroff opt on the s0i3 system we still need
un-gate the gfx clock gating and power gating before destory amdgpu device.

Fixes: 628c36d7b2 ("drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1499
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-24 09:28:55 -05:00
Nirmoy Das
98d28ac2f5 drm/amdgpu: do not use drm middle layer for debugfs
Use debugfs API directly instead of drm middle layer.

This also includes following debugfs file output changes:
1 amdgpu_evict_vram/amdgpu_evict_gtt output will not contain any braces.
  e.g. (0) --> 0
2 amdgpu_gpu_recover output will print return value of
  amdgpu_device_gpu_recover() instead of not so important "gpu recover"
  message.

v2: * checkpatch.pl: use '0444' instead of S_IRUGO.
    * remove S_IFREG from mode.
    * remove mode variable.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Nirmoy Das
88293c03c8 drm/amdgpu: do not keep debugfs dentry
Cleanup unnecessary debugfs dentries and surrounding functions.

v3: remove return value check for debugfs_create_file()
v2: remove ttm_debugfs_entries array.
    do not init variables.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-18 16:43:09 -05:00
Jiawei Gu
006cc1a213 drm/amdgpu: extend MAX_KIQ_REG_TRY to 1000
Extend retry times of KIQ to avoid starvation situation caused by
long time full access of GPU by other VFs.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:48:49 -05:00
Alex Deucher
af484df800 drm/amdgpu: add generic pci reset as an option
This allows us to use generic PCI reset mechanisms (FLR, SBR) as
a reset mechanism to verify that the generic PCI reset mechanisms
are working properly.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:54 -05:00
Wayne Lin
11f1a5538b drm/amdgpu: Add otg vertical IRQ Source
[Why & How]
In order to get appropriate timing for registers which
read/write is vertical line sensitive, add new IRQ source variable.
This interrupt is triggered by specific vertical line,

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:55 -05:00
Christian König
8af8a109b3 drm/ttm: device naming cleanup
Rename ttm_bo_device to ttm_device.
Rename ttm_bo_driver to ttm_device_funcs.
Rename ttm_bo_global to ttm_global.

Move global and device related functions to ttm_device.[ch].

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/415222/
2021-01-21 14:51:45 +01:00
Dave Airlie
2ce542e517 Merge tag 'amd-drm-next-5.12-2021-01-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.12-2021-01-08:

amdgpu:
- Rework IH ring handling on vega and navi
- Rework HDP handling for vega and navi
- swSMU documenation updates
- Overdrive support for Sienna Cichlid and newer asics
- swSMU updates for vangogh
- swSMU updates for renoir
- Enable FP16 on DCE8-11
- Misc code cleanups and bug fixes

radeon:
- Fixes for platforms that can't access PCI resources correctly
- Misc code cleanups

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108221811.3868-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-01-15 09:05:32 +10:00
Daniel Vetter
18589d74f4 drm-misc-next for v5.12:
UAPI Changes:
 - Not necessarily one, but we document that userspace needs to force probe connectors.
 
 Cross-subsystem Changes:
 - Require FB_ATY_CT for aty on sparc64.
 - video: Fix documentation, and a few compiler warnings.
 - Add devicetree bindings for DP connectors.
 - dma-buf: Update kernel-doc, and add might_lock for resv objects in begin/end_cpu_access.
 
 Core Changes:
 - ttm: Warn when releasing a pinned bo.
 - ttm: Cleanup bo size handling.
 - cma-helper: Remove prime infix, and implement mmap as GEM CMA functions.
 - Split drm_prime_sg_to_page_addr_arrays into 2 functions.
 - Add a new api to install irq using devm.
 - Update panel kerneldoc to inline style.
 - Add DP support to drm/bridge.
 - Assorted small fixes to ttm, fb-helper, scheduler.
 - Add atomic_commit_setup function callback.
 - Automatically use the atomic gamma_set, instead of forcing drivers to declare the default atomic version.
 - Allow using degamma for legacy gamma if gamma is not available.
 - Clarify that primary/cursor planes are not tied to 1 crtc (depending on possible_crtcs).
 - ttm: Cleanup the lru handler.
 
 Driver Changes:
 - Add pm support to ingenic.
 - Assorted small fixes in radeon, via, rockchip, omap2fb, kmb, gma500, nouveau, virtio, hisilicon, ingenic, s6e63m0 panel, ast, udlfb.
 - Add BOE NV110WTM-N61, ys57pss36bh5gq, Khadas TS050 panels.
 - Stop using pages with drm_prime_sg_to_page_addr_arrays, and switch all callers to use ttm_sg_tt_init.
 - Cleanup compiler and docbook warnings in a lot of fbdev devices.
 - Use the drmm_vram_helper in hisilicon.
 - Add support for BCM2711 DSI1 in vc4.
 - Add support for 8-bit delta RGB panels to ingenic.
 - Add documentation on how to test vkms.
 - Convert vc4 to atomic helpers.
 - Use degamma instead of gamma table in omap, to add support for CTM and color encoding/range properties.
 - Rework omap DSI code, and merge all omapdrm modules now that the last omap panel is now a drm panel.
 - More refactoring of omap dsi code.
 - Enable 10/12 bpc outputs in vc4.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl/bLtAACgkQ/lWMcqZw
 E8M5mBAAs6iA3giF+LrzQzszZqOFmtExwjHsvJXDtiINrVubdRov2XbbaOZ+8Eax
 Cnnc5QzokJV8v1IReImz+cP7i4uXyWmwQsthY2WvHYLXZmZfZUV+rK4cCtJIKXlR
 9EvkEFtdPZ6OpyJ8p1G0r/UNqXxqOl4pEhp6NZdSj1mscz2yux/BuqX+1tX1j5+G
 SIfwcKIY+nIwnVpKwhJplJNfFwthRzENxq2KMt+PGLcpOWLHEgwqzKoR+ddTcKf2
 Nk1w2OERbnBVRt8NzCIeyczlSp4QuB+NpE5mFEBedmLatGYqwAIwPMt22AVRP9rc
 SdXqjVxg3UJ5UNMYXQwjrnJq1nIm1ViYt7za/e2aMC8g9oI7tNQ5uI7XViO6+tk5
 3vQV40pHVfMLGzkzqGhv9qHYAkkc+gPMAXwObuzmYppVVZn8dnrkAco7Ib8WPrRc
 g8JD0bpIsOrgqCTj2W54fC/ZP7hQZpffOrxWY7alhlriEl4E7To6OvTwEavIzxCm
 ajkO129oItOcATDTGiUV3JRWCTdCj4cNz7cQi4pC2QgaUIy8twjm57V8nnFDUCNg
 ifi1dIY24DOrHxdmZ8sAWZRoPzcl5a84c8EqEkGZieEsdr3Za4XMZD8tqI2uSJrU
 x+OfqgfvqNZenkuc+SUDZJQ855iqdkzJcqZnkAn61VpPh7XCtxk=
 =orEC
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-12-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.12:

UAPI Changes:
- Not necessarily one, but we document that userspace needs to force probe connectors.

Cross-subsystem Changes:
- Require FB_ATY_CT for aty on sparc64.
- video: Fix documentation, and a few compiler warnings.
- Add devicetree bindings for DP connectors.
- dma-buf: Update kernel-doc, and add might_lock for resv objects in begin/end_cpu_access.

Core Changes:
- ttm: Warn when releasing a pinned bo.
- ttm: Cleanup bo size handling.
- cma-helper: Remove prime infix, and implement mmap as GEM CMA functions.
- Split drm_prime_sg_to_page_addr_arrays into 2 functions.
- Add a new api to install irq using devm.
- Update panel kerneldoc to inline style.
- Add DP support to drm/bridge.
- Assorted small fixes to ttm, fb-helper, scheduler.
- Add atomic_commit_setup function callback.
- Automatically use the atomic gamma_set, instead of forcing drivers to declare the default atomic version.
- Allow using degamma for legacy gamma if gamma is not available.
- Clarify that primary/cursor planes are not tied to 1 crtc (depending on possible_crtcs).
- ttm: Cleanup the lru handler.

Driver Changes:
- Add pm support to ingenic.
- Assorted small fixes in radeon, via, rockchip, omap2fb, kmb, gma500, nouveau, virtio, hisilicon, ingenic, s6e63m0 panel, ast, udlfb.
- Add BOE NV110WTM-N61, ys57pss36bh5gq, Khadas TS050 panels.
- Stop using pages with drm_prime_sg_to_page_addr_arrays, and switch all callers to use ttm_sg_tt_init.
- Cleanup compiler and docbook warnings in a lot of fbdev devices.
- Use the drmm_vram_helper in hisilicon.
- Add support for BCM2711 DSI1 in vc4.
- Add support for 8-bit delta RGB panels to ingenic.
- Add documentation on how to test vkms.
- Convert vc4 to atomic helpers.
- Use degamma instead of gamma table in omap, to add support for CTM and color encoding/range properties.
- Rework omap DSI code, and merge all omapdrm modules now that the last omap panel is now a drm panel.
- More refactoring of omap dsi code.
- Enable 10/12 bpc outputs in vc4.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/78381a4f-45fd-aed4-174a-94ba051edd37@linux.intel.com
2021-01-07 10:46:32 +01:00
Likun Gao
455d40c927 drm/amdgpu: switch hdp callback functions for hdp v4
Switch to use the HDP functions which unified on hdp structure instead of
the scattered hdp callback functions.
V2: clean up hdp reset ras error count function.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-05 11:33:01 -05:00
Hawking Zhang
b291a3872b drm/amdgpu: add amdgpu_hdp structure
amdgpu_hdp hold all the callbacks for hdp

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-05 11:32:41 -05:00
Daniel Vetter
efd3043790 Merge tag 'amd-drm-fixes-5.11-2020-12-16' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-fixes-5.11-2020-12-16:

amdgpu:
- Fix a eDP regression for DCE asics
- SMU fixes for sienna cichlid
- Misc W=1 fixes
- SDMA 5.2 reset fix
- Suspend/resume fix
- Misc display fixes
- Misc runtime PM fixes and cleanups
- Dimgrey Cavefish fixes
- printk cleanup
- Documentation warning fixes

amdkfd:
- Error logging fix
- Fix pipe offset calculation

radeon:
- printk cleanup

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216192421.18627-1-alexander.deucher@amd.com
2020-12-16 23:25:51 +01:00
Alex Deucher
b10c1c5b3a drm/amdgpu: add check for ACPI power resources
Check if the device has ACPI power resources so we can
enable runtime pm if so.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-15 11:35:12 -05:00
Alex Deucher
fd496ca892 drm/amdgpu: split BOCO and ATPX handling
In preparation for systems that support d3cold on dGPUs
independent of PX/HG.  No functional change intended.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-15 11:35:08 -05:00
Likun Gao
9ca5b8a170 drm/amdgpu: add judgement for suspend/resume sequence
S0ix only makes sense on APUs since they are part of the platform, so
only when the ASIC is APU should set amdgpu_acpi_is_s0ix_supported flag
to deal with the related situation.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-15 11:32:46 -05:00
Maarten Lankhorst
ae75a0431f Merge drm/drm-next into drm-misc-next
Required backmerge since we will be based on top of v5.11, and there
has been a request to backmerge already to upstream some features.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2020-12-15 11:05:43 +01:00
Daniel Vetter
5fbd41d3bf drm-misc-next for 5.11:
UAPI Changes:
 
 Cross-subsystem Changes:
 
  * char/agp: Disable frontend without CONFIG_DRM_LEGACY
  * mm: Fix fput in mmap error path; Introduce vma_set_file() to change
    vma->vm_file
 
 Core Changes:
 
  * dma-buf: Use sgtables in system heap; Move heap helpers to CMA-heap code;
    Skip sync for unmapped buffers; Alloc higher order pages is available;
    Respect num_fences when initializing shared fence list
  * doc: Improvements around DRM modes and SCALING_FILTER
  * Pass full state to connector atomic functions + callee updates
  * Cleanups
  * shmem: Map pages with caching by default; Cleanups
  * ttm: Fix DMA32 for global page pool
  * fbdev: Cleanups
  * fb-helper: Update framebuffer after userspace writes; Unmap console buffer
    during shutdown; Rework damage handling of shadow framebuffer
 
 Driver Changes:
 
  * amdgpu: Multi-hop fixes, Clenaups
  * imx: Fix rotation for Vivante tiled formats; Support nearest-neighour
    skaling; Cleanups
  * mcde: Fix RGB formats; Support DPI output; Cleanups
  * meson: HDMI clock fixes
  * panel: Add driver and bindings for Innolux N125HCE-GN1
  * panel/s6e63m0: More backlight levels; Fix init; Cleanups
  * via: Clenunps
  * virtio: Use fence ID for handling fences; Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl/AuLYACgkQaA3BHVML
 eiMzvQgAmE1FGqUSNPGFjWpWN+hYDqEnpa3MXBVZ4sOqXxmzaI+SdnmexsaQ6v+F
 37VSlk3OFa69gYV6kQp9NbfYc0E7TZ9CusU6WNcxo36GpInsWmuIHJI3BhlCgpKm
 hDDRylv5c4+e6RlGaz+yS73zkddmrMPi4HqarMP+8c2+UJxaDe1Huv22MKE7PJK3
 noBck4Afelk51kn9bpeGj+WXvpEY+CMzXA7cvx1U26INZZyIdontZXJEeN6hAxzP
 yyhzOWPMH1X9Wfe8Qj1Ua9r27btG/0SKfhseovLU9Cjm1FGBjRiAnkxZua/ScwIL
 H7JcYplEkiLXuxnDAJOAZkvF6EcaJQ==
 =TatG
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-11-27-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.11:

UAPI Changes:

Cross-subsystem Changes:

 * char/agp: Disable frontend without CONFIG_DRM_LEGACY
 * mm: Fix fput in mmap error path; Introduce vma_set_file() to change
   vma->vm_file

Core Changes:

 * dma-buf: Use sgtables in system heap; Move heap helpers to CMA-heap code;
   Skip sync for unmapped buffers; Alloc higher order pages is available;
   Respect num_fences when initializing shared fence list
 * doc: Improvements around DRM modes and SCALING_FILTER
 * Pass full state to connector atomic functions + callee updates
 * Cleanups
 * shmem: Map pages with caching by default; Cleanups
 * ttm: Fix DMA32 for global page pool
 * fbdev: Cleanups
 * fb-helper: Update framebuffer after userspace writes; Unmap console buffer
   during shutdown; Rework damage handling of shadow framebuffer

Driver Changes:

 * amdgpu: Multi-hop fixes, Clenaups
 * imx: Fix rotation for Vivante tiled formats; Support nearest-neighour
   skaling; Cleanups
 * mcde: Fix RGB formats; Support DPI output; Cleanups
 * meson: HDMI clock fixes
 * panel: Add driver and bindings for Innolux N125HCE-GN1
 * panel/s6e63m0: More backlight levels; Fix init; Cleanups
 * via: Clenunps
 * virtio: Use fence ID for handling fences; Cleanups

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127083055.GA29139@linux-uq9g
2020-12-15 10:21:48 +01:00
Christian König
5cf8290426 drm/ttm/drivers: remove unecessary ttm_module.h include v2
ttm_module.h deals with internals of TTM and should never
be include outside of it.

v2: also move the file around

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/404885/
2020-12-01 17:43:46 +01:00
Luben Tuikov
b1246bd4a1 drm/amdgpu: Fix missing prototype warning
Fix a missing prototype warning for function
amdgpu_info_ioctl(),

drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:482:5: warning: no previous prototype for 'amdgpu_info_ioctl' [-Wmissing-prototypes]

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110051548.685725-1-luben.tuikov@amd.com
2020-11-18 11:32:26 -05:00
Prike Liang
4cd078dc65 drm/amdgpu: add s0i3 capacity check for s0i3 routine (v2)
add amdgpu_acpi_is_s0ix_supported() to check the platform
whether support s0i3.

v2: fix empty function parameters warning (void)

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 17:29:45 -05:00
Hawking Zhang
293f256396 drm/amdgpu: add amdgpu_smuio structure
Add amdgpu_smuio structure in amdgpu_device to
provide various callback functions to support
smuio ip funcitonality

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 00:13:08 -05:00
Lee Jones
02f40f82c4 gpu: drm: amd: amdgpu: amdgpu: Mark global variables as __maybe_unused
These 3 variables are used in *some* sourcefiles which include
amdgpu.h, but not *all*.  This leads to a flurry of build warnings.

Fixes the following W=1 kernel build warning(s):

 from drivers/gpu/drm/amd/amdgpu/amdgpu.h:67,
 drivers/gpu/drm/amd/amdgpu/amdgpu.h:198:19: warning: ‘no_system_mem_limit’ defined but not used [-Wunused-const-variable=]
 drivers/gpu/drm/amd/amdgpu/amdgpu.h:197:19: warning: ‘debug_evictions’ defined but not used [-Wunused-const-variable=]
 drivers/gpu/drm/amd/amdgpu/amdgpu.h:196:18: warning: ‘sched_policy’ defined but not used [-Wunused-const-variable=]

 NB: Repeats ~650 times - snipped for brevity.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 00:03:37 -05:00
Evan Quan
73275181f6 drm/amd/pm: correct the checks for polaris kickers
By defining new Macros.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-27 11:56:42 -04:00
Evan Quan
f2b75bc24d drm/amd/pm: correct gfx and pcie settings on umd pstate switching(V2)
For entering UMD stable Pstate, the operations to enter rlc_safe
mode, disable mgcg_perfmon and disable PCIE aspm are needed. And
the opposite operations should be performed on UMD stable Pstate
exiting.

V2: take those ASICs(CI/SI/VI) which may not support this into
    consideration

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-15 12:20:46 -04:00
Jonathan Kim
b4a7db71ea drm/amdgpu: add per device user friendly xgmi events for vega20
Non-outbound data metrics are non useful so mark them as legacy.
Bucket new perf counters into device and not device ip.
Bind events to chip instead of IP.
Report available event counters and not number of hw counter banks.
Move DF public macros to private since not needed outside of IP version.

v5: cleanup by moving per chip configs into structs

v4: After more discussion, replace *_LEGACY references with IP references
to indicate concept of pmu-typed versus event-config-typed event
registration.

v3: attr groups const array is global but attr groups are allocated per
device which doesn't work and causes problems on memory allocation and
de-allocation for pmu unregister. Switch to building const attr groups
per pmu instead to simplify solution.

v2: add comments on sysfs structure and formatting.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07 14:44:33 -04:00
Hawking Zhang
f7ee1874b0 drm/amdgpu: support indirect access reg outside of mmio bar (v2)
support both direct and indirect accessor in unified
helper functions.

v2: Retire indirect mmio access via mm_index/data

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:42:55 -04:00
Hawking Zhang
1bba36834c drm/amdgpu: add helper function for indirect reg access (v3)
Add helper function in order to remove RREG32/WREG32
in current pcie_rreg/wreg function for soc15 and
onwards adapters.
PCIE_INDEX/DATA pairs are used to access regsiters
outside of mmio bar in the helper functions.
The new helper functions help remove the recursion
of amdgpu_mm_rreg/wreg from pcie_rreg/wreg and
provide the oppotunity to centralize direct and
indirect access in a single function.

v2: Fixed typo and refine the comments

v3: Remove unnecessary volatile local variable

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-01 10:42:13 -04:00
Oak Zeng
8ffff9b449 drm/amdgpu: use function pointer for gfxhub functions
gfxhub functions are now called from function pointers,
instead of from asic-specific functions.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:50:13 -04:00
Andrey Grodzovsky
c1dd4aa624 drm/amdgpu: Fix consecutive DPC recovery failures.
Cache the PCI state on boot and before each case where we might
loose it.

v2: Add pci_restore_state while caching the PCI state to avoid
breaking PCI core logic for stuff like suspend/resume.

v3: Extract pci_restore_state from amdgpu_device_cache_pci_state
to avoid superflous restores during GPU resets and suspend/resumes.

v4: Style fixes.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:04 -04:00
Andrey Grodzovsky
bf36b52e78 drm/amdgpu: Avoid accessing HW when suspending SW state
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:39 -04:00
Andrey Grodzovsky
c9a6b82f45 drm/amdgpu: Implement DPC recovery
Add PCI Downstream Port Containment (DPC) with
basic recovery functionality

v2: remove pci_save_state to avoid breaking suspend/resume
v3: Fix style comments
v4: Improve description.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:32 -04:00
Stanley.Yang
5436ab94cd drm/amdkfd: fix set kfd node ras properties value
The ctx->features are new RAS implementation which
is only available for Vega20 and onwards, it is not
available for vega10, vega10 should follow legacy
ECC implementation.

Changed from V1:
    wrap function to initialize kfd node properties

Changed from V2:
    remove wrap function and SDMA SRAM ECC check

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Alex Deucher
9737a923c9 drm/amdgpu: add an asic callback for pre asic init
This callback can be used by asics that need to
do something special prior to calling atom asic init.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Luben Tuikov
8aba21b751 drm/amdgpu: Embed drm_device into amdgpu_device (v3)
a) Embed struct drm_device into struct amdgpu_device.
b) Modify the inline-f drm_to_adev() accordingly.
c) Modify the inline-f adev_to_drm() accordingly.
d) Eliminate the use of drm_device.dev_private,
   in amdgpu.
e) Switch from using drm_dev_alloc() to
   drm_dev_init().
f) Add a DRM driver release function, which frees
   the container amdgpu_device after all krefs on
   the contained drm_device have been released.

v2: Split out adding adev_to_drm() into its own
    patch (previous commit), making this patch
    more succinct and clear. More detailed commit
    description.
v3: squash in fix to call drmm_add_final_kfree()
    to avoid a warning.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:17 -04:00
Luben Tuikov
4a580877bd drm/amdgpu: Get DRM dev from adev by inline-f
Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Luben Tuikov
1348969ab6 drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Dennis Li
d95e8e97e2 drm/amdgpu: refine create and release logic of hive info
Change to dynamically create and release hive info object,
which help driver support more hives in the future.

v2:
Change to save hive object pointer in adev, to avoid locking
xgmi_mutex every time when calling amdgpu_get_xgmi_hive.

v3:
1. Change type of hive object pointer in adev from void* to
amdgpu_hive_info*.
2. remove unnecessary variable initialization.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:24:14 -04:00
Dennis Li
6049db43d6 drm/amdgpu: change reset lock from mutex to rw_semaphore
clients don't need reset-lock for synchronization when no
GPU recovery.

v2:
change to return the return value of down_read_killable.

v3:
if GPU recovery begin, VF ignore FLR notification.

Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:23:48 -04:00
Dennis Li
53b3f8f40e drm/amdgpu: refine codes to avoid reentering GPU recovery
if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:22:56 -04:00
jqdeng
9a1cddd637 drm/amdgpu: Fix repeatly flr issue
Only for no job running test case need to do recover in
flr notification.
For having job in mirror list, then let guest driver to
hit job timeout, and then do recover.

Signed-off-by: jqdeng <Emily.Deng@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:22:02 -04:00
Christian König
f1403342eb drm/amdgpu: revert "fix system hang issue during GPU reset"
The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa2.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Philip Yang
b80f050ff2 drm/amdkfd: option to disable system mem limit
If multiple process share system memory through /dev/shm, KFD allocate
memory should not fail if it reaches the system memory limit because
one copy of physical system memory are shared by multiple process.

Add module parameter no_system_mem_limit to provide user option to
disable system memory limit check at runtime using sysfs or during
driver module init using kernel boot argument. By default the system
memory limit is on.

Print out debug message to warn user if KFD allocate memory failed
because system memory reaches limit.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-06 15:43:08 -04:00
Alex Deucher
87ded5caee drm/amdgpu: move vram usage by vbios to mman (v2)
It's related to the memory manager so move it there.

v2: inline the structure

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04 17:29:29 -04:00
Alex Deucher
72de33f8f7 drm/amdgpu: move IP discovery data to mman
It's related to the memory manager so move it there.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04 17:29:29 -04:00
Alex Deucher
fcbc92e2e1 drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc
Since that is where we store the other data related to
the stolen vga memory.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04 17:29:28 -04:00
Alex Deucher
81b54fb7a2 drm/amdgpu: use a define for the memory size of the vga emulator
Rather than open coding it everywhere.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04 17:29:28 -04:00
Monk Liu
a300de40f6 drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5)
what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

v4:
fix indentation
some cleanupsin the gfx_compute_queue_acquire()

v5:
further fix on indentations
more cleanupsin gfx_compute_queue_acquire()

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04 17:27:29 -04:00
Guchun Chen
acc0204cdb drm/amdgpu: add bad page count threshold in module parameter(v3)
bad_page_threshold could be configured to enable/disable the
associated bad page retirement feature in RAS.

When it's -1, ras will use typical bad page failure value to
handle bad page retirement.

When it's 0, disable bad page retirement, and no bad page
will be recorded and saved.

For other valid value, driver will use this manual value
as the threshold value of totoal bad pages.

v2: correct documentation of this parameter.

v3: remove confused statement in documentation.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04 17:25:41 -04:00
Dennis Li
df9c8d1aa2 drm/amdgpu: fix system hang issue during GPU reset
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249]  dump_stack+0x98/0xd5
[ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831]  ksys_ioctl+0x98/0xb0
[ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
[ 1230.205174]  do_syscall_64+0x5f/0x250
[ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Suggested-by: Luben Tukov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27 16:21:37 -04:00
Wenhui Sheng
273da6ff7c drm/amdgpu: add module parameter choose reset mode
Default value is auto, doesn't change
original reset method logic.

v2: change to use parameter reset_method
v3: add warn msg if specified mode isn't supported

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15 12:42:01 -04:00
Hawking Zhang
e78b579d2d Revert "drm/amdgpu: support access regs outside of mmio bar"
This reverts commit 2eee0229f6.
Fallback to a stable base until we have a correct new one

Signed-off-by:Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-02 12:02:56 -04:00
Alex Jivin
fb40bceb6c drm/amdgpu: SI support for VCE clock control
Port functionality from the Radeon driver to support
VCE clock control.

Signed-off-by: Alex Jivin <alex.jivin@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-02 12:02:49 -04:00
Felix Kuehling
b205795677 drm/amdkfd: Add eviction debug messages
Use WARN to print messages with backtrace when evictions are triggered.
This can help determine the root cause of evictions and help spot driver
bugs triggering evictions unintentionally, or help with performance tuning
by avoiding conditions that cause evictions in a specific workload.

The messages are controlled by a new module parameter that can be changed
at runtime:

  echo Y > /sys/module/amdgpu/parameters/debug_evictions
  echo N > /sys/module/amdgpu/parameters/debug_evictions

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01 01:59:21 -04:00
Dan Carpenter
8df1a28f41 drm/amdgpu: Fix a buffer overflow handling the serial number
The comments say that the serial number is a 16-digit HEX string so the
buffer needs to be at least 17 characters to hold the NUL terminator.

The other issue is that "size" returned from sprintf() is the number of
characters before the NUL terminator so the memcpy() wasn't copying the
terminator.  The serial number needs to be NUL terminated so that it
doesn't lead to a read overflow in amdgpu_device_get_serial_number().
Also it's just cleaner and faster to sprintf() directly to adev->serial[]
instead of using a temporary buffer.

Fixes: 81a1624111 ("drm/amdgpu: Add unique_id and serial_number for Arcturus v3")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
2020-07-01 01:59:19 -04:00
Likun Gao
72d208c23c drm/amdgpu: remove unnecessary check for mem train
a.Check whether mem train support when try to reserve related memory.
b.Remove ASIC check and atom firmware table version check as the check
of firmware capability is enough to achieve that purpose.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01 01:59:14 -04:00
Evan Quan
b265bdbd9f drm/amdgpu: added a sysfs interface for thermal throttling related V4
User can check and set the enablement of throttling logging and
the interval between each logging.

V2: simplify the sysfs interface(no string parsing)
V3: add proper lock protection on updating throttling_logging_rs.interval
V4: documentation cosmetic per Luben's suggestion

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-29 13:55:07 -04:00
Alex Deucher
54f78a7655 drm/amdgpu: add apu flags (v2)
Add some APU flags to simplify handling of different APU
variants.  It's easier to understand the special cases
if we use names flags rather than checking device ids and
silicon revisions.

v2: rebase on latest code

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-22 13:41:53 -04:00
Harry Wentland
8a791dabea drm/amd/display: Add DC Debug mask to disable features for bringup
[Why]
At bringup we want to be able to disable various power features.

[How]
These features are already exposed as dc_debug_options and exercised
on other OSes. Create a new dc_debug_mask module parameter and expose
relevant bits, in particular
 * DC_DISABLE_PIPE_SPLIT
 * DC_DISABLE_STUTTER
 * DC_DISABLE_DSC
 * DC_DISABLE_CLOCK_GATING

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21 12:37:19 -04:00
Jiange Zhao
728e7e0cd6 drm/amdgpu: Add autodump debugfs node for gpu reset v8
When GPU got timeout, it would notify an interested part
of an opportunity to dump info before actual GPU reset.

A usermode app would open 'autodump' node under debugfs system
and poll() for readable/writable. When a GPU reset is due,
amdgpu would notify usermode app through wait_queue_head and give
it 10 minutes to dump info.

After usermode app has done its work, this 'autodump' node is closed.
On node closure, amdgpu gets to know the dump is done through
the completion that is triggered in release().

There is no write or read callback because necessary info can be
obtained through dmesg and umr. Messages back and forth between
usermode app and amdgpu are unnecessary.

v2: (1) changed 'registered' to 'app_listening'
    (2) add a mutex in open() to prevent race condition

v3 (chk): grab the reset lock to avoid race in autodump_open,
          rename debugfs file to amdgpu_autodump,
          provide autodump_read as well,
          style and code cleanups

v4: add 'bool app_listening' to differentiate situations, so that
    the node can be reopened; also, there is no need to wait for
    completion when no app is waiting for a dump.

v5: change 'bool app_listening' to 'enum amdgpu_autodump_state'
    add 'app_state_mutex' for race conditions:
	(1)Only 1 user can open this file node
	(2)wait_dump() can only take effect after poll() executed.
	(3)eliminated the race condition between release() and
	   wait_dump()

v6: removed 'enum amdgpu_autodump_state' and 'app_state_mutex'
    removed state checking in amdgpu_debugfs_wait_dump
    Improve on top of version 3 so that the node can be reopened.

v7: move reinit_completion into open() so that only one user
    can open it.

v8: remove complete_all() from amdgpu_debugfs_wait_dump().

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18 11:23:37 -04:00
Evan Quan
85625e6429 drm/amdgpu: enable hibernate support on Navi1X
BACO is needed to support hibernate on Navi1X.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08 14:32:16 -04:00
Hawking Zhang
e0c116c190 drm/amdgpu: re-structue members for ip discovery
This is to prepare for initializing discovery tmr size per
ASIC type

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-01 15:19:07 -04:00
Christian König
9ecefb19c3 drm/amdgpu: cleanup IB pool handling a bit
Fix the coding style, move and rename the definitions to
better match what they are supposed to be doing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 16:20:30 -04:00
Luben Tuikov
c6252390fc drm/amdgpu: implement TMZ accessor (v3)
Implement an accessor of adev->tmz.enabled. Let not
code around access it as "if (adev->tmz.enabled)"
as the organization may change. Instead...

Recruit "bool amdgpu_is_tmz(adev)" to return
exactly this Boolean value. That is, this function
is now an accessor of an already initialized and
set adev and adev->tmz.

Add "void amdgpu_gmc_tmz_set(adev)" to check and
set adev->gmc.tmz_enabled at initialization
time. After which one uses "bool
amdgpu_is_tmz(adev)" to query whether adev
supports TMZ.

Also, remove circular header file include.

v2: Remove amdgpu_tmz.[ch] as requested.
v3: Move TMZ into GMC.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 16:20:29 -04:00
Huang Rui
ae60305ac0 drm/amdgpu: add amdgpu_tmz data structure
This patch to add amdgpu_tmz structure which stores all tmz related fields.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 16:20:28 -04:00
Huang Rui
d7ccb38df5 drm/amdgpu: add tmz feature parameter (v2)
This patch adds tmz parameter to enable/disable
the feature in the amdgpu kernel module. Nomally,
by default, it should be auto (rely on the
hardware capability).

But right now, it need to set "off" to avoid
breaking other developers' work because it's not
totally completed.

Will set "auto" till the feature is stable and
completely verified.

v2: add "auto" option for future use.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 16:20:28 -04:00
Yintian Tao
5420819401 drm/amdgpu: request reg_val_offs each kiq read reg
According to the current kiq read register method,
there will be race condition when using KIQ to read
register if multiple clients want to read at same time
just like the expample below:
1. client-A start to read REG-0 throguh KIQ
2. client-A poll the seqno-0
3. client-B start to read REG-1 through KIQ
4. client-B poll the seqno-1
5. the kiq complete these two read operation
6. client-A to read the register at the wb buffer and
   get REG-1 value

Therefore, use amdgpu_device_wb_get() to request reg_val_offs
for each kiq read register.

v2: fix the error remove
v3: fix the print typo
v4: remove unused variables

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23 15:06:41 -04:00
Kevin Wang
e05185b341 drm/amdgpu: clean up unused variable about ring lru
clean up unused variable:
1. ring_lru_list
2. ring_lru_list_lock

related-commit:
drm/amdgpu: remove ring lru handling

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:49 -04:00
Jonathan Kim
d84a430d9f drm/amdgpu: fix race between pstate and remote buffer map
Vega20 arbitrates pstate at hive level and not device level. Last peer to
remote buffer unmap could drop P-State while another process is still
remote buffer mapped.

With this fix, P-States still needs to be disabled for now as SMU bug
was discovered on synchronous P2P transfers.  This should be fixed in the
next FW update.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:46 -04:00
Aurabindo Pillai
539489fc91 drm/amd/amdgpu: add print prefix for dev_* variants
Define dev_fmt macro for informative print messages

Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13 12:02:37 -04:00
Aurabindo Pillai
d57229b1da drm/amd/amdgpu: add prefix for pr_* prints
amdgpu uses lots of pr_* calls for printing error messages.
With this prefix, errors shall be more obvious to the end
use regarding its origin, and may help debugging.

Prefix format:

[xxx.xxxxx] amdgpu: ...

Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-13 12:02:35 -04:00
Hawking Zhang
2eee0229f6 drm/amdgpu: support access regs outside of mmio bar
add indirect access support to registers outside of
mmio bar.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:18 -04:00
Hawking Zhang
f384ff95f6 drm/amdgpu: retire AMDGPU_REGS_KIQ flag
all the register access through kiq is redirected
to amdgpu_kiq_rreg/amdgpu_kiq_wreg

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:18 -04:00
Hawking Zhang
ec59847e74 drm/amdgpu: retire RREG32_IDX/WREG32_IDX
those are not needed anymore

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:18 -04:00
Hawking Zhang
dec0520aff drm/amdgpu: remove inproper workaround for vega10
the workaround is not needed for soc15 ASICs except
for vega10. it is even not needed with latest vega10
vbios.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:18 -04:00
Nirmoy Das
1c6d567bdf drm/amdgpu: rework sched_list generation
Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.

v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum

v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list

v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:14 -04:00
xinhui pan
c8e42d5785 drm/amdgpu: implement more ib pools (v2)
We have three ib pools, they are normal, VM, direct pools.

Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.

Any jobs schedule direct VM update IBs should use VM pool.

Any other jobs use NORMAL pool.

v2: squash in coding style fix

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01 14:44:44 -04:00
Kent Russell
bd607166af drm/amdgpu: Enable reading FRU chip via I2C v3
Allow for reading of information like manufacturer, product number
and serial number from the FRU chip. Report the serial number as
the new sysfs file serial_number. Note that this only works on
server cards, as consumer cards do not feature the FRU chip, which
contains this information.

v2: Add documentation to amdgpu.rst, add helper functions,
    rename functions for consistency, fix bad starting offset
v3: Remove testing definitions

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01 14:44:41 -04:00
Monk Liu
2e0cc4d48b drm/amdgpu: revise RLCG access path
what changed:
1)provide new implementation interface for the rlcg access path
2)put SQ_CMD/SQ_IND_INDEX to GFX9 RLCG path to let debugfs's reg_op
function can access reg that need RLCG path help

now even debugfs's reg_op can used to dump wave.

tested-by: Monk Liu <monk.liu@amd.com>
tested-by: Zhou pengju <pengju.zhou@amd.com>
Signed-off-by: Zhou pengju <pengju.zhou@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-16 16:17:55 -04:00
Hawking Zhang
4a89ad9b39 drm/amdgpu: add reset_ras_error_count function for HDP
HDP ras error counters are dirty ones after cold reboot
Read operation is needed to reset them to 0

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-05 00:32:54 -05:00
Dave Airlie
a2ae604da7 Merge tag 'amd-drm-next-5.7-2020-02-26' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.7-2020-02-26:

amdgpu:
- Rework VM update handling in preparation for HMM support
- HDCP srm support
- PSR fixes
- DC watermark fixes
- OLED panel support
- SR-IOV fixes
- BACO fixes
- Optimize debugging vram access
- RAS fixes
- Use BACO for runtime pm
- HDCP fixes
- XGMI fixes
- DDC fixes
- DC clock programming optimizations and fixes
- PSP fw loading sequence updates
- Drop DRIVER_USE_AGP
- Remove legacy drm load and unload callbacks

amdkfd:
- Add runtime pm support

radeon:
- Drop DRIVER_USE_AGP

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227043142.4075-1-alexander.deucher@amd.com
2020-02-28 15:40:26 +10:00
Maxime Ripard
28f2aff1ca Linux 5.6-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5JsVQeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGHZEH+wddtJO4dZk5TZdF
 KZB2w2ldsCvrBZAmas1TVvm8ncvUf+ATUZcSIzvZ3YbLmxsuLF2Cz3kD+n+36Mvy
 ejwq8Scl7jwnouYps/Gfd6rRj/uCafqST4qp15GMGeiy2ST4A8dJrv5IAgZhD8/N
 SN1bSr1AXpZ2JlEzzLDQ/NdVoNMS6IzCOsaINZcc60/XQoQZFRBWamMJFqu+CmXD
 SBJOybQNFJhziy45cGZSAl+67sSCcoPftwTs0Stu4CJsvFWRb3MsbNTDS51Hjcc4
 3tdgGOhoNXzyzZr96MEAHmiaW4VLQv0PGgUfOajE35viMz48OrwCTru8Kbuae3XM
 YHV4qJk=
 =yiem
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXkpecgAKCRDj7w1vZxhR
 xU4CAQD/8ZzfYroiiEI7FHEgvnuldGYZqA9kbNAO2Vueeac0OgD+Jojewdxy+pJ9
 dxfA/POdtzx3hOdN+U1YgaDC0hbXwww=
 =XgKq
 -----END PGP SIGNATURE-----

Merge v5.6-rc2 into drm-misc-next

Lyude needs some patches in 5.6-rc2 and we didn't bring drm-misc-next
forward yet, so it looks like a good occasion.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-02-17 10:34:34 +01:00
Thomas Zimmermann
e3eff4b5d9 drm/amdgpu: Convert to CRTC VBLANK callbacks
VBLANK callbacks in struct drm_driver are deprecated in favor of
their equivalents in struct drm_crtc_funcs. Convert amdgpu over.

v2:
	* don't wrap existing functions; change signature instead

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-6-tzimmermann@suse.de
2020-02-13 13:08:13 +01:00
Alex Deucher
f0f7ddfc34 drm/amdgpu: add flag for runtime suspend
So we know whether we in are in runtime suspend or
system suspend.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-11 11:51:52 -05:00
chen gong
c68dbcd8f9 drm/amdgpu: add kiq version interface for RREG32/WREG32
Reading some registers by mmio will result in hang when GPU is in
"gfxoff" state.This problem can be solved by GPU in "ring command
packages" way.

Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:34:22 -05:00
Joseph Greathouse
bdf84a80e0 drm/amdgpu: Create generic DF struct in adev
The only data fabric information the adev struct currently
contains is a function pointer table. In the near future,
we will be adding some cached DF information into adev. As
such, this patch creates a new amdgpu_df struct for adev.
Right now, it only containst the old function pointer table,
but new stuff will be added soon.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:41 -05:00
Tianci.Yin
8d40002fee drm/amdgpu: update the method to get fb_loc of memory training(V4)
The method of getting fb_loc changed from parsing VBIOS to
taking certain offset from top of VRAM

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:59:20 -05:00
Andrey Grodzovsky
041a62bc06 drm/amdgpu: reverts commit ce316fa55e.
In preparation for doing XGMI reset synchronization using task barrier.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Daniel Vetter
be452c4e8d Merge tag 'drm-next-5.6-2019-12-11' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.6-2019-12-11:

amdgpu:
- Add MST atomic routines
- Add support for DMCUB (new helper microengine for displays)
- Add OEM i2c support in DC
- Use vstartup for vblank events on DCN
- Simplify Kconfig for DC
- Renoir fixes for DC
- Clean up function pointers in DC
- Initial support for HDCP 2.x
- Misc code cleanups
- GFX10 fixes
- Rework JPEG engine handling for VCN
- Add clock and power gating support for JPEG
- BACO support for Arcturus
- Cleanup PSP ring handling
- Add framework for using BACO with runtime pm to save power
- Move core pci state handling out of the driver for pm ops
- Allow guest power control in 1 VF case with SR-IOV
- SR-IOV fixes
- RAS fixes
- Support for power metrics on renoir
- Golden settings updates for gfx10
- Enable gfxoff on supported navi10 skus
- Update MAINTAINERS

amdkfd:
- Clean up generational gfx code
- Fixes for gfx10
- DIQ fixes
- Share more code with amdgpu

radeon:
- PPC DMA fix
- Register checker fixes for r1xx/r2xx
- Misc cleanups

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211223020.7510-1-alexander.deucher@amd.com
2019-12-17 18:47:46 +01:00
Le Ma
ce316fa55e drm/amdgpu: add concurrent baco reset support for XGMI
Currently each XGMI node reset wq does not run in parrallel if bound to same
cpu. Make change to bound the xgmi_reset_work item to different cpus.

XGMI requires all nodes enter into baco within very close proximity before
any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit
baco respectively.

To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved.

The case that PSP reset and baco reset coexist within an XGMI hive never exist
and is not in the consideration.

v2: define use_baco flag to simplify the code for xgmi baco sequence

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05 16:25:53 -05:00
Yintian Tao
7c868b592d drm/amdgpu: not remove sysfs if not create sysfs
When load amdgpu failed before create pm_sysfs and ucode_sysfs,
the pm_sysfs and ucode_sysfs should not be removed.
Otherwise, there will be warning call trace just like below.
[   24.836386] [drm] VCE initialized successfully.
[   24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed
[   25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init
[   25.889575] [drm] amdgpu: finishing device.
[   26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[   26.070110] [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
[   26.200309] [TTM] Finalizing pool allocator
[   26.200314] [TTM] Finalizing DMA pool allocator
[   26.200349] [TTM] Zone  kernel: Used memory at exit: 0 KiB
[   26.200351] [TTM] Zone   dma32: Used memory at exit: 0 KiB
[   26.200353] [drm] amdgpu: ttm finalized
[   26.205329] ------------[ cut here ]------------
[   26.205330] sysfs group 'fw_version' not found for kobject '0000:00:07.0'
[   26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256 sysfs_remove_group+0x80/0x90
[   26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi floppy
[   26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G           OE     5.2.0-rc1 #1
[   26.205370] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   26.205372] RIP: 0010:sysfs_remove_group+0x80/0x90
[   26.205374] Code: e8 35 b9 ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8 f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55
[   26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282
[   26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[   26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97ad6f817380
[   26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09: 00000000000002b3
[   26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12: ffffffffc0e58240
[   26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15: ffff97ad4db7fff0
[   26.205380] FS:  00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000) knlGS:0000000000000000
[   26.205381] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4: 00000000003606f0
[   26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   26.205385] Call Trace:
[   26.205461]  amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu]
[   26.205518]  amdgpu_device_fini+0x3b4/0x560 [amdgpu]
[   26.205573]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
[   26.205623]  amdgpu_driver_load_kms+0xcd/0x250 [amdgpu]
[   26.205637]  drm_dev_register+0x12b/0x1c0 [drm]
[   26.205695]  amdgpu_pci_probe+0x12a/0x1e0 [amdgpu]
[   26.205699]  local_pci_probe+0x47/0xa0
[   26.205701]  pci_device_probe+0x106/0x1b0
[   26.205704]  really_probe+0x21a/0x3f0
[   26.205706]  driver_probe_device+0x11c/0x140
[   26.205707]  device_driver_attach+0x58/0x60
[   26.205709]  __driver_attach+0xc3/0x140

Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02 17:39:05 -05:00
Linus Torvalds
aa32f11691 hmm related patches for 5.5
This is another round of bug fixing and cleanup. This time the focus is on
 the driver pattern to use mmu notifiers to monitor a VA range. This code
 is lifted out of many drivers and hmm_mirror directly into the
 mmu_notifier core and written using the best ideas from all the driver
 implementations.
 
 This removes many bugs from the drivers and has a very pleasing
 diffstat. More drivers can still be converted, but that is for another
 cycle.
 
 - A shared branch with RDMA reworking the RDMA ODP implementation
 
 - New mmu_interval_notifier API. This is focused on the use case of
   monitoring a VA and simplifies the process for drivers
 
 - A common seq-count locking scheme built into the mmu_interval_notifier
   API usable by drivers that call get_user_pages() or hmm_range_fault()
   with the VA range
 
 - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev
   drivers to the new API. This deletes a lot of wonky driver code.
 
 - Two improvements for hmm_range_fault(), from testing done by Ralph
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl3cCjQACgkQOG33FX4g
 mxpp8xAAiR9iOdT28m/tx1GF31XludrMhRZVIiz0vmCIxIiAkWekWEfAEVm9PDnh
 wdrxTJohSs+B65AK3sfToOM3AIuNCuFVWmbbHI5qmOO76vaSvcZa905Z++pNsawO
 Bn8mgRCprYoFHcxWLvTvnA5U0g1S2BSSOwBSZI43CbEnVvHjYAR6MnvRqfGMk+NF
 bf8fTk/x+fl0DCemhynlBLuJkogzoE2Hgl0yPY5bFna4PktOxdpa1yPaQsiqZ7e6
 2s2NtM3pbMBJk0W42q5BU+aPhiqfxFFszasPSLBduXrD2xDsG76HJdHj5VydKmfL
 nelG4BvqJozXTEZWvTEePYhCqaZ41eJZ7Asw8BXtmacVqE5mDlTXo/Zdgbz7yEOR
 mI5MVyjD5rauZJldUOWXbwrPoWVFRvboauehiSgqvxvT9HvlFp9GKObSuu4gubBQ
 mzxs4t48tPhA7bswLmw0/pETSogFuVDfaB7hsyY0gi8EwxMFMpw2qFypm1PEEF+C
 BuUxCSShzvNKrraNe5PWaNNFd3AzIwAOWJHE+poH4bCoXQVr5nA+rq2gnHkdY5vq
 /xrBCyxkf0U05YoFGYembPVCInMehzp9Xjy8V+SueSvCg2/TYwGDCgGfsbe9dNOP
 Bc40JpS7BDn5w9nyLUJmOx7jfruNV6kx1QslA7NDDrB/rzOlsEc=
 =Hj8a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "This is another round of bug fixing and cleanup. This time the focus
  is on the driver pattern to use mmu notifiers to monitor a VA range.
  This code is lifted out of many drivers and hmm_mirror directly into
  the mmu_notifier core and written using the best ideas from all the
  driver implementations.

  This removes many bugs from the drivers and has a very pleasing
  diffstat. More drivers can still be converted, but that is for another
  cycle.

   - A shared branch with RDMA reworking the RDMA ODP implementation

   - New mmu_interval_notifier API. This is focused on the use case of
     monitoring a VA and simplifies the process for drivers

   - A common seq-count locking scheme built into the
     mmu_interval_notifier API usable by drivers that call
     get_user_pages() or hmm_range_fault() with the VA range

   - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen
     GntDev drivers to the new API. This deletes a lot of wonky driver
     code.

   - Two improvements for hmm_range_fault(), from testing done by Ralph"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap
  mm/hmm: make full use of walk_page_range()
  xen/gntdev: use mmu_interval_notifier_insert
  mm/hmm: remove hmm_mirror and related
  drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
  drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
  drm/amdgpu: Call find_vma under mmap_sem
  nouveau: use mmu_interval_notifier instead of hmm_mirror
  nouveau: use mmu_notifier directly for invalidate_range_start
  drm/radeon: use mmu_interval_notifier_insert
  RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
  RDMA/odp: Use mmu_interval_notifier_insert()
  mm/hmm: define the pre-processor related parts of hmm.h even if disabled
  mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
  mm/mmu_notifier: add an interval tree notifier
  mm/mmu_notifier: define the header pre-processor parts even if disabled
  mm/hmm: allow snapshot of the special zero page
2019-11-30 10:33:14 -08:00
Alex Deucher
de185019bc drm/amdgpu: move pci handling out of pm ops
The documentation says the that PCI core handles this
for you unless you choose to implement it.  Just rely
on the PCI core to handle the pci specific bits.

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26 14:51:03 -05:00
Jason Gunthorpe
62914a99de drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.

For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.

Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.

Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-23 19:56:45 -04:00
Alex Deucher
6ae6c7d404 drm/amdgpu: start to disentangle boco from runtime pm
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

We originally only supported runtime pm on PX/HG
laptops so most of the runtime pm code looks for this.
Add a new flag to check for runtime pm enablement and
use this rather than checking for PX/HG.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:53 -05:00
Alex Deucher
361dbd01a1 drm/amdgpu: add helpers for baco entry and exit
BACO - Bus Active, Chip Off

Will be used for runtime pm.  Entry will enter the BACO
state (chip off).  Exit will exit the BACO state (chip on).

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:52 -05:00
Alex Deucher
31af062acf drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2)
BACO - Bus Active, Chip Off
BOCO - Bus Off, Chip Off

To better match what we are checking for and to align with
amdgpu_device_supports_baco.

BOCO is used on PowerXpress/Hybrid Graphics systems and BACO
is used on desktop dGPU boards.

v2: fix typo in documentation

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:50 -05:00
Alex Deucher
a69cba42b1 drm/amdgpu: add a amdgpu_device_supports_baco helper
BACO - Bus Active, Chip Off

To check if a device supports BACO or not.  This will be
used in determining when to enable runtime pm.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:49 -05:00
Alex Deucher
69d5436d4d drm/amdgpu: add asic callback for BACO support
BACO - Bus Active, Chip Off

Used to check whether the device supports BACO.  This will
be used to enable runtime pm on devices which support BACO.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 16:42:46 -05:00
Leo Liu
88a1c40a04 drm/amdgpu: add JPEG HW IP and SW structures
It will be used for JPEG IP 1.0, 2.0, 2.5 and later.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19 10:12:49 -05:00
Evan Quan
5c5b2ba006 drm/amdgpu: fix possible pstate switch race condition
Added lock protection so that the p-state switch will
be guarded to be sequential. Also update the hive
pstate only all device from the hive are in the same
state.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00
Wambui Karuga
f440ff44b1 drm/amd: correct "_LENTH" mispelling in constant
Correct the "_LENTH" mispelling in the AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
constant.

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-28 11:19:17 -04:00
Wambui Karuga
7e0ff20c7a drm/amd: declare amdgpu_exp_hw_support in amdgpu.h
Declare `amdgpu_exp_hw_support` as extern in amdgpu.h to address the
following sparse warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:118:5: warning: symbol 'amdgpu_exp_hw_support' was not declared. Should it be static?

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Suggested-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-28 11:19:17 -04:00
Tianci.Yin
367039bfb6 drm/amdgpu/psp: add psp memory training implementation(v3)
add memory training implementation code to save resume time.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:54 -04:00
Tianci.Yin
efe4f00077 drm/amdgpu/atomfirmware: add memory training related helper functions(v3)
parse firmware to get memory training capability and fb location.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:31 -04:00
Tianci.Yin
e35e2b117f drm/amdgpu: add a generic fb accessing helper function(v3)
add a generic helper function for accessing framebuffer via MMIO

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17 16:31:13 -04:00
Alex Deucher
31fa2991f4 drm/amdgpu: remove in_baco_reset hack
It was a vega20 specific hack.  Check if we are in reset
and what reset method we are using.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:51:39 -04:00
Xiaojie Yuan
5f6a556f98 drm/amdgpu/discovery: reserve discovery data at the top of VRAM
IP Discovery data is TMR fenced by the latest PSP BL,
so we need to reserve this region.

Tested on navi10/12/14 with VBIOS integrated with latest PSP BL.

v2: use DISCOVERY_TMR_SIZE macro as bo size
    use amdgpu_bo_create_kernel_at() to allocate bo

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15 15:48:46 -04:00
Alex Deucher
71f98027f2 drm/amdgpu: move amdgpu_device_get_job_timeout_settings
It's only used in amdgpu_device.c and the naming also
reflects that.  Move it there.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07 15:10:20 -05:00
Tao Zhou
d65bf1f8a7 drm/amdgpu: replace mmhub_funcs with mmhub.funcs
remove mmhub_funcs in adev

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Tao Zhou
d3a5a121b8 drm/amdgpu: add common mmhub member for adev
put mmhub_funcs and ras_if pointer into mmhub struct

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:02 -05:00
Marek Olšák
6de088a08d drm/amdgpu: remove gfx9 NGG
Never used.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Shirish S
a35ad98bf9 drm/amdgpu: remove needless usage of #ifdef
define sched_policy in case CONFIG_HSA_AMD is not
enabled, with this there is no need to check for CONFIG_HSA_AMD
else where in driver code.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:20 -05:00
Shirish S
8c9f69bc5c drm/amdgpu: fix build error without CONFIG_HSA_AMD
If CONFIG_HSA_AMD is not set, build fails:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function `amdgpu_device_ip_early_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to `sched_policy'

Use CONFIG_HSA_AMD to guard this.

Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling policy")

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 10:05:07 -05:00
Huang Rui
aa978594cf drm/amdgpu: disable gfxoff while use no H/W scheduling policy
While gfxoff is enabled, the mmVM_XXX registers will be 0xfffffff while the GFX
is in "off" state. KFD queue creattion doesn't use ring based method, so it will
trigger a VM fault.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:55:41 -05:00
Yong Zhao
4e66d7d215 drm/amdgpu: Add a kernel parameter for specifying the asic type
As more and more new asics start to reuse the old device IDs before
launch, there is a need to quickly override the existing asic type
corresponding to the reused device ID through a kernel parameter. With
this, engineers no longer need to rely on local hack patches,
facilitating cooperation across teams.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16 09:54:25 -05:00
Jack Zhang
f1d59e00ff drm/amd/amdgpu: add sw_fini interface for df_funcs
add sw_fini interface of df_funcs.
This interface will remove sysfs file of df_cntr_avail
function.

The old behavior only create sysfs of df_cntr_avail
in sw_init, but never remove it for lack of sw_fini
interface. With this,driver will report create
sysfs fail when it's loaded for the second time.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:42:15 -05:00
Christian König
43ce6bab7b drm/amdgpu: remove amdgpu_cs_try_evict
Trying to evict things from the current working set doesn't work that
well anymore because of per VM BOs.

Rely on reserving VRAM for page tables to avoid contention.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:38:56 -05:00
Hawking Zhang
bebc076285 drm/amdgpu: switch to new amdgpu_nbio structure
no functional change, just switch to new structures

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13 17:11:03 -05:00
Monk Liu
e352625796 drm/amdgpu: introduce vram lost for reset (v2)
for SOC15/vega10 the BACO reset & mode1 would introduce vram lost
in high end address range, current kmd's vram lost checking cannot
catch it since it only check very ahead visible frame buffer

v2:
cover NV as well

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-29 15:52:32 -05:00
Christoph Hellwig
244511f386 drm/amdgpu: simplify and cleanup setting the dma mask
Use dma_set_mask_and_coherent to set both masks in one go, and remove
the no longer required fallback, as the kernel now always accepts
larger than required DMA masks.  Fail the driver probe if we can't
set the DMA mask, as that means the system can only support a larger
mask.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:51:01 -05:00
Tao Zhou
3d093da098 drm/amdgpu: add amdgpu_mmhub_funcs definition
add amdgpu_mmhub_funcs definition and initialize it,
prepare for mmhub ras enablement

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-12 12:47:48 -05:00
Tao Zhou
6ca523d7eb drm/amdgpu: remove RREG64/WREG64
atomic 64 bits REG operations are useless currently

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-09 11:17:30 -05:00
Tao Zhou
045c021653 drm/amdgpu: switch to amdgpu_umc structure
create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:52 -05:00
Tao Zhou
4fa1c6a679 drm/amdgpu: add RREG64/WREG64(_PCIE) operations
add 64 bits register access functions

v2: implement 64 bit functions in low level

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:49:40 -05:00
Hawking Zhang
9e585a523b drm/amdgpu: add amdgpu_umc_functions structure
This is common structure as UMC callback function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:57 -05:00
Hawking Zhang
6501a77170 drm/amdgpu: init RSMU and UMC ip base address for vega20
the driver needs to program RSMU and UMC registers to
support vega20 RAS feature

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:48:51 -05:00
Alex Deucher
a3a09142f4 drm/amdgpu: put the SMC into the proper state on reset/unload
When doing a GPU reset or unloading the driver, we need to
put the SMU into the apprpriate state for the re-init after
the reset or unload to reliably work.

I don't think this is necessary for BACO because the SMU actually
controls the BACO state to it needs to be active.

For suspend (S3), the asic is put into D3 so the SMU would be
powered down so I don't think we need to put the SMU into
any special state.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:44 -05:00
Alex Deucher
0cf3c64f29 drm/amdgpu: add an asic callback to determine the reset method
Sometimes the driver may have to behave differently depending
on the method we are using to reset the GPU.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:44 -05:00
Jonathan Kim
64671c0fdc drm/amdgpu: add perfmon and fica atomics for df
adding perfmon and fica atomic operations to adhere to data fabrics finite
state machine requirements for indirect register access.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:26 -05:00
James Zhu
989b6a0549 drm/amdgpu: add vcn nbio doorbell range setting for 2nd vcn instance
add vcn nbio doorbell range setting for 2nd vcn instance

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:05 -05:00
Le Ma
0fe6a7b49f drm/amdgpu: support hdp flush for more sdma instances
The bit RSVD_ENG0 to RSVD_ENG5 in GPU_HDP_FLUSH_REQ/GPU_HDP_FLUSH_DONE
can be leveraged for sdma instance 2~7 to poll register/memory.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Snow Zhang < Snow.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:02 -05:00
Le Ma
113b47e780 drm/amdgpu: increase max number of ip base instances to 8
For Arcturus, the number of IP base instances is 8.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Snow Zhang < Snow.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:02 -05:00
Le Ma
fa5d2e6f0a drm/amdgpu: add SDMA 2~7 ip block type
Add IP block type.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Snow Zhang < Snow.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:02 -05:00
Le Ma
1daa2bfa17 drm/amdgpu: add new member in amdgpu_device for vmhub counts per asic chip
It aims to replace AMDGPU_MAX_VMHUBS in for loop to initialize registers.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:01 -05:00
Xiaojie Yuan
c8ff09bf41 drm/amdgpu: increase max instance number for hw ip
max instance number is 6 for navi10 and 7 for navi14, and we increase the
reg_offset array size to avoid out-of-bound access

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:17:58 -05:00
Felix Kuehling
75ee64875e drm/amdkfd: Consistently apply noretry setting
Apply the same setting to SH_MEM_CONFIG and VM_CONTEXT1_CNTL. This
makes the noretry param no longer KFD-specific. On GFX10 I'm not
changing SH_MEM_CONFIG in this commit because GFX10 has different
retry behaviour in the SQ and I don't have a way to test it at the
moment.

Suggested-by: Christian König <Christian.Koenig@amd.com>
CC: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by : Shaoyun.liu < Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-16 13:02:55 -05:00
Jack Xiao
bae17d2a1b drm/amdgpu: add field indicating if has PCIE atomics support
The new field in amdgpu device is used to record whether the
system has PCIE atomics support. The field can be exposed to
UMD or kfd whether PCIE atomics have supported.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-01 14:54:31 -05:00