mirror_ubuntu-kernels/drivers/gpu/drm
Johannes Weiner c25d09bcb7 drm/amdgpu: fix deadlock while reading mqd from debugfs
An errant disk backup on my desktop got into debugfs and triggered the
following deadlock scenario in the amdgpu debugfs files. The machine
also hard-resets immediately after those lines are printed (although I
wasn't able to reproduce that part when reading by hand):

[ 1318.016074][ T1082] ======================================================
[ 1318.016607][ T1082] WARNING: possible circular locking dependency detected
[ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted
[ 1318.017598][ T1082] ------------------------------------------------------
[ 1318.018096][ T1082] tar/1082 is trying to acquire lock:
[ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80
[ 1318.019084][ T1082]
[ 1318.019084][ T1082] but task is already holding lock:
[ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.020607][ T1082]
[ 1318.020607][ T1082] which lock already depends on the new lock.
[ 1318.020607][ T1082]
[ 1318.022081][ T1082]
[ 1318.022081][ T1082] the existing dependency chain (in reverse order) is:
[ 1318.023083][ T1082]
[ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 1318.024114][ T1082]        __ww_mutex_lock.constprop.0+0xe0/0x12f0
[ 1318.024639][ T1082]        ww_mutex_lock+0x32/0x90
[ 1318.025161][ T1082]        dma_resv_lockdep+0x18a/0x330
[ 1318.025683][ T1082]        do_one_initcall+0x6a/0x350
[ 1318.026210][ T1082]        kernel_init_freeable+0x1a3/0x310
[ 1318.026728][ T1082]        kernel_init+0x15/0x1a0
[ 1318.027242][ T1082]        ret_from_fork+0x2c/0x40
[ 1318.027759][ T1082]        ret_from_fork_asm+0x11/0x20
[ 1318.028281][ T1082]
[ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}:
[ 1318.029297][ T1082]        dma_resv_lockdep+0x16c/0x330
[ 1318.029790][ T1082]        do_one_initcall+0x6a/0x350
[ 1318.030263][ T1082]        kernel_init_freeable+0x1a3/0x310
[ 1318.030722][ T1082]        kernel_init+0x15/0x1a0
[ 1318.031168][ T1082]        ret_from_fork+0x2c/0x40
[ 1318.031598][ T1082]        ret_from_fork_asm+0x11/0x20
[ 1318.032011][ T1082]
[ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}:
[ 1318.032778][ T1082]        __lock_acquire+0x14bf/0x2680
[ 1318.033141][ T1082]        lock_acquire+0xcd/0x2c0
[ 1318.033487][ T1082]        __might_fault+0x58/0x80
[ 1318.033814][ T1082]        amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu]
[ 1318.034181][ T1082]        full_proxy_read+0x55/0x80
[ 1318.034487][ T1082]        vfs_read+0xa7/0x360
[ 1318.034788][ T1082]        ksys_read+0x70/0xf0
[ 1318.035085][ T1082]        do_syscall_64+0x94/0x180
[ 1318.035375][ T1082]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.035664][ T1082]
[ 1318.035664][ T1082] other info that might help us debug this:
[ 1318.035664][ T1082]
[ 1318.036487][ T1082] Chain exists of:
[ 1318.036487][ T1082]   &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex
[ 1318.036487][ T1082]
[ 1318.037310][ T1082]  Possible unsafe locking scenario:
[ 1318.037310][ T1082]
[ 1318.037838][ T1082]        CPU0                    CPU1
[ 1318.038101][ T1082]        ----                    ----
[ 1318.038350][ T1082]   lock(reservation_ww_class_mutex);
[ 1318.038590][ T1082]                                lock(reservation_ww_class_acquire);
[ 1318.038839][ T1082]                                lock(reservation_ww_class_mutex);
[ 1318.039083][ T1082]   rlock(&mm->mmap_lock);
[ 1318.039328][ T1082]
[ 1318.039328][ T1082]  *** DEADLOCK ***
[ 1318.039328][ T1082]
[ 1318.040029][ T1082] 1 lock held by tar/1082:
[ 1318.040259][ T1082]  #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu]
[ 1318.040560][ T1082]
[ 1318.040560][ T1082] stack backtrace:
[ 1318.041053][ T1082] CPU: 22 PID: 1082 Comm: tar Not tainted 6.8.0-rc7-00015-ge0c8221b72c0 #17 3316c85d50e282c5643b075d1f01a4f6365e39c2
[ 1318.041329][ T1082] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS PRO AX/B650 AORUS PRO AX, BIOS F20 12/14/2023
[ 1318.041614][ T1082] Call Trace:
[ 1318.041895][ T1082]  <TASK>
[ 1318.042175][ T1082]  dump_stack_lvl+0x4a/0x80
[ 1318.042460][ T1082]  check_noncircular+0x145/0x160
[ 1318.042743][ T1082]  __lock_acquire+0x14bf/0x2680
[ 1318.043022][ T1082]  lock_acquire+0xcd/0x2c0
[ 1318.043301][ T1082]  ? __might_fault+0x40/0x80
[ 1318.043580][ T1082]  ? __might_fault+0x40/0x80
[ 1318.043856][ T1082]  __might_fault+0x58/0x80
[ 1318.044131][ T1082]  ? __might_fault+0x40/0x80
[ 1318.044408][ T1082]  amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu 8fe2afaa910cbd7654c8cab23563a94d6caebaab]
[ 1318.044749][ T1082]  full_proxy_read+0x55/0x80
[ 1318.045042][ T1082]  vfs_read+0xa7/0x360
[ 1318.045333][ T1082]  ksys_read+0x70/0xf0
[ 1318.045623][ T1082]  do_syscall_64+0x94/0x180
[ 1318.045913][ T1082]  ? do_syscall_64+0xa0/0x180
[ 1318.046201][ T1082]  ? lockdep_hardirqs_on+0x7d/0x100
[ 1318.046487][ T1082]  ? do_syscall_64+0xa0/0x180
[ 1318.046773][ T1082]  ? do_syscall_64+0xa0/0x180
[ 1318.047057][ T1082]  ? do_syscall_64+0xa0/0x180
[ 1318.047337][ T1082]  ? do_syscall_64+0xa0/0x180
[ 1318.047611][ T1082]  entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 1318.047887][ T1082] RIP: 0033:0x7f480b70a39d
[ 1318.048162][ T1082] Code: 91 ba 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb b2 e8 18 a3 01 00 0f 1f 84 00 00 00 00 00 80 3d a9 3c 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 53 48 83
[ 1318.048769][ T1082] RSP: 002b:00007ffde77f5c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 1318.049083][ T1082] RAX: ffffffffffffffda RBX: 0000000000000800 RCX: 00007f480b70a39d
[ 1318.049392][ T1082] RDX: 0000000000000800 RSI: 000055c9f2120c00 RDI: 0000000000000008
[ 1318.049703][ T1082] RBP: 0000000000000800 R08: 000055c9f2120a94 R09: 0000000000000007
[ 1318.050011][ T1082] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c9f2120c00
[ 1318.050324][ T1082] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000000800
[ 1318.050638][ T1082]  </TASK>

amdgpu_debugfs_mqd_read() holds a reservation when it calls
put_user(), which may fault and acquire the mmap_sem. This violates
the established locking order.

Bounce the mqd data through a kernel buffer to get put_user() out of
the illegal section.

Fixes: 445d85e3c1 ("drm/amdgpu: add debugfs interface for reading MQDs")
Cc: stable@vger.kernel.org # v6.5+
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27 01:44:24 -04:00
..
amd drm/amdgpu: fix deadlock while reading mqd from debugfs 2024-03-27 01:44:24 -04:00
arm drm/drm_property: make replace_property_blob_from_id a DRM helper 2023-12-13 15:09:53 -05:00
armada drm: Use device_get_match_data() 2023-11-27 13:56:32 -06:00
aspeed drm: Use device_get_match_data() 2023-11-27 13:56:32 -06:00
ast This cycle, I2C removes the currently unused CLASS_DDC support 2024-01-18 17:29:01 -08:00
atmel-hlcdc
bridge drm/bridge: adv7511: fix crash on irq during probe 2024-02-21 16:29:58 +01:00
ci drm/ci: mark universal-plane-sanity as failing on SC7180 2024-02-19 17:31:18 -03:00
display drm-misc-next for v6.9: 2024-03-01 15:27:50 +10:00
etnaviv - various code cleanups 2024-03-08 12:36:55 +10:00
exynos Several fixups 2024-01-25 14:22:15 +10:00
fsl-dcu
gma500 drm: remove I2C_CLASS_DDC support 2024-01-18 21:10:41 +01:00
gud Merge drm/drm-next into drm-misc-next 2023-11-15 10:56:44 +01:00
hisilicon drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
hyperv drm/hyperv: Remove firmware framebuffers with aperture helper 2024-01-12 12:38:37 +01:00
i2c
i915 Merge drm/drm-next into drm-misc-next-fixes 2024-03-07 13:30:43 +01:00
imagination One fix for drm/plane to avoid a use-after-free and some additional 2024-01-05 10:31:54 +10:00
imx drm/imx/dcss: fix resource size calculation 2024-02-28 09:16:59 +00:00
ingenic drm: Clean-up superfluously selecting VT_HW_CONSOLE_BINDING 2024-01-12 13:58:20 +01:00
kmb drm/kmb: Convert to platform remove callback returning void 2023-11-21 09:18:52 +01:00
lib
lima drm/lima: standardize debug messages by ip name 2024-02-12 16:27:48 +08:00
logicvc Merge tag 'drm-misc-fixes-2023-11-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-next 2023-11-10 16:57:49 +01:00
loongson Merge drm/drm-next into drm-misc-next 2024-01-29 14:20:23 +01:00
mcde drm: Clean-up superfluously selecting VT_HW_CONSOLE_BINDING 2024-01-12 13:58:20 +01:00
mediatek Mediatek DRM Next for Linux 6.9 2024-03-01 19:14:33 +10:00
meson Linux 6.8-rc6 2024-02-26 11:41:07 +01:00
mgag200 drm/mgag200: Add a workaround for low-latency 2024-02-26 16:37:51 +01:00
msm drm/msm/dpu: capture snapshot on the first commit_done timeout 2024-03-04 11:44:03 +02:00
mxsfb drm: lcdif: Switch to drmm_mode_config_init 2024-02-26 08:33:45 +01:00
nouveau drm/nouveau: Include <linux/backlight.h> 2024-02-28 09:59:26 +01:00
omapdrm drm/omap/hdmi5: switch to ->edid_read callback 2024-02-09 10:16:01 +02:00
panel drm/dp: Don't attempt AUX transfers when eDP panels are not powered 2024-02-28 12:43:36 -08:00
panfrost Linux 6.7-rc5 2023-12-12 11:32:33 +10:00
pl111 drm: Clean-up superfluously selecting VT_HW_CONSOLE_BINDING 2024-01-12 13:58:20 +01:00
qxl drm/ttm: replace busy placement with flags v6 2024-01-25 09:59:44 +01:00
radeon amd-drm-next-6.9-2024-03-01: 2024-03-08 11:21:13 +10:00
renesas drm: renesas: rz-du: Fix redefinition errors related to rzg2l_du_vsp_*() 2024-02-22 14:46:41 +01:00
rockchip Linux 6.8-rc6 2024-02-26 11:41:07 +01:00
scheduler drm/scheduler: Simplify the allocation of slab caches in drm_sched_fence_slab_init 2024-02-28 15:55:13 +01:00
solomon drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
sprd drm/sprd: Convert to platform remove callback returning void 2023-11-21 09:18:53 +01:00
sti
stm
sun4i drm/sun4i: hdmi: Add missing drm_atomic header 2024-03-01 19:10:29 +10:00
tegra drm/tegra: put drm_gem_object ref on error in tegra_fb_create 2024-02-22 18:29:22 +01:00
tests drm/tests: connector: Add tests for drmm_connector_init 2024-02-28 16:38:33 +01:00
tidss drm/tidss: Fix sync-lost issue with two displays 2024-02-26 10:09:43 +02:00
tilcdc drm/tilcdc: request and mapp iomem with devres 2023-12-28 19:29:04 +02:00
tiny drm/simpledrm: Do not include <drm/drm_plane_helper.h> 2023-12-06 10:36:18 +01:00
ttm Linux 6.8-rc6 2024-02-26 11:41:07 +01:00
tve200 drm: Clean-up superfluously selecting VT_HW_CONSOLE_BINDING 2024-01-12 13:58:20 +01:00
udl drm/plane-helper: Move drm_plane_helper_atomic_check() into udl 2023-12-06 10:35:49 +01:00
v3d drm/v3d: Enable V3D to use different PAGE_SIZE 2024-02-23 16:37:20 -03:00
vboxvideo drm/vboxvideo: Use the hotspot properties from cursor planes 2023-11-24 11:58:00 +01:00
vc4 drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
vgem
virtio drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
vkms drm/vkms: Avoid reading beyond LUT array 2024-01-02 12:06:53 -01:00
vmwgfx drm/vmwgfx: Fix the lifetime of the bo cursor memory 2024-01-30 14:18:21 -05:00
xe drm/xe: Replace 'grouped target' in Makefile with pattern rule 2024-03-04 08:41:28 -06:00
xen
xlnx drm: xlnx: zynqmp_dpsub: switch to ->edid_read callback 2024-02-09 10:16:03 +02:00
drm_aperture.c
drm_atomic_helper.c drm-misc-next for $kernel-version: 2023-12-19 17:07:32 +10:00
drm_atomic_state_helper.c drm/drm_plane: track color mgmt changes per plane 2023-12-13 15:09:53 -05:00
drm_atomic_uapi.c drm/drm_property: make replace_property_blob_from_id a DRM helper 2023-12-13 15:09:53 -05:00
drm_atomic.c drm/drm_plane: track color mgmt changes per plane 2023-12-13 15:09:53 -05:00
drm_auth.c drm-next for 6.8: 2024-01-12 11:32:19 -08:00
drm_blend.c Revert "drm: Introduce pixel_source DRM plane property" 2023-12-04 21:33:10 +02:00
drm_bridge_connector.c drm/bridge: switch to drm_bridge_edid_read() 2024-02-08 17:10:44 +02:00
drm_bridge.c drm/bridge: remove ->get_edid callback 2024-02-09 10:16:20 +02:00
drm_buddy.c drm/buddy: Modify duplicate list_splice_tail call 2024-02-16 13:03:14 +01:00
drm_cache.c
drm_client_modeset.c
drm_client.c drm/client: Do not acquire module reference 2023-11-15 13:51:38 +01:00
drm_color_mgmt.c
drm_connector.c drm/doc: describe PATH format for DP MST 2023-10-27 16:01:10 +02:00
drm_crtc_helper_internal.h
drm_crtc_helper.c drm/plane-helper: Move drm_plane_helper_atomic_check() into udl 2023-12-06 10:35:49 +01:00
drm_crtc_internal.h Revert "drm/atomic: Add pixel source to plane state dump" 2023-12-04 21:33:07 +02:00
drm_crtc.c drm: Remove drm_num_crtcs() helper 2024-02-28 12:18:07 +01:00
drm_damage_helper.c drm: Allow drivers to indicate the damage helpers to ignore damage clips 2023-11-24 15:15:25 +01:00
drm_debugfs_crc.c
drm_debugfs.c drm/debugfs: drop unneeded DEBUG_FS guard 2024-01-02 15:50:13 +02:00
drm_displayid.c
drm_drv.c drm: Remove support for legacy drivers 2023-12-06 10:08:28 +01:00
drm_dumb_buffers.c
drm_edid_load.c drm/edid/firmware: Remove built-in EDIDs 2024-02-26 14:05:18 +01:00
drm_edid.c drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
drm_eld.c drm/eld: add helpers to modify the SADs of an ELD 2023-11-09 16:48:27 +02:00
drm_encoder_slave.c
drm_encoder.c drm/encoder: register per-encoder debugfs dir 2023-12-04 16:07:29 +02:00
drm_exec.c Merge drm/drm-next into drm-misc-next 2024-01-29 14:20:23 +01:00
drm_fb_dma_helper.c
drm_fb_helper.c
drm_fbdev_dma.c
drm_fbdev_generic.c
drm_file.c drm: update drm_show_memory_stats() for dma-bufs 2024-02-16 12:52:50 +01:00
drm_flip_work.c drm: Remove struct drm_flip_task from DRM interfaces 2023-11-14 10:23:11 +01:00
drm_format_helper.c drm/format-helper: Pass format-conversion state to helpers 2023-11-14 10:16:53 +01:00
drm_fourcc.c drm/fourcc: Add NV20 and NV30 YUV formats 2023-10-24 21:34:35 +02:00
drm_framebuffer.c drm: Warn when freeing a framebuffer that's still on a list 2023-12-23 07:31:29 +02:00
drm_gem_atomic_helper.c drm/atomic-helper: Add format-conversion state to shadow-plane state 2023-11-14 10:01:14 +01:00
drm_gem_dma_helper.c
drm_gem_framebuffer_helper.c
drm_gem_shmem_helper.c
drm_gem_ttm_helper.c
drm_gem_vram_helper.c drm/ttm: replace busy placement with flags v6 2024-01-25 09:59:44 +01:00
drm_gem.c drm: Do not overrun array in drm_gem_get_pages() 2023-10-12 10:44:06 +02:00
drm_gpuvm.c Merge tag 'drm-msm-next-2023-12-15' of https://gitlab.freedesktop.org/drm/msm into drm-next 2023-12-20 07:54:03 +10:00
drm_internal.h drm: Remove source code for non-KMS drivers 2023-12-06 10:08:37 +01:00
drm_ioc32.c drm/ioc32: replace __attribute__((packed)) with __packed 2023-12-14 12:16:58 +02:00
drm_ioctl.c drm: Remove locking for legacy ioctls and DRM_UNLOCKED 2023-12-06 10:08:32 +01:00
drm_kms_helper_common.c drm/edid/firmware: drop drm_kms_helper.edid_firmware backward compat 2023-11-21 12:22:48 +02:00
drm_lease.c drm_lease.c: copy user-array safely 2023-10-09 16:59:49 +10:00
drm_managed.c drm/managed: Add drmm_release_action 2024-01-17 10:38:39 +01:00
drm_mipi_dbi.c drm/format-helper: Pass format-conversion state to helpers 2023-11-14 10:16:53 +01:00
drm_mipi_dsi.c drm: mipi-dsi: make mipi_dsi_bus_type const 2024-02-07 12:35:10 +02:00
drm_mm.c
drm_mode_config.c drm/mode: switch from drm_debug_printer() to device specific drm_dbg_printer() 2024-02-09 11:51:59 +02:00
drm_mode_object.c drm: Refuse to async flip with atomic prop changes 2023-11-23 17:12:38 +01:00
drm_modes.c drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
drm_modeset_helper.c drm: Check output polling initialized before disabling 2024-02-28 15:07:15 +01:00
drm_modeset_lock.c drm: remove drm_debug_printer in favor of drm_dbg_printer 2024-02-09 11:52:43 +02:00
drm_of.c
drm_panel_orientation_quirks.c drm: panel-orientation-quirks: Add quirk for GPD Win Mini 2024-01-19 09:25:22 +01:00
drm_panel.c
drm_pci.c drm: Remove source code for non-KMS drivers 2023-12-06 10:08:37 +01:00
drm_plane_helper.c drm/plane-helper: Move drm_plane_helper_atomic_check() into udl 2023-12-06 10:35:49 +01:00
drm_plane.c drm: Don't unref the same fb many times by mistake due to deadlock handling 2023-12-23 07:31:05 +02:00
drm_prime.c drm/prime: Support page array >= 4GB 2024-02-13 16:36:04 +01:00
drm_print.c drm: remove drm_debug_printer in favor of drm_dbg_printer 2024-02-09 11:52:43 +02:00
drm_privacy_screen_x86.c
drm_privacy_screen.c
drm_probe_helper.c drm: Check polling initialized before enabling in drm_helper_probe_single_connector_modes 2024-02-28 15:07:22 +01:00
drm_property.c drm/drm_property: make replace_property_blob_from_id a DRM helper 2023-12-13 15:09:53 -05:00
drm_rect.c
drm_self_refresh_helper.c
drm_simple_kms_helper.c
drm_suballoc.c
drm_syncobj.c Linux 6.8-rc6 2024-02-26 11:41:07 +01:00
drm_sysfs.c
drm_trace_points.c
drm_trace.h
drm_vblank_work.c
drm_vblank.c drm: Remove support for legacy drivers 2023-12-06 10:08:28 +01:00
drm_vma_manager.c
drm_writeback.c
Kconfig drm-misc-next for v6.9: 2024-02-05 13:50:15 +10:00
Makefile drm/xe: Introduce a new DRM driver for Intel GPUs 2023-12-12 14:05:48 -05:00