mirror_ubuntu-kernels/drivers/gpu/drm/scheduler
Andrey Grodzovsky 45ecaea738 drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer'
Problem:
This patch caused negative refcount as described in [1] because
for that case parent fence did not signal by the time of drm_sched_stop and hence
kept in pending list the assumption was they will not signal and
so fence was put to account for the s_fence->parent refcount but for
amdgpu which has embedded HW fence (always same parent fence)
drm_sched_fence_release_scheduled was always called and would
still drop the count for parent fence once more. For jobs that
never signaled this imbalance was masked by refcount bug in
amdgpu_fence_driver_clear_job_fences that would not drop
refcount on the fences that were removed from fence drive
fences array (against prevois insertion into the array in
get in amdgpu_fence_emit).

Fix:
Revert this patch and by setting s_job->s_fence->parent to NULL
as before prevent the extra refcount drop in amdgpu when
drm_sched_fence_release_scheduled is called on job release.

Also - align behaviour in drm_sched_resubmit_jobs_ext with that of
drm_sched_main when submitting jobs - take a refcount for the
new parent fence pointer and drop refcount for original kref_init
for new HW fence creation (or fake new HW fence in amdgpu - see next patch).

[1] - https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4db0@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Tested-by: Yiqing Yao <yiqing.yao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-28 11:24:31 -04:00
..
gpu_scheduler_trace.h drm/sched: use __string in tracepoints 2022-04-26 15:11:00 -04:00
Makefile drm/scheduler: rename gpu_scheduler.c to sched_main.c 2018-08-27 11:10:44 -05:00
sched_entity.c drm/sched: Avoid lockdep spalt on killing a processes 2021-11-01 11:08:21 -04:00
sched_fence.c drm/sched: Fix drm_sched_fence_free() so it can be passed an uninitialized fence 2021-09-07 09:58:26 +02:00
sched_main.c drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer' 2022-06-28 11:24:31 -04:00