Commit Graph

309 Commits

Author SHA1 Message Date
Dave Airlie
b837d3db9a drm-misc-next for 6.2:
UAPI Changes:
   - Documentation for page-flip flags
 
 Cross-subsystem Changes:
   - dma-buf: Add unlocked variant of vmapping and attachment-mapping
     functions
 
 Core Changes:
   - atomic-helpers: CRTC primary plane test fixes
   - connector: TV API consistency improvements, cmdline parsing
     improvements
   - crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper
   - edid: Fixes for HFVSDB parsing,
   - fourcc: Addition of the Vivante tiled modifier
   - makefile: Sort and reorganize the objects files
   - mode_config: Remove fb_base from drm_mode_config_funcs
   - sched: Add a module parameter to change the scheduling policy,
     refcounting fix for fences
   - tests: Sort the Kunit tests in the Makefile, improvements to the
     DP-MST tests
   - ttm: Remove unnecessary drm_mm_clean() call
 
 Driver Changes:
   - New driver: ofdrm
   - Move all drivers to a common dma-buf locking convention
   - bridge:
     - adv7533: Remove dynamic lane switching
     - it6505: Runtime PM support
     - ps8640: Handle AUX defer messages
     - tc358775: Drop soft-reset over I2C
   - ast: Atomic Gamma LUT Support, Convert to SHMEM, various
     improvements
   - lcdif: Support for YUV planes
   - mgag200: Fix PLL Setup on some revisions
   - udl: Modesetting improvements, hot-unplug support
   - vc4: Fix support for PAL-M
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCY1D3fQAKCRDj7w1vZxhR
 xZZgAPoCSfyU3+M3ALT0vQ/OpktE5tuwUBMac2Lxkqgx4dnQ0gEAnYeez0Dedod8
 HNdgBH7FTklYa/zT8n17SwHUOzJJ5gc=
 =YvuW
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2022-10-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 6.2:

UAPI Changes:
  - Documentation for page-flip flags

Cross-subsystem Changes:
  - dma-buf: Add unlocked variant of vmapping and attachment-mapping
    functions

Core Changes:
  - atomic-helpers: CRTC primary plane test fixes
  - connector: TV API consistency improvements, cmdline parsing
    improvements
  - crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper
  - edid: Fixes for HFVSDB parsing,
  - fourcc: Addition of the Vivante tiled modifier
  - makefile: Sort and reorganize the objects files
  - mode_config: Remove fb_base from drm_mode_config_funcs
  - sched: Add a module parameter to change the scheduling policy,
    refcounting fix for fences
  - tests: Sort the Kunit tests in the Makefile, improvements to the
    DP-MST tests
  - ttm: Remove unnecessary drm_mm_clean() call

Driver Changes:
  - New driver: ofdrm
  - Move all drivers to a common dma-buf locking convention
  - bridge:
    - adv7533: Remove dynamic lane switching
    - it6505: Runtime PM support
    - ps8640: Handle AUX defer messages
    - tc358775: Drop soft-reset over I2C
  - ast: Atomic Gamma LUT Support, Convert to SHMEM, various
    improvements
  - lcdif: Support for YUV planes
  - mgag200: Fix PLL Setup on some revisions
  - udl: Modesetting improvements, hot-unplug support
  - vc4: Fix support for PAL-M

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020072405.g3o4hxuk75gmeumw@houat
2022-10-25 11:42:18 +10:00
Thomas Zimmermann
1aca5ce036 Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get v6.1-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-10-20 09:09:00 +02:00
Christian König
7b476affcc drm/sched: add DRM_SCHED_FENCE_DONT_PIPELINE flag
Setting this flag on a scheduler fence prevents pipelining of jobs
depending on this fence. In other words we always insert a full CPU
round trip before dependent jobs are pushed to the pipeline.

Signed-off-by: Christian König <christian.koenig@amd.com>
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2113#note_1579296
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014081553.114899-1-christian.koenig@amd.com
2022-10-19 12:42:51 +02:00
Maxime Ripard
a140a6a2d5
Merge drm/drm-next into drm-misc-next
Let's kick-off this release cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2022-10-18 15:00:03 +02:00
Dave Airlie
bafaf67c42 Revert "drm/sched: Use parent fence instead of finished"
This reverts commit e4dc45b184.

This is causing instability on Linus' desktop, and I'm seeing
oops with VK CTS runs.

netconsole got me the following oops:
[ 1234.778760] BUG: kernel NULL pointer dereference, address: 0000000000000088
[ 1234.778782] #PF: supervisor read access in kernel mode
[ 1234.778787] #PF: error_code(0x0000) - not-present page
[ 1234.778791] PGD 0 P4D 0
[ 1234.778798] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1234.778803] CPU: 7 PID: 805 Comm: systemd-journal Not tainted 6.0.0+ #2
[ 1234.778809] Hardware name: System manufacturer System Product
Name/PRIME X370-PRO, BIOS 5603 07/28/2020
[ 1234.778813] RIP: 0010:drm_sched_job_done.isra.0+0xc/0x140 [gpu_sched]
[ 1234.778828] Code: aa 0f 1d ce e9 57 ff ff ff 48 89 d7 e8 9d 8f 3f
ce e9 4a ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 53
48 89 fb <48> 8b af 88 00 00 00 f0 ff 8d f0 00 00 00 48 8b 85 80 01 00
00 f0
[ 1234.778834] RSP: 0000:ffffabe680380de0 EFLAGS: 00010087
[ 1234.778839] RAX: ffffffffc04e9230 RBX: 0000000000000000 RCX: 0000000000000018
[ 1234.778897] RDX: 00000ba278e8977a RSI: ffff953fb288b460 RDI: 0000000000000000
[ 1234.778901] RBP: ffff953fb288b598 R08: 00000000000000e0 R09: ffff953fbd98b808
[ 1234.778905] R10: 0000000000000000 R11: ffffabe680380ff8 R12: ffffabe680380e00
[ 1234.778908] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff953fbd9ec458
[ 1234.778912] FS:  00007f35e7008580(0000) GS:ffff95428ebc0000(0000)
knlGS:0000000000000000
[ 1234.778916] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1234.778919] CR2: 0000000000000088 CR3: 000000010147c000 CR4: 00000000003506e0
[ 1234.778924] Call Trace:
[ 1234.778981]  <IRQ>
[ 1234.778989]  dma_fence_signal_timestamp_locked+0x6a/0xe0
[ 1234.778999]  dma_fence_signal+0x2c/0x50
[ 1234.779005]  amdgpu_fence_process+0xc8/0x140 [amdgpu]
[ 1234.779234]  sdma_v3_0_process_trap_irq+0x70/0x80 [amdgpu]
[ 1234.779395]  amdgpu_irq_dispatch+0xa9/0x1d0 [amdgpu]
[ 1234.779609]  amdgpu_ih_process+0x80/0x100 [amdgpu]
[ 1234.779783]  amdgpu_irq_handler+0x1f/0x60 [amdgpu]
[ 1234.779940]  __handle_irq_event_percpu+0x46/0x190
[ 1234.779946]  handle_irq_event+0x34/0x70
[ 1234.779949]  handle_edge_irq+0x9f/0x240
[ 1234.779954]  __common_interrupt+0x66/0x100
[ 1234.779960]  common_interrupt+0xa0/0xc0
[ 1234.779965]  </IRQ>
[ 1234.779968]  <TASK>
[ 1234.779971]  asm_common_interrupt+0x22/0x40
[ 1234.779976] RIP: 0010:finish_mkwrite_fault+0x22/0x110
[ 1234.779981] Code: 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 41
54 55 48 89 fd 53 48 8b 07 f6 40 50 08 0f 84 eb 00 00 00 48 8b 45 30
48 8b 18 <48> 89 df e8 66 bd ff ff 48 85 c0 74 0d 48 89 c2 83 e2 01 48
83 ea
[ 1234.779985] RSP: 0000:ffffabe680bcfd78 EFLAGS: 00000202

Revert it for now and figure it out later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-10-07 12:58:39 +10:00
Christian König
65b698bf40 drm/sched: add missing NULL check in drm_sched_get_cleanup_job v2
Otherwise we would crash if the job is not resubmitted.

v2: fix second usage of s_fence->parent as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221004132831.134986-1-christian.koenig@amd.com
2022-10-05 12:38:00 +02:00
Christian König
7c022f516f drm/scheduler: fix fence ref counting
We leaked dependency fences when processes were beeing killed.

Additional to that grab a reference to the last scheduled fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220929180151.139751-1-christian.koenig@amd.com
2022-10-05 12:36:20 +02:00
Andrey Grodzovsky
08fb97de03 drm/sched: Add FIFO sched policy to run queue
When many entities are competing for the same run queue
on the same scheduler, we observe an unusually long wait
times and some jobs get starved. This has been observed on GPUVis.

The issue is due to the Round Robin policy used by schedulers
to pick up the next entity's job queue for execution. Under stress
of many entities and long job queues within entity some
jobs could be stuck for very long time in it's entity's
queue before being popped from the queue and executed
while for other entities with smaller job queues a job
might execute earlier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entities in run queue, chose next entity
on run queue in such order that if job on one entity arrived
earlier then job on another entity the first job will start
executing earlier regardless of the length of the entity's job
queue.

v2:
Switch to rb tree structure for entities based on TS of
oldest job waiting in the job queue of an entity. Improves next
entity extraction to O(1). Entity TS update
O(log N) where N is the number of entities in the run-queue

Drop default option in module control parameter.

v3:
Various cosmetical fixes and minor refactoring of fifo update function. (Luben)

v4:
Switch drm_sched_rq_select_entity_fifo to in order search (Luben)

v5: Fix up drm_sched_rq_select_entity_fifo loop (Luben)

v6: Add missing drm_sched_rq_remove_fifo_locked

v7: Fix ts sampling bug and more cosmetic stuff (Luben)

v8: Fix module parameter string (Luben)

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Tested-by: Yunxiang Li (Teddy) <Yunxiang.Li@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220930041258.1050247-1-luben.tuikov@amd.com
2022-09-30 09:12:08 -04:00
Arvind Yadav
e4dc45b184 drm/sched: Use parent fence instead of finished
Using the parent fence instead of the finished fence
to get the job status. This change is to avoid GPU
scheduler timeout error which can cause GPU reset.

Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220914164321.2156-6-Arvind.Yadav@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2022-09-16 15:53:25 +02:00
Randy Dunlap
f8ad757e40 drm/scheduler: quieten kernel-doc warnings
Fix kernel-doc warnings in gpu_scheduler.h and sched_main.c.

Quashes these warnings:

include/drm/gpu_scheduler.h:332: warning: missing initial short description on line:
 * struct drm_sched_backend_ops
include/drm/gpu_scheduler.h:412: warning: missing initial short description on line:
 * struct drm_gpu_scheduler
include/drm/gpu_scheduler.h:461: warning: Function parameter or member 'dev' not described in 'drm_gpu_scheduler'

drivers/gpu/drm/scheduler/sched_main.c:201: warning: missing initial short description on line:
 * drm_sched_dependency_optimized
drivers/gpu/drm/scheduler/sched_main.c:995: warning: Function parameter or member 'dev' not described in 'drm_sched_init'

Fixes: 2d33948e4e ("drm/scheduler: add documentation")
Fixes: 8ab62eda17 ("drm/sched: Add device pointer to drm_gpu_scheduler")
Fixes: 542cff7893 ("drm/sched: Avoid lockdep spalt on killing a processes")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Jiawei Gu <Jiawei.Gu@amd.com>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220404213040.12912-1-rdunlap@infradead.org
2022-09-06 16:14:28 -04:00
Christian König
6d602e0311 drm/sched: move calling drm_sched_entity_select_rq
We already discussed that the call to drm_sched_entity_select_rq() needs
to move to drm_sched_job_arm() to be able to set a new scheduler list
between _init() and _arm(). This was just not applied for some reason.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714103902.7084-2-christian.koenig@amd.com
2022-07-19 17:22:25 +02:00
Dave Airlie
344feb7ccf Merge tag 'amd-drm-next-5.20-2022-07-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-05:

amdgpu:
- Various spelling and grammer fixes
- Various eDP fixes
- Various DMCUB fixes
- VCN fixes
- GMC 11 fixes
- RAS fixes
- TMZ support for GC 10.3.7
- GPUVM TLB flush fixes
- SMU 13.0.x updates
- DCN 3.2 Support
- DCN 3.2.1 Support
- MES updates
- GFX11 modifiers support
- USB-C fixes
- MMHUB 3.0.1 support
- SDMA 6.0 doorbell fixes
- Initial devcoredump support
- Enable high priority gfx queue on asics which support it
- Enable GPU reset for SMU 13.0.4
- OLED display fixes
- MPO fixes
- DC frame size fixes
- ASPM support for PCIE 7.4/7.6
- GPU reset support for SMU 13.0.0
- GFX11 updates
- VCN JPEG fix
- BACO support for SMU 13.0.7
- VCN instance handling fix
- GFX8 GPUVM TLB flush fix
- GPU reset rework
- VCN 4.0.2 support
- GTT size fixes
- DP link training fixes
- LSDMA 6.0.1 support
- Various backlight fixes
- Color encoding fixes
- Backlight config cleanup
- VCN 4.x unified queue cleanup

amdkfd:
- MMU notifier fixes
- Updates for GC 10.3.6 and 10.3.7
- P2P DMA support using dma-buf
- Add available memory IOCTL
- SDMA 6.0.1 fix
- MES fixes
- HMM profiler support

radeon:
- License fix
- Backlight config cleanup

UAPI:
- Add available memory IOCTL to amdkfd
  Proposed userspace: https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html
- HMM profiler support for amdkfd
  Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080805.html

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220705212633.6037-1-alexander.deucher@amd.com
2022-07-12 11:07:32 +10:00
Andrey Grodzovsky
45ecaea738 drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer'
Problem:
This patch caused negative refcount as described in [1] because
for that case parent fence did not signal by the time of drm_sched_stop and hence
kept in pending list the assumption was they will not signal and
so fence was put to account for the s_fence->parent refcount but for
amdgpu which has embedded HW fence (always same parent fence)
drm_sched_fence_release_scheduled was always called and would
still drop the count for parent fence once more. For jobs that
never signaled this imbalance was masked by refcount bug in
amdgpu_fence_driver_clear_job_fences that would not drop
refcount on the fences that were removed from fence drive
fences array (against prevois insertion into the array in
get in amdgpu_fence_emit).

Fix:
Revert this patch and by setting s_job->s_fence->parent to NULL
as before prevent the extra refcount drop in amdgpu when
drm_sched_fence_release_scheduled is called on job release.

Also - align behaviour in drm_sched_resubmit_jobs_ext with that of
drm_sched_main when submitting jobs - take a refcount for the
new parent fence pointer and drop refcount for original kref_init
for new HW fence creation (or fake new HW fence in amdgpu - see next patch).

[1] - https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4db0@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Tested-by: Yiqing Yao <yiqing.yao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-28 11:24:31 -04:00
Dmitry Osipenko
7d64c40a7d drm/scheduler: Don't kill jobs in interrupt context
Interrupt context can't sleep. Drivers like Panfrost and MSM are taking
mutex when job is released, and thus, that code can sleep. This results
into "BUG: scheduling while atomic" if locks are contented while job is
freed. There is no good reason for releasing scheduler's jobs in IRQ
context, hence use normal context to fix the trouble.

Cc: stable@vger.kernel.org
Fixes: 542cff7893 ("drm/sched: Avoid lockdep spalt on killing a processes")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220411221536.283312-1-dmitry.osipenko@collabora.com
2022-05-17 10:06:41 -04:00
Chia-I Wu
e87826efa9 drm/sched: use __string in tracepoints
Otherwise, ring names are marked [UNSAFE-MEMORY].

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220412204809.824491-2-olvaffe@gmail.com
2022-04-26 15:11:00 -04:00
Chia-I Wu
4a35c23f91 drm/sched: use DECLARE_EVENT_CLASS
drm_sched_job and drm_run_job have the same prototype.

v2: rename the class from drm_sched_job_entity to drm_sched_job (Andrey)

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220412204809.824491-1-olvaffe@gmail.com
2022-04-26 15:11:00 -04:00
Christian König
7bc80a5462 dma-buf: add enum dma_resv_usage v4
This change adds the dma_resv_usage enum and allows us to specify why a
dma_resv object is queried for its containing fences.

Additional to that a dma_resv_usage_rw() helper function is added to aid
retrieving the fences for a read or write userspace submission.

This is then deployed to the different query functions of the dma_resv
object and all of their users. When the write paratermer was previously
true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise.

v2: add KERNEL/OTHER in separate patch
v3: some kerneldoc suggestions by Daniel
v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in
    the rebase pointed out by Bas.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-2-christian.koenig@amd.com
2022-04-07 12:53:53 +02:00
Daniel Vetter
b892d39199 drm/sched: Check locking in drm_sched_job_add_implicit_dependencies
You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

v2: Fix commit summary (Christian)

Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220331204651.2699107-4-daniel.vetter@ffwll.ch
2022-04-04 16:46:34 +02:00
Jiawei Gu
8ab62eda17 drm/sched: Add device pointer to drm_gpu_scheduler
Add device pointer so scheduler's printing can use
DRM_DEV_ERROR() instead, which makes life easier under multiple GPU
scenario.

v2: amend all calls of drm_sched_init()
v3: fill dev pointer for all drm_sched_init() calls

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com
2022-02-23 10:04:14 +01:00
Dave Airlie
c18c889111 drm-misc-next for 5.17:
UAPI Changes:
 
  * Remove restrictions on DMA_BUF_SET_NAME ioctl
  * connector: State of privacy screen
  * sysfs: Send hotplug uevent
 
 Cross-subsystem Changes:
 
  * clk/bmc-2835: Fixes
 
  * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs
    helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes
 
  * pwm: Introduce of_pwm_single_xlate()
 
 Core Changes:
 
  * Support for privacy screens
  * Make drm_irq.c legacy
  * Fix __stack_depot_* name conflict
  * Documentation fixes
  * Fixes and cleanups
 
  * dp-helper: Reuse 8b/10b link-training delay helpers
 
  * format-helper: Update interfaces
 
  * fb-helper: Allocate shadow buffer of correct size
 
  * gem: Link GEM SHMEM and CMA helpers into separate modules; Use
 	    dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules
 
  * gem/shmem-helper: Interface cleanups
 
  * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies();
    Lockdep fixes
 
  * kms-helpers: Link several files from core into the KMS-helper module
 
 Driver Changes:
 
  * Use dma_resv_iter in several places
  * Fixes and cleanups
 
  * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences
    at once
 
  * bridge: Switch to managed MIPI DSI helpers in several places; Register
    and attach during probe in several places; Convert to YAML in several
    places
 
  * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes
 
  * bridge/dw-hdmi: Allow interlace on bridge
 
  * bridge/ps8640: Enable PM; Support aux-bus
 
  * bridge/tc358768: Enabled reference clock; Support pulse mode;
    Modesetting fixes
 
  * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM
 
  * etnaviv: Get all fences at once
 
  * gma500: GEM object cleanups; Remove generic drivers in probe function
 
  * i915: Support VESA panel backlights
 
  * ingenic: Fixes and cleanups
 
  * kirin: Adjust probe order
 
  * kmb: Enable framebuffer console
 
  * lima: Kconfig fixes
 
  * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER
 
  * msm: Fixes and cleanups
 
  * msm/dsi: Adjust probe order
 
  * omap: Fixes and cleanups
 
  * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB
    quantization to FULL; Fixes and cleanups
 
  * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452,
    Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950,
    BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout
    drivers; Fixes and cleanups
 
  * panel/ili9881c: Orientation fixes
 
  * radeon: Use dma_resv_wait_timeout()
 
  * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock
    fixes; Implement mmap in GEM object functions
 
  * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes
 
  * sun4i: Use CMA helpers without vmap support
 
  * tidss: Fixes and cleanups
 
  * v3d: Cleanups
 
  * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller
    while disabling; Support 4k@60 Hz modes; Fixes and cleanups
 
  * video: Convert to sysfs_emit() in several places
 
  * video/omapfb: Fix fall-through
 
  * virtio: Overflow fixes
 
  * xen: Implement mmap as GEM object functions
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmGWF0wACgkQaA3BHVML
 eiN55ggAr6QN7S7Uxu98XnqfAHC9RErY7r3PoTXXS6ODvxY41bWOpHk8TQzuw626
 JCNnpQCk6Gi8L3yl8r/l1fqoirGXrfDR1YvrnmG4I9xhPxOqBmgxDWw7HQrROm2B
 FctOvgFukvzn5jzQk2FqYgs5JBV20WqLrfEhttPFFMvjLGti/U31/+d+aGMJdRIQ
 kE2eVpPZtAlbBAP+S4mglKp6w+WrrzNHHULSGOPcGS5jwLrNFaQg7w75JiLInzyR
 RWum28USXSkKE0d6XBqHw+PWAwo3B/Vq9XSnbCX4+3GQX4tJh+hosyU7ju+rA0BY
 FJCyN/YgZvc+474o/LVtMh7PbKMEHQ==
 =TV6Q
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-11-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.17:

UAPI Changes:

 * Remove restrictions on DMA_BUF_SET_NAME ioctl
 * connector: State of privacy screen
 * sysfs: Send hotplug uevent

Cross-subsystem Changes:

 * clk/bmc-2835: Fixes

 * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs
   helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes

 * pwm: Introduce of_pwm_single_xlate()

Core Changes:

 * Support for privacy screens
 * Make drm_irq.c legacy
 * Fix __stack_depot_* name conflict
 * Documentation fixes
 * Fixes and cleanups

 * dp-helper: Reuse 8b/10b link-training delay helpers

 * format-helper: Update interfaces

 * fb-helper: Allocate shadow buffer of correct size

 * gem: Link GEM SHMEM and CMA helpers into separate modules; Use
	    dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules

 * gem/shmem-helper: Interface cleanups

 * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies();
   Lockdep fixes

 * kms-helpers: Link several files from core into the KMS-helper module

Driver Changes:

 * Use dma_resv_iter in several places
 * Fixes and cleanups

 * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences
   at once

 * bridge: Switch to managed MIPI DSI helpers in several places; Register
   and attach during probe in several places; Convert to YAML in several
   places

 * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes

 * bridge/dw-hdmi: Allow interlace on bridge

 * bridge/ps8640: Enable PM; Support aux-bus

 * bridge/tc358768: Enabled reference clock; Support pulse mode;
   Modesetting fixes

 * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM

 * etnaviv: Get all fences at once

 * gma500: GEM object cleanups; Remove generic drivers in probe function

 * i915: Support VESA panel backlights

 * ingenic: Fixes and cleanups

 * kirin: Adjust probe order

 * kmb: Enable framebuffer console

 * lima: Kconfig fixes

 * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER

 * msm: Fixes and cleanups

 * msm/dsi: Adjust probe order

 * omap: Fixes and cleanups

 * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB
   quantization to FULL; Fixes and cleanups

 * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452,
   Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950,
   BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout
   drivers; Fixes and cleanups

 * panel/ili9881c: Orientation fixes

 * radeon: Use dma_resv_wait_timeout()

 * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock
   fixes; Implement mmap in GEM object functions

 * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes

 * sun4i: Use CMA helpers without vmap support

 * tidss: Fixes and cleanups

 * v3d: Cleanups

 * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller
   while disabling; Support 4k@60 Hz modes; Fixes and cleanups

 * video: Convert to sysfs_emit() in several places

 * video/omapfb: Fix fall-through

 * virtio: Overflow fixes

 * xen: Implement mmap as GEM object functions

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YZYZSypIrr+qcih3@linux-uq9g.fritz.box
2021-11-23 09:38:55 +10:00
Rob Clark
963d0b3569 drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder
drm_sched_job_add_dependency() could drop the last ref, so we need to do
the dma_fence_get() first.

Cc: Christian König <christian.koenig@amd.com>
Fixes: 9c2ba26535 ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211116155545.473311-1-robdclark@gmail.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-11-17 08:21:03 +01:00
Christian König
4eaf02d607 drm/scheduler: fix drm_sched_job_add_implicit_dependencies
Trivial fix since we now need to grab a reference to the fence we have
added. Previously the dma_resv function where doing that for us.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 9c2ba26535 ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
Link: https://patchwork.freedesktop.org/patch/msgid/20211019112706.27769-1-christian.koenig@amd.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
References: https://lore.kernel.org/dri-devel/2023306.UmlnhvANQh@archbook/
Tested-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
2021-11-16 10:00:58 +01:00
Andrey Grodzovsky
542cff7893 drm/sched: Avoid lockdep spalt on killing a processes
Probelm:
Singlaning one sched fence from within another's sched
fence singal callback generates lockdep splat because
the both have same lockdep class of their fence->lock

Fix:
Fix bellow stack by rescheduling to irq work of
signaling and killing of jobs that left when entity is killed.

[11176.741181]  dump_stack+0x10/0x12
[11176.741186] __lock_acquire.cold+0x208/0x2df
[11176.741197]  lock_acquire+0xc6/0x2d0
[11176.741204]  ? dma_fence_signal+0x28/0x80
[11176.741212] _raw_spin_lock_irqsave+0x4d/0x70
[11176.741219]  ? dma_fence_signal+0x28/0x80
[11176.741225]  dma_fence_signal+0x28/0x80
[11176.741230] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741240] drm_sched_entity_kill_jobs_cb+0x1c/0x50 [gpu_sched]
[11176.741248] dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741254]  dma_fence_signal+0x3b/0x80
[11176.741260] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741268] drm_sched_job_done.isra.0+0x7f/0x1a0 [gpu_sched]
[11176.741277] drm_sched_job_done_cb+0x12/0x20 [gpu_sched]
[11176.741284] dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741290]  dma_fence_signal+0x3b/0x80
[11176.741296] amdgpu_fence_process+0xd1/0x140 [amdgpu]
[11176.741504] sdma_v4_0_process_trap_irq+0x8c/0xb0 [amdgpu]
[11176.741731]  amdgpu_irq_dispatch+0xce/0x250 [amdgpu]
[11176.741954]  amdgpu_ih_process+0x81/0x100 [amdgpu]
[11176.742174]  amdgpu_irq_handler+0x26/0xa0 [amdgpu]
[11176.742393] __handle_irq_event_percpu+0x4f/0x2c0
[11176.742402] handle_irq_event_percpu+0x33/0x80
[11176.742408]  handle_irq_event+0x39/0x60
[11176.742414]  handle_edge_irq+0x93/0x1d0
[11176.742419]  __common_interrupt+0x50/0xe0
[11176.742426]  common_interrupt+0x80/0x90

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Daniel Vetter  <daniel.vetter@ffwll.ch>
Suggested-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/dri-devel/msg321250.html
2021-11-01 11:08:21 -04:00
Christian König
13e9e30caf drm/scheduler: fix drm_sched_job_add_implicit_dependencies
Trivial fix since we now need to grab a reference to the fence we have
added. Previously the dma_resv function where doing that for us.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 9c2ba26535 ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
Link: https://patchwork.freedesktop.org/patch/msgid/20211019112706.27769-1-christian.koenig@amd.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
References: https://lore.kernel.org/dri-devel/2023306.UmlnhvANQh@archbook/
Tested-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
2021-10-19 15:11:26 +02:00
Christian König
9c2ba26535 drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2
Simplifying the code a bit.

v2: use dma_resv_for_each_fence

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-17-christian.koenig@amd.com
2021-10-07 14:49:11 +02:00
Monk Liu
bcf26654a3 drm/sched: fix the bug of time out calculation(v4)
issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

v4:
remove the kthread_should_park() checking in cleanup_job routine,
we should cleanup the signaled job asap

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization
(no race issue anymore with the serialization)

Tested-by: jingwen <jingwen.chen@@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1630457207-13107-1-git-send-email-Monk.Liu@amd.com
2021-09-15 10:21:30 -04:00
Boris Brezillon
d4c16733e7 drm/sched: Fix drm_sched_fence_free() so it can be passed an uninitialized fence
drm_sched_job_cleanup() will pass an uninitialized fence to
drm_sched_fence_free(), which will cause to_drm_sched_fence() to return
a NULL fence object, causing a NULL pointer deref when this NULL object
is passed to kmem_cache_free().

Let's create a new drm_sched_fence_free() function that takes a
drm_sched_fence pointer and suffix the old function with _rcu. While at
it, complain if drm_sched_fence_free() is passed an initialized fence
or if drm_sched_fence_free_rcu() is passed an uninitialized fence.

Fixes: dbe48d030b ("drm/sched: Split drm_sched_job_init")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210903120554.444101-1-boris.brezillon@collabora.com
2021-09-07 09:58:26 +02:00
Christian König
d72277b6c3 dma-buf: nuke DMA_FENCE_TRACE macros v2
Only the DRM GPU scheduler, radeon and amdgpu where using them and they depend
on a non existing config option to actually emit some code.

v2: keep the signal path as is for now

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818105443.1578-1-christian.koenig@amd.com
2021-09-02 12:40:52 +02:00
Daniel Vetter
981b04d968 drm/sched: improve docs around drm_sched_entity
I found a few too many things that are tricky and not documented, so I
started typing.

I found a few more things that looked broken while typing, see the
varios FIXME in drm_sched_entity.

Also some of the usual logics:
- actually include sched_entity.c declarations, that was lost in the
  move here: 620e762f9a ("drm/scheduler: move entity handling into
  separate file")

- Ditch the kerneldoc for internal functions, keep the comments where
  they're describing more than what the function name already implies.

- Switch drm_sched_entity to inline docs.

Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-7-daniel.vetter@ffwll.ch
2021-08-30 10:57:18 +02:00
Daniel Vetter
0e10e9a1db drm/sched: drop entity parameter from drm_sched_push_job
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Melissa Wen <mwen@igalia.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch
2021-08-30 10:54:45 +02:00
Daniel Vetter
ebd5f74255 drm/sched: Add dependency tracking
Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

v2/3: Rebased.

v4: Repaint this shed. The functions are now called _add_dependency()
and _add_implicit_dependency()

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v3)
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-5-daniel.vetter@ffwll.ch
2021-08-30 10:54:12 +02:00
Daniel Vetter
b0a5303d4e drm/sched: Barriers are needed for entity->last_scheduled
It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-4-daniel.vetter@ffwll.ch
2021-08-30 10:53:48 +02:00
Daniel Vetter
dbe48d030b drm/sched: Split drm_sched_job_init
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't
be a problem where it is now.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Steven Price <steven.price@arm.com> (v2)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch
2021-08-30 10:50:44 +02:00
Boris Brezillon
78efe21b6f drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
2021-07-01 08:53:25 +02:00
Boris Brezillon
3b5ac97ad4 drm/sched: Declare entity idle only after HW submission
The panfrost driver tries to kill in-flight jobs on FD close after
destroying the FD scheduler entities. For this to work properly, we
need to make sure the jobs popped from the scheduler entities have
been queued at the HW level before declaring the entity idle, otherwise
we might iterate over a list that doesn't contain those jobs.

Suggested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624140850.2229697-1-boris.brezillon@collabora.com
2021-06-28 13:16:49 +02:00
Andrey Grodzovsky
0b10ab8069 drm/sched: Avoid data corruptions
Wait for all dependencies of a job  to complete before
killing it to avoid data corruptions.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210519141407.88444-1-andrey.grodzovsky@amd.com
2021-05-19 23:50:28 -04:00
Andrey Grodzovsky
c61cdbdbff drm/scheduler: Fix hang when sched_entity released
Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-14-andrey.grodzovsky@amd.com
2021-05-19 23:50:28 -04:00
Andrey Grodzovsky
75973e5802 drm/sched: Make timeout timer rearm conditional.
We don't want to rearm the timer if driver hook reports
that the device is gone.

v5: Update drm_gpu_sched_stat values in code.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210512142648.666476-11-andrey.grodzovsky@amd.com
2021-05-19 23:50:28 -04:00
Roy Sun
1774baa64f drm/scheduler: Change scheduled fence track v2
Update the timestamp of scheduled fence on HW
completion of the previous fences

This allow more accurate tracking of the fence
execution in HW

v2 (chk): drop the flag check and improve the comment

Signed-off-by: David M Nieto <david.nieto@amd.com>
Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426062701.39732-1-Roy.Sun@amd.com
2021-05-05 09:26:36 +02:00
Maxime Ripard
355b602961
Merge drm/drm-next into drm-misc-next
Christian needs some patches from drm/next

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-04-26 14:03:09 +02:00
Lee Jones
04be0c5b40 drm/scheduler/sched_entity: Fix some function name disparity
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/scheduler/sched_entity.c:204: warning: expecting prototype for drm_sched_entity_kill_jobs(). Prototype was for drm_sched_entity_kill_jobs_cb() instead
 drivers/gpu/drm/scheduler/sched_entity.c:262: warning: expecting prototype for drm_sched_entity_cleanup(). Prototype was for drm_sched_entity_fini() instead
 drivers/gpu/drm/scheduler/sched_entity.c:305: warning: expecting prototype for drm_sched_entity_fini(). Prototype was for drm_sched_entity_destroy() instead

Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210416143725.2769053-25-lee.jones@linaro.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-04-22 14:05:17 +02:00
Jack Zhang
e6c6338f39 drm/amd/amdgpu implement tdr advanced mode
[Why]
Previous tdr design treats the first job in job_timeout as the bad job.
But sometimes a later bad compute job can block a good gfx job and
cause an unexpected gfx job timeout because gfx and compute ring share
internal GC HW mutually.

[How]
This patch implements an advanced tdr mode.It involves an additinal
synchronous pre-resubmit step(Step0 Resubmit) before normal resubmit
step in order to find the real bad job.

1. At Step0 Resubmit stage, it synchronously submits and pends for the
first job being signaled. If it gets timeout, we identify it as guilty
and do hw reset. After that, we would do the normal resubmit step to
resubmit left jobs.

2. For whole gpu reset(vram lost), do resubmit as the old way.

v2: squash in build fix (Alex)

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:45:45 -04:00
Christian König
ac4eb83ab2 drm/sched: select new rq even if there is only one v3
This is necessary when changing priorities of an entity.

v2: test the sched_list instead of num_sched.
v3: set the sched_list to NULL when there is only one entry

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210305125155.2312-1-christian.koenig@amd.com
2021-03-08 14:06:17 +01:00
Christian König
f2f12eb9c3 drm/scheduler: provide scheduler score externally
Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com
2021-02-05 10:47:11 +01:00
Luben Tuikov
a6a1f036c7 drm/scheduler: Job timeout handler returns status (v3)
This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/
2021-01-29 11:30:22 +01:00
Andrey Grodzovsky
e582951baa drm/sched: Cancel and flush all outstanding jobs before finish.
To avoid any possible use after free.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/414814/
CC: stable@vger.kernel.org
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-01-19 10:22:35 +01:00
Maarten Lankhorst
ae75a0431f Merge drm/drm-next into drm-misc-next
Required backmerge since we will be based on top of v5.11, and there
has been a request to backmerge already to upstream some features.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2020-12-15 11:05:43 +01:00
Dave Airlie
b10733527b Merge tag 'amd-drm-next-5.11-2020-12-09' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.11-2020-12-09:

amdgpu:
- SR-IOV fixes
- Navy Flounder updates
- Sienna Cichlid updates
- Dimgrey Cavefish updates
- Vangogh updates
- Misc SMU fixes
- Misc display fixes
- Last big hunk of W=1 warning fixes
- Cursor validation fixes
- CI BACO updates

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210045344.21566-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-12-10 16:55:53 +10:00
Luben Tuikov
71173e787c drm/scheduler: Essentialize the job done callback
The job done callback is called from various
places, in two ways: in job done role, and
as a fence callback role.

Essentialize the callback to an atom
function to just complete the job,
and into a second function as a prototype
of fence callback which calls to complete
the job.

This is used in latter patches by the completion
code.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/405574/

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-12-08 14:38:09 +01:00
Luben Tuikov
6efa4b465c gpu/drm: ring_mirror_list --> pending_list
Rename "ring_mirror_list" to "pending_list",
to describe what something is, not what it does,
how it's used, or how the hardware implements it.

This also abstracts the actual hardware
implementation, i.e. how the low-level driver
communicates with the device it drives, ring, CAM,
etc., shouldn't be exposed to DRM.

The pending_list keeps jobs submitted, which are
out of our control. Usually this means they are
pending execution status in hardware, but the
latter definition is a more general (inclusive)
definition.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/405573/

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-12-08 14:38:03 +01:00
Luben Tuikov
8935ff00e3 drm/scheduler: "node" --> "list"
Rename "node" to "list" in struct drm_sched_job,
in order to make it consistent with what we see
being used throughout gpu_scheduler.h, for
instance in struct drm_sched_entity, as well as
the rest of DRM and the kernel.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/403515/

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-12-08 14:37:55 +01:00
Mauro Carvalho Chehab
e9d2871f69 drm: fix some kernel-doc markups
Some identifiers have different names between their prototypes
and the kernel-doc markup.

Others need to be fixed, as kernel-doc markups should use this format:
        identifier - description

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/12d4ca26f6843618200529ce5445063734d38c04.1605521731.git.mchehab+huawei@kernel.org
2020-11-16 20:48:20 +01:00
Lee Jones
00d44b966d gpu: drm: scheduler: sched_entity: Demote non-conformant kernel-doc headers
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/scheduler/sched_entity.c:316: warning: Function parameter or member 'f' not described in 'drm_sched_entity_clear_dep'
 drivers/gpu/drm/scheduler/sched_entity.c:316: warning: Function parameter or member 'cb' not described in 'drm_sched_entity_clear_dep'
 drivers/gpu/drm/scheduler/sched_entity.c:330: warning: Function parameter or member 'f' not described in 'drm_sched_entity_wakeup'
 drivers/gpu/drm/scheduler/sched_entity.c:330: warning: Function parameter or member 'cb' not described in 'drm_sched_entity_wakeup'

Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 00:03:31 -05:00
Lee Jones
26b5cf49cd gpu: drm: scheduler: sched_main: Provide missing description for 'sched' paramter
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/scheduler/sched_main.c:74: warning: Function parameter or member 'sched' not described in 'drm_sched_rq_init'

Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 00:03:24 -05:00
Dave Airlie
1cd260a790 drm-misc-next for 5.11:
UAPI Changes:
 
   - doc: rules for EBUSY on non-blocking commits; requirements for fourcc
     modifiers; on parsing EDID
   - fbdev/sbuslib: Remove unused FBIOSCURSOR32
   - fourcc: deprecate DRM_FORMAT_MOD_NONE
   - virtio: Support blob resources for memory allocations; Expose host-visible
     and cross-device features
 
 Cross-subsystem Changes:
 
   - devicetree: Add vendor Prefix for Yes Optoelectronics, Shanghai Top Display
     Optoelectronics
   - dma-buf: Add struct dma_buf_map that stores DMA pointer and I/O-memory flag;
     dma_buf_vmap()/vunmap() return address in dma_buf_map; Use struct_size() macro
 
 Core Changes:
 
   - atomic: pass full state to CRTC atomic enable/disable; warn for EBUSY during
     non-blocking commits
   - dp: Prepare for DP 2.0 DPCD
   - dp_mst: Receive extended DPCD caps
   - dma-buf: Documentation
   - doc: Format modifiers; dma-buf-map; Cleanups
   - fbdev: Don't use compat_alloc_user_space(); mark as orphaned
   - fb-helper: Take lock in drm_fb_helper_restore_work_fb()
   - gem: Convert implementation and drivers to GEM object functions, remove
     GEM callbacks from struct drm_driver (expect gem_prime_mmap)
   - panel: Cleanups
   - pci: Add legacy infix to drm_irq_by_busid()
   - sched: Avoid infinite waits in drm_sched_entity_destroy()
   - switcheroo: Cleanups
   - ttm: Remove AGP support; Don't modify caching during swapout; Major
     refactoring of the implementation and API that affects all depending
     drivers; Add ttm_bo_wait_ctx(); Add ttm_bo_pin()/unpin() in favor of
     TTM_PL_FLAG_NO_EVICT; Remove ttm_bo_create(); Remove fault_reserve_notify()
     callback; Push move() implementation into drivers; Remove TTM_PAGE_FLAG_WRITE;
     Replace caching flags with init-time cache setting; Push ttm_tt_bind() into
     drivers; Replace move_notify() with delete_mem_notify(); No overlapping memcpy();
     no more ttm_set_populated()
   - vram-helper: Fix BO top-down placement; TTM-related changes; Init GEM
     object functions with defaults; Default placement in system memory; Cleanups
 
 Driver Changes:
 
   - amdgpu: Use GEM object functions
   - armada: Use GEM object functions
   - aspeed: Configure output via sysfs; Init struct drm_driver with
   - ast: Reload LUT after FB format changes
   - bridge: Add driver and DT bindings for anx7625; Cleanups
   - bridge/dw-hdmi: Constify ops
   - bridge/ti-sn65dsi86: Add retries for link training
   - bridge/lvds-codec: Add support for regulator
   - bridge/tc358768: Restore connector support DRM_GEM_CMA_DRIVEROPS; Cleanups
   - display/ti,j721e-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
     dma-coherent
   - display/ti,am65s-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
     dma-coherent
   - etnaviv: Use GEM object functions
   - exynos: Use GEM object functions
   - fbdev: Cleanups and compiler fixes throughout framebuffer drivers
   - fbdev/cirrusfb: Avoid division by 0
   - gma500: Use GEM object functions; Fix double-free of connector; Cleanups
   - hisilicon/hibmc: I2C-based DDC support; Use to_hibmc_drm_device(); Cleanups
   - i915: Use GEM object functions
   - imx/dcss: Init driver with DRM_GEM_CMA_DRIVER_OPS; Cleanups
   - ingenic: Reset pixel clock when parent clock changes; support reserved
     memory; Alloc F0 and F1 DMA channels at once; Support different pixel formats;
     Revert support for cached mmap buffers
     on F0/F1; support 30-bit/24-bit/8-bit-palette modes
   - komeda: Use DEFINE_SHOW_ATTRIBUTE
   - mcde: Detect platform_get_irq() errors
   - mediatek: Use GEM object functions
   - msm: Use GEM object functions
   - nouveau: Cleanups; TTM-related changes; Use GEM object functions
   - omapdrm: Use GEM object functions
   - panel: Add driver and DT bindings for Novatak nt36672a; Add driver and DT
     bindings for YTC700TLAG-05-201C; Add driver and DT bindings for TDO TL070WSH30;
     Cleanups
   - panel/mantix: Fix reset; Fix deref of NULL pointer in mantix_get_modes()
   - panel/otm8009a: Allow non-continuous dsi clock; Cleanups
   - panel/rm68200: Allow non-continuous dsi clock; Fix mode to 50 FPS
   - panfrost: Fix job timeout handling; Cleanups
   - pl111: Use GEM object functions
   - qxl: Cleanups; TTM-related changes; Pin new BOs with ttm_bo_init_reserved()
   - radeon: Cleanups; TTM-related changes; Use GEM object functions
   - rockchip: Use GEM object functions
   - shmobile: Cleanups
   - tegra: Use GEM object functions
   - tidss: Set drm_plane_helper_funcs.prepare_fb
   - tilcdc: Don't keep vblank interrupt enabled all the time
   - tve200: Detect platform_get_irq() errors
   - vc4: Use GEM object functions; Only register components once DSI is attached;
     Add Maxime as maintainer
   - vgem: Use GEM object functions
   - via: Simplify critical section in via_mem_alloc()
   - virtgpu: Use GEM object functions
   - virtio: Implement blob resources, host-visible and cross-device features;
     Support mapping of host-allocated resources; Use UUID APi; Cleanups
   - vkms: Use GEM object functions; Switch to SHMEM
   - vmwgfx: TTM-related changes; Inline ttm_bo_swapout_all()
   - xen: Use GEM object functions
   - xlnx: Use GEM object functions
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl+X8N4ACgkQaA3BHVML
 eiPulggAu/RudQn0GmWPcDg9DpM30zE/ppBrmKk+WGWqj6M2DgK9gy+KhJps+Uht
 Fb2jMnUirNS5ZNa5kVhkdazOKqHq5jHHV+SbRPziySzV56TW8lbPU/HOUhKSQbkF
 FUB/YCWbb2kJA23So9VwNkjSJUXKpy896WoVxH7b/gLYL7c+sHUK9TOWAlsbFEmD
 t3kjxQgsHdVhqaZIKE7zg72Vi1AkkhjCVraPQeZY1GgmmLxdQeEKhNO8xdfG3OzY
 US4MYwJ51RfaCDTFr5t1UA224ODxoJtV3dTDDtrx4R5sf4MYJUC4SJYZHIyHyUkm
 9KXjFFzB9+Hd0JjpUHFUyl+4k8JjHQ==
 =GWwb
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.11:

UAPI Changes:

  - doc: rules for EBUSY on non-blocking commits; requirements for fourcc
    modifiers; on parsing EDID
  - fbdev/sbuslib: Remove unused FBIOSCURSOR32
  - fourcc: deprecate DRM_FORMAT_MOD_NONE
  - virtio: Support blob resources for memory allocations; Expose host-visible
    and cross-device features

Cross-subsystem Changes:

  - devicetree: Add vendor Prefix for Yes Optoelectronics, Shanghai Top Display
    Optoelectronics
  - dma-buf: Add struct dma_buf_map that stores DMA pointer and I/O-memory flag;
    dma_buf_vmap()/vunmap() return address in dma_buf_map; Use struct_size() macro

Core Changes:

  - atomic: pass full state to CRTC atomic enable/disable; warn for EBUSY during
    non-blocking commits
  - dp: Prepare for DP 2.0 DPCD
  - dp_mst: Receive extended DPCD caps
  - dma-buf: Documentation
  - doc: Format modifiers; dma-buf-map; Cleanups
  - fbdev: Don't use compat_alloc_user_space(); mark as orphaned
  - fb-helper: Take lock in drm_fb_helper_restore_work_fb()
  - gem: Convert implementation and drivers to GEM object functions, remove
    GEM callbacks from struct drm_driver (expect gem_prime_mmap)
  - panel: Cleanups
  - pci: Add legacy infix to drm_irq_by_busid()
  - sched: Avoid infinite waits in drm_sched_entity_destroy()
  - switcheroo: Cleanups
  - ttm: Remove AGP support; Don't modify caching during swapout; Major
    refactoring of the implementation and API that affects all depending
    drivers; Add ttm_bo_wait_ctx(); Add ttm_bo_pin()/unpin() in favor of
    TTM_PL_FLAG_NO_EVICT; Remove ttm_bo_create(); Remove fault_reserve_notify()
    callback; Push move() implementation into drivers; Remove TTM_PAGE_FLAG_WRITE;
    Replace caching flags with init-time cache setting; Push ttm_tt_bind() into
    drivers; Replace move_notify() with delete_mem_notify(); No overlapping memcpy();
    no more ttm_set_populated()
  - vram-helper: Fix BO top-down placement; TTM-related changes; Init GEM
    object functions with defaults; Default placement in system memory; Cleanups

Driver Changes:

  - amdgpu: Use GEM object functions
  - armada: Use GEM object functions
  - aspeed: Configure output via sysfs; Init struct drm_driver with
  - ast: Reload LUT after FB format changes
  - bridge: Add driver and DT bindings for anx7625; Cleanups
  - bridge/dw-hdmi: Constify ops
  - bridge/ti-sn65dsi86: Add retries for link training
  - bridge/lvds-codec: Add support for regulator
  - bridge/tc358768: Restore connector support DRM_GEM_CMA_DRIVEROPS; Cleanups
  - display/ti,j721e-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
    dma-coherent
  - display/ti,am65s-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
    dma-coherent
  - etnaviv: Use GEM object functions
  - exynos: Use GEM object functions
  - fbdev: Cleanups and compiler fixes throughout framebuffer drivers
  - fbdev/cirrusfb: Avoid division by 0
  - gma500: Use GEM object functions; Fix double-free of connector; Cleanups
  - hisilicon/hibmc: I2C-based DDC support; Use to_hibmc_drm_device(); Cleanups
  - i915: Use GEM object functions
  - imx/dcss: Init driver with DRM_GEM_CMA_DRIVER_OPS; Cleanups
  - ingenic: Reset pixel clock when parent clock changes; support reserved
    memory; Alloc F0 and F1 DMA channels at once; Support different pixel formats;
    Revert support for cached mmap buffers
    on F0/F1; support 30-bit/24-bit/8-bit-palette modes
  - komeda: Use DEFINE_SHOW_ATTRIBUTE
  - mcde: Detect platform_get_irq() errors
  - mediatek: Use GEM object functions
  - msm: Use GEM object functions
  - nouveau: Cleanups; TTM-related changes; Use GEM object functions
  - omapdrm: Use GEM object functions
  - panel: Add driver and DT bindings for Novatak nt36672a; Add driver and DT
    bindings for YTC700TLAG-05-201C; Add driver and DT bindings for TDO TL070WSH30;
    Cleanups
  - panel/mantix: Fix reset; Fix deref of NULL pointer in mantix_get_modes()
  - panel/otm8009a: Allow non-continuous dsi clock; Cleanups
  - panel/rm68200: Allow non-continuous dsi clock; Fix mode to 50 FPS
  - panfrost: Fix job timeout handling; Cleanups
  - pl111: Use GEM object functions
  - qxl: Cleanups; TTM-related changes; Pin new BOs with ttm_bo_init_reserved()
  - radeon: Cleanups; TTM-related changes; Use GEM object functions
  - rockchip: Use GEM object functions
  - shmobile: Cleanups
  - tegra: Use GEM object functions
  - tidss: Set drm_plane_helper_funcs.prepare_fb
  - tilcdc: Don't keep vblank interrupt enabled all the time
  - tve200: Detect platform_get_irq() errors
  - vc4: Use GEM object functions; Only register components once DSI is attached;
    Add Maxime as maintainer
  - vgem: Use GEM object functions
  - via: Simplify critical section in via_mem_alloc()
  - virtgpu: Use GEM object functions
  - virtio: Implement blob resources, host-visible and cross-device features;
    Support mapping of host-allocated resources; Use UUID APi; Cleanups
  - vkms: Use GEM object functions; Switch to SHMEM
  - vmwgfx: TTM-related changes; Inline ttm_bo_swapout_all()
  - xen: Use GEM object functions
  - xlnx: Use GEM object functions

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027100936.GA4858@linux-uq9g
2020-11-04 11:49:10 +10:00
Maxime Ripard
c489573b5b
Merge drm/drm-next into drm-misc-next
Daniel needs -rc2 in drm-misc-next to merge some patches

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-11-02 11:17:54 +01:00
Boris Brezillon
170fb58ee3 drm/sched: Avoid infinite waits in the drm_sched_entity_destroy() path
If we don't initialize the entity to idle and the entity is never
scheduled before being destroyed we end up with an infinite wait in the
destroy path.

v2:
- Add Steven's R-b

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/393486/
2020-10-05 15:06:33 +02:00
Tian Tao
05f59762bc drm/scheduler: fix sched_fence.c kernel-doc warnings
Fix kernel-doc warnings.
drivers/gpu/drm/scheduler/sched_fence.c:110: warning: Function parameter or
member 'f' not described in 'drm_sched_fence_release_scheduled'
drivers/gpu/drm/scheduler/sched_fence.c:110: warning: Excess function
parameter 'fence' description in 'drm_sched_fence_release_scheduled'

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Dave Airlie
0c8d22fcae Merge tag 'amd-drm-next-5.10-2020-09-03' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.10-2020-09-03:

amdgpu:
- RAS fixes
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support in DC
- Enable plane rotation
- Rework pre-OS vram reservation handling during driver init
- Add standard interface to dump GPU metrics table from SMU
- Rework tiling and tmz state handling in atomic commits
- Pstate fixes
- Add voltage and power hwmon interfaces for renoir
- SW CTF fixes
- S/G display fix for Raven
- Print client strings for vmfaults for vega and newer
- Manual fan control fixes
- Display updates
- Reorg power management directory structure
- Misc bug fixes
- Misc code cleanups

amdkfd:
- Topology fixes
- Add SMI events for thermal throttling and GPU resets

radeon:
- switch from pci_* to dma_* for dma allocations
- PLL fix

Scheduler:
- Clean up priority levels

UAPI:
- amdgpu INFO IOCTL query update for TMZ state
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
- amdkfd SMI event interface updates
  https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200903222921.4152-1-alexander.deucher@amd.com
2020-09-08 16:40:13 +10:00
Luben Tuikov
e2d732fdb7 drm/scheduler: Scheduler priority fixes (v2)
Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-18 18:20:17 -04:00
Linus Torvalds
6d2b84a4e5 This tree adds the sched_set_fifo*() encapsulation APIs to remove
static priority level knowledge from non-scheduler code.
 
 The three APIs for non-scheduler code to set SCHED_FIFO are:
 
  - sched_set_fifo()
  - sched_set_fifo_low()
  - sched_set_normal()
 
 These are two FIFO priority levels: default (high), and a 'low' priority level,
 plus sched_set_normal() to set the policy back to non-SCHED_FIFO.
 
 Since the changes affect a lot of non-scheduler code, we kept this in a separate
 tree.
 
 When merging to the latest upstream tree there's a conflict in drivers/spi/spi.c,
 which can be resolved via:
 
 	sched_set_fifo(ctlr->kworker_task);
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8pPQIRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j0Jw/+LlSyX6gD2ATy3cizGL7DFPZogD5MVKTb
 IXbhXH/ACpuPQlBe1+haRLbJj6XfXqbOlAleVKt7eh+jZ1jYjC972RCSTO4566mJ
 0v8Iy9kkEeb2TDbYx1H3bnk78lf85t0CB+sCzyKUYFuTrXU04eRj7MtN3vAQyRQU
 xJg83x/sT5DGdDTP50sL7lpbwk3INWkD0aDCJEaO/a9yHElMsTZiZBKoXxN/s30o
 FsfzW56jqtng771H2bo8ERN7+abwJg10crQU5mIaLhacNMETuz0NZ/f8fY/fydCL
 Ju8HAdNKNXyphWkAOmixQuyYtWKe2/GfbHg8hld0jmpwxkOSTgZjY+pFcv7/w306
 g2l1TPOt8e1n5jbfnY3eig+9Kr8y0qHkXPfLfgRqKwMMaOqTTYixEzj+NdxEIRX9
 Kr7oFAv6VEFfXGSpb5L1qyjIGVgQ5/JE/p3OC3GHEsw5VKiy5yjhNLoSmSGzdS61
 1YurVvypSEUAn3DqTXgeGX76f0HH365fIKqmbFrUWxliF+YyflMhtrj2JFtejGzH
 Md3RgAzxusE9S6k3gw1ev4byh167bPBbY8jz0w3Gd7IBRKy9vo92h6ZRYIl6xeoC
 BU2To1IhCAydIr6hNsIiCSDTgiLbsYQzPuVVovUxNh+l1ZvKV2X+csEHhs8oW4pr
 4BRU7dKL2NE=
 =/7JH
 -----END PGP SIGNATURE-----

Merge tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull sched/fifo updates from Ingo Molnar:
 "This adds the sched_set_fifo*() encapsulation APIs to remove static
  priority level knowledge from non-scheduler code.

  The three APIs for non-scheduler code to set SCHED_FIFO are:

   - sched_set_fifo()
   - sched_set_fifo_low()
   - sched_set_normal()

  These are two FIFO priority levels: default (high), and a 'low'
  priority level, plus sched_set_normal() to set the policy back to
  non-SCHED_FIFO.

  Since the changes affect a lot of non-scheduler code, we kept this in
  a separate tree"

* tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  sched,tracing: Convert to sched_set_fifo()
  sched: Remove sched_set_*() return value
  sched: Remove sched_setscheduler*() EXPORTs
  sched,psi: Convert to sched_set_fifo_low()
  sched,rcutorture: Convert to sched_set_fifo_low()
  sched,rcuperf: Convert to sched_set_fifo_low()
  sched,locktorture: Convert to sched_set_fifo()
  sched,irq: Convert to sched_set_fifo()
  sched,watchdog: Convert to sched_set_fifo()
  sched,serial: Convert to sched_set_fifo()
  sched,powerclamp: Convert to sched_set_fifo()
  sched,ion: Convert to sched_set_normal()
  sched,powercap: Convert to sched_set_fifo*()
  sched,spi: Convert to sched_set_fifo*()
  sched,mmc: Convert to sched_set_fifo*()
  sched,ivtv: Convert to sched_set_fifo*()
  sched,drm/scheduler: Convert to sched_set_fifo*()
  sched,msm: Convert to sched_set_fifo*()
  sched,psci: Convert to sched_set_fifo*()
  sched,drbd: Convert to sched_set_fifo*()
  ...
2020-08-06 11:55:43 -07:00
Maarten Lankhorst
60e9eabf41 Backmerge remote-tracking branch 'drm/drm-next' into drm-misc-next
Some conflicts with ttm_bo->offset removal, but drm-misc-next needs updating to v5.8.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2020-06-29 12:16:26 +02:00
Nirmoy Das
d41a39dda1 drm/scheduler: improve job distribution with multiple queues
This patch uses score to select a new drm scheduler for better
loadbalance between multiple drm schedulers instead of num_jobs.

Below are test results after running amdgpu_test for ~10 times.

Before this patch:

sched_name     num of many times it got schedule
=========      ==================================
sdma0          1463
sdma1          198
comp_1.0.1     280

After this patch:

sched_name     num of many times it got schedule
=========      ==================================
sdma0          925
sdma1          928
comp_1.0.1     177
comp_1.1.1     44
comp_1.2.1     43
comp_1.3.1     44

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/373000/
Signed-off-by: Christian König <christian.koenig@amd.com>
2020-06-26 14:16:29 +02:00
Peter Zijlstra
7b31e940b1 sched,drm/scheduler: Convert to sched_set_fifo*()
Because SCHED_FIFO is a broken scheduler model (see previous patches)
take away the priority field, the kernel can't possibly make an
informed decision.

In this case, use fifo_low, because it only cares about being above
SCHED_NORMAL. Effectively no change in behaviour.

Cc: alexander.deucher@amd.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
2020-06-15 14:10:21 +02:00
Christian König
8623b5255a drm/scheduler: fix drm_sched_get_cleanup_job
We are racing to initialize sched->thread here, just always check the
current thread.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Link: https://patchwork.freedesktop.org/patch/361303/
2020-04-15 11:09:13 +02:00
Yintian Tao
77bb2f204f drm/scheduler: fix rare NULL ptr race
There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
->dma_fence_signal
    ->dma_fence_signal_locked
	->test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.

->dma_fence_put
    ->dma_fence_release
	->drm_sched_fence_release_scheduled
	    ->call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list

Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal

[  732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  732.914815] #PF: supervisor write access in kernel mode
[  732.915731] #PF: error_code(0x0002) - not-present page
[  732.916621] PGD 0 P4D 0
[  732.917072] Oops: 0002 [#1] SMP PTI
[  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE     5.4.0-rc7 #1
[  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[  732.938569] Call Trace:
[  732.939003]  <IRQ>
[  732.939364]  dma_fence_signal+0x29/0x50
[  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
[  732.941910]  dma_fence_signal_locked+0x85/0x100
[  732.942692]  dma_fence_signal+0x29/0x50
[  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
[  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]

v2: hold the finished fence at drm_sched_process_job instead of
    amdgpu_fence_process
v3: resume the blank line

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-25 17:00:11 -04:00
Nirmoy Das
ec2edcc279 drm/sched: implement and export drm_sched_pick_best
Remove drm_sched_entity_get_free_sched() and use the logic of picking
the least loaded drm scheduler from a drm scheduler list to implement
drm_sched_pick_best(). This patch also exports drm_sched_pick_best() so
that it can be utilized by other drm drivers.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-16 16:21:32 -04:00
changzhu
d164bebb95 Revert "drm/scheduler: improve job distribution with multiple queues"
It needs to revert this patch to avoid amdgpu_test compute hang problem
on picasso.

This reverts commit 56822db194.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-16 16:21:32 -04:00
Lucas Stach
a7fbb630c5 drm/scheduler: fix inconsistent locking of job_list_lock
1db8c142b6 (drm/scheduler: Add drm_sched_suspend/resume_timeout()) made
the job_list_lock IRQ safe in as the suspend/resume calls were expected to
be called from IRQ context. This usage never materialized in upstream.
Instead amdgpu started locking the job_list_lock in an IRQ unsafe way in
amdgpu_ib_preempt_mark_partial_job() and amdgpu_ib_preempt_job_recovery(),
which leads to potential deadlock if one would actually start to call the
drm_sched_suspend/resume_timeout functions from IRQ context.

As no current user needs the locking to be IRQ safe, the local IRQ
disable/enable is pure overhead. Fix the inconsistent locking by changing
all uses of job_list_lock to use the IRQ unsafe locking primitives.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-13 11:52:36 -04:00
Robert Beckett
c2c91828fb drm/sched: add run job trace
Add a new trace event to show when jobs are run on the HW.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-13 11:52:36 -04:00
Nirmoy Das
b37aced31e drm/scheduler: implement a function to modify sched list
Implement drm_sched_entity_modify_sched() which modifies existing
sched_list with a different one. This is going to be helpful when
userspace changes priority of a ctx/entity then the driver can switch
to the corresponding HW scheduler list for that priority.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-09 13:51:35 -04:00
Nirmoy Das
2639f453f2 drm/amdgpu: fix doc by clarifying sched_list definition
expand sched_list definition for better understanding.
Also fix a typo atleast -> at least

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27 16:46:44 -05:00
Nirmoy Das
9e3e90c50d drm/scheduler: fix documentation by replacing rq_list with sched_list
This also replaces old artifacts with a correct one in drm_sched_entity_init()
declaration

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:41:00 -05:00
Nirmoy Das
56822db194 drm/scheduler: improve job distribution with multiple queues
This patch uses score based logic to select a new rq for better
loadbalance between multiple rq/scheds instead of num_jobs.

Below are test results after running amdgpu_test from mesa drm

Before this patch:

sched_name     num of many times it got scheduled
=========      ==================================
sdma0          314
sdma1          32
comp_1.0.0     56
comp_1.0.1     0
comp_1.1.0     0
comp_1.1.1     0
comp_1.2.0     0
comp_1.2.1     0
comp_1.3.0     0
comp_1.3.1     0
After this patch:

sched_name     num of many times it got scheduled
=========      ==================================
sdma0          216
sdma1          185
comp_1.0.0     39
comp_1.0.1     9
comp_1.1.0     12
comp_1.1.1     0
comp_1.2.0     12
comp_1.2.1     0
comp_1.3.0     12
comp_1.3.1     0

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:37:54 -05:00
Nirmoy Das
8c23056bdc drm/scheduler: do not keep a copy of sched list
entity should not keep copy and maintain sched list for
itself.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Nirmoy Das
b3ac17667f drm/scheduler: rework entity creation
Entity currently keeps a copy of run_queue list and modify it in
drm_sched_entity_set_priority(). Entities shouldn't modify run_queue
list. Use drm_gpu_scheduler list instead of drm_sched_rq list
in drm_sched_entity struct. In this way we can select a runqueue based
on entity/ctx's priority for a  drm scheduler.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Daniel Vetter
6c56e8adc0 drm-misc-next for v5.6:
UAPI Changes:
 - Add support for DMA-BUF HEAPS.
 
 Cross-subsystem Changes:
 - mipi dsi definition updates, pulled into drm-intel as well.
 - Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
 - Remove support for dma-buf kmap/kunmap.
 - Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.
 
 Core Changes:
 - Small cleanups to ttm.
 - Fix SCDC definition.
 - Assorted cleanups to core.
 - Add todo to remove load/unload hooks, and use generic fbdev emulation.
 - Assorted documentation updates.
 - Use blocking ww lock in ttm fault handler.
 - Remove drm_fb_helper_fbdev_setup/teardown.
 - Warning fixes with W=1 for atomic.
 - Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
 - Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
 - Various kconfig indentation fixes in core and drivers.
 - Fix freeing transactions in dp-mst correctly.
 - Sean Paul is steping down as core maintainer. :-(
 - Add lockdep annotations for atomic locks vs dma-resv.
 - Prevent use-after-free for a bad job in drm_scheduler.
 - Fill out all block sizes in the P01x and P210 definitions.
 - Avoid division by zero in drm/rect, and fix bounds.
 - Add drm/rect selftests.
 - Add aspect ratio and alternate clocks for HDMI 4k modes.
 - Add todo for drm_framebuffer_funcs and fb_create cleanup.
 - Drop DRM_AUTH for prime import/export ioctls.
 - Clear DP-MST payload id tables downstream when initializating.
 - Fix for DSC throughput definition.
 - Add extra FEC definitions.
 - Fix fake offset in drm_gem_object_funs.mmap.
 - Stop using encoder->bridge in core directly
 - Handle bridge chaining slightly better.
 - Add backlight support to drm/panel, and use it in many panel drivers.
 - Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.
 
 Driver Changes:
 - Small fixes all over.
 - Fix documentation in vkms.
 - Fix mmap_sem vs dma_resv in nouveau.
 - Small cleanup in komeda.
 - Add page flip support in gma500 for psb/cdv.
 - Add ddc symlink in the connector sysfs directory for many drivers.
 - Add support for analogic an6345, and fix small bugs in it.
 - Add atomic modesetting support to ast.
 - Fix radeon fault handler VMA race.
 - Switch udl to use generic shmem helpers.
 - Unconditional vblank handling for mcde.
 - Miscellaneous fixes to mcde.
 - Tweak debug output from komeda using debugfs.
 - Add gamma and color transform support to komeda for DOU-IPS.
 - Add support for sony acx424AKP panel.
 - Various small cleanups to gma500.
 - Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
 - Add support for Logic PD Type 28 panel.
 - Use drm_panel_* wrapper functions in exynos/tegra/msm.
 - Add devicetree bindings for generic DSI panels.
 - Don't include drm_pci.h directly in many drivers.
 - Add support for begin/end_cpu_access in udmabuf.
 - Stop using drm_get_pci_dev in gma500 and mga200.
 - Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
 - Add devfreq thermal support to panfrost.
 - Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
 - meson: Add support for OSD1 plane AFBC commit.
 - Stop displaying garbage when toggling ast primary plane on/off.
 - More cleanups and fixes to UDL.
 - Add D32 suport to komeda.
 - Remove globle copy of drm_dev in gma500.
 - Add support for Boe Himax8279d MIPI-DSI LCD panel.
 - Add support for ingenic JZ4770 panel.
 - Small null pointer deference fix in ingenic.
 - Remove support for the special tfp420 driver, as there is a generic way to do it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl34lkkACgkQ/lWMcqZw
 E8M76g//WRYl9fWnV063s44FBVJYjGxaus0vQJSGidaPCIE6Ep6TNjXp8DVzV82M
 HR79P9glL02DC9B8pflioNNXdIRGSVk/FJcKVB2seFAqEFCAknvWDM/X/y+mOUpp
 fUeFl+Znlwx3YlM8f4Qujdbm+CbTewfbya4VAWeWd8XG2V8jfq5cmODPPlUMNenZ
 J6Ja+W3ph741uSIfAKaP69LVJgOcuUjXINE4SWhRk/i5QF3GIRej/A7ZjWGLQ/t2
 2zUUF7EiCzhPomM40H3ddKtXb4ZjNJuc5pOD4GpxR8ciNbe2gUOHEZ5aenwYBdsU
 5MwbxNKyMbKXATtn3yv3fSc4jH3DtmEKpmovONeO8ZDBrQBnxeYa3tQvfkNghA2f
 acoZMzYUImV+ft6DMIgpXppASvo7mQYDAbLPOGEJ9E44AL4UP00jesEjnK5FOHSR
 3BEzGUnK/6QL5zFNPni8YZQ8dan4jDIno1mqIV+cQ4WCGlaKckzIWO6243Bf13b/
 kROSJpgWkiK6Ngq0ofhD0MHyT/m1QnqUzWRKTJhRtPflSWRBsDZqWCQ5Vx1QlNIE
 /HfTNbTpXWwa+5wXbbB8TkDw5t9cQGnR+QcrEd9HgoIec7B5Re8rx9i0TJAT4N05
 03RCQCecSfD8gwKd2wgaFIpFGRl9lTdLYSpffSmyL2X5a20lZhM=
 =b15X
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-12-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.6:

UAPI Changes:
- Add support for DMA-BUF HEAPS.

Cross-subsystem Changes:
- mipi dsi definition updates, pulled into drm-intel as well.
- Add lockdep annotations for dma_resv vs mmap_sem and fs_reclaim.
- Remove support for dma-buf kmap/kunmap.
- Constify fb_ops in all fbdev drivers, including drm drivers and drm-core, and media as well.

Core Changes:
- Small cleanups to ttm.
- Fix SCDC definition.
- Assorted cleanups to core.
- Add todo to remove load/unload hooks, and use generic fbdev emulation.
- Assorted documentation updates.
- Use blocking ww lock in ttm fault handler.
- Remove drm_fb_helper_fbdev_setup/teardown.
- Warning fixes with W=1 for atomic.
- Use drm_debug_enabled() instead of drm_debug flag testing in various drivers.
- Fallback to nontiled mode in fbdev emulation when not all tiles are present. (Later on reverted)
- Various kconfig indentation fixes in core and drivers.
- Fix freeing transactions in dp-mst correctly.
- Sean Paul is steping down as core maintainer. :-(
- Add lockdep annotations for atomic locks vs dma-resv.
- Prevent use-after-free for a bad job in drm_scheduler.
- Fill out all block sizes in the P01x and P210 definitions.
- Avoid division by zero in drm/rect, and fix bounds.
- Add drm/rect selftests.
- Add aspect ratio and alternate clocks for HDMI 4k modes.
- Add todo for drm_framebuffer_funcs and fb_create cleanup.
- Drop DRM_AUTH for prime import/export ioctls.
- Clear DP-MST payload id tables downstream when initializating.
- Fix for DSC throughput definition.
- Add extra FEC definitions.
- Fix fake offset in drm_gem_object_funs.mmap.
- Stop using encoder->bridge in core directly
- Handle bridge chaining slightly better.
- Add backlight support to drm/panel, and use it in many panel drivers.
- Increase max number of y420 modes from 128 to 256, as preparation to add the new modes.

Driver Changes:
- Small fixes all over.
- Fix documentation in vkms.
- Fix mmap_sem vs dma_resv in nouveau.
- Small cleanup in komeda.
- Add page flip support in gma500 for psb/cdv.
- Add ddc symlink in the connector sysfs directory for many drivers.
- Add support for analogic an6345, and fix small bugs in it.
- Add atomic modesetting support to ast.
- Fix radeon fault handler VMA race.
- Switch udl to use generic shmem helpers.
- Unconditional vblank handling for mcde.
- Miscellaneous fixes to mcde.
- Tweak debug output from komeda using debugfs.
- Add gamma and color transform support to komeda for DOU-IPS.
- Add support for sony acx424AKP panel.
- Various small cleanups to gma500.
- Use generic fbdev emulation in udl, and replace udl_framebuffer with generic implementation.
- Add support for Logic PD Type 28 panel.
- Use drm_panel_* wrapper functions in exynos/tegra/msm.
- Add devicetree bindings for generic DSI panels.
- Don't include drm_pci.h directly in many drivers.
- Add support for begin/end_cpu_access in udmabuf.
- Stop using drm_get_pci_dev in gma500 and mga200.
- Fixes to UDL damage handling, and use dma_buf_begin/end_cpu_access.
- Add devfreq thermal support to panfrost.
- Fix hotplug with daisy chained monitors by removing VCPI when disabling topology manager.
- meson: Add support for OSD1 plane AFBC commit.
- Stop displaying garbage when toggling ast primary plane on/off.
- More cleanups and fixes to UDL.
- Add D32 suport to komeda.
- Remove globle copy of drm_dev in gma500.
- Add support for Boe Himax8279d MIPI-DSI LCD panel.
- Add support for ingenic JZ4770 panel.
- Small null pointer deference fix in ingenic.
- Remove support for the special tfp420 driver, as there is a generic way to do it.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ba73535a-9334-5302-2e1f-5208bd7390bd@linux.intel.com
2019-12-17 13:57:54 +01:00
Andrey Grodzovsky
135517d356 drm/scheduler: Avoid accessing freed bad job.
Problem:
Due to a race between drm_sched_cleanup_jobs in sched thread and
drm_sched_job_timedout in timeout work there is a possiblity that
bad job was already freed while still being accessed from the
timeout thread.

Fix:
Instead of just peeking at the bad job in the mirror list
remove it from the list under lock and then put it back later when
we are garanteed no race with main sched thread is possible which
is after the thread is parked.

v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.

v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
drm_sched_get_cleanup_job already has a lock there.

v4: Fix comments to relfect latest code in drm-misc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/342356
2019-11-27 16:05:49 +01:00
Andrey Grodzovsky
2b6f717c33 drm/sched: Avoid job cleanup if sched thread is parked.
When the sched thread is parked we assume ring_mirror_list is
not accessed from here.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
83a7772ba2 drm/sched: Use completion to wait for sched->thread idle v2.
Removes thread park/unpark hack from drm_sched_entity_fini and
by this fixes reactivation of scheduler thread while the thread
is supposed to be stopped.

v2: Per sched entity completion.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
d7c5782acd drm/sched: Fix passing zero to 'PTR_ERR' warning v2
Fix a static code checker warning.

v2: Drop PTR_ERR_OR_ZERO.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:47 -05:00
Dave Airlie
8a86b00a43 Merge tag 'drm-next-5.5-2019-11-01' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-01:

amdgpu:
- Add EEPROM support for Arcturus
- Enable VCN encode support for Arcturus
- Misc PSP fixes
- Misc DC fixes
- swSMU cleanup

amdkfd:
- Misc cleanups
- Fix typo in cu bitmap parsing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101190607.3763-1-alexander.deucher@amd.com
2019-11-04 10:22:53 +10:00
Andrey Grodzovsky
e91e5f080e drm/sched: Set error to s_fence if HW job submission failed.
Problem:
When run_job fails and HW fence returned is NULL we still signal
the s_fence to avoid hangs but the user has no way of knowing if
the actual HW job was ran and finished.

Fix:
Allow .run_job implementations to return ERR_PTR in the fence pointer
returned and then set this error for s_fence->finished fence so whoever
wait on this fence can inspect the signaled fence for an error.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25 16:50:11 -04:00
Steven Price
588b9828f0 drm: Don't free jobs in wait_event_interruptible()
drm_sched_cleanup_jobs() attempts to free finished jobs, however because
it is called as the condition of wait_event_interruptible() it must not
sleep. Unfortunately some free callbacks (notably for Panfrost) do sleep.

Instead let's rename drm_sched_cleanup_jobs() to
drm_sched_get_cleanup_job() and simply return a job for processing if
there is one. The caller can then call the free_job() callback outside
the wait_event_interruptible() where sleeping is possible before
re-checking and returning to sleep if necessary.

Tested-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Fixes: 5918045c4e ("drm/scheduler: rework job destruction")
Signed-off-by: Steven Price <steven.price@arm.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/337652/
2019-10-25 13:47:58 +02:00
Ben Dooks
2636a5172d drm/scheduler: make unexported items static
The drm_sched_fence_ops_{scheduled,finished} are not exported
from the file so make them static to avoid the following
warnings from sparse:

drivers/gpu/drm/scheduler/sched_fence.c:131:28: warning: symbol 'drm_sched_fence_ops_scheduled' was not declared. Should it be static?
drivers/gpu/drm/scheduler/sched_fence.c:137:28: warning: symbol 'drm_sched_fence_ops_finished' was not declared. Should it be static?

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009121447.31017-1-ben.dooks@codethink.co.uk
2019-10-10 09:11:46 -05:00
Christian König
85cb9d5067 drm/scheduler: use job count instead of peek
The spsc_queue_peek function is accessing queue->head which belongs to
the consumer thread and shouldn't be accessed by the producer

This is fixing a rare race condition when destroying entities.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Monk.liu@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-15 10:52:10 -05:00
Sam Ravnborg
7c1be93c8e drm/scheduler: drop use of drmP.h
Drop use of the deprecated drmP.h header file.
Fix fallout.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Sharat Masetty <smasetty@codeaurora.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190630061922.7254-26-sam@ravnborg.org
2019-07-15 18:11:31 +02:00
Andrey Grodzovsky
d0f29d4980 drm/sched: Fix make htmldocs warnings.
Document the missing parameters.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1559140180-6762-1-git-send-email-andrey.grodzovsky@amd.com
2019-05-29 11:49:51 -05:00
Andrey Grodzovsky
b576ff902f drm/sched: Fix static checker warning for potential NULL ptr
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1558533443-7795-1-git-send-email-andrey.grodzovsky@amd.com
2019-05-24 13:21:06 -05:00
Sean Paul
374ed54293 Merge drm/drm-next into drm-misc-next
Backmerging 5.2-rc1 to -misc-next for robher

Signed-off-by: Sean Paul <seanpaul@chromium.org>
2019-05-22 16:08:21 -04:00
Erico Nunes
794c686eb7 drm/scheduler: Fix job cleanup without timeout handler
After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are
only deleted when the timeout handler is able to be cancelled
successfully.

In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT),
job cleanup would be skipped which may result in memory leaks.

Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in
drm_sched_cleanup_jobs.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/306025/?series=60878&rev=2
2019-05-22 08:42:50 +02:00
Andrey Grodzovsky
a5343b8a2c drm/scheduler: Add flag to hint the release of guilty job.
Problem:
Sched thread's cleanup function races against TO handler
and removes the guilty job from mirror list and we
have no way of differentiating if the job was removed from within the
TO handler or from the sched thread's clean-up function.

Fix:
Add a flag to scheduler to hint the TO handler that the guilty job needs
to be explicitly released.

v2: whitespace fix

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-5-git-send-email-andrey.grodzovsky@amd.com
2019-05-02 15:50:55 -05:00
Andrey Grodzovsky
290764af7e drm/sched: Keep s_fence->parent pointer
For later driver's reference to see if the fence is signaled.

v2: Move parent fence put to resubmit jobs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-4-git-send-email-andrey.grodzovsky@amd.com
2019-05-02 15:46:52 -05:00
Christian König
5918045c4e drm/scheduler: rework job destruction
We now destroy finished jobs from the worker thread to make sure that
we never destroy a job currently in timeout processing.
By this we avoid holding lock around ring mirror list in drm_sched_stop
which should solve a deadlock reported by a user.

v2: Remove unused variable.
v4: Move guilty job free into sched code.
v5:
Move sched->hw_rq_count to drm_sched_start to account for counter
decrement in drm_sched_stop even when we don't call resubmit jobs
if guily job did signal.
v6: remove unused variable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692

Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com
2019-05-02 15:45:48 -05:00
Jonathan Neuschäfer
f5d356328d drm/sched: Fix description of drm_sched_stop
Since commit 222b5f0441 ("drm/sched: Refactor ring mirror list
handling."), drm_sched_hw_job_reset is no longer there, so let's adjust
the doc comment accordingly.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-23 11:15:38 -05:00
Dave Airlie
fbac3c48fa Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
Fixes for 5.1:
amdgpu:
- Fix missing fw declaration after dropping old CI DPM code
- Fix debugfs access to registers beyond the MMIO bar size
- Fix context priority handling
- Add missing license on some new files
- Various cleanups and bug fixes

radeon:
- Fix missing break in CS parser for evergreen
- Various cleanups and bug fixes

sched:
- Fix entities with 0 run queues

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221214134.3308-1-alexander.deucher@amd.com
2019-02-22 15:56:42 +10:00
Dave Airlie
c06de56121 Linux 5.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxqHJYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWl8H/jPI4EipzD2GbnjZ
 GaFpMBBjcXBaVmoA+Y69so+7BHx1Ql+5GQtqbK0RHJRb9qEPLw3FBhHNjM/N8Sgf
 nSrK+GnBZp9s+k/NR/Yf2RacUR3jhz+Q9JEoQd3u9bFUeQyvE8Rf3vgtoBBwFOfz
 +t7N1memYVF3asLGWB4e4sP1YVMGfseTQpSPojvM30YWM86Bv+QtSx1AGgHczQIM
 kMKealR8ZPelN6JAXgLhQ5opDojBrE4YKB98pwsMDI6abz0Tz2JLFEUTTxsv5XNN
 o/Iz+XDoylskEyxN2unNWfHx7Swkvoklog8J/hDg5XlTvipL/WkT66PHBgcGMNvj
 BW9GgU8=
 =ZizU
 -----END PGP SIGNATURE-----

Merge v5.0-rc7 into drm-next

Backmerging for nouveau and imx that needed some fixes for next pulls.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-02-18 13:27:15 +10:00
Bas Nieuwenhuizen
1decbf6bb0 drm/sched: Fix entities with 0 rqs.
Some blocks in amdgpu can have 0 rqs.

Job creation already fails with -ENOENT when entity->rq is NULL,
so jobs cannot be pushed. Without a rq there is no scheduler to
pop jobs, and rq selection already does the right thing with a
list of length 0.

So the operations we need to fix are:
  - Creation, do not set rq to rq_list[0] if the list can have length 0.
  - Do not flush any jobs when there is no rq.
  - On entity destruction handle the rq = NULL case.
  - on set_priority, do not try to change the rq if it is NULL.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-15 11:15:08 -05:00
Eric Anholt
82abf33766 drm/sched: Always trace the dependencies we wait on, to fix a race.
The entity->dependency can go away completely once we've called
drm_sched_entity_add_dependency_cb() (if the cb is called before we
get around to tracing).  The tracepoint is more useful if we trace
every dependency instead of just ones that get callbacks installed,
anyway, so just do that.

Fixes any easy-to-produce OOPS when tracing the scheduler on V3D with
"perf record -a -e gpu_scheduler:.\* glxgears" and DEBUG_SLAB enabled.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-08 14:02:33 -05:00
Andrey Grodzovsky
3741540e04 drm/sched: Rework HW fence processing.
Expedite job deletion from ring mirror list to the HW fence signal
callback instead from finish_work, together with waiting for all
such fences to signal in drm_sched_stop we garantee that
already signaled job will not be processed twice.
Remove the sched finish fence callback and just submit finish_work
directly from the HW fence callback.

v2: Fix comments.
v3: Attach  hw fence cb to sched_job
v5: Rebase

Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:36 -05:00
Andrey Grodzovsky
222b5f0441 drm/sched: Refactor ring mirror list handling.
Decauple sched threads stop and start and ring mirror
list handling from the policy of what to do about the
guilty jobs.
When stoppping the sched thread and detaching sched fences
from non signaled HW fenes wait for all signaled HW fences
to complete before rerunning the jobs.

v2: Fix resubmission of guilty job into HW after refactoring.

v4:
Full restart for all the jobs, not only from guilty ring.
Extract karma increase into standalone function.

v5:
Rework waiting for signaled jobs without relying on the job
struct itself as those might already be freed for non 'guilty'
job's schedulers.
Expose karma increase to drivers.

v6:
Use list_for_each_entry_safe_continue and drm_sched_process_job
in case fence already signaled.
Call drm_sched_increase_karma only once for amdgpu and add documentation.

v7:
Wait only for the latest job's fence.

Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-25 16:15:36 -05:00
Sharat Masetty
1db8c142b6 drm/scheduler: Add drm_sched_suspend/resume_timeout()
This patch adds two new functions to help client drivers suspend and
resume the scheduler job timeout. This can be useful in cases where the
hardware has preemption support enabled. Using this, it is possible to have
the timeout active only for the ring which is active on the ringbuffer.
This patch also makes the job_list_lock IRQ safe.

Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-05 17:56:16 -05:00
Sharat Masetty
9afd07566b drm/scheduler: Set sched->thread to NULL on failure
In cases where the scheduler instance is used as a base object of another
driver object, it's not clear if the driver can call scheduler cleanup on the
fail path. So, Set the sched->thread to NULL, so that the driver can safely
call drm_sched_fini() during cleanup.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-05 17:56:16 -05:00
Christian König
68c12d24ce drm/sched: revert "fix timeout handling v2" v2
This reverts commit 0efd2d2f68.

It's still causing problems for V3D.

v2: keep rearming the timeout.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-28 15:49:58 -05:00
Trigger Huang
85744e9c10 drm/scheduler: Fix bad job be re-processed in TDR
A bad job is the one triggered TDR(In the current amdgpu's
implementation, actually all the jobs in the current joq-queue will
be treated as bad jobs). In the recovery process, its fence
will be fake signaled and as a result, the work behind will be scheduled
to delete it from the mirror list, but if the TDR process is invoked
before the work's execution, then this bad job might be processed again
and the call dma_fence_set_error to its fence in TDR process will lead to
kernel warning trace:

[  143.033605] WARNING: CPU: 2 PID: 53 at ./include/linux/dma-fence.h:437 amddrm_sched_job_recovery+0x1af/0x1c0 [amd_sched]
kernel: [  143.033606] Modules linked in: amdgpu(OE) amdchash(OE) amdttm(OE) amd_sched(OE) amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 snd_hda_codec_generic crypto_simd glue_helper cryptd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev snd_seq_device snd_timer snd soundcore binfmt_misc input_leds mac_hid serio_raw nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 8139too floppy psmouse 8139cp mii i2c_piix4 pata_acpi
[  143.033649] CPU: 2 PID: 53 Comm: kworker/2:1 Tainted: G           OE    4.15.0-20-generic #21-Ubuntu
[  143.033650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  143.033653] Workqueue: events drm_sched_job_timedout [amd_sched]
[  143.033656] RIP: 0010:amddrm_sched_job_recovery+0x1af/0x1c0 [amd_sched]
[  143.033657] RSP: 0018:ffffa9f880fe7d48 EFLAGS: 00010202
[  143.033659] RAX: 0000000000000007 RBX: ffff9b98f2b24c00 RCX: ffff9b98efef4f08
[  143.033660] RDX: ffff9b98f2b27400 RSI: ffff9b98f2b24c50 RDI: ffff9b98efef4f18
[  143.033660] RBP: ffffa9f880fe7d98 R08: 0000000000000001 R09: 00000000000002b6
[  143.033661] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9b98efef3430
[  143.033662] R13: ffff9b98efef4d80 R14: ffff9b98efef4e98 R15: ffff9b98eaf91c00
[  143.033663] FS:  0000000000000000(0000) GS:ffff9b98ffd00000(0000) knlGS:0000000000000000
[  143.033664] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  143.033665] CR2: 00007fc49c96d470 CR3: 000000001400a005 CR4: 00000000003606e0
[  143.033669] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  143.033669] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  143.033670] Call Trace:
[  143.033744]  amdgpu_device_gpu_recover+0x144/0x820 [amdgpu]
[  143.033788]  amdgpu_job_timedout+0x9b/0xa0 [amdgpu]
[  143.033791]  drm_sched_job_timedout+0xcc/0x150 [amd_sched]
[  143.033795]  process_one_work+0x1de/0x410
[  143.033797]  worker_thread+0x32/0x410
[  143.033799]  kthread+0x121/0x140
[  143.033801]  ? process_one_work+0x410/0x410
[  143.033803]  ? kthread_create_worker_on_cpu+0x70/0x70
[  143.033806]  ret_from_fork+0x35/0x40

So just delete the bad job from mirror list directly

Changes in v3:
	- Add a helper function to delete the bad jobs from mirror list and call
		it directly *before* the job's fence is signaled

Changes in v2:
	- delete the useless list node check
	- also delete bad jobs in drm_sched_main because:
		kthread_unpark(ring->sched.thread) will be invoked very early before
		amdgpu_device_gpu_recover's return, then drm_sched_main will have
		chance to pick up a new job from the job queue. This new job will be
		added into the mirror list and processed by amdgpu_job_run, but may
		not be deleted from the mirror list on time due to the same reason.
		And finally re-processed by drm_sched_job_recovery

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Christian König <chrstian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:15 -05:00
Sharat Masetty
26efecf955 drm/scheduler: Add drm_sched_job_cleanup
This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:27 -05:00
Andrey Grodzovsky
faf6e1a87e drm/sched: Add boolean to mark if sched is ready to work v5
Problem:
A particular scheduler may become unsuable (underlying HW) after
some event (e.g. GPU reset). If it's later chosen by
the get free sched. policy a command will fail to be
submitted.

Fix:
Add a driver specific callback to report the sched status so
rq with bad sched can be avoided in favor of working one or
none in which case job init will fail.

v2: Switch from driver callback to flag in scheduler.

v3: rebase

v4: Remove ready paramter from drm_sched_init, set
uncoditionally to true once init done.

v5: fix missed change in v3d in v4 (Alex)

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:22 -05:00
Christian König
8fe159b014 drm/sched: add drm_sched_fault
Add a helper to immediately start timeout handling in case of a hardware
fault.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:02 -05:00
Christian König
19067e522d drm/sched: make sure timer is restarted
Make sure we always restart the timer after a timeout and remove the
device specific workarounds.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:01 -05:00
Christian König
0efd2d2f68 drm/sched: fix timeout handling v2
We need to make sure that we don't race between job completion and
timeout.

v2: put revert label after calling the handling manually

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-12 12:52:32 -05:00
Christian König
b981c86f03 drm/sched: add drm_sched_start_timeout helper
Cleanup starting the timeout a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-12 12:52:21 -05:00
Nathan Chancellor
c1f0320e03 drm/scheduler: Simplify spsc_queue_count check in drm_sched_entity_select_rq
Clang generates a warning when it sees a logical not followed by a
conditional operator like ==, >, or <.

drivers/gpu/drm/scheduler/sched_entity.c:470:6: warning: logical not is
only applied to the left hand side of this comparison
[-Wlogical-not-parentheses]
        if (!spsc_queue_count(&entity->job_queue) == 0 ||
            ^                                     ~~
drivers/gpu/drm/scheduler/sched_entity.c:470:6: note: add parentheses
after the '!' to evaluate the comparison first
        if (!spsc_queue_count(&entity->job_queue) == 0 ||
            ^
             (                                        )
drivers/gpu/drm/scheduler/sched_entity.c:470:6: note: add parentheses
around left hand side expression to silence this warning
        if (!spsc_queue_count(&entity->job_queue) == 0 ||
            ^
            (                                    )
1 warning generated.

It assumes the author might have made a mistake in their logic:

if (!a == b) -> if (!(a == b))

Sometimes that is the case; other times, it's just a super convoluted
way of saying 'if (a)' when b = 0:

if (!1 == 0) -> if (0 == 0) -> if (true)

Alternatively:

if (!1 == 0) -> if (!!1) -> if (1)

Simplify this comparison so that Clang doesn't complain.

Fixes: 35e160e781 ("drm/scheduler: change entities rq even earlier")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-09 17:06:00 -05:00
Nayan Deshmukh
6a96243056 drm/scheduler: remove timeout work_struct from drm_sched_job (v3)
having a delayed work item per job is redundant as we only need one
per scheduler to track the time out the currently executing job.

v2: the first element of the ring mirror list is the currently
executing job so we don't need a additional variable for it

v3: squash in fixes for v3d and etnaviv

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27 09:55:45 -05:00
Nayan Deshmukh
c89677afb3 drm/scheduler: avoid redundant shifting of the entity v2
do not remove entity from the rq if the current rq is from
the least loaded scheduler.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:14 -05:00
Andrey Grodzovsky
62347a3300 drm/scheduler: Add stopped flag to drm_sched_entity
The flag will prevent another thread from same process to
reinsert the entity queue into scheduler's rq after it was already
removewd from there by another thread during drm_sched_entity_flush.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:11:10 -05:00
Christian König
23f67981fd drm/scheduler: rename gpu_scheduler.c to sched_main.c
Better match the naming of the other components.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:44 -05:00
Christian König
7b10574eac drm/scheduler: cleanup entity coding style
Cleanup coding style in sched_entity.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:43 -05:00
Christian König
620e762f9a drm/scheduler: move entity handling into separate file
This is complex enough on it's own. Move it into a separate C file.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:43 -05:00
Christian König
8acc725457 drm/scheduler: trivial error handling fix
Return -ENOMEM when allocating the rq_list fails.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:42 -05:00
Christian König
35e160e781 drm/scheduler: change entities rq even earlier
Looks like for correct debugging we need to know the scheduler even
earlier. So move picking a rq for an entity into job creation.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:07 -05:00
Christian König
573edb241b drm/scheduler: fix last_scheduled handling
Make sure we access last_scheduled only after checking that there are no
more jobs on the entity.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:06 -05:00
Christian König
93f15e1c07 drm/scheduler: Remove entity->rq NULL check
That is superflous now.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:06 -05:00
Christian König
e854b61acf drm/scheduler: bind job earlier to scheduler
Update job earlier with the scheduler it is supposed to be scheduled on.

Otherwise we could incorrectly optimize dependencies when moving an
entity from one scheduler to another.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:01 -05:00
Christian König
7febe4bfd5 drm/scheduler: fix setting the priorty for entities (v2)
Since we now deal with multiple rq we need to update all of them, not
just the current one.

v2: Trivial: Removed unused variable (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:10:00 -05:00
Andrey Grodzovsky
07507c01aa drm/scheduler: Add job dependency trace.
During debug sessions I encountered a need to trace
back a job dependecy a few steps back to the first failing
job. This trace helpped me a lot.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:46 -05:00
Nayan Deshmukh
df0ca30838 drm/scheduler: move idle entities to scheduler with less load v2
This is the first attempt to move entities between schedulers to
have dynamic load balancing. We just move entities with no jobs for
now as moving the ones with jobs will lead to other compilcations
like ensuring that the other scheduler does not remove a job from
the current entity while we are moving.

v2: remove unused variable and an unecessary check

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:46 -05:00
Nayan Deshmukh
97ffa35b5d drm/scheduler: add new function to get least loaded sched v2
The function selects the run queue from the rq_list with the
least load. The load is decided by the number of jobs in a
scheduler.

v2: avoid using atomic read twice consecutively, instead store
    it locally

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:45 -05:00
Nayan Deshmukh
249a07c05a drm/scheduler: add counter for total jobs in scheduler
To keep track of the scheduler load.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:45 -05:00
Nayan Deshmukh
ac0a6cf1c6 drm/scheduler: add a list of run queues to the entity
These are the potential run queues on which the jobs from this
entity can be scheduled. We will use this to do load balancing.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-27 11:09:44 -05:00
Nayan Deshmukh
275e6fa8ec drm/scheduler: fix param documentation
We no longer have sched parameter so remove its description
as well

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-09 11:57:39 -05:00
Lucas Stach
4823e5da2e drm/scheduler: fix timeout worker setup for out of order job completions
drm_sched_job_finish() is a work item scheduled for each finished job on
a unbound system workqueue. This means the workers can execute out of order
with regard to the real hardware job completions.

If this happens queueing a timeout worker for the first job on the ring
mirror list is wrong, as this may be a job which has already finished
executing. Fix this by reorganizing the code to always queue the worker
for the next job on the list, if this job hasn't finished yet. This is
robust against a potential reordering of the finish workers.

Also move out the timeout worker cancelling, so that we don't need to
take the job list lock twice. As a small optimization list_del is used
to remove the job from the ring mirror list, as there is no need to
reinit the list head in the job we are about to free.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-06 15:58:00 -05:00
Christian König
a875f58e23 drm/scheduler: stop setting rq to NULL
We removed the redundancy of having an extra scheduler field, so we
can't set the rq to NULL any more or otherwise won't know which
scheduler to use for the cleanup.

Just remove the entity from the scheduling list instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=107367
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-31 16:58:20 -05:00
Christian König
43bce41cf4 drm/scheduler: only kill entity if last user is killed v2
Note which task is using the entity and only kill it if the last user of
the entity is killed. This should prevent problems when entities are leaked to
child processes.

v2: add missing kernel doc

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-31 16:58:20 -05:00
Masahiro Yamada
ba7f47831e drm/sched: remove unneeded -Iinclude/drm compiler flag
I refactored the include directives under include/drm/ some time ago.
This flag is unneeded.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-31 16:58:16 -05:00
Nayan Deshmukh
068c330419 drm/scheduler: remove sched field from the entity
The scheduler of the entity is decided by the run queue on which
it is queued. This patch avoids us the effort required to maintain
a sync between rq and sched field when we start shifting entites
among different rqs.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-25 15:06:26 -05:00
Nayan Deshmukh
cdc5017659 drm/scheduler: modify API to avoid redundancy
entity has a scheduler field and we don't need the sched argument
in any of the functions where entity is provided.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-25 15:06:19 -05:00
Junwei Zhang
a6da48caf9 drm/scheduler: add NULL pointer check for run queue (v2)
To check rq pointer before adding entity into it.
That avoids NULL pointer access in some case.

v2: move the check to caller

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-16 16:11:48 -05:00
Nayan Deshmukh
aa16b6c6b4 drm/scheduler: modify args of drm_sched_entity_init
replace run queue by a list of run queues and remove the
sched arg as that is part of run queue itself

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13 14:46:05 -05:00
Nayan Deshmukh
8dc9fbbf27 drm/scheduler: add a pointer to scheduler in the rq
This patch is in preparation for a better load balancing in
scheduler. It allows us to associate entities with the
run queues instead of binding them to a scheduler.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-13 14:45:58 -05:00
Dave Airlie
ba7ca97d73 Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-next
More features for 4.19:
- Use core pcie functionality rather than duplicating our own for pcie
  gens and lanes
- Scheduler function naming cleanups
- More documentation
- Reworked DC/Powerplay interfaces to improve power savings
- Initial stutter mode support for RV (power feature)
- Vega12 powerplay updates
- GFXOFF fixes
- Misc fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705221447.2807-1-alexander.deucher@amd.com
2018-07-10 10:57:08 +10:00
Andrey Grodzovsky
180fc134d7 drm/scheduler: Rename cleanup functions v2.
Everything in the flush code path (i.e. waiting for SW queue
to become empty) names with *_flush()
and everything in the release code path names *_fini()

This patch also effect the amdgpu and etnaviv drivers which
use those functions.

v2:
Also pplay the change to vd3.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05 16:38:45 -05:00
Daniel Vetter
99e227cb03 drm: Remove unecessary dma_fence_ops
dma_fence_default_wait is the default now, same for the trivial
enable_signaling implementation.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: David Airlie <airlied@linux.ie>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503142603.28513-7-daniel.vetter@ffwll.ch
2018-07-03 13:13:22 +02:00
Andrey Grodzovsky
741f01e636 drm/scheduler: Avoid using wait_event_killable for dying process (V4)
Dying process might be blocked from receiving any more signals
so avoid using it.

Also retire enity->fini_status and just check the SW queue,
if it's not empty do the fallback cleanup.

Also handle entity->last_scheduled == NULL use case which
happens when HW ring is already hangged whem a  new entity
tried to enqeue jobs.

v2:
Return the remaining timeout and use that as parameter for the next call.
This way when we need to cleanup multiple queues we don't wait for the
entire TO period for each queue but rather in total.
Styling comments.
Rebase.

v3:
Update types from unsigned to long.
Work with jiffies instead of ms.
Return 0 when TO expires.
Rebase.

v4:
Remove unnecessary timeout calculation.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-15 12:20:33 -05:00
Nayan Deshmukh
2d33948e4e drm/scheduler: add documentation
convert existing raw comments into kernel-doc format as well
as add new documentation

v2: reword the overview

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
2018-06-15 12:20:21 -05:00
Nayan Deshmukh
6201e033d7 drm/scheduler: fix a corner case in dependency optimization
When checking for a dependency fence for belonging to the same entity
compare it with scheduled as well finished fence. Earlier we were only
comparing it with the scheduled fence.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-25 11:02:14 -05:00
Emily Deng
52bf20f414 drm/sched: add rcu_barrier after entity fini
To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid
the amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before
kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after drm_sched_entity_fini.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
1.drm_sched_entity_fini ->
drm_sched_entity_cleanup ->
dma_fence_put(entity->last_scheduled) ->
drm_sched_fence_release_finished ->
drm_sched_fence_release_scheduled ->
call_rcu(&fence->finished.rcu, drm_sched_fence_free)

2.drm_sched_fence_free ->
dma_fence_put(fence->parent) ->
amdgpu_fence_release ->
call_rcu(&f->rcu, amdgpu_fence_free) ->
kmem_cache_free(amdgpu_fence_slab, fence);

v2:put the barrier before the kmem_cache_destroy
v3:put the dma_fence_put(fence->parent) before call_rcu in
drm_sched_fence_release_scheduled

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-24 10:07:55 -05:00
Nayan Deshmukh
652470ac55 drm/scheduler: fix function name prefix in comments
That got missed while moving the files outside of amdgpu.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-18 16:08:20 -05:00
Andrey Grodzovsky
563e1e664d drm/scheduler: Remove obsolete spinlock.
This spinlock is superfluous, any call to drm_sched_entity_push_job
should already be under a lock together with matching drm_sched_job_init
to match the order of insertion into queue with job's fence seqence
number.

v2:
Improve patch description.
Add functions documentation describing the locking considerations

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-18 16:08:17 -05:00
Nayan Deshmukh
8344c53f57 drm/scheduler: remove unused parameter
this patch also effect the amdgpu and etnaviv drivers which
use the function drm_sched_entity_init

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:44:27 -05:00
Pixel Ding
5dd3f9efd4 drm/scheduler: don't update last scheduled fence in TDR
The current sequence in scheduler thread is:
1. update last sched fence
2. job begin (adding to mirror list)
3. job finish (remove from mirror list)
4. back to 1

Since we update last sched prior to joining mirror list, the jobs
in mirror list already pass the last sched fence. TDR just run
the jobs in mirror list, so we should not update the last sched
fences in TDR.

Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:44:05 -05:00
Pixel Ding
b5b4ea4d98 drm/scheduler: move last_sched fence updating prior to job popping (v2)
Make sure main thread won't update last_sched fence when entity
is cleanup.

Fix a racing issue which is caused by putting last_sched fence
twice. Running vulkaninfo in tight loop can produce this issue
as seeing wild fence pointer.

v2: squash in build fix (Christian)

Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:30 -05:00
Pixel Ding
a4b3996aee drm/scheduler: always put last_sched fence in entity_fini
Fix the potential memleak since scheduler main thread always
hold one last_sched fence.

Signed-off-by: Pixel Ding <Pixel.Ding@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:30 -05:00
Emily Deng
8ee3a52e3f drm/gpu-sched: fix force APP kill hang(v4)
issue:
there are VMC page fault occurred if force APP kill during
3dmark test, the cause is in entity_fini we manually signal
all those jobs in entity's queue which confuse the sync/dep
mechanism:

1)page fault occurred in sdma's clear job which operate on
shadow buffer, and shadow buffer's Gart table is cleaned by
ttm_bo_release since the fence in its reservation was fake signaled
by entity_fini() under the case of SIGKILL received.

2)page fault occurred in gfx' job because during the lifetime
of gfx job we manually fake signal all jobs from its entity
in entity_fini(), thus the unmapping/clear PTE job depend on those
result fence is satisfied and sdma start clearing the PTE and lead
to GFX page fault.

fix:
1)should at least wait all jobs already scheduled complete in entity_fini()
if SIGKILL is the case.

2)if a fence signaled and try to clear some entity's dependency, should
set this entity guilty to prevent its job really run since the dependency
is fake signaled.

v2:
splitting drm_sched_entity_fini() into two functions:
1)The first one is does the waiting, removes the entity from the
runqueue and returns an error when the process was killed.
2)The second one then goes over the entity, install it as
completion signal for the remaining jobs and signals all jobs
with an error code.

v3:
1)Replace the fini1 and fini2 with better name
2)Call the first part before the VM teardown in
amdgpu_driver_postclose_kms() and the second part
after the VM teardown
3)Keep the original function drm_sched_entity_fini to
refine the code.

v4:
1)Rename entity->finished to entity->last_scheduled;
2)Rename drm_sched_entity_fini_job_cb() to
drm_sched_entity_kill_jobs_cb();
3)Pass NULL to drm_sched_entity_fini_job_cb() if -ENOENT;
4)Replace the type of entity->fini_status with "int";
5)Remove the check about entity->finished.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:17 -05:00
Nayan Deshmukh
a70cdb9edd drm/scheduler: move the tracepoints file from the include directory
Move it with the scheduler code. This is mostly a straight forward
rename with no code change except for updating the TRACE_INCLUDE_PATH

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11 13:08:00 -05:00
Nayan Deshmukh
ced5443502 drm/scheduler: fix param documentation
There is no @kernel parameter anymore and document the
@guilty parameter

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-04-11 13:08:00 -05:00
Luis de Bethencourt
e2751493fc drm: Fix trailing semicolon
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.

Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:04 -05:00
Alex Deucher
8205f8840f drm/scheduler: add license to the Makefile
Was missing before.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07 11:52:29 -05:00
Lucas Stach
4983e48c85 drm/sched: move fence slab handling to module init/exit
This is the only part of the scheduler which must not be called from
different drivers. Move it to module init/exit so it is done a single
time when loading the scheduler.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07 11:52:14 -05:00
Lucas Stach
1b1f42d8fd drm: move amd_gpu_scheduler into common location
This moves and renames the AMDGPU scheduler to a common location in DRM
in order to facilitate re-use by other drivers. This is mostly a straight
forward rename with no code changes.

One notable exception is the function to_drm_sched_fence(), which is no
longer a inline header function to avoid the need to export the
drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures.

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07 11:51:56 -05:00