Commit Graph

25 Commits

Author SHA1 Message Date
Philipp Stanner
bead880022 drm/nouveau: Remove waitque for sched teardown
struct nouveau_sched contains a waitque needed to prevent
drm_sched_fini() from being called while there are still jobs pending.
Doing so so far would have caused memory leaks.

With the new memleak-free mode of operation switched on in
drm_sched_fini() by providing the callback nouveau_sched_cancel_job()
the waitque is not necessary anymore.

Remove the waitque.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250710125412.128476-10-phasta@kernel.org
2025-07-10 17:07:09 +02:00
Dave Airlie
9c685f6172 nouveau: set placement to original placement on uvmm validate.
When a buffer is evicted for memory pressure or TTM evict all,
the placement is set to the eviction domain, this means the
buffer never gets revalidated on the next exec to the correct domain.

I think this should be fine to use the initial domain from the
object creation, as least with VM_BIND this won't change after
init so this should be the correct answer.

Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: <stable@vger.kernel.org> # v6.6
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240515025542.2156774-1-airlied@gmail.com
2024-08-01 01:22:12 +02:00
Christophe JAILLET
118b4eed8b drm/nouveau: Constify struct nouveau_job_ops
"struct nouveau_job_ops" is not modified in these drivers.

Constifying this structure moves some data to a read-only section, so
increase overall security.

In order to do it, "struct nouveau_job" and "struct nouveau_job_args" also
need to be adjusted to this new const qualifier.

On a x86_64, with allmodconfig:
Before:
======
   text	   data	    bss	    dec	    hex	filename
   5570	    152	      0	   5722	   165a	drivers/gpu/drm/nouveau/nouveau_exec.o

After:
=====
   text	   data	    bss	    dec	    hex	filename
   5630	    112	      0	   5742	   166e	drivers/gpu/drm/nouveau/nouveau_exec.o

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/860e9753d7867aa46b003bb3d0497f1b04065b24.1718381285.git.christophe.jaillet@wanadoo.fr
2024-06-17 17:22:06 +02:00
Dave Airlie
be141849ec nouveau/uvmm: fix addr/range calcs for remap operations
dEQP-VK.sparse_resources.image_rebind.2d_array.r64i.128_128_8
was causing a remap operation like the below.

op_remap: prev: 0000003fffed0000 00000000000f0000 00000000a5abd18a 0000000000000000
op_remap: next:
op_remap: unmap: 0000003fffed0000 0000000000100000 0
op_map: map: 0000003ffffc0000 0000000000010000 000000005b1ba33c 00000000000e0000

This was resulting in an unmap operation from 0x3fffed0000+0xf0000, 0x100000
which was corrupting the pagetables and oopsing the kernel.

Fixes the prev + unmap range calcs to use start/end and map back to addr/range.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328024317.2041851-1-airlied@gmail.com
2024-03-28 17:58:31 +01:00
Danilo Krummrich
9a0c32d698 drm/nouveau: don't fini scheduler if not initialized
nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call
their corresponding *_fini() counterpart. This can lead to
nouveau_sched_fini() being called without struct nouveau_sched ever
being initialized in the first place.

Instead of embedding struct nouveau_sched into struct nouveau_cli and
struct nouveau_chan_abi16, allocate struct nouveau_sched separately,
such that we can check for the corresponding pointer to be NULL in the
particular *_fini() functions.

It makes sense to allocate struct nouveau_sched separately anyway, since
in a subsequent commit we can also avoid to allocate a struct
nouveau_sched in nouveau_abi16_ioctl_channel_alloc() at all, if the
VM_BIND uAPI has been disabled due to the legacy uAPI being used.

Fixes: 5f03a507b2 ("drm/nouveau: implement 1:1 scheduler - entity relationship")
Reported-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Closes: https://lore.kernel.org/nouveau/20240131213917.1545604-1-ttabi@nvidia.com/
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202000606.3526-1-dakr@redhat.com
2024-02-12 11:40:46 +01:00
Rob Clark
05d249352f drm/exec: Pass in initial # of objects
In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Patchwork: https://patchwork.freedesktop.org/patch/568338/
2023-12-10 10:38:47 -08:00
Yuran Pereira
b101d08451 drm/nouveau: Removes unnecessary args check in nouveau_uvmm_sm_prepare
Checking `args` after calling `op_map_prepare` is unnecessary since
if `op_map_prepare` was to be called with  NULL args, it would lead
to a NULL pointer dereference, thus never hitting that check.

Hence remove the check and add a note to remind users of this function
to ensure that args != NULL when calling this function for a map
operation as it was suggested by Danilo [1].

[1] https://lore.kernel.org/lkml/6a1ebcef-bade-45a0-9bd9-c05f0226eb88@redhat.com

Suggested-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/GV1PR10MB65637F4BAABFE2D8E261E1DCE8B0A@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
2023-11-30 01:04:12 +01:00
Danilo Krummrich
46990918f3 drm/nouveau: enable dynamic job-flow control
Make use of the scheduler's credit limit and scheduler job's credit
count to account for the actual size of a job, such that we fill up the
ring efficiently.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231114002728.3491-2-dakr@redhat.com
2023-11-24 21:24:51 +01:00
Danilo Krummrich
5f03a507b2 drm/nouveau: implement 1:1 scheduler - entity relationship
Recent patches to the DRM scheduler [1][2] allow for a variable number
of run-queues and add support for (shared) workqueues rather than
dedicated kthreads per scheduler. This allows us to create a 1:1
relationship between a GPU scheduler and a scheduler entity, in order to
properly support firmware schedulers being able to handle an arbitrary
amount of dynamically allocated command ring buffers. This perfectly
matches Nouveau's needs, hence make use of it.

Topology wise we create one scheduler instance per client (handling
VM_BIND jobs) and one scheduler instance per channel (handling EXEC
jobs).

All channel scheduler instances share a workqueue, but every client
scheduler instance has a dedicated workqueue. The latter is required to
ensure that for VM_BIND job's free_job() work and run_job() work can
always run concurrently and hence, free_job() work can never stall
run_job() work. For EXEC jobs we don't have this requirement, since EXEC
job's free_job() does not require to take any locks which indirectly or
directly are held for allocations elsewhere.

[1] https://lore.kernel.org/all/8f53f7ef-7621-4f0b-bdef-d8d20bc497ff@redhat.com/T/
[2] https://lore.kernel.org/all/20231031032439.1558703-1-matthew.brost@intel.com/T/

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231114002728.3491-1-dakr@redhat.com
2023-11-24 21:24:46 +01:00
Danilo Krummrich
014f831abc drm/nouveau: use GPUVM common infrastructure
GPUVM provides common infrastructure to track external and evicted GEM
objects as well as locking and validation helpers.

Especially external and evicted object tracking is a huge improvement
compared to the current brute force approach of iterating all mappings
in order to lock and validate the GPUVM's GEM objects. Hence, make us of
it.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231113221202.7203-1-dakr@redhat.com
2023-11-24 20:58:12 +01:00
Danilo Krummrich
94bc2249f0 drm/gpuvm: add an abstraction for a VM / BO combination
Add an abstraction layer between the drm_gpuva mappings of a particular
drm_gem_object and this GEM object itself. The abstraction represents a
combination of a drm_gem_object and drm_gpuvm. The drm_gem_object holds
a list of drm_gpuvm_bo structures (the structure representing this
abstraction), while each drm_gpuvm_bo contains list of mappings of this
GEM object.

This has multiple advantages:

1) We can use the drm_gpuvm_bo structure to attach it to various lists
   of the drm_gpuvm. This is useful for tracking external and evicted
   objects per VM, which is introduced in subsequent patches.

2) Finding mappings of a certain drm_gem_object mapped in a certain
   drm_gpuvm becomes much cheaper.

3) Drivers can derive and extend the structure to easily represent
   driver specific states of a BO for a certain GPUVM.

The idea of this abstraction was taken from amdgpu, hence the credit for
this idea goes to the developers of amdgpu.

Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-11-dakr@redhat.com
2023-11-13 18:19:20 +01:00
Danilo Krummrich
8af72338dd drm/gpuvm: reference count drm_gpuvm structures
Implement reference counting for struct drm_gpuvm.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-10-dakr@redhat.com
2023-11-13 18:19:06 +01:00
Danilo Krummrich
266f7618e7 drm/nouveau: separately allocate struct nouveau_uvmm
Allocate struct nouveau_uvmm separately in preparation for subsequent
commits introducing reference counting for struct drm_gpuvm.

While at it, get rid of nouveau_uvmm_init() as indirection of
nouveau_uvmm_ioctl_vm_init() and perform some minor cleanups.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-9-dakr@redhat.com
2023-11-13 18:19:06 +01:00
Danilo Krummrich
809ef191ee drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm
Introduce flags for struct drm_gpuvm, this required by subsequent
commits.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-8-dakr@redhat.com
2023-11-13 18:18:50 +01:00
Danilo Krummrich
6118411428 drm/nouveau: make use of the GPUVM's shared dma-resv
DRM GEM objects private to a single GPUVM can use a shared dma-resv.
Make use of the shared dma-resv of GPUVM rather than a driver specific
one.

The shared dma-resv originates from a "root" GEM object serving as
container for the dma-resv to make it compatible with drm_exec.

In order to make sure the object proving the shared dma-resv can't be
freed up before the objects making use of it, let every such GEM object
take a reference on it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-7-dakr@redhat.com
2023-11-13 18:18:12 +01:00
Danilo Krummrich
bbe8458037 drm/gpuvm: add common dma-resv per struct drm_gpuvm
Provide a common dma-resv for GEM objects not being used outside of this
GPU-VM. This is used in a subsequent patch to generalize dma-resv,
external and evicted object handling and GEM validation.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-6-dakr@redhat.com
2023-11-13 18:18:00 +01:00
Danilo Krummrich
b41e297abd drm/nouveau: make use of drm_gpuvm_range_valid()
Use drm_gpuvm_range_valid() in order to validate userspace requests.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-5-dakr@redhat.com
2023-11-13 18:18:00 +01:00
Danilo Krummrich
546ca4d35d drm/gpuvm: convert WARN() to drm_WARN() variants
Use drm_WARN() and drm_WARN_ON() variants to indicate drivers the
context the failing VM resides in.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-2-dakr@redhat.com
2023-11-13 18:17:26 +01:00
Danilo Krummrich
78f54469b8 drm/nouveau: uvmm: rename 'umgr' to 'base'
Rename struct drm_gpuvm within struct nouveau_uvmm from 'umgr' to base.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230920144343.64830-4-dakr@redhat.com
2023-09-26 01:58:29 +02:00
Danilo Krummrich
f72c2db470 drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
Rename struct drm_gpuva_manager to struct drm_gpuvm including
corresponding functions. This way the GPUVA manager's structures align
very well with the documentation of VM_BIND [1] and VM_BIND locking [2].

It also provides a better foundation for the naming of data structures
and functions introduced for implementing a common dma-resv per GPU-VM
including tracking of external and evicted objects in subsequent
patches.

[1] Documentation/gpu/drm-vm-bind-async.rst
[2] Documentation/gpu/drm-vm-bind-locking.rst

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230920144343.64830-2-dakr@redhat.com
2023-09-26 01:51:28 +02:00
Danilo Krummrich
b4e9fa9335 drm/nouveau: uvmm: fix unset region pointer on remap
Transfer the region pointer of a uvma to the new uvma(s) on re-map to
prevent potential shader faults when the re-mapped uvma(s) are unmapped.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230820222920.2344-1-dakr@redhat.com
2023-08-22 19:32:22 +02:00
Danilo Krummrich
a3540b46e9 drm/nouveau: uvmm: remove dedicated VM pointer from VMAs
VMAs can find their corresponding VM through their embedded struct
drm_gpuva which already carries a pointer to a struct drm_gpuva_manager
which the VM is based on. Hence, remove the struct nouveau_uvmm pointer
from struct nouveau_uvma to save a couple of bytes per mapping.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230807163238.2091-6-dakr@redhat.com
2023-08-08 04:47:19 +02:00
Danilo Krummrich
3cbc772107 drm/nouveau: uvmm: remove incorrect calls to mas_unlock()
Remove incorrect calls to mas_unlock() in the unwind path of
__nouveau_uvma_region_insert(). The region maple tree uses an external
lock instead, namely the global uvmm lock.

Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230807163238.2091-5-dakr@redhat.com
2023-08-08 04:47:19 +02:00
Danilo Krummrich
e39701e33a drm/nouveau: remove incorrect __user annotations
Fix copy-paste error causing EXEC and VM_BIND syscalls data pointers
to carry incorrect __user annotations.

Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230807163238.2091-4-dakr@redhat.com
2023-08-08 04:47:19 +02:00
Danilo Krummrich
b88baab828 drm/nouveau: implement new VM_BIND uAPI
This commit provides the implementation for the new uapi motivated by the
Vulkan API. It allows user mode drivers (UMDs) to:

1) Initialize a GPU virtual address (VA) space via the new
   DRM_IOCTL_NOUVEAU_VM_INIT ioctl for UMDs to specify the portion of VA
   space managed by the kernel and userspace, respectively.

2) Allocate and free a VA space region as well as bind and unbind memory
   to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.
   UMDs can request the named operations to be processed either
   synchronously or asynchronously. It supports DRM syncobjs
   (incl. timelines) as synchronization mechanism. The management of the
   GPU VA mappings is implemented with the DRM GPU VA manager.

3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl. The
   execution happens asynchronously. It supports DRM syncobj (incl.
   timelines) as synchronization mechanism. DRM GEM object locking is
   handled with drm_exec.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, use the DRM
GPU scheduler for the asynchronous paths.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230804182406.5222-12-dakr@redhat.com
2023-08-04 20:34:41 +02:00