Commit Graph

182 Commits

Author SHA1 Message Date
Brian Welty
8049e3954a drm/xe: Fix bounds checking in __xe_bo_placement_for_flags()
Requesting all memory regions on PVC will fill bo->placements up to
XE_BO_MAX_PLACEMENTS. The subsequent call to try_add_stolen() will trip
over the bounds checking even though XE_PL_STOLEN is not expected to
be used in this case.

This is hit with igt@xe_exec_fault_mode@once-basic-prefetch:
    xe 0000:8c:00.0: [drm] Assertion `*c < (sizeof(bo->placements) / sizeof((bo->placements)[0]) + ((int)(sizeof(struct { int:(-!!(__builtin_types_compatible_p(typeof((bo->placements)), typeof(&(bo->placements)[0])))); }))))` failed!
    WARNING: CPU: 30 PID: 6161 at drivers/gpu/drm/xe/xe_bo.c:203 __xe_bo_placement_for_flags+0x218/0x240 [xe]

Is fixed here by moving the bounds checks closer to where we actually
write into the bo->placement array.

Fixes: 8c54ee8a86 ("drm/xe: Ensure that we don't access the placements array out-of-bounds")
Link: https://patchwork.freedesktop.org/patch/msgid/20240111002111.10190-1-brian.welty@intel.com
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
(cherry picked from commit 52e3fa3e3e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 15:37:03 +01:00
Thomas Hellström
77232e6a28 drm/xe: Annotate xe_mem_region::mapping with __iomem
The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.

Fixes: 0887a2e7ab ("drm/xe: Make xe_mem_region struct")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit 20855b62a3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15 15:36:42 +01:00
Brian Welty
52e3fa3e3e drm/xe: Fix bounds checking in __xe_bo_placement_for_flags()
Requesting all memory regions on PVC will fill bo->placements up to
XE_BO_MAX_PLACEMENTS. The subsequent call to try_add_stolen() will trip
over the bounds checking even though XE_PL_STOLEN is not expected to
be used in this case.

This is hit with igt@xe_exec_fault_mode@once-basic-prefetch:
    xe 0000:8c:00.0: [drm] Assertion `*c < (sizeof(bo->placements) / sizeof((bo->placements)[0]) + ((int)(sizeof(struct { int:(-!!(__builtin_types_compatible_p(typeof((bo->placements)), typeof(&(bo->placements)[0])))); }))))` failed!
    WARNING: CPU: 30 PID: 6161 at drivers/gpu/drm/xe/xe_bo.c:203 __xe_bo_placement_for_flags+0x218/0x240 [xe]

Is fixed here by moving the bounds checks closer to where we actually
write into the bo->placement array.

Fixes: 8c54ee8a86 ("drm/xe: Ensure that we don't access the placements array out-of-bounds")
Link: https://patchwork.freedesktop.org/patch/msgid/20240111002111.10190-1-brian.welty@intel.com
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2024-01-12 09:36:37 -08:00
Thomas Hellström
20855b62a3 drm/xe: Annotate xe_mem_region::mapping with __iomem
The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.

Fixes: 0887a2e7ab ("drm/xe: Make xe_mem_region struct")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com
2024-01-09 18:00:55 +01:00
Badal Nilawar
fa78e188d8 drm/xe/dgfx: Release mmap mappings on rpm suspend
Release all mmap mappings for all vram objects which are associated
with userfault such that, while pcie function in D3hot, any access
to memory mappings will raise a userfault.

Upon userfault, in order to access memory mappings, if graphics
function is in D3 then runtime resume of dgpu will be triggered to
transition to D0.

v2:
  - Avoid iomem check before bo migration check as bo can migrate
    to system memory (Matthew Auld)
v3:
  - Delete bo userfault link during bo destroy
  - Upon bo move (vram-smem), do bo userfault link deletion in
    xe_bo_move_notify instead of xe_bo_move (Thomas Hellström)
  - Grab lock in rpm hook while deleting bo userfault link (Matthew Auld)
v4:
  - Add kernel doc and wrap vram_userfault related
    stuff in the structure (Matthew Auld)
  - Get rpm wakeref before taking dma reserve lock (Matthew Auld)
  - In suspend path apply lock for entire list op
    including list iteration (Matthew Auld)
v5:
  - Use mutex lock instead of spin lock
v6:
  - Fix review comments (Matthew Auld)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #For the xe_bo_move_notify() changes
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20240104130702.950078-1-badal.nilawar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-01-08 16:55:44 -05:00
Lucas De Marchi
80166e9567 drm/xe/bo: Remove unusued variable
bo is not used since all the checks are against tbo. Fix warning:

	../drivers/gpu/drm/xe/xe_bo.c: In function ‘xe_evict_flags’:
	../drivers/gpu/drm/xe/xe_bo.c:250:23: error: variable ‘bo’ set but not used [-Werror=unused-but-set-variable]
	  250 |         struct xe_bo *bo;

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:17 -05:00
Himal Prasad Ghimiray
266c858852 drm/xe/xe2: Handle flat ccs move for igfx.
- Clear flat ccs during user bo creation.
- copy ccs meta data between flat ccs and bo during eviction and
restore.
- Add a bool field ccs_cleared in bo, true means ccs region of bo is
already cleared.

v2:
 - Rebase.

v3:
 - Maintain order of xe_bo_move_notify for ttm_bo_type_sg.

v4:
 - xe_migrate_copy can be used to copy src to dst bo on igfx too.
Add a bool which handles only ccs metadata copy.

v5:
- on dgfx ccs should be cleared even if the bo is not compression enabled.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:15 -05:00
Himal Prasad Ghimiray
20561efb0f drm/xe/xe2: Allocate extra pages for ccs during bo create
Incase of bo move from PL_TT to PL_SYSTEM these pages will be used to
store ccs metadata from flat ccs. And during bo move to PL_TT from
PL_SYSTEM the metadata will be copied from extra pages to flat ccs. This
copy of ccs metadata ensures ccs remains unaltered between swapping out
of bo to disk and its restore to PL_TT.

Bspec:58796

v2:
 - For dgfx ensure system bit is not set.
 - Modify comments.(Thomas)

v3:
 - Separate out patch to modify main memory to ccs memory ratio.(Matt)

v4:
 - Update description for commit message.
 - Make bo allocation routine more readable.(Matt)

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:09 -05:00
Thomas Hellström
24f947d58f drm/xe: Use DRM GPUVM helpers for external- and evicted objects
Adapt to the DRM_GPUVM helpers moving removing a lot of complicated
driver-specific code.

For now this uses fine-grained locking for the evict list and external
object list, which may incur a slight performance penalty in some
situations.

v2:
- Don't lock all bos and validate on LR exec submissions (Matthew Brost)
- Add some kerneldoc

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231212100144.6833-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:46:08 -05:00
Rodrigo Vivi
7a56bd0cfb drm/xe/uapi: Fix various struct padding for 64b alignment
Let's respect Documentation/process/botching-up-ioctls.rst
and add the proper padding for a 64b alignment with all as
well as all the required checks and settings for the pads
and the reserved entries.

v2: Fix remaining holes and double check with pahole (Jose)
    Ensure with pahole that both 32b and 64b have exact same
    layout (Thomas)
    Do not set query's pad and reserved bits to zero since it
    is redundant and already done by kzalloc (Matt)

v3: Fix alignment after rebase (José Roberto de Souza)

v4: Fix pad check (Francois Dugast)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:45:19 -05:00
Mauro Carvalho Chehab
4e03b58414 drm/xe/uapi: Reject bo creation of unaligned size
For xe bo creation we request passing size which matches system or
vram minimum page alignment. This way we want to ensure userspace
is aware of region constraints and not aligned allocations will be
rejected returning EINVAL.

v2:
- Rebase, Update uAPI documentation. (Thomas)
v3:
- Adjust the dma-buf kunit test accordingly. (Thomas)
v4:
- Fixed rebase conflicts and updated commit message. (Francois)

Signed-off-by: Mauro Carvalho Chehab <mauro.chehab@linux.intel.com>
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:13 -05:00
Rodrigo Vivi
6b8c1edc4f drm/xe/uapi: Separate bo_create placement from flags
Although the flags are about the creation, the memory placement
of the BO deserves a proper dedicated field in the uapi.

Besides getting more clear, it also allows to remove the
'magic' shifts from the flags that was a concern during the
uapi reviews.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:45:12 -05:00
Michał Winiarski
0e1a47fcab drm/xe: Add a helper for DRM device-lifetime BO create
A helper for managed BO allocations makes it possible to remove specific
"fini" actions and will simplify the following patches adding ability to
execute a release action for specific BO directly.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:11 -05:00
Pallavi Mishra
622f709ca6 drm/xe/uapi: Add support for CPU caching mode
Allow userspace to specify the CPU caching mode at object creation.
Modify gem create handler and introduce xe_bo_create_user to replace
xe_bo_create. In a later patch we will support setting the pat_index as
part of vm_bind, where expectation is that the coherency mode extracted
from the pat_index must be least 1way coherent if using cpu_caching=wb.

v2
  - s/smem_caching/smem_cpu_caching/ and
    s/XE_GEM_CACHING/XE_GEM_CPU_CACHING/. (Matt Roper)
  - Drop COH_2WAY and just use COH_NONE + COH_AT_LEAST_1WAY; KMD mostly
    just cares that zeroing/swap-in can't be bypassed with the given
    smem_caching mode. (Matt Roper)
  - Fix broken range check for coh_mode and smem_cpu_caching and also
    don't use constant value, but the already defined macros. (José)
  - Prefer switch statement for smem_cpu_caching -> ttm_caching. (José)
  - Add note in kernel-doc for dgpu and coherency modes for system
    memory. (José)
v3 (José):
  - Make sure to reject coh_mode == 0 for VRAM-only.
  - Also make sure to actually pass along the (start, end) for
    __xe_bo_create_locked.
v4
  - Drop UC caching mode. Can be added back if we need it. (Matt Roper)
  - s/smem_cpu_caching/cpu_caching. Idea is that VRAM is always WC, but
    that is currently implicit and KMD controlled. Make it explicit in
    the uapi with the limitation that it currently must be WC. For VRAM
    + SYS objects userspace must now select WC. (José)
  - Make sure to initialize bo_flags. (José)
v5
  - Make to align with the other uapi and prefix uapi constants with
    DRM_ (José)
v6:
  - Make it clear that zero cpu_caching is only allowed for kernel
    objects. (José)
v7: (Oak)
  - With all the changes from the original design, it looks we can
    further simplify here and drop the explicit coh_mode. We can just
    infer the coh_mode from the cpu_caching. i.e reject cpu_caching=wb +
    coh_none. It's one less thing for userspace to maintain so seems
    worth it.
v8:
  - Make sure to also update the kselftests.

Testcase: igt@xe_mmap@cpu-caching
Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Zhengguo Xu <zhengguo.xu@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Zhengguo Xu <zhengguo.xu@intel.com>
Acked-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:07 -05:00
Thomas Hellström
e7c9e049e0 drm/xe/bo: Remove leftover trace_printk()
trace_printk() is not intended for production code. Remove it.

Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122110359.4087-4-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:57 -05:00
Thomas Hellström
a21fe5ee59 drm/xe/bo: Rename xe_bo_get_sg() to xe_bo_sg()
Using "get" typically refers to obtaining a refcount, which we don't do
here so rename to xe_bo_sg().

Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Ohad Sharabi<osharabi@habana.ai>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122110359.4087-3-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:57 -05:00
Thomas Hellström
8c54ee8a86 drm/xe: Ensure that we don't access the placements array out-of-bounds
Ensure, using xe_assert that the various try_add_<placement> functions
don't access the bo placements array out-of-bounds.

v2:
- Remove the places argument to make sure the xe_assert operates on
  the array we're actually populating. (Matthew Auld)

Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ohad Sharabi <osharabi@habana.ai> #v1
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231123153158.12779-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:57 -05:00
Matthew Auld
e7b4ebd7c6 drm/xe/bo: don't hold dma-resv lock over drm_gem_handle_create
This seems to create a locking inversion with object_name_lock. The lock
is held by drm_prime_fd_to_handle when calling our xe_gem_prime_import
hook, which might eventually go on to grab the dma-resv lock during the
attach. However we also have the opposite locking order in
xe_gem_create_ioctl which is holding the dma-resv lock when calling
drm_gem_handle_create, which wants to eventually grab object_name_lock:

-> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
<4> [635.739288]        lock_acquire+0x169/0x3d0
<4> [635.739294]        __ww_mutex_lock.constprop.0+0x164/0x1e60
<4> [635.739300]        ww_mutex_lock_interruptible+0x42/0x1a0
<4> [635.739305]        drm_gem_shmem_pin+0x4b/0x140 [drm_shmem_helper]
<4> [635.739317]        dma_buf_dynamic_attach+0x101/0x430
<4> [635.739323]        xe_gem_prime_import+0xcc/0x2e0 [xe]
<4> [635.739499]        drm_prime_fd_to_handle_ioctl+0x184/0x2e0 [drm]
<4> [635.739594]        drm_ioctl_kernel+0x16f/0x250 [drm]
<4> [635.739693]        drm_ioctl+0x35e/0x620 [drm]
<4> [635.739789]        __x64_sys_ioctl+0xb7/0xf0
<4> [635.739794]        do_syscall_64+0x3c/0x90
<4> [635.739799]        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<4> [635.739805]
-> #0 (&dev->object_name_lock){+.+.}-{3:3}:
<4> [635.739813]        check_prev_add+0x1ba/0x14a0
<4> [635.739818]        __lock_acquire+0x203e/0x2ff0
<4> [635.739823]        lock_acquire+0x169/0x3d0
<4> [635.739827]        __mutex_lock+0x124/0x1310
<4> [635.739832]        drm_gem_handle_create+0x32/0x50 [drm]
<4> [635.739927]        xe_gem_create_ioctl+0x1d3/0x550 [xe]
<4> [635.740102]        drm_ioctl_kernel+0x16f/0x250 [drm]
<4> [635.740197]        drm_ioctl+0x35e/0x620 [drm]
<4> [635.740293]        __x64_sys_ioctl+0xb7/0xf0
<4> [635.740297]        do_syscall_64+0x3c/0x90
<4> [635.740302]        entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<4> [635.740307]

It looks like it should be safe to simply drop the dma-resv lock prior
to publishing the object when calling drm_gem_handle_create.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/743
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:39 -05:00
Francois Dugast
d5dc73dbd1 drm/xe/uapi: Add missing DRM_ prefix in uAPI constants
Most constants defined in xe_drm.h use DRM_XE_ as prefix which is
helpful to identify the name space. Make this systematic and add
this prefix where it was missing.

v2:
- fix vertical alignment of define values
- remove double DRM_ in some variables (José Roberto de Souza)

v3: Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:33 -05:00
Rodrigo Vivi
ddfa2d6a84 drm/xe/uapi: Kill VM_MADVISE IOCTL
Remove unused IOCTL.
Without any userspace using it we need to remove before we
can be accepted upstream.

At this point we are breaking the compatibility for good,
so we don't need to break when we are in-tree. So, let's
also use this breakage to sort out the IOCTL entries and
fix all the small indentation and line issues.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:44:32 -05:00
Maarten Lankhorst
44e694958b drm/xe/display: Implement display support
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there.

We do this by recompiling i915/display code twice.
Now that i915 has been adapted to support the Xe build, we can add
the xe/display support.

This initial work is a collaboration of many people and unfortunately
this squashed patch won't fully honor the proper credits.
But let's try to add a few from the squashed patches:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2023-12-21 11:43:39 -05:00
Brian Welty
fd0975b7cf drm/xe: Replace usage of mem_type_to_tile
Currently mem_type_to_tile() is being used to access the tile's underlying
tile.mem.vram. However, this function makes the assumption that a mem_type
will only ever map to a single tile. Now that the TTM vram manager contains
a pointer to the memory_region, make use of this in xe_bo.c.

As such, introduce a helper function res_to_mem_region() to get the
ttm_vram_mgr->vram from the BO's resource, and use this to replace usage
of mem_type_to_tile().

xe_tile is still needed to choose the migration context, so this part is
unchanged. But as this is only renaming usage, function is renamed now to
mem_type_to_migrate().

Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:32 -05:00
Brian Welty
4e11a1411a drm/xe: Remove unused xe_bo_to_tile
Unused and would like to remove the memtype_to_tile() which it calls.

Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:32 -05:00
Matthew Auld
503a6f4e4f drm/xe/bo: sync kernel fences for KMD buffers
With things like pipelined evictions, VRAM pages can be marked as free
and yet still have some active kernel fences, with the idea that the
next caller to allocate the memory will respect them. However it looks
like we are missing synchronisation for KMD internal buffers, like
page-tables, lrc etc. For userspace objects we should already have the
required synchronisation for CPU access via the fault handler, and
likewise for GPU access when vm_binding them.

To fix this synchronise against any kernel fences for all KMD objects at
creation. This should resolve some severe corruption seen during
evictions.

v2 (Matt B):
  - Revamp the comment explaining this. Also mention why USAGE_KERNEL is
    correct here.
v3 (Thomas):
  - Make sure to use ctx.interruptible for the wait.

Testcase: igt@xe-evict-ccs
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/853
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/855
Reported-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:32 -05:00
Pallavi Mishra
02cadbb5d1 drm/xe: Align size to PAGE_SIZE
Ensure alignment with PAGE_SIZE for the size parameter
passed to __xe_bo_create_locked()

v2: move size alignment under else condition (Lucas)

Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230920213259.3458968-1-pallavi.mishra@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:16 -05:00
Tejas Upadhyay
b27970f3e1 drm/xe: Add tracking support for bos per client
In order to show per client memory consumption, we
need tracking support APIs to add at every bo consumption
and removal. Adding APIs here to add tracking calls at
places wherever it is applicable.

V5:
  - Rebase
V4:
  - remove client bo before vm_put
  - spin_lock_irqsave not required - Auld
V3:
  - update .h to return xe_drm_client_remove_bo void
  - protect xe_drm_client_remove_bo under CONFIG_PROC_FS check - Himal
  - Fixed Checkpatch error - CI
V2:
  - make xe_drm_client_remove_bo return void - Himal

Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:15 -05:00
Thomas Hellström
fc678ec7c2 drm/xe: Reinstate pipelined fence enable_signaling
With the GPUVA conversion, the xe_bo::vmas member became replaced with
drm_gem_object::gpuva.list, however there was a couple of usage instances
left using the old member. Most notably the pipelined fence
enable_signaling.

Remove the xe_bo::vmas member completely, fix usage instances and
also enable this pipelined fence enable_signaling even for faulting
VM:s since we actually wait for bind fences to complete.

v2:
- Rebase.
v3:
- Fix display code build error.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230915172606.14436-1-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:14 -05:00
Francois Dugast
c73acc1eeb drm/xe: Use Xe assert macros instead of XE_WARN_ON macro
The XE_WARN_ON macro maps to WARN_ON which is not justified
in many cases where only a simple debug check is needed.
Replace the use of the XE_WARN_ON macro with the new xe_assert
macros which relies on drm_*. This takes a struct drm_device
argument, which is one of the main changes in this commit. The
other main change is that the condition is reversed, as with
XE_WARN_ON a message is displayed if the condition is true,
whereas with xe_assert it is if the condition is false.

v2:
- Rebase
- Keep WARN splats in xe_wopcm.c (Matt Roper)

v3:
- Rebase

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:08 -05:00
Thomas Hellström
d00e9cc28e drm/xe/vm: Simplify and document xe_vm_lock()
The xe_vm_lock() function was unnecessarily using ttm_eu_reserve_buffers().
Simplify and document the interface.

v4:
- Improve on xe_vm_lock() documentation (Matthew Brost)
v5:
- Rebase conflict.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908091716.36984-3-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:07 -05:00
Thomas Hellström
08a4f00e62 drm/xe/bo: Simplify xe_bo_lock()
xe_bo_lock() was, although it only grabbed a single lock, unnecessarily
using ttm_eu_reserve_buffers(). Simplify and document the interface.

v2:
- Update also the xe_display subsystem.
v4:
- Reinstate a lost dma_resv_reserve_fences().
- Improve on xe_bo_lock() documentation (Matthew Brost)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908091716.36984-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:06 -05:00
Pallavi Mishra
9c0d779fc6 drm/xe: Prevent return with locked vm
Reorder vm_id check after the one for VISIBLE_VRAM. This should
prevent returning with locked vm in error scenario.

Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:28 -05:00
Oak Zeng
0887a2e7ab drm/xe: Make xe_mem_region struct
Make a xe_mem_region structure which will be used in the
coming patches. The new structure is used in both xe device
level (xe->mem.vram) and xe_tile level (tile->vram).

Make the definition of xe_mem_region.dpa_base to be the DPA
base of this memory region and change codes according to
this new definition.

v1:
  - rename xe_mem_region.base to dpa_base per conversation with Mike
    Ruhl

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:19 -05:00
Matthew Auld
ca8656a2eb drm/xe: skip rebind_list if vma destroyed
If we are closing a vm, mark each vma as XE_VMA_DESTROYED and skip
touching the rebind_list if this is seen on the eviction path. That way
we can safely drop the vm dma-resv lock on the close path without
needing to worry about racing with the eviction path trying to add stuff
to the rebind_list which can corrupt our contended list, since the
destroy and rebind links are the same list entry underneath.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/514
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:19 -05:00
Maarten Lankhorst
2a368a09ae drm/xe: Fix error paths of __xe_bo_create_locked
ttm_bo_init_reserved() calls the destroy() callback if it fails.

Because of this, __xe_bo_create_locked is required to be responsible
for freeing the bo even when it's passed in as argument.

Additionally, if the placement check fails, the bo was kept alive.
Fix it too.

Reported-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:17 -05:00
Francois Dugast
99fea68288 drm/xe: Prefer WARN() over BUG() to avoid crashing the kernel
Replace calls to XE_BUG_ON() with calls XE_WARN_ON() which in turn calls
WARN() instead of BUG(). BUG() crashes the kernel and should only be
used when it is absolutely unavoidable in case of catastrophic and
unrecoverable failures, which is not the case here.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:17 -05:00
Lucas De Marchi
b23ebae7ab drm/xe: Set PTE_DM bit for stolen on MTL
Integrated graphics 1270 and beyond should set the PTE_LM bit in the PTE
when it's stolen memory. Add a new function, xe_bo_is_stolen_devmem(),
and use it when encoding the PTE.

In some places in the spec the PTE bit is called "Local Memory",
abbreviated as LM, and in others it's called "Device Memory" (DM). Since
we moved away from "Local Memory" and preferred the "vram" terminology,
also rename the macros as DM to follow the name of the new function.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:05 -05:00
Lucas De Marchi
937b4be72b drm/xe: Decouple vram check from xe_bo_addr()
The output arg is_vram in xe_bo_addr() is unused by several callers.
It's also not what the function is mainly doing. Remove the argument and
let the interested callers to call xe_bo_is_vram().

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:05 -05:00
Matthew Auld
cd928fced9 drm/xe/uapi: add the userspace bits for small-bar
Mostly the same as i915. We add a new hint for userspace to force an
object into the mappable part of vram.

We also need to tell userspace how large the mappable part is. In Vulkan
for example, there will be two vram heaps for small-bar systems. And
here the size of each heap needs to be known. Likewise the used/avail
tracking needs to account for the mappable part.

We also limit the available tracking going forward, such that we limit
to privileged users only, since these values are system wide and are
technically considered an info leak.

v2 (Maarten):
  - s/NEEDS_CPU_ACCESS/NEEDS_VISIBLE_VRAM/ in the uapi. We also no
    longer require smem as an extra placement. This is more flexible,
    and lets us use this for clear-color surfaces, since we need CPU access
    there but we don't want to attach smem, since that effectively disables
    CCS from kernel pov.
  - Reject clear-color CCS buffers where NEEDS_VISIBLE_VRAM is not set,
    instead of migrating it behind the scenes.
v3 (José):
  - Split the changes that limit the accounting for perfmon_capable()
    into a separate patch.
  - Use XE_BO_CREATE_VRAM_MASK.
v4 (Gwan-gyeong Mun):
  - Add some kernel-doc for the query bits.
v5:
  - One small kernel-doc correction. The cpu_visible_size and
    corresponding used tracking are always zero for non
    XE_MEM_REGION_CLASS_VRAM.
v6:
  - Without perfmon_capable() it likely makes more sense to report as
    zero, instead of reporting as used == total size. This should give
    similar behaviour as i915 which rather tracks free instead of used.
  - Only enforce NEEDS_VISIBLE_VRAM on rc_ccs_cc_plane surfaces when the
    device is actually small-bar.

Testcase: igt/tests/xe_query
Testcase: igt/tests/xe_mmap@small-bar
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Auld
6a024f1bfd drm/xe/bo: support tiered vram allocation for small-bar
Add the new flag XE_BO_NEEDS_CPU_ACCESS, to force allocating in the
mappable part of vram. If no flag is specified we do a topdown
allocation, to limit the chances of stealing the precious mappable part,
if we don't need it. If this is a full-bar system, then this all gets
nooped.

For kernel users, it looks like xe_bo_create_pin_map() is the central
place which users should call if they want CPU access to the object, so
add the flag there.

We still need to plumb this through for userspace allocations. Also it
looks like page-tables are using pin_map(), which is less than ideal. If
we can already use the GPU to do page-table management, then maybe we
should just force that for small-bar.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matt Roper
7a060d786c drm/xe/mtl: Map PPGTT as CPU:WC
On MTL and beyond, the GPU performs non-coherent accesses to the PPGTT
page tables.  These page tables should be mapped as CPU:WC.

Removes CAT errors triggered by xe_exec_basic@once-basic on MTL:

   xe 0000:00:02.0: [drm:__xe_pt_bind_vma [xe]] Preparing bind, with range [1a0000...1a0fff) engine 0000000000000000.
   xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]] 1 entries to update
   xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]]  0: Update level 3 at (0 + 1) [0...8000000000) f:0
   xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
   xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
   xe 0000:00:02.0: [drm] Timedout job: seqno=4294967169, guc_id=2, flags=0x4

v2:
 - Rename to XE_BO_PAGETABLE to make it more clear that this BO is the
   pagetable itself, rather than just being bound in the PPGTT.  (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230725003433.1992137-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Brost
1655c893af drm/xe: Reduce the number list links in xe_vma
Combine the userptr, rebind, and destroy links into a union as
the lists these links belong to are mutually exclusive.

v2: Adjust which lists are combined (Thomas H)
v3: Add kernel doc why this is safe (Thomas H), remove related change
of list_del_init -> list_del (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Francois Dugast
72e8d73b71 drm/xe: Cleanup style warnings and errors
Fix 6 errors and 20 warnings reported by checkpatch.pl.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Matthew Auld
ee82d2da9c drm/xe: add missing bulk_move reset
It looks like bulk_move is set during object construction, but is only
removed on object close, however in various places we might not yet have
an actual fd to close, like on the error paths for the gem_create ioctl,
and also one internal user for the evict_test_run_gt() selftest. Try to
handle those cases by manually resetting the bulk_move. This should
prevent triggering:

WARNING: CPU: 7 PID: 8252 at drivers/gpu/drm/ttm/ttm_bo.c:327
ttm_bo_release+0x25e/0x2a0 [ttm]

v2 (Nirmoy):
  - It should be safe to just unconditionally call
    __xe_bo_unset_bulk_move() in most places.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:37 -05:00
Francois Dugast
3e8e7ee6a3 drm/xe: Cleanup style warnings
Reduce the number of warnings reported by checkpatch.pl from 118 to 48 by
addressing those warnings types:

  LEADING_SPACE
  LINE_SPACING
  BRACES
  TRAILING_SEMICOLON
  CONSTANT_COMPARISON
  BLOCK_COMMENT_STYLE
  RETURN_VOID
  ONE_SEMICOLON
  SUSPECT_CODE_INDENT
  LINE_CONTINUATIONS
  UNNECESSARY_ELSE
  UNSPECIFIED_INT
  UNNECESSARY_INT
  MISORDERED_TYPE

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:31 -05:00
Francois Dugast
b8c1ba831e drm/xe: Prevent flooding the kernel log with XE_IOCTL_ERR
Lower log level of XE_IOCTL_ERR macro to debug in order to prevent flooding
kernel log.

v2: Rename XE_IOCTL_ERR to XE_IOCTL_DBG (Rodrigo Vivi)
v3: Rebase
v4: Fix style, remove unrelated change about __FILE__ and __LINE__

Link: https://lists.freedesktop.org/archives/intel-xe/2023-May/004704.html
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:30 -05:00
Matthew Brost
b06d47be7c drm/xe: Port Xe to GPUVA
Rather than open coding VM binds and VMA tracking, use the GPUVA
library. GPUVA provides a common infrastructure for VM binds to use mmap
/ munmap semantics and support for VK sparse bindings.

The concepts are:

1) xe_vm inherits from drm_gpuva_manager
2) xe_vma inherits from drm_gpuva
3) xe_vma_op inherits from drm_gpuva_op
4) VM bind operations (MAP, UNMAP, PREFETCH, UNMAP_ALL) call into the
GPUVA code to generate an VMA operations list which is parsed, committed,
and executed.

v2 (CI): Add break after default in case statement.
v3: Rebase
v4: Fix some error handling
v5: Use unlocked version VMA in error paths
v6: Rebase, address some review feedback mainly Thomas H
v7: Fix compile error in xe_vma_op_unwind, address checkpatch

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:18 -05:00
Matthew Brost
21ed3327e3 drm/xe: Add helpers to hide struct xe_vma internals
This will help with the GPUVA port as the internals of struct xe_vma
will change.

v2: Update comment around helpers

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:18 -05:00
Tejas Upadhyay
e4b2893c17 drm/xe: Make usable size of VRAM readable
Current size member of vram struct does not give
complete information as what "size" contains. Does
it contain reserved portions or not. Name it usable
size and accordingly describe other size members as
well.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:06 -05:00
Thomas Hellström
a201c6ee37 drm/xe/bo: Evict VRAM to TT rather than to system
The main difference is that we don't bounce and sync on eviction, allowing
for pipelined eviction. Moving forward we also need to be careful with
dma mappings which can be released in SYSTEM but may remain in TT.

v2:
- Remove a stale comment (Matthew Brost)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:04 -05:00
Thomas Hellström
70ff6a999d drm/xe/bo: Gracefully handle errors from ttm_bo_move_accel_cleanup().
The function ttm_bo_move_accel_cleanup() attempts to help pipeline a
move, and in doing so, needs memory allocations which may fail.

Rather than failing in a state where the new resource may freed while
accessed by the copy engine, sync uninterruptible and do a failsafe
cleanup.

v2:
- Don't try to attach the signaled fence on ttm_bo_move_accel_cleanup()
  error.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-4-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:04 -05:00
Thomas Hellström
3439cc4661 drm/xe/bo: Avoid creating a system resource when allocating a fresh VRAM bo
When creating a new bo, on the first move the bo->resource is typically
NULL. Our move callback rejected that instructing TTM to create a system
resource. In addition a struct ttm_tt with a page-vector was created,
although not populated with pages. Similarly when the clearing of VRAM
was complete, the system resource was put on a ghost object and freed
using the TTM delayed destroy mechanism.

This is a lot of pointless work. So avoid creating the system resource and
instead change the code to cope with a NULL bo->resource.

v2:
- Add some code comments (Matthew Brost)
v3:
- Fix a dereference of old_mem which might be NULL.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-3-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:04 -05:00
Thomas Hellström
bc2e0215de drm/xe/bo: Fix swapin when moving to VRAM
When a source system resource had been swapped out, we incorrectly
assumed that we were lacking source data for a move and therefore
cleared the destination instead of swapping in and copying the
swapped-out data. Fix this.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:04 -05:00
Matthew Auld
513e826279 drm/xe/bo: consider bo->flags in xe_bo_migrate()
For VRAM allocations the bo->flags can control some characteristics of
the underlying memory, like whether it needs to be contiguous, and in
the future whether it needs to be in the CPU visible portion. Rather use
add_vram() in xe_bo_migrate() which should take care of such things for
us.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:00 -05:00
Matthew Auld
8489f30e0c drm/xe/bo: handle PL_TT -> PL_TT
When moving between PL_VRAM <-> PL_SYSTEM we have to have use PL_TT in
the middle as a temporary resource for the actual copy. In some GL
workloads it can be seen that once the resource has been moved to the
PL_TT we might have to bail out of the ttm_bo_validate(), before
finishing the final hop. If this happens the resource is left as
TTM_PL_FLAG_TEMPORARY, and when the ttm_bo_validate() is restarted the
current placement is always seen as incompatible, requiring us to
complete the move.  However if the BO allows PL_TT as a possible
placement we can end up attempting a PL_TT -> PL_TT move (like when
running out of VRAM) which leads to explosions in xe_bo_move(), like
triggering the XE_BUG_ON(!tile).

Going from TTM_PL_FLAG_TEMPORARY with PL_TT -> PL_VRAM should already
work as-is, so it looks like we only need to worry about PL_TT -> PL_TT
and it looks like we can just treat it as a dummy move, since no real
move is needed.

Reported-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:34:53 -05:00
Matthew Brost
7ba4c5f027 drm/xe: VM LRU bulk move
Use the TTM LRU bulk move for BOs tied to a VM. Update the bulk moves
LRU position on every exec.

v2: Bulk move for compute VMs, use WARN rather than BUG

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:34:53 -05:00
Matt Roper
08dea76745 drm/xe: Move migration from GT to tile
Migration primarily focuses on the memory associated with a tile, so it
makes more sense to track this at the tile level (especially since the
driver was already skipping migration operations on media GTs).

Note that the blitter engine used to perform the migration always lives
in the tile's primary GT today.  In theory that could change if media
GTs ever start including blitter engines in the future, but we can
extend the design if/when that happens in the future.

v2:
 - Fix kunit test build
 - Kerneldoc parameter name update
v3:
 - Removed leftover prototype for removed function.  (Gustavo)
 - Remove unrelated / unwanted error handling change.  (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-15-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:15 -05:00
Matt Roper
876611c2b7 drm/xe: Memory allocations are tile-based, not GT-based
Since memory and address spaces are a tile concept rather than a GT
concept, we need to plumb tile-based handling through lots of
memory-related code.

Note that one remaining shortcoming here that will need to be addressed
before media GT support can be re-enabled is that although the address
space is shared between a tile's GTs, each GT caches the PTEs
independently in their own TLB and thus TLB invalidation should be
handled at the GT level.

v2:
 - Fix kunit test build.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-13-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:14 -05:00
Matt Roper
ebd288cba7 drm/xe: Move VRAM from GT to tile
On platforms with VRAM, the VRAM is associated with the tile, not the
GT.

v2:
 - Unsquash the GGTT handling back into its own patch.
 - Fix kunit test build
v3:
 - Tweak the "FIXME" comment to clarify that this function will be
   completely gone by the end of the series.  (Lucas)
v4:
 - Move a few changes that were supposed to be part of the GGTT patch
   back to that commit.  (Gustavo)
v5:
 - Kerneldoc parameter name fix.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-11-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00
Matt Roper
ad703e0637 drm/xe: Move GGTT from GT to tile
The GGTT exists at the tile level.  When a tile contains multiple GTs,
they share the same GGTT.

v2:
 - Include some changes that were mis-squashed into the VRAM patch.
   (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-9-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00
Michael J. Ruhl
fb31517cd7 drm/xe: Rename GPU offset helper to reflect true usage
The _io_offset helper function is returning an offset into the GPU
address space.  Using the CPU address offset (io_) is not correct.

Rename to reflect usage.
Update to use GPU offset information.
Update PT dma_offset to use the helper

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:10 -05:00
Maarten Lankhorst
094d739f4d drm/xe: Prevent evicting for page tables
When creating page tables from xe_exec_ioctl, we may end up freeing
memory we just validated. To be certain this does not happen, do not
allow the current reservation to be evicted from the ioctl.

Callchain:
[  109.008522]  xe_bo_move_notify+0x5c/0xf0 [xe]
[  109.008548]  xe_bo_move+0x90/0x510 [xe]
[  109.008573]  ttm_bo_handle_move_mem+0xb7/0x170 [ttm]
[  109.008581]  ttm_bo_swapout+0x15e/0x360 [ttm]
[  109.008586]  ttm_device_swapout+0xc2/0x110 [ttm]
[  109.008592]  ttm_global_swapout+0x47/0xc0 [ttm]
[  109.008598]  ttm_tt_populate+0x7a/0x130 [ttm]
[  109.008603]  ttm_bo_handle_move_mem+0x160/0x170 [ttm]
[  109.008609]  ttm_bo_validate+0xe5/0x1d0 [ttm]
[  109.008614]  ttm_bo_init_reserved+0xac/0x190 [ttm]
[  109.008620]  __xe_bo_create_locked+0x153/0x260 [xe]
[  109.008645]  xe_bo_create_locked_range+0x77/0x360 [xe]
[  109.008671]  xe_bo_create_pin_map_at+0x33/0x1f0 [xe]
[  109.008695]  xe_bo_create_pin_map+0x11/0x20 [xe]
[  109.008721]  xe_pt_create+0x69/0xf0 [xe]
[  109.008749]  xe_pt_stage_bind_entry+0x208/0x430 [xe]
[  109.008776]  xe_pt_walk_range+0xe9/0x2a0 [xe]
[  109.008802]  xe_pt_walk_range+0x223/0x2a0 [xe]
[  109.008828]  xe_pt_walk_range+0x223/0x2a0 [xe]
[  109.008853]  __xe_pt_bind_vma+0x28d/0xbd0 [xe]
[  109.008878]  xe_vm_bind_vma+0xc7/0x2f0 [xe]
[  109.008904]  xe_vm_rebind+0x72/0x160 [xe]
[  109.008930]  xe_exec_ioctl+0x22b/0xa70 [xe]
[  109.008955]  drm_ioctl_kernel+0xb9/0x150 [drm]
[  109.008972]  drm_ioctl+0x210/0x430 [drm]
[  109.008988]  __x64_sys_ioctl+0x85/0xb0
[  109.008990]  do_syscall_64+0x38/0x90
[  109.008991]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Original warning:
[ 5613.149126] WARNING: CPU: 3 PID: 45883 at drivers/gpu/drm/xe/xe_vm.c:504 xe_vm_unlock_dma_resv+0x43/0x50 [xe]
...
[ 5613.226398] RIP: 0010:xe_vm_unlock_dma_resv+0x43/0x50 [xe]
[ 5613.316098] Call Trace:
[ 5613.318595]  <TASK>
[ 5613.320743]  xe_exec_ioctl+0x383/0x8a0 [xe]
[ 5613.325278]  ? __is_insn_slot_addr+0x8e/0x110
[ 5613.329719]  ? __is_insn_slot_addr+0x8e/0x110
[ 5613.334116]  ? kernel_text_address+0x75/0xf0
[ 5613.338429]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 5613.343778]  ? __kernel_text_address+0x9/0x40
[ 5613.348181]  ? unwind_get_return_address+0x1a/0x30
[ 5613.353013]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 5613.358362]  ? arch_stack_walk+0x99/0xf0
[ 5613.362329]  ? rcu_read_lock_sched_held+0xb/0x70
[ 5613.366996]  ? lock_acquire+0x287/0x2f0
[ 5613.370873]  ? rcu_read_lock_sched_held+0xb/0x70
[ 5613.375530]  ? rcu_read_lock_sched_held+0xb/0x70
[ 5613.380181]  ? lock_release+0x225/0x2e0
[ 5613.384059]  ? __pfx_xe_exec_ioctl+0x10/0x10 [xe]
[ 5613.389092]  drm_ioctl_kernel+0xc0/0x170
[ 5613.393068]  drm_ioctl+0x1b7/0x490
[ 5613.396519]  ? __pfx_xe_exec_ioctl+0x10/0x10 [xe]
[ 5613.401547]  ? lock_release+0x225/0x2e0
[ 5613.405432]  __x64_sys_ioctl+0x8a/0xb0
[ 5613.409232]  do_syscall_64+0x37/0x90

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/239
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:10 -05:00
Matthew Auld
38453f826d drm/xe/bo: further limit where CCS pages are needed
No need to allocate extra pages for this if we know flat-ccs AUX state
is not even possible, like for normal system memory objects.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:04 -05:00
Thomas Hellström
3690a01ba9 drm/xe: Support copying of data between system memory bos
Modify the xe_migrate_copy() function somewhat to explicitly allow
copying of data between two buffer objects including system memory
buffer objects. Update the migrate test accordingly.

v2:
- Check that buffer object sizes match when copying (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:04 -05:00
Christopher Snowhill
1799c761c4 drm/xe: Validate uAPI padding and reserved fields
Padding and reserved fields are declared such that they must be
zeroed, so verify that they're all zero in the respective ioctl
functions.

Derived from original patch by mlankhorst.

v2:
	Removed extensions checks where there were none originally. (José)
	Moved extraneous parentheses to the correct places. (Lucas)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Christopher Snowhill <kode54@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:01 -05:00
Niranjana Vishwanathapura
1b1d371038 drm/xe: Apply upper limit to sg element size
The iommu_dma_map_sg() function ensures iova allocation doesn't
cross dma segment boundary. It does so by padding some sg elements.
This can cause overflow, ending up with sg->length being set to 0.
Avoid this by halving the maximum segment size (rounded down to
PAGE_SIZE).

Specify maximum segment size for sg elements by using
sg_alloc_table_from_pages_segment() to allocate sg_table.

v2: Use correct max segment size in dma_set_max_seg_size() call

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:01 -05:00
Francois Dugast
116d325152 drm/xe: Fix splat during error dump
Allow xe_bo_addr without lock to print debug information, such
as from xe_analyze_vm.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:33:13 -05:00
Matthew Auld
36919ebeaa drm/xe: fix suspend-resume for dgfx
This stopped working now that TTM treats moving a pinned object through
ttm_bo_validate() as an error, for the general case. Add some new
routines to handle the new special casing needed for suspend-resume.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/244
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Niranjana Vishwanathapura
c33a721943 drm/xe: Use proper vram offset
In xe_migrate functions, use proper vram io offset of the
tiles while calculating addresses.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:30 -05:00
Matthew Auld
e7dc1341f0 drm/xe/bo: refactor try_add_vram
Get rid of some of the duplication here. In a future patch we need to
also consider [fpfn, lpfn], so better adjust in only one place.

Suggested-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:26 -05:00
Matthew Auld
8deba79f5d drm/xe: add XE_BO_CREATE_VRAM_MASK
So we don't have to keep repeating VRAM0 | VRAM1. Also if there are ever
more instances, then we have less places to update.

Suggested-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:26 -05:00
Matthew Brost
59ea53eecb drm/xe: Use BO's GT to determine dma_offset when programming PTEs
Rather than using the passed in GT, use the BO's GT determine dma_offset
when programming PTEs as these two GT's could differ (i.e. mapping a BO
from a remote GT). The BO's GT is correct GT to use as this where BO
resides, while the passed in GT is where the mapping is created.

v2:
  (Thomas)      - Kernel doc, extra new line
  (CI)          - Rebase to tip

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:21 -05:00
Balasubramani Vivekanandan
11a2407ed5 drm/xe: Stop accepting value in xe_migrate_clear
Although xe_migrate_clear() has a value argument, currently the driver
is only passing 0 at all the places this function is invoked with the
exception the kunit tests are using the parameter to validate this
function with different values.
xe_migrate_clear() is failing on platforms with link copy engines
because xe_migrate_clear() via emit_clear() is using the blitter
instruction XY_FAST_COLOR_BLT to clear the memory. But this instruction
is not supported by link copy engine.
So the solution is to use the alternate instruction MEM_SET when
platform contains link copy engine. But MEM_SET instruction accepts only
8-bit value for setting whereas the value agrument of xe_migrate_clear()
is 32-bit.
So instead of spreading this limitation around all invocations of
xe_migrate_clear() and causing more confusion, it was decided to not
accept any value itself as driver does not really need this currently.

All the kunit tests are adapted as per the new function prototype.

This will be followed by a patch to add support for link copy engines.

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Auld
1a653b879d drm/xe/buddy: remove the virtualized start
Hopefully not needed anymore. We can add a .compatible() hook once we
need to differentiate between mappable and non-mappable vram. If the
allocation is not contiguous then the start value is kind of
meaningless, so rather just mark as invalid.

In upstream, TTM wants to eventually remove the ttm_resource.start
usage.

References: 544432703b ("drm/ttm: Add new callbacks to ttm res mgr")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Auld
69db25e447 drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt()
xe_ttm_stolen_cpu_inaccessible() was originally meant to just cover the
case where stolen is not directly CPU accessible on some older
integrated platforms, and as such a GGTT mapping was also required for
CPU access (as per the check in xe_bo_create_pin_map_at()).

However with small-bar systems on dgfx we have one more case where
stolen is also inaccessible, however here we don't have any fallback
GGTT mode for CPU access. Fix the check in xe_bo_create_pin_map_at() to
make this distinction clear. In such a case the later vmap() will fail
anyway.

v2: fix kernel-doc warning
v3: Simplify further and remove cpu_inaccessible()

Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Matthew Auld
2a8477f761 drm/xe: s/lmem/vram/
This seems to be the preferred nomenclature in xe. Currently we are
intermixing vram and lmem, which is confusing.

v2 (Gwan-gyeong Mun & Lucas):
  - Rather apply to the entire driver

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Matthew Auld
d79bdcdf06 drm/xe/bo: explicitly reject zero sized BO
In the depths of ttm, when allocating the vma node this should result in
-ENOSPC it seems. However we should probably rather reject as part of
our own ioctl sanity checking, and then treat as programmer error in the
lower levels.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:43 -05:00
Lucas De Marchi
ea9f879d03 drm/xe: Sort includes
Sort includes and split them in blocks:

1) .h corresponding to the .c. Example: xe_bb.c should have a "#include
   "xe_bb.h" first.
2) #include <linux/...>
3) #include <drm/...>
4) local includes
5) i915 includes

This is accomplished by running
`clang-format --style=file -i --sort-includes drivers/gpu/drm/xe/*.[ch]`
and ignoring all the changes after the includes. There are also some
manual tweaks to split the blocks.

v2: Also sort includes in headers

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:20 -05:00
Matthew Auld
6062acc1b8 drm/xe/stolen: don't map stolen on small-bar
The driver should still be functional with small-bar, just that the vram
size is clamped to the BAR size (until we add proper support for tiered
vram). For stolen vram we shouldn't iomap anything if the BAR size
doesn't also contain the stolen portion, since on discrete the stolen
portion is always at the end of normal vram. Stolen should still be
functional, just that allocating CPU visible io memory will always
return an error.

v2 (Lucas)
  - Mention in the commit message that stolen vram is always as the end
    of normal vram, which is why stolen in not mappable on small-bar
    systems.
  - Just make xe_ttm_stolen_inaccessible() return true for such cases.
    Also rename to xe_ttm_stolen_cpu_inaccessible to better describe
    that we are talking about direct CPU access. Plus add some
    kernel-doc.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/209
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:28:54 -05:00
Matthew Auld
f3edf6917c drm/xe/bo: reduce xe_bo_create_pin_map() restrictions
On DGFX this blows up if can call this with a system memory object:

XE_BUG_ON(!mem_type_is_vram(place->mem_type) && place->mem_type != XE_PL_STOLEN);

If we consider dpt it looks like we can already in theory hit this, if
we run out of vram and stolen vram. It at least seems reasonable to
allow calling this on any object which supports CPU access.

Note this also changes the behaviour with stolen VRAM and suspend, such
that we no longer attempt to migrate stolen objects into system memory.
However nothing in stolen should ever need to be restored (same on
integrated), so should be fine. Also on small-bar systems the stolen
portion is pretty much always non-CPU accessible, and currently pinned
objects use plain memcpy when being moved, which doesn't play nicely.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:44 -05:00
Maarten Lankhorst
9b6483af37 drm/xe: Map initial FB at the same place in GGTT too
I saw a flicker when booting xe, and it's very likely that the original
FB was not mapped at the same place when inheriting, fix it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:44 -05:00
Maarten Lankhorst
d8b52a02cb drm/xe: Implement stolen memory.
This adds support for stolen memory, with the same allocator as
vram_mgr. This allows us to skip a whole lot of copy-paste,
by re-using parts of xe_ttm_vram_mgr.

The stolen memory may be bound using VM_BIND, so it performs like any
other memory region.

We should be able to map a stolen BO directly using the physical memory
location instead of through GGTT even on old platforms, but I don't know
what the effects are on coherency.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-12 14:06:00 -05:00
Matthew Brost
dd08ebf6c3 drm/xe: Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2023-12-12 14:05:48 -05:00