Commit Graph

23 Commits

Author SHA1 Message Date
Maxime Ripard
93c7dd1b39
Merge drm/drm-next into drm-misc-next
Bring rc1 to start the new release dev.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-02-06 13:47:32 +01:00
Karol Wachowski
0240fa18d2 accel/ivpu: Dump only first MMU fault from single context
Stop dumping consecutive faults from an already faulty context immediately,
instead of waiting for the context abort thread handler (IRQ handler bottom
half) to abort currently executing jobs.

Remove 'R' (record events) bit from context descriptor of a faulty
context to prevent future faults generation.

This change speeds up the IRQ handler by eliminating the need to print the
fault content repeatedly. Additionally, it prevents flooding dmesg with
errors, which was occurring due to the delay in the bottom half of the
handler stopping fault-generating jobs.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-7-maciej.falkowski@linux.intel.com
2025-01-09 09:35:44 +01:00
Jacek Lawrynowicz
6c9ba75f14 accel/ivpu: Fix memory leak in ivpu_mmu_reserved_context_init()
Add appropriate error handling to ensure all allocated resources are
released upon encountering an error.

Fixes: a74f4d9913 ("accel/ivpu: Defer MMU root page table allocation")
Cc: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-3-jacek.lawrynowicz@linux.intel.com
2024-12-19 13:16:21 +01:00
Karol Wachowski
83b6fa5844 accel/ivpu: Increase DMA address range
Increase DMA address range to:
 * 128 GB on 37xx (due to MMU limitations)
 * 256 GB on other generations
Merge User and DMA ranges on 40xx and above as it is possible
to access whole 256 GBs from both FW and DMA.

Increase User range on 37xx from 255MB to 511MB
to allow loading very large models.

Do not set global_alias_pio_base/size on other generations than 37xx
as it's only used on 37xx anyway.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017145817.121590-11-jacek.lawrynowicz@linux.intel.com
2024-10-30 10:23:57 +01:00
Karol Wachowski
1fc65fa96f accel/ivpu: Unmap partially mapped BOs in case of errors
Ensure that all buffers that were created only partially through
allocated scatter-gather table are unmapped from MMU600 in case of errors.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017145817.121590-6-jacek.lawrynowicz@linux.intel.com
2024-10-30 10:22:07 +01:00
Karol Wachowski
a74f4d9913 accel/ivpu: Defer MMU root page table allocation
Defer root page table allocation and unify context init/fini functions.
Move allocation of the root page table from the file_priv_open function to
perform a lazy allocation approach during ivpu_bo_pin().

By doing so, we avoid the overhead of allocating page tables for simple
operations like GET_PARAM that do not require them.
Additionally, the MMU context descriptor table initialization has been
moved to the ivpu_mmu_context_map_page function.

This change streamlines the process and ensures that the descriptor table
is only initialized when it is actually needed.
Refactor init/fini functions to remove redundant code and make the context
management more straightforward.

Overall, these changes lead to a reduction in the time taken by the file
descriptor open operation, as the costly root page table allocation is now
avoided for operations that do not require it.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017145817.121590-3-jacek.lawrynowicz@linux.intel.com
2024-10-30 10:22:04 +01:00
Wachowski, Karol
72b96ec655 accel/ivpu: Make parts of FW image read-only
Implement setting specified buffer ranges as read-only.
In case if specified range is not 64K aligned and 64K contiguous
MMU600 pages are turned on, split 64K mapping to allow 4K granularity
for read-only configuration.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-10-jacek.lawrynowicz@linux.intel.com
2024-06-14 09:14:29 +02:00
Kent Overstreet
0069455bcb fix missing vmalloc.h includes
Patch series "Memory allocation profiling", v6.

Overview:
Low overhead [1] per-callsite memory allocation profiling. Not just for
debug kernels, overhead low enough to be deployed in production.

Example output:
  root@moria-kvm:~# sort -rn /proc/allocinfo
   127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
    56373248     4737 mm/slub.c:2259 func:alloc_slab_page
    14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
    14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
    13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
    11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
     9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
     4206592        4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
     4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
     3940352      962 mm/memory.c:4214 func:alloc_anon_folio
     2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
     ...

Usage:
kconfig options:
 - CONFIG_MEM_ALLOC_PROFILING
 - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
 - CONFIG_MEM_ALLOC_PROFILING_DEBUG
   adds warnings for allocations that weren't accounted because of a
   missing annotation

sysctl:
  /proc/sys/vm/mem_profiling

Runtime info:
  /proc/allocinfo

Notes:

[1]: Overhead
To measure the overhead we are comparing the following configurations:
(1) Baseline with CONFIG_MEMCG_KMEM=n
(2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
(3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
(4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1)
(5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT
(6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)  && CONFIG_MEMCG_KMEM=y
(7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y

Performance overhead:
To evaluate performance we implemented an in-kernel test executing
multiple get_free_page/free_page and kmalloc/kfree calls with allocation
sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
affinity set to a specific CPU to minimize the noise. Below are results
from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
56 core Intel Xeon:

                        kmalloc                 pgalloc
(1 baseline)            6.764s                  16.902s
(2 default disabled)    6.793s  (+0.43%)        17.007s (+0.62%)
(3 default enabled)     7.197s  (+6.40%)        23.666s (+40.02%)
(4 runtime enabled)     7.405s  (+9.48%)        23.901s (+41.41%)
(5 memcg)               13.388s (+97.94%)       48.460s (+186.71%)
(6 def disabled+memcg)  13.332s (+97.10%)       48.105s (+184.61%)
(7 def enabled+memcg)   13.446s (+98.78%)       54.963s (+225.18%)

Memory overhead:
Kernel size:

   text           data        bss         dec         diff
(1) 26515311	      18890222    17018880    62424413
(2) 26524728	      19423818    16740352    62688898    264485
(3) 26524724	      19423818    16740352    62688894    264481
(4) 26524728	      19423818    16740352    62688898    264485
(5) 26541782	      18964374    16957440    62463596    39183

Memory consumption on a 56 core Intel CPU with 125GB of memory:
Code tags:           192 kB
PageExts:         262144 kB (256MB)
SlabExts:           9876 kB (9.6MB)
PcpuExts:            512 kB (0.5MB)

Total overhead is 0.2% of total memory.

Benchmarks:

Hackbench tests run 100 times:
hackbench -s 512 -l 200 -g 15 -f 25 -P
      baseline       disabled profiling           enabled profiling
avg   0.3543         0.3559 (+0.0016)             0.3566 (+0.0023)
stdev 0.0137         0.0188                       0.0077


hackbench -l 10000
      baseline       disabled profiling           enabled profiling
avg   6.4218         6.4306 (+0.0088)             6.5077 (+0.0859)
stdev 0.0933         0.0286                       0.0489

stress-ng tests:
stress-ng --class memory --seq 4 -t 60
stress-ng --class cpu --seq 4 -t 60
Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/

[2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/


This patch (of 37):

The next patch drops vmalloc.h from a system header in order to fix a
circular dependency; this adds it to all the files that were pulling it in
implicitly.

[kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c]
  Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev
[surenb@google.com: fix arch/x86/mm/numa_32.c]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com
[kent.overstreet@linux.dev: a few places were depending on sizes.h]
  Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev
[arnd@arndb.de: fix mm/kasan/hw_tags.c]
  Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org
[surenb@google.com: fix arc build]
  Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:49 -07:00
Wachowski, Karol
8047d36fe5 accel/ivpu: Add debug prints for MMU map/unmap operations
It is common need to be able to see IOVA/physical to VPU addresses
mappings. Especially when debugging different kind of memory related
issues. Lack of such logs forces user to modify and recompile KMD manually.

This commit adds those logs under MMU debug mask which can be turned on
dynamically with module param during KMD load.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-4-jacek.lawrynowicz@linux.intel.com
2024-01-22 10:27:47 +01:00
Maxime Ripard
3bf3e21c15
Merge drm/drm-next into drm-misc-next
Let's kickstart the v6.8 release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-11-15 10:56:44 +01:00
Jacek Lawrynowicz
48aea7f2a2 accel/ivpu: Fix locking in ivpu_bo_remove_all_bos_from_context()
ivpu_bo_remove_all_bos_from_context() could race with ivpu_bo_free()
when prime buffer was closed after vpu device was closed.

Move the bo_list from context to vdev and use a dedicated lock to
sync it. This list is not modified when BO is added/removed from
a context.

Also rename ivpu_bo_free_vpu_addr() to ivpu_bo_unbind() because this
function does more then just free vpu_addr.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-3-stanislaw.gruszka@linux.intel.com
2023-11-08 16:27:35 +01:00
Jacek Lawrynowicz
b035224134 accel/ivpu: Allocate vpu_addr in gem->open() callback
Use gem->open() callback to simplify the code and prepare for gem_shmem
conversion. It is called during handle creation for a gem object,
during prime import and in BO_CREATE ioctl. Hence can be used for vpu_addr
allocation. On the way remove unused bo->user_ptr field.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-2-stanislaw.gruszka@linux.intel.com
2023-11-08 16:26:34 +01:00
Karol Wachowski
3bcc5209ba accel/ivpu: Make DMA allocations for MMU600 write combined
Previously using dma_alloc_wc() API we created cache coherent
(mapped as write-back) mappings.

Because we disable MMU600 snooping it was required to do costly
page walk and cache flushes after each page table modification.

With write-combined buffers it's possible to do a single write memory
barrier to flush write-combined buffer to memory which simplifies the
driver and significantly reduce time of map/unmap operations.

Mapping time of 255 MB is reduced from 2.5 ms to 500 us.

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-7-stanislaw.gruszka@linux.intel.com
2023-10-31 16:45:45 +01:00
Dave Airlie
7cd62eab9b Linux 6.6-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmU1ngkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGrsIH/0k/+gdBBYFFdEym
 foRhKir9WV3ZX4oIozJjA1f7T+qVYclKs6kaYm3gNepRBb6AoG8pdgv4MMAqhYsf
 QMe2XHi0MrO/qKBgfNfivxEa9jq+0QK5uvTbqCRqCAB8LfwVyDqapCmg3EuiZcPW
 UbMITmnwLIfXgPxvp9rabmCsTqO6FLbf0GDOVIkNSAIDBXMpcO1iffjrWUbhRa7n
 oIoiJmWJLcXLxPWDsRKbpJwzw2cIG08YhfQYAiQnC3YaeRm1FKLDIICRBsmfYzja
 rWv9r4dn4TDfV4/AnjggQnsZvz2yPCxNaFSQIT88nIeiLvyuUTJ9j8aidsSfMZQf
 xZAbzbA=
 =NoQv
 -----END PGP SIGNATURE-----

BackMerge tag 'v6.6-rc7' into drm-next

This is needed to add the msm pr which is based on a higher base.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-10-23 18:20:06 +10:00
Wludzik, Jozef
8f5ad367e8 accel/ivpu: Extend address range for MMU mmap
Allow to use whole address range in MMU context mmap which is up to 48
bits. Return invalid argument from MMU context mmap in case address is
not aligned to MMU page size, address is below MMU page size or address
is greater then 47 bits.

This fixes problem disallowing to run large models on VPU4

Signed-off-by: Wludzik, Jozef <jozef.wludzik@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231018110113.547208-1-stanislaw.gruszka@linux.intel.com
2023-10-19 08:01:20 +02:00
Karol Wachowski
34d03f2a17 accel/ivpu: Initialize context with SSID = 1
Context with SSID = 1 is reserved and accesses on that context happen
only when context is uninitialized on the VPU side. Such access triggers
MMU fault (0xa) "Invalid CD Fetch", which doesn't contain any useful
information besides context ID.

This commit will change that state, now (0x10) "Translation fault" will
be triggered and accessed address will shown in the log.

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-7-stanislaw.gruszka@linux.intel.com
2023-09-04 11:01:26 +02:00
Stanislaw Gruszka
edee62c085 accel/ivpu: Add information about context on failure
Identify the mmu context that failed to initialize in the error messages.
This allows the error to be correlated with a specific user during debug.

Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-5-stanislaw.gruszka@linux.intel.com
2023-09-04 11:01:26 +02:00
Jacek Lawrynowicz
0a9cd7924e accel/ivpu: Remove duplicated error messages
Reduce the number of error messages per single failure in
ivpu_dev_init() and ivpu_probe().

Most error messages are already printed by functions called
from ivpu_dev_init(). Add missed error prints in ivpu_ipc_init()
and ivpu_mmu_context_init().

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-3-stanislaw.gruszka@linux.intel.com
2023-09-04 11:01:26 +02:00
Karol Wachowski
162f17b2d9 accel/ivpu: Refactor memory ranges logic
Add new dma range and change naming convention for virtual address
memory ranges managed by KMD.

New available ranges are named as follows:
 * global range - global context accessible by FW
 * aliased range - user context accessible by FW
 * dma range - user context accessible by DMA
 * shave range - user context accessible by shaves
 * global shave range - global context accessible by shave nn

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230731161258.2987564-6-stanislaw.gruszka@linux.intel.com
2023-08-09 13:52:15 +02:00
Karol Wachowski
95d440188d accel/ivpu: Mark 64 kB contiguous areas as contiguous in PTEs
Whenever KMD maps region larger than 64kB that is both aligned and
contiguous, set contiguous bit (52) in MMU PTE descriptor for each page
in that region.

This allows to treat 16 contiguous pages as one and reduce
number of MMU page walks required which results in lower latency.

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230518131605.650622-6-stanislaw.gruszka@linux.intel.com
2023-06-08 07:54:00 +02:00
Karol Wachowski
103d2ea139 accel/ivpu: Rename and cleanup MMU600 page tables
Simplify and unify naming convention in MMU600 page tables
configuration.

All DMA addresses in page tables directly accessed by VPU are called
with _dma sufix and all CPU pointers to those page tables have _ptr
sufix.

Base pointers used to do a page walk on the CPU have corresponding
names:

 pud_ptrs (pointers used to get access to PUD DMA)
 pmd_ptrs (pointers used to get access to PMD DMA)
 pte_ptrs (pointers used to get access to PTE DMA)

with the following convention:

 u64 *pud_dma_ptr = pud_ptrs[pgd_idx];
 *pud_dma_ptr = pud_dma;

 u64 *pmd_dma_ptr = pmd_ptrs[pgd_idx][pud_idx];
 *pmd_dma_ptr = pmd_dma;

 u64 *pte_dma_ptr = pte_ptrs[pgd_idx][pud_idx][pmd_idx];
 *pte_dma_ptr = pte_dma;

On the way change to coherent dma allocation, _wc is only valid on ARM
and was used by mistake.

Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230518131605.650622-5-stanislaw.gruszka@linux.intel.com
2023-06-08 07:53:51 +02:00
Karol Wachowski
a2fd4a6fae accel/ivpu: Add MMU support for 4 level page mappings
Program additional fourth level required for mappings with VA above 38bits.

Co-developed-by: Raymond Tan <raymond.tan@intel.com>
Signed-off-by: Raymond Tan <raymond.tan@intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230518131605.650622-3-stanislaw.gruszka@linux.intel.com
2023-06-08 07:53:33 +02:00
Jacek Lawrynowicz
263b2ba5fc accel/ivpu: Add Intel VPU MMU support
VPU Memory Management Unit is based on ARM MMU-600.
It allows the creation of multiple virtual address spaces for
the device and map noncontinuous host memory (there is no dedicated
memory on the VPU).

Address space is implemented as a struct ivpu_mmu_context, it has an ID,
drm_mm allocator for VPU addresses and struct ivpu_mmu_pgtable that
holds actual 3-level, 4KB page table.
Context with ID 0 (global context) is created upon driver initialization
and it's mainly used for mapping memory required to execute
the firmware.
Contexts with non-zero IDs are user contexts allocated each time
the devices is open()-ed and they map command buffers and other
workload-related memory.
Workloads executing in a given contexts have access only
to the memory mapped in this context.

This patch is has two main files:
  - ivpu_mmu_context.c handles MMU page tables and memory mapping
  - ivpu_mmu.c implements a driver that programs the MMU device

Co-developed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Co-developed-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20230117092723.60441-3-jacek.lawrynowicz@linux.intel.com
2023-01-19 11:07:22 +01:00