Commit Graph

53 Commits

Author SHA1 Message Date
Sui Jingfeng
a807cb22ad drm/etnaviv: Improve VA, PA, SIZE alignment checking
Alignment checking is only needed to be done in the upper caller function.
If those address and sizes are able to pass the check, it will certainly
pass the same test in the etnaviv_context_unmap() function. We don't need
examine it more than once.

Remove redundant alignment tests, move the those useless to upper caller
function.

Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2024-12-03 18:30:32 +01:00
Sui Jingfeng
0078a6f484 drm/etnaviv: Fix the debug log of the etnaviv_iommu_map()
The value of the 'iova' variable is the start GPUVA that is going to be
mapped, its value doesn't changed when the mapping is on going.

Replace it with the 'da' variable, which is incremental and it reflects
the actual address being mapped exactly.

Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2024-12-03 18:30:32 +01:00
Sui Jingfeng
9aad03e7f5 drm/etnaviv: Drop the offset in page manipulation
The etnaviv driver, both kernel space and user space, assumes that GPU page
size is 4KiB. Its IOMMU map/unmap 4KiB physical address range once a time.
If 'sg->offset != 0' is true, then the current implementation will map the
IOVA to a wrong area, which may lead to coherency problem. Picture 0 and 1
give the illustration, see below.

  PA start drifted
  |
  |<--- 'sg_dma_address(sg) - sg->offset'
  |               .------ sg_dma_address(sg)
  |              |  .---- sg_dma_len(sg)
  |<-sg->offset->|  |
  V              |<-->|    Another one cpu page
  +----+----+----+----+   +----+----+----+----+
  |xxxx|         ||||||   |||||||||||||||||||||
  +----+----+----+----+   +----+----+----+----+
  ^                   ^   ^                   ^
  |<---   da_len  --->|   |                   |
  |                   |   |                   |
  |    .--------------'   |                   |
  |    | .----------------'                   |
  |    | |                   .----------------'
  |    | |                   |
  |    | +----+----+----+----+
  |    | |||||||||||||||||||||
  |    | +----+----+----+----+
  |    |
  |    '--------------.  da_len = sg_dma_len(sg) + sg->offset, using
  |                   |  'sg_dma_len(sg) + sg->offset' will lead to GPUVA
  +----+ ~~~~~~~~~~~~~+  collision, but min_t(unsigned int, da_len, va_len)
  |xxxx|              |  will clamp it to correct size. But the IOVA will
  +----+ ~~~~~~~~~~~~~+  be redirect to wrong area.
  ^
  |             Picture 0: Possibly wrong implementation.
GPUVA (IOVA)

--------------------------------------------------------------------------

                 .------- sg_dma_address(sg)
                 |  .---- sg_dma_len(sg)
  |<-sg->offset->|  |
  |              |<-->|    another one cpu page
  +----+----+----+----+   +----+----+----+----+
  |              ||||||   |||||||||||||||||||||
  +----+----+----+----+   +----+----+----+----+
                 ^    ^   ^                   ^
                 |    |   |                   |
  .--------------'    |   |                   |
  |                   |   |                   |
  |    .--------------'   |                   |
  |    | .----------------'                   |
  |    | |                   .----------------'
  |    | |                   |
  +----+ +----+----+----+----+
  |||||| ||||||||||||||||||||| The first one is SZ_4K, the second is SZ_16K
  +----+ +----+----+----+----+
  ^
  |           Picture 1: Perfectly correct implementation.
GPUVA (IOVA)

If sg->offset != 0 is true, IOVA will be mapped to wrong physical address.
Either because there doesn't contain the data or there contains wrong data.
Strictly speaking, the memory area that before sg_dma_address(sg) doesn't
belong to us, and it's likely that the area is being used by other process.

Because we don't want to introduce confusions about which part is visible
to the GPU, we assumes that the size of GPUVA is always 4KiB aligned. This
is very relaxed requirement, since we already made the decision that GPU
page size is 4KiB (as a canonical decision). And softpin feature is landed,
Mesa's util_vma_heap_alloc() will certainly report correct length of GPUVA
to kernel with desired alignment ensured.

With above statements agreed, drop the "offset in page" manipulation will
return us a correct implementation at any case.

Fixes: a8c21a5451 ("drm/etnaviv: add initial etnaviv DRM driver")
Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2024-12-03 18:30:32 +01:00
Sui Jingfeng
68786b7f49 drm/etnaviv: Map and unmap GPUVA range with respect to the GPUVA size
Etnaviv assumes that GPU page size is 4KiB, however, GPUVA ranges collision
when using softpin capable GPUs on a non 4KiB CPU page size configuration.
The root cause is that kernel side BO takes up bigger address space than
userspace expect, the size of backing memory of GEM buffer objects are
required to align to the CPU PAGE_SIZE. Therefore, results in userspace
allocated GPUVA range fails to be inserted to the specified hole exactly.

To solve this problem, record the GPU visiable size of a BO firstly, then
map and unmap the SG entry strictly with respect to the total GPUVA size.

Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2024-10-28 16:41:54 +01:00
Sui Jingfeng
deadf1ef4a drm/etnaviv: Fix missing mutex_destroy()
Currently, the calling of mutex_destroy() is ignored on error handling
code path. It is safe for now, since mutex_destroy() actually does
nothing in non-debug builds. But the mutex_destroy() is used to mark
the mutex uninitialized on debug builds, and any subsequent use of the
mutex is forbidden.

It also could lead to problems if mutex_destroy() gets extended, add
missing mutex_destroy() to eliminate potential concerns.

Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2024-10-28 16:30:15 +01:00
Sui Jingfeng
9e2e8a5113 drm/etnaviv: Drop the 'len' parameter of etnaviv_iommu_map() function
The 'len' parameter is the 4th argument, because it is not get used, so
drop it. No functional change.

Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2024-01-23 10:20:20 +01:00
Lucas Stach
e116be254a drm/etnaviv: drop GPU initialized property
Now that it is only used to track the driver internal state of
the MMU global and cmdbuf objects, we can get rid of this property
by making the free/finit functions of those objects safe to call
on an uninitialized object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
2023-07-17 11:32:31 +02:00
Lucas Stach
d37c120b73 drm/etnaviv: don't truncate physical page address
While the interface for the MMU mapping takes phys_addr_t to hold a
full 64bit address when necessary and MMUv2 is able to map physical
addresses with up to 40bit, etnaviv_iommu_map() truncates the address
to 32bits. Fix this by using the correct type.

Fixes: 931e97f3af ("drm/etnaviv: mmuv2: support 40 bit phys address")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2022-09-23 21:24:22 +02:00
Lucas Stach
5a40837deb drm/etnaviv: move idle mapping reaping into separate function
The same logic is already used in two different places and now
it will also be needed outside of the compilation unit, so split
it into a separate function.

Cc: stable@vger.kernel.org # 5.19
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2022-08-26 21:36:47 +02:00
Lucas Stach
2829a9fcb7 drm/etnaviv: reap idle softpin mappings when necessary
Right now the only point where softpin mappings get removed from the
MMU context is when the mapped GEM object is destroyed. However,
userspace might want to reuse that address space before the object
is destroyed, which is a valid usage, as long as all mapping in that
region of the address space are no longer used by any GPU jobs.

Implement reaping of idle MMU mappings that would otherwise
prevent the insertion of a softpin mapping.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2022-04-06 10:01:51 +02:00
Lucas Stach
9247fcca39 drm/etnaviv: move flush_seq increment into etnaviv_iommu_map/unmap
The flush sequence is a marker that the page tables have been changed
and any affected TLBs need to be flushed. Move the flush_seq increment
a little further down the call stack to place it next to the actual
page table manipulation. Not functional change.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
2022-04-06 10:01:43 +02:00
Lucas Stach
11ad6a1f18 drm/etnaviv: move MMU context ref/unref into map/unmap_gem
This makes it a little more clear that the mapping holds a reference
to the context once the buffer has been successfully mapped into that
context and simplifies the error handling a bit.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
2022-04-06 10:01:36 +02:00
Lucas Stach
e168c25526 drm/etnaviv: check for reaped mapping in etnaviv_iommu_unmap_gem
When the mapping is already reaped the unmap must be a no-op, as we
would otherwise try to remove the mapping twice, corrupting the involved
data structures.

Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Acked-by: Guido Günther <agx@sigxcpu.org>
2022-04-06 10:00:51 +02:00
Lucas Stach
f2faea8b64 drm/etnaviv: add missing MMU context put when reaping MMU mapping
When we forcefully evict a mapping from the the address space and thus the
MMU context, the MMU context is leaked, as the mapping no longer points to
it, so it doesn't get freed when the GEM object is destroyed. Add the
mssing context put to fix the leak.

Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2021-09-16 10:35:37 +02:00
Dave Airlie
5eb3c85e34 Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into drm-next
Again, nothing big this time. Mostly a new performance counter from
Christian, some more lockdep annotations from Guido and removal of
functionality that duplicates driver core from Robin.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/81367a99b8949584e5becd334ac001b9ad3dc37a.camel@pengutronix.de
2020-12-03 13:25:23 +10:00
Guido Günther
4612bad570 drm/etnaviv: Add lockdep annotations for context lock
etnaviv_iommu_find_iova has it so etnaviv_iommu_insert_exact and
lockdep_assert_held should have it as well.

Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2020-10-30 15:33:59 +01:00
Marek Szyprowski
182354a526 drm: etnaviv: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
2020-09-10 08:18:35 +02:00
Lucas Stach
18fa692d80 drm/etnaviv: reinstate MMUv1 command buffer window check
The switch to per-process address spaces erroneously dropped the check
which validated that the command buffer is mapped through the linear
apperture as required by the hardware. This turned a system
misconfiguration with a helpful error message into a very hard to
debug issue. Reinstate the check at the appropriate location.

Fixes: 17e4660ae3 (drm/etnaviv: implement per-process address spaces on MMUv2)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2019-10-29 18:11:50 +01:00
Lucas Stach
17eae23b08 drm/etnaviv: allow to request specific virtual address for gem mapping
Allow the mapping code to request a specific virtual address for the gem
mapping. If the virtual address is zero we fall back to the old mode of
allocating a virtual address for the mapping.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2019-08-15 11:58:59 +02:00
Lucas Stach
17e4660ae3 drm/etnaviv: implement per-process address spaces on MMUv2
This builds on top of the MMU contexts introduced earlier. Instead of having
one context per GPU core, each GPU client receives its own context.

On MMUv1 this still means a single shared pagetable set is used by all
clients, but on MMUv2 there is now a distinct set of pagetables for each
client. As the command fetch is also translated via the MMU on MMUv2 the
kernel command ringbuffer is mapped into each of the client pagetables.

As the MMU context switch is a bit of a heavy operation, due to the needed
cache and TLB flushing, this patch implements a lazy way of switching the
MMU context. The kernel does not have its own MMU context, but reuses the
last client context for all of its operations. This has some visible impact,
as the GPU can now only be started once a client has submitted some work and
we got the client MMU context assigned. Also the MMU context has a different
lifetime than the general client context, as the GPU might still execute the
kernel command buffer in the context of a client even after the client has
completed all GPU work and has been terminated. Only when the GPU is runtime
suspended or switches to another clients MMU context is the old context
freed up.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2019-08-15 11:44:27 +02:00
Lucas Stach
27b67278e0 drm/etnaviv: rework MMU handling
This reworks the MMU handling to make it possible to have multiple MMU contexts.
A context is basically one instance of GPU page tables. Currently we have one
set of page tables per GPU, which isn't all that clever, as it has the
following two consequences:

1. All GPU clients (aka processes) are sharing the same pagetables, which means
there is no isolation between clients, but only between GPU assigned memory
spaces and the rest of the system. Better than nothing, but also not great.

2. Clients operating on the same set of buffers with different etnaviv GPU
cores, e.g. a workload using both the 2D and 3D GPU, need to map the used
buffers into the pagetable sets of each used GPU.

This patch reworks all the MMU handling to introduce the abstraction of the
MMU context. A context can be shared across different GPU cores, as long as
they have compatible MMU implementations, which is the case for all systems
with Vivante GPUs seen in the wild.

As MMUv1 is not able to change pagetables on the fly, without a
"stop the world" operation, which stops GPU, changes pagetables via CPU
interaction, restarts GPU, the implementation introduces a shared context on
MMUv1, which is returned whenever there is a request for a new context.

This patch assigns a MMU context to each GPU, so on MMUv2 systems there is
still one set of pagetables per GPU, but due to the shared context MMUv1
systems see a change in behavior as now a single pagetable set is used
across all GPU cores.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2019-08-15 10:56:45 +02:00
Lucas Stach
4900dda90a drm/etnaviv: replace MMU flush marker with flush sequence
If a MMU is shared between multiple GPUs, all of them need to flush their
TLBs, so a single marker that gets reset on the first flush won't do.
Replace the flush marker with a sequence number, so that it's possible to
check if the TLB is in sync with the current page table state for each GPU.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2019-08-15 10:56:03 +02:00
Lucas Stach
db82a0435b drm/etnaviv: split out cmdbuf mapping into address space
This allows to decouple the cmdbuf suballocator create and mapping
the region into the GPU address space. Allowing multiple AS to share
a single cmdbuf suballoc.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
2019-08-15 10:55:03 +02:00
Sam Ravnborg
6eae41fea7 drm/etnaviv: drop use of drmP.h
Drop use of the deprecated drmP.h header file.
Fix fallout in all .c files.

The etnaviv_drv.h header file was made self-contained,
and missing includes was then added to the .c files that needed them.
In a few cases the list of include files was sorted.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: etnaviv@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2019-08-02 19:14:51 +02:00
Lucas Stach
f6ffbd4fc1 drm/etnaviv: replace license text with SPDX tags
This replaces the repetitive GPL-2.0 license text in code and header files
with the SPDX tags. Generated hardware headers aren't changed, as any changes
there need to be done in the upstream rnndb repository.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-05-18 15:27:56 +02:00
Lucas Stach
ccae45928f drm/etnaviv: remove cycling through MMU address space
This was useful on MMUv1 GPUs, which don't generate proper faults,
when the GPU write caches weren't fully understood and not properly
handled by the kernel driver. As this has been fixed for quite some
time, the cycling though the MMU address space needlessly spreads
out the MMU mappings.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-05-18 15:27:56 +02:00
Lucas Stach
ba5a42196b drm/etnaviv: use correct format specifier for size_t
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2018-02-12 16:31:02 +01:00
Markus Elfring
ff98159577 drm/etnaviv: Improve unlocking of a mutex in etnaviv_iommu_map_gem()
Add a jump target so that a call of the function "mutex_unlock" is stored
only once at the end of this function implementation.
Replace three calls by goto statements.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-12-01 17:37:31 +01:00
Lucas Stach
b670908384 drm/etnaviv: remove IOMMU dependency
Using the IOMMU API to manage the internal GPU MMU has been an
historical accident and it keeps getting in the way, as well as
entangling the driver with the inner workings of the IOMMU
subsystem.

Clean this up by removing the usage of iommu_domain, which is the
last piece linking etnaviv to the IOMMU subsystem.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10 11:36:37 +02:00
Lucas Stach
27d38062a2 drm/etnaviv: mmu: mark local functions static
And clean up the header file a bit.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 11:36:36 +02:00
Lucas Stach
50073cf98d drm/etnaviv: mmu: stop using iommu map/unmap functions
This is a preparation to remove the etnaviv dependency on the IOMMU
subsystem by importing the relevant parts of the iommu map/unamp
functions into the driver.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 11:36:30 +02:00
Lucas Stach
3bc3e0ecef drm/etnaviv: remove iommu fault handler
The handler has never been used, so it's really just dead code.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 11:34:57 +02:00
Chris Wilson
4e64e5539d drm: Improve drm_mm search (and fix topdown allocation) with rbtrees
The drm_mm range manager claimed to support top-down insertion, but it
was neither searching for the top-most hole that could fit the
allocation request nor fitting the request to the hole correctly.

In order to search the range efficiently, we create a secondary index
for the holes using either their size or their address. This index
allows us to find the smallest hole or the hole at the bottom or top of
the range efficiently, whilst keeping the hole stack to rapidly service
evictions.

v2: Search for holes both high and low. Rename flags to mode.
v3: Discover rb_entry_safe() and use it!
v4: Kerneldoc for enum drm_mm_insert_mode.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Sinclair Yeh <syeh@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com> # vmwgfx
Reviewed-by: Lucas Stach <l.stach@pengutronix.de> #etnaviv
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170202210438.28702-1-chris@chris-wilson.co.uk
2017-02-03 11:10:32 +01:00
Dave Airlie
99743ae4c5 Merge branch 'drm-etnaviv-next' of https://git.pengutronix.de/git/lst/linux into drm-next
It includes code cleanups from Bhumika and Liviu, a significant shader
performance fix and additions to the cmdstream validator from Wladimir
and the addition of a cmdbuf suballocator by myself.
The suballocator improves performance on all chips by reducing the CPU
overhead of the kernel driver and side steps the GC3000 FE MMU flush
erratum, now making the workarounds in IOVA allocation we had before
unnecessary, which results in a nice cleanup of the code in that area.

* 'drm-etnaviv-next' of https://git.pengutronix.de/git/lst/linux:
  drm/etnaviv: Remove duplicate header file include
  Revert "drm/etnaviv: trick drm_mm into giving out a low IOVA"
  drm/etnaviv: add cmdbuf suballocator
  drm/etnaviv: get cmdbuf physical address through the cmdbuf abstraction
  drm/etnaviv: wire up iova handling in new cmdbuf abstraction
  drm/etnaviv: move cmdbuf de-/allocation into own file
  drm/etnaviv: always flush MMU TLBs on map/unmap
  drm/etnaviv: constify etnaviv_iommu_ops structures
  drm/etnaviv: set up initial PULSE_EATER register
  drm/etnaviv: add new GC3000 sensitive states
2017-02-03 05:41:58 +10:00
Lucas Stach
e17d0bf23f Revert "drm/etnaviv: trick drm_mm into giving out a low IOVA"
Now that commandstreams are handled through the cmdbuf suballocator
the workaround to make the IOVA games work is not needed anymore.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-02 10:30:43 +01:00
Lucas Stach
e66774dd6f drm/etnaviv: add cmdbuf suballocator
There are 3 big benefits to suballocating a single big DMA buffer
for command submission:

1. Avoid hammering CMA. The old way of allocating and freeing a DMA
   buffer for each submission was hitting some of the real slow
   pathes in CMA, as this allocator was not designed for a concurrent
   small buffers load.

2. Less TLB flushes on IOMMUv2. If a new command buffer is mapped into
   the GPU address space the MMU TLBs need to be flushed. By having
   one big buffer statically mapped to the GPU, a lot of those flushes
   can be avoided.

3. No funky workarounds for GC3000. The FE TLB flush on GC3000 isn't
   reliable. To work around that we tried to lay out the cmdbufs in
   the GPU address space in a way to avoid this issue. This hasn't
   always worked if the address space is crowded. A single statically
   mapped buffer avoids the erratum completely.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-02 10:30:37 +01:00
Lucas Stach
ea1f5729aa drm/etnaviv: move cmdbuf de-/allocation into own file
This will get more complex with the following changes, so move it
into its own place.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-02 10:30:15 +01:00
Lucas Stach
d46450737c drm/etnaviv: always flush MMU TLBs on map/unmap
This ensures that the GPU isn't able to write into already freed
objects, as doing this in the IOVA reaper isn't enough, as the
gem_free_object path will also cause unmaps to happen.

On MMUv2 this also ensures that stale entries, which may have
been prefetched into the TLB will be purged.

The flush is low overhead, as it gets batched up with the next
user command buffer, so this isn't incuring an overhead for
each buffer map/unmap.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-02 10:30:08 +01:00
Dave Airlie
b0df0b251b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
Backmerge Linus master to get the connector locking revert.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux: (645 commits)
  sysctl: fix proc_doulongvec_ms_jiffies_minmax()
  Revert "drm/probe-helpers: Drop locking from poll_enable"
  MAINTAINERS: add Dan Streetman to zbud maintainers
  MAINTAINERS: add Dan Streetman to zswap maintainers
  mm: do not export ioremap_page_range symbol for external module
  mn10300: fix build error of missing fpu_save()
  romfs: use different way to generate fsid for BLOCK or MTD
  frv: add missing atomic64 operations
  mm, page_alloc: fix premature OOM when racing with cpuset mems update
  mm, page_alloc: move cpuset seqcount checking to slowpath
  mm, page_alloc: fix fast-path race with cpuset update or removal
  mm, page_alloc: fix check for NULL preferred_zone
  kernel/panic.c: add missing \n
  fbdev: color map copying bounds checking
  frv: add atomic64_add_unless()
  mm/mempolicy.c: do not put mempolicy before using its nodemask
  radix-tree: fix private list warnings
  Documentation/filesystems/proc.txt: add VmPin
  mm, memcg: do not retry precharge charges
  proc: add a schedule point in proc_pid_readdir()
  ...
2017-01-27 11:00:42 +10:00
Lucas Stach
3546fb0cda drm/etnaviv: trick drm_mm into giving out a low IOVA
After rollover of the IOVA space, we want to get a low IOVA address,
otherwise the the games we play by remembering the last IOVA are
pointless. When we search for a free hole with DRM_MM_SEARCH_DEFAULT,
drm_mm will pop the next entry from the free holes stack, which will
likely be a high IOVA. By using DRM_MM_SEARCH_BELOW we can trick
drm_mm into reversing the search and provide us with a low IOVA.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir van der Laan <laanwj@gmail.com>
2017-01-11 10:38:45 +01:00
Chris Wilson
0b04d474a6 drm: Compute tight evictions for drm_mm_scan
Compute the minimal required hole during scan and only evict those nodes
that overlap. This enables us to reduce the number of nodes we need to
evict to the bare minimum.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161222083641.2691-31-chris@chris-wilson.co.uk
2016-12-28 11:50:55 +01:00
Chris Wilson
9a71e27788 drm: Extract struct drm_mm_scan from struct drm_mm
The scan state occupies a large proportion of the struct drm_mm and is
rarely used and only contains temporary state. That makes it suitable to
moving to its struct and onto the stack of the callers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[danvet: Fix up etnaviv to compile, was missing a BUG_ON.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-12-27 16:44:13 +01:00
Lucas Stach
8814d2dce0 drm/etnaviv: block 64K of address space behind each cmdstream
To make sure we don't place anything there which might confuse
the FE prefetcher. This gets rid of another case of FE MMU faults
when the address space gets crowded before triggering the reaper.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-10-10 15:26:39 +02:00
Lucas Stach
7c971c62dd drm/etnaviv: space out IOVA layout for cmdbufs on MMUv2
At least on the GC3000 the FE MMU is not properly flushing stale TLB
entries. Make sure to map the cmdbufs with a big enough spacing in
the IOVAs to not hit old/prefetched TLB entries when jumping to a
newly mapped cmdbuf.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:44 +02:00
Lucas Stach
afb7b3b1de drm/etnaviv: implement IOMMUv2 translation
All other parts are now in place, so implement the actual translation
step and hook it up, so the driver claims support for cores with
the new MMU.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:43 +02:00
Lucas Stach
e68f270f21 drm/etnaviv: map cmdbuf through MMU on version 2
With MMUv2 all buffers need to be mapped through the MMU once it
is enabled. Align the buffer size to 4K, as the MMU is only able to
map page aligned buffers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:40 +02:00
Lucas Stach
90969c9aa9 drm/etnaviv: split out iova search and MMU reaping logic
With MMUv2 the command buffers need to be mapped through the MMU.
Split out the iova search and MMU reaping logic so it can be reused
for the cmdbuf mapping, where no GEM object is involved.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:39 +02:00
Lucas Stach
e07c0db5e8 drm/etnaviv: move gpu_va() to etnaviv mmu
The GPU virtual address for the command buffers differs depending on
the IOMMU version. Move the calculation of the iova into etnaviv
mmu, to enable proper dispatch.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:37 +02:00
Lucas Stach
dd34bb9655 drm/etnaviv: move IOMMU domain allocation into etnaviv MMU
The GPU code doesn't need to deal with the IOMMU directly, instead
it can all be hidden behind the etnaviv mmu interface. Move the
last remaining part into etnaviv mmu.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:36 +02:00
Lucas Stach
e095c8feb8 drm/etnaviv: indirect IOMMU restore through etnaviv MMU
So we can call the v2 restore code once it is there.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2016-09-15 15:29:35 +02:00