Commit Graph

1000 Commits

Author SHA1 Message Date
George Spelvin
21118e8e56 drm/i915/selftests: Avoid passing a random 0 into ilog2
igt_mm_config() calls ilog2() on the (pseudo)random 21-bit number
s>>12.  Once in 2 million seeds, this is zero and ilog2 summons
the nasal demons.

There was an attempt to handle this case with a max(), but that's
too late; ms could already be something bizarre.

Given that the low 12 bits of s and ms are always zero, it's a lot
simpler just to divide them by 4096, then everything fits into 32
bits, and we can easily generate a random number 1 <= s <= 0x1fffff.

Fixes: 14d1b9a624 ("drm/i915: buddy allocator")
Signed-off-by: George Spelvin <lkml@sdf.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325192429.GA8865@SDF.ORG
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-08-17 16:16:54 -04:00
Tianjia Zhang
e714977eef drm/i915: Fix wrong return value
In function i915_active_acquire_preallocate_barrier(), not all
paths have the return value set correctly, and in case of memory
allocation failure, a negative error code should be returned.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200802115655.25568-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-08-17 16:16:40 -04:00
Chris Wilson
98ef067453 drm/i915: Copy default modparams to mock i915_device
Since we use the module parameters stored inside the drm_i915_device
itself, we need to ensure the mock i915_device also sets up the right
defaults.

Fixes: 8a25c4be58 ("drm/i915/params: switch to device specific parameters")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728150600.4509-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-08-17 16:16:22 -04:00
Linus Torvalds
952ace797c IOMMU Updates for Linux v5.9
Including:
 
 	- Removal of the dev->archdata.iommu (or similar) pointers from
 	  most architectures. Only Sparc is left, but this is private to
 	  Sparc as their drivers don't use the IOMMU-API.
 
 	- ARM-SMMU Updates from Will Deacon:
 
 	  -  Support for SMMU-500 implementation in Marvell
 	     Armada-AP806 SoC
 
 	  - Support for SMMU-500 implementation in NVIDIA Tegra194 SoC
 
 	  - DT compatible string updates
 
 	  - Remove unused IOMMU_SYS_CACHE_ONLY flag
 
 	  - Move ARM-SMMU drivers into their own subdirectory
 
 	- Intel VT-d Updates from Lu Baolu:
 
 	  - Misc tweaks and fixes for vSVA
 
 	  - Report/response page request events
 
 	  - Cleanups
 
 	- Move the Kconfig and Makefile bits for the AMD and Intel
 	  drivers into their respective subdirectory.
 
 	- MT6779 IOMMU Support
 
 	- Support for new chipsets in the Renesas IOMMU driver
 
 	- Other misc cleanups and fixes (e.g. to improve compile test
 	  coverage)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl8ygTIACgkQK/BELZcB
 GuPZmRAAzSLuUNoQPWrFUbocNuZ/YHUCKdluKdYx26AgtYFwBrwzDAHPdq8HF8Hm
 y8w2xiUVVP9uZ8gnDkAuwXBtg+yOnG9sRNFZMNdtCy1Q0ehp0HNsn/6NabxVpSml
 QuAmd2PxMMopQRVLOR5YYvZl6JdiZx19W8X+trgwnR9Kghqq+7QXI9+D00jztRxQ
 Qvh/9NvIdX3k+5R4ZPJaV6OhaFvxzQzQZwKuO61VqFOWZRH1z9Oo+aXDCWTFUjYN
 IClTcG8qOK2W9/SOyYDXMoz30Yf0vcuDxhafi2JJVNcTPRmMWoeqff6yKslp76ea
 lTepDcIKld1Ul9NoqfYzhhKiEaLcgMEW2ua6vk5YFVxBBqJfg5qdtDZzBxa0FiNx
 TQrZFX3xjtZC6tRyy+eKWOj6vx7l0ONwwDxRc3HdvL+xE+KUdmsg82qHU4cAHRjp
 U2dgTdlkTEd56q4BEQxmJAHYMIUrx2QAp6pa2+Jv/Iqpi9PsZ2k+l9Gy6h+rM7dn
 Est/1gA4kDhKdCKfTx7g9EL6AAoU50WttxNmwMxrUrXX3fsstfY1fKgyZUPpkL7V
 V5iXbbsdMQLHzOF2qiqIIMxMGYxr/x/FJ1DnSJ7j+jAXMF77d2B9iQttzImOVN2c
 VXBxcVstWN7/xXjIy13C/83bRKwWqXaaS4cbv3Di0ZGFeD2oAF0=
 =3O2Z
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Remove of the dev->archdata.iommu (or similar) pointers from most
   architectures. Only Sparc is left, but this is private to Sparc as
   their drivers don't use the IOMMU-API.

 - ARM-SMMU updates from Will Deacon:

     - Support for SMMU-500 implementation in Marvell Armada-AP806 SoC

     - Support for SMMU-500 implementation in NVIDIA Tegra194 SoC

     - DT compatible string updates

     - Remove unused IOMMU_SYS_CACHE_ONLY flag

     - Move ARM-SMMU drivers into their own subdirectory

 - Intel VT-d updates from Lu Baolu:

     - Misc tweaks and fixes for vSVA

     - Report/response page request events

     - Cleanups

 - Move the Kconfig and Makefile bits for the AMD and Intel drivers into
   their respective subdirectory.

 - MT6779 IOMMU Support

 - Support for new chipsets in the Renesas IOMMU driver

 - Other misc cleanups and fixes (e.g. to improve compile test coverage)

* tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (77 commits)
  iommu/amd: Move Kconfig and Makefile bits down into amd directory
  iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
  iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory
  iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
  iommu: Add gfp parameter to io_pgtable_ops->map()
  iommu: Mark __iommu_map_sg() as static
  iommu/vt-d: Rename intel-pasid.h to pasid.h
  iommu/vt-d: Add page response ops support
  iommu/vt-d: Report page request faults for guest SVA
  iommu/vt-d: Add a helper to get svm and sdev for pasid
  iommu/vt-d: Refactor device_to_iommu() helper
  iommu/vt-d: Disable multiple GPASID-dev bind
  iommu/vt-d: Warn on out-of-range invalidation address
  iommu/vt-d: Fix devTLB flush for vSVA
  iommu/vt-d: Handle non-page aligned address
  iommu/vt-d: Fix PASID devTLB invalidation
  iommu/vt-d: Remove global page support in devTLB flush
  iommu/vt-d: Enforce PASID devTLB field mask
  iommu: Make some functions static
  iommu/amd: Remove double zero check
  ...
2020-08-11 14:13:24 -07:00
Dan Carpenter
23ec9f4224 drm/i915/selftest: Fix an error code in live_noa_gpr()
The error code needs to be set on this path.  It currently returns
success.

Fixes: ed2690a9ca ("drm/i915/selftest: Check that GPR are restored across noa_wait")
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200714143652.GA337376@mwanda
2020-07-14 18:16:15 +01:00
Colin Ian King
d2921096e7 drm/i915/selftest: fix an error return path where err is not being set
There is an error condition where err is not being set and an uninitialized
garbage value in err is being returned.  Fix this by assigning err to an
appropriate error return value before taking the error exit path.

Addresses-Coverity: ("Uninitialized scalar value")
Fixes: ed2690a9ca ("drm/i915/selftest: Check that GPR are restored across noa_wait")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713142551.423649-1-colin.king@canonical.com
2020-07-13 15:45:32 +01:00
Chris Wilson
ed2690a9ca drm/i915/selftest: Check that GPR are restored across noa_wait
Perf implements a GPU delay (noa_wait) by looping until the CS timestamp
has passed a certain point. This use MI_MATH and the general purpose
registers of the user's context, and since it is clobbering the user
state it must carefully save and restore the user's data around the
noa_wait. We can verify this by loading some values in the GPR that we
know will be clobbered by the noa_wait, and then inspecting the GPR after
the noa_wait completes and confirming that they have been restored.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200709224504.11345-2-chris@chris-wilson.co.uk
2020-07-10 10:20:38 +01:00
Daniele Ceraolo Spurio
792592e72a drm/i915: Move the engine mask to intel_gt_info
Since the engines belong to the GT, move the runtime-updated list of
available engines to the intel_gt struct. The original mask has been
renamed to indicate it contains the maximum engine list that can be
found on a matching device.

In preparation for other info being moved to the gt in follow up patches
(sseu), introduce an intel_gt_info structure to group all gt-related
runtime info.

v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
2020-07-08 21:07:11 +01:00
Chris Wilson
12b07256c2 drm/i915: Export ppgtt_bind_vma
Reuse the ppgtt_bind_vma() for aliasing_ppgtt_bind_vma() so we can
reduce some code near-duplication. The catch is that we need to then
pass along the i915_address_space and not rely on vma->vm, as they
differ with the aliasing-ppgtt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703102519.26539-1-chris@chris-wilson.co.uk
2020-07-03 15:14:35 +01:00
Joerg Roedel
01b9d4e211 iommu/vt-d: Use dev_iommu_priv_get/set()
Remove the use of dev->archdata.iommu and use the private per-device
pointer provided by IOMMU core code instead.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200625130836.1916-3-joro@8bytes.org
2020-06-30 11:59:48 +02:00
Jani Nikula
0f69403d25 Merge drm/drm-next into drm-intel-next-queued
Catch up with upstream, in particular to get c1e8d7c6a7 ("mmap locking
API: convert mmap_sem comments").

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-06-25 18:05:03 +03:00
Chris Wilson
810b7ee300 drm/i915/gt: Always report the sample time for busy-stats
Return the monotonic timestamp (ktime_get()) at the time of sampling the
busy-time. This is used in preference to taking ktime_get() separately
before or after the read seqlock as there can be some large variance in
reported timestamps. For selftests trying to ascertain that we are
reporting accurate to within a few microseconds, even a small delay
leads to the test failing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
2020-06-18 09:26:54 +01:00
Chris Wilson
1b90e4a43b drm/i915/selftests: Enable selftesting of busy-stats
A couple of very simple tests to ensure that the basic properties of
per-engine busyness accounting [0% and 100% busy] are faithful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-1-chris@chris-wilson.co.uk
2020-06-18 09:26:53 +01:00
Chris Wilson
8ab3a3812a drm/i915/gt: Incrementally check for rewinding
In commit 5ba32c7be8 ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.

However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.

Fixes: 5ba32c7be8 ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
(cherry picked from commit e36ba817fa)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-06-16 11:34:23 +03:00
Chris Wilson
d4b02a4c61 drm/i915/selftests: Trim execlists runtime
Reduce the smoke depth by trimming the number of contexts, repetitions
and wait times. This is in preparation for a less greedy scheduler that
tries to be fair across contexts, resulting in a great many more context
switches. A thousand context switches may be 50-100ms, causing us to
timeout as the HW is not fast enough to complete the deep smoketests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-5-chris@chris-wilson.co.uk
2020-06-13 10:24:26 +01:00
Chris Wilson
e36ba817fa drm/i915/gt: Incrementally check for rewinding
In commit 5ba32c7be8 ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.

However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.

Fixes: 5ba32c7be8 ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
2020-06-10 15:42:47 +01:00
Kees Cook
684f1a1bf9 drm/i915: Fix comments mentioning typo in IS_ENABLED()
This has no code changes, but the typo is clearly getting copy/pasted,
so better to avoid this now and fix the typo. IS_ENABLED() takes full
names, and must have the "CONFIG_" prefix.

Reported-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/lkml/b08611018fdb6d88757c6008a5c02fa0e07b32fb.camel@perches.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/202006050718.9D4FCFC2E@keescook
2020-06-05 16:28:42 +01:00
Chris Wilson
5a83399536 drm/i915: Drop i915_request.i915 backpointer
We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the most frequently allocated struct vs an 3 bytes and pointer
chasing in using rq->engine->i915?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
2020-06-03 13:53:39 +01:00
Chris Wilson
7d192daa73 drm/i915/gem: Give each object class a friendly name
Name the object classes and their offspring for easier lockdep
debugging.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-2-chris@chris-wilson.co.uk
2020-05-29 23:38:29 +01:00
Dave Airlie
6cf991611b Merge tag 'drm-intel-next-2020-05-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- drm/i915: Show per-engine default property values in sysfs

    By providing the default values configured into the kernel via sysfs, it
    is much more convenient for userspace to restore those sane defaults, or
    at least know what are considered good baseline. This is useful, for
    example, to cleanup after any failed userspace prior to commencing new
    jobs.

Cross-subsystem Changes:

- video/hdmi: Add Unpack only function for DRM infoframe
- Includes pull request gvt-next-2020-05-12

Driver Changes:

- Restore Cherryview back to full-ppgtt (Chris, Mika)
- Document locking guidelines for i915 (Chris, Daniel, Joonas)
- Fix GitLab #1746: Handle idling during i915_gem_evict_something busy loops (Chris)
- Display WA #1105: Require linear fb stride to be multiple of 512 bytes on
  gen9/glk (Ville)
- Add Wa_14010685332 for ICP/ICL (Matt R)
- Restrict w/a 1607087056 for EHL/JSL (Swathi)
- Fix interrupt handling for DP AUX transactions on Tigerlake (Imre)
- Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" (Mika)
- Fix HDC pipeline flush hardware bit on Gen12 (Mika)
- Flush L3 when flushing render on Gen12 (Mika)
- Invalidate aux table entries forcibly between BB on Gen12 (Mika)
- Add aux table invalidate for all engines on Gen12 (Mika)
- Force pte cacheline to main memory Gen8+ (Mika)
- Add and enable TGL+ SAGV support (Stanislav)
- Implement vm_ops->access on i915 mmaps for GDB (Chris, Kristian)
- Replace zero-length array with flexible-array (Gustavo)
- Improve batch buffer pool effectiveness to mitigate soft-rc6 hit (Chris)
- Remove wait priority boosting (Chris)
- Keep driver module referenced when PMU is active (Chris)
- Sanitize RPS interrupts upon resume (Chris)
- Extend pcode read timeout to 20 ms (Chris)
- Wait for ACT sent before enabling MST pipe (Ville)
- Extend support to async relocations to SNB (Chris)
- Remove CNL pre-prod workarounds (Ville)
- Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled (Sultan)
- Record the active CCID from before reset (Chris)
- Mark concurrent submissions with a weak-dependency (Chris)
- Peel dma-fence-chains for await to allow engine-to-engine sync (Lionel)
- Prevent using semaphores to chain up to external fences (Chris)
- Fix GLK watermark calculations (Ville)
- Emit await(batch) before MI_BB_START (Chris)
- Reset execlists registers before HWSP (Chris)
- Drop no-semaphore boosting in favor of fast timeslicing (Chris)
- Fix enabled infoframe states of lspcon (Gwan-gyeong)
- Program DP SDPs on pipe updates (Gwan-gyeong)
- Stop sending DP SDPs on ddi disable (Gwan-gyeong)
- Store CS timestamp frequency in Hz (Ville)

- Remove unused HAS_FWTABLE macro (Pascal)
- Use batchbuffer chaining for relocations to save ring space (Chris)
- Try different engines for relocs if MI ops not supported (Chris, Tvrtko)
- Lazily acquire the device wakeref for freeing objects (Chris)
- Streamline display code arithmetics around rounding etc. (Ville)
- Use bw state for per crtc SAGV evaluation (Stanislav)
- Track active_pipes in bw_state (Stanislav)
- Nuke mode.vrefresh usage (Ville)
- Warn if the FBC is still writing to stolen on removal (Chris)
- Added new PCode commands prepping for QGV rescricting (Stansilav)
- Stop holding onto the pinned_default_state (Chris)
- Propagate error from completed fences (Chris)
- Ignore submit-fences on the same timeline (Chris)
- Pull waiting on an external dma-fence into its routine (Chris)
- Replace the hardcoded I915_FENCE_TIMEOUT with Kconfig (Chris)
- Mark up the racy read of execlists->context_tag (Chris)
- Tidy up the return handling for completed dma-fences (Chris)
- Introduce skl_plane_wm_level accessor (Stanislav)
- Extract SKL SAGV checking (Stanislav)
- Make active_pipes check skl specific (Stanislav)
- Suspend tasklets before resume sanitization (Chris)
- Remove redundant exec_fence (Chris)
- Mark the addition of the initial-breadcrumb in the request (Chris)
- Transfer old virtual breadcrumbs to irq_worker (Chris)
- Read the DP SDPs from the video DIP (Gwan-gyeong)
- Program DP SDPs with computed configs (Gwan-gyeong)
- Add state readout for DP VSC and DP HDR Metadata Infoframe SDP
  (Gwan-gyeong)
- Add compute routine for DP PSR VSC SDP (Gwan-gyeong)
- Use new DP VSC SDP compute routine on PSR (Gwan-gyeong)
- Restrict qgv points which don't have enough bandwidth. (Stanislav)
- Nuke pointless div by 64bit (Ville)

- Static checker code fixes (Nathan, Mika, Chris)
- Add logging function for DP VSC SDP (Gwan-gyeong)
- Include HDMI DRM infoframe, DP HDR metadata and DP VSC SDP in the
  crtc state dump (Gwan-gyeong)
- Make timeslicing explicit engine property (Chris, Tvrtko)
- Selftest and debugging improvements (Chris)
- Align variable names with BSpec (Ville)
- Tidy up gen8+ breadcrumb emission code (Chris)
- Turn intel_digital_port_connected() in a vfunc (Ville)
- Use stashed away hpd isr bits in intel_digital_port_connected() (Ville)
- Extract i915_cs_timestamp_{ns_to_ticks,tick_to_ns}() (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515160703.GA19043@jlahtine-desk.ger.corp.intel.com
2020-05-20 13:36:45 +10:00
Chris Wilson
25c26f18ea drm/i915/selftests: Measure dispatch latency
A useful metric of the system's health is how fast we can tell the GPU
to do various actions, so measure our latency.

v2: Refactor all the instruction building into emitters.
v3: Mark the error handling if not perfect, at least consistent.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519130802.4067-1-chris@chris-wilson.co.uk
2020-05-19 15:28:26 +01:00
Ville Syrjälä
802a5820fc drm/i915: Extract i915_cs_timestamp_{ns_to_ticks,tick_to_ns}()
Pull the code to do the CS timestamp ns<->ticks conversion into
helpers and use them all over.

The check in i915_perf_noa_delay_set() seems a bit dubious,
so we switch it to do what I assume it wanted to do all along
(ie. make sure the resulting delay in CS timestamp ticks
doesn't exceed 32bits)?

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200302143943.32676-5-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-05-14 20:04:02 +03:00
Ville Syrjälä
56f1b31f1d drm/i915: Store CS timestamp frequency in Hz
kHz isn't accurate enough for storing the CS timestamp
frequency on some of the platforms. Store the value
in Hz instead.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200302143943.32676-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2020-05-14 19:59:53 +03:00
Chris Wilson
ed610f4360 drm/i915/selftests: Always call the provided engine->emit_init_breadcrumb
While this does not appear to fix any issues, the backend itself knows
when it wants to emit a breadcrumb, so let it make the final call.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513074809.18194-16-chris@chris-wilson.co.uk
2020-05-14 09:01:54 +01:00
Dave Airlie
a1fb548962 Merge tag 'drm-intel-next-2020-04-30' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:

- Fix GitLab #1698: Performance regression with Linux 5.7-rc1 on
  Iris Plus 655 and 4K screen (Chris)
- Add Wa_14011059788 for Tigerlake (Matt A)
- Add per ctx batchbuffer wa for timestamp for Gen12 (Mika)
- Use indirect ctx bb to load cmd buffer control value
  from context image to avoid corruption (Mika)
- Enable DP Display Audio WA (Uma, Jani)
- Update forcewake firmware ranges for Icelake (Radhakrishna)
- Add missing deinitialization cases of load failure for display (Jose)
- Implement TC cold sequences for Icelake and Tigerlake (Jose)
- Unbreak enable_dpcd_backlight modparam (Lyude)
- Move the late flush_submission in retire to the end (Chris)
- Demote "Reducing compressed framebufer size" message to info (Peter)
- Push MST link retraining to the hotplug work (Ville)
- Hold obj->vma.lock over for_each_ggtt_vma() (Chris)
- Fix timeout handling during TypeC AUX power well enabling for ICL (Imre)
- Fix skl+ non-scaled pfit modes (Ville)
- Prefer soft-rc6 over RPS DOWN_TIMEOUT (Chris)
- Sanitize GT first before poisoning HWSP (Chris)
- Fix up clock RPS frequency readout (Chris)
- Avoid reusing the same logical CCID (Chris)
- Avoid dereferencing a dead context (Chris)
- Always enable busy-stats for execlists (Chris)
- Apply the aggressive downclocking to parking (Chris)
- Restore aggressive post-boost downclocking (Chris)

- Scrub execlists state on resume (Chris)
- Add debugfs attributes for LPSP (Ansuman)
- Improvements to kernel selftests (Chris, Mika)
- Add tiled blits selftest (Zbigniew)
- Fix error handling in __live_lrc_indirect_ctx_bb() (Dan)
- Add pre/post plane updates for SAGV (Stanislav)
- Add ICL PG3 PW ID for EHL (Anshuman)
- Fix Sphinx build duplicate label warning (Jani)
- Error log non-zero audio power refcount after unbind (Jani)
- Remove object_is_locked assertion from unpin_from_display_plane (Chris)
- Use single set of AUX powerwell ops for gen11+ (Matt R)
- Prefer drm_WARN_ON over WARN_ON (Pankaj)
- Poison residual state [HWSP] across resume (Chris, Tvrtko)
- Convert request-before-CS assertion to debug (Chris)
- Carefully order virtual_submission_tasklet (Chris)
- Check carefully for an idle engine in wait-for-idle (Chris)
- Only close vma we open (Chris)
- Trace RPS events (Chris)
- Use the RPM config register to determine clk frequencies (Chris)
- Drop rq->ring->vma peeking from error capture (Chris)
- Check preempt-timeout target before submit_ports (Chris)
- Check HWSP cacheline is valid before acquiring (Chris)
- Use proper fault mask in interrupt postinstall too (Matt R)
- Keep a no-frills swappable copy of the default context state (Chris)

- Add atomic helpers for bandwidth (Stanislav)
- Refactor setting dma info to a common helper from device info (Michael)
- Refactor DDI transcoder code for clairty (Ville)
- Extend PG3 power well ID to ICL (Anshuman)
- Refactor PFIT code for readability and future extensibility (Ville)
- Clarify code split between intel_ddi.c and intel_dp.c (Ville)
- Move out code to return the digital_port of the aux ch (Jose)
- Move rps.enabled/active  and use of RPS interrupts to flags (Chris)
- Remove superfluous inlines and dead code (Jani)
- Re-disable -Wframe-address from top-level Makefile (Nick)
- Static checker and spelling fixes (Colin, Nathan)
- Split long lines (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430124904.GA100924@jlahtine-desk.ger.corp.intel.com
2020-05-14 11:33:10 +10:00
Chris Wilson
9bad40a27d drm/i915/selftests: Always flush before unpining after writing
Be consistent, and even when we know we had used a WC, flush the mapped
object after writing into it. The flush understands the mapping type and
will only clflush if !I915_MAP_WC, but will always insert a wmb [sfence]
so that we can be sure that all writes are visible.

v2: Add the unconditional wmb so we are know that we always flush the
writes to memory/HW at that point.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200511141304.599-1-chris@chris-wilson.co.uk
2020-05-11 16:50:04 +01:00
Chris Wilson
b0a997ae52 drm/i915: Emit await(batch) before MI_BB_START
Be consistent and ensure that we always emit the asynchronous waits
prior to issuing instructions that use the address. This ensures that if
we do emit GPU commands to do the await, they are before our use!

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200510102431.21959-1-chris@chris-wilson.co.uk
2020-05-11 16:50:04 +01:00
Chris Wilson
e3d291301f drm/i915/gem: Implement legacy MI_STORE_DATA_IMM
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary reason would be so that the writes were cache coherent, as the
CPU cache uses physical tagging. As such we did not implement the
legacy variant of MI_STORE_DATA_IMM and so left all the relocations
synchronous -- but with a small function to convert from the vma address
into the physical address, we can implement asynchronous relocs on these
older arches, fixing up a few tests that require them.

In order to be able to test the legacy paths, refactor the gpu
relocations so that we can hook them up to a selftest.

v2: Use an array of offsets not enum labels for the selftest
v3: Refactor the common igt_hexdump()

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
2020-05-04 15:15:04 +01:00
Chris Wilson
426d0073fb drm/i915/gt: Always enable busy-stats for execlists
In the near future, we will utilize the busy-stats on each engine to
approximate the C0 cycles of each, and use that as an input to a manual
RPS mechanism. That entails having busy-stats always enabled and so we
can remove the enable/disable routines and simplify the pmu setup. As a
consequence of always having the stats enabled, we can also show the
current active time via sysfs/engine/xcs/active_time_ns.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
2020-04-30 00:57:34 +01:00
Chris Wilson
be1cb55a07 drm/i915/gt: Keep a no-frills swappable copy of the default context state
We need to keep the default context state around to instantiate new
contexts (aka golden rendercontext), and we also keep it pinned while
the engine is active so that we can quickly reset a hanging context.
However, the default contexts are large enough to merit keeping in
swappable memory as opposed to kernel memory, so we store them inside
shmemfs. Currently, we use the normal GEM objects to create the default
context image, but we can throw away all but the shmemfs file.

This greatly simplifies the tricky power management code which wants to
run underneath the normal GT locking, and we definitely do not want to
use any high level objects that may appear to recurse back into the GT.
Though perhaps the primary advantage of the complex GEM object is that
we aggressively cache the mapping, but here we are recreating the
vm_area everytime time we unpark. At the worst, we add a lightweight
cache, but first find a microbenchmark that is impacted.

Having started to create some utility functions to make working with
shmemfs objects easier, we can start putting them to wider use, where
GEM objects are overkill, such as storing persistent error state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
2020-04-29 19:02:37 +01:00
Chris Wilson
5c4a53e3b1 drm/i915/execlists: Track inflight CCID
The presumption is that by using a circular counter that is twice as
large as the maximum ELSP submission, we would never reuse the same CCID
for two inflight contexts.

However, if we continually preempt an active context such that it always
remains inflight, it can be resubmitted with an arbitrary number of
paired contexts. As each of its paired contexts will use a new CCID,
eventually it will wrap and submit two ELSP with the same CCID.

Rather than use a simple circular counter, switch over to a small bitmap
of inflight ids so we can avoid reusing one that is still potentially
active.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk
2020-04-28 22:17:36 +01:00
Chris Wilson
50689771c8 drm/i915: Only close vma we open
The history of i915_vma_close() is confusing, as is its use. As the
lifetime of the i915_vma is currently bounded by the object it is
attached to, we needed a means of identify when a vma was no longer in
use by userspace (via the user's fd). This is further complicated by
that only ppgtt vma should be closed at the user's behest, as the ggtt
were always shared.

Now that we attach the vma to a lut on the user's context, the open
count does indicate how many unique and open context/vm are referencing
this vma from the user. As such, we can and should just use the
open_count to track when the vma is still in use by userspace.

It's a poor man's replacement for reference counting.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
2020-04-24 11:24:45 +01:00
Chris Wilson
cbfd3a0c5a drm/i915/selftests: Add request throughput measurement to perf
Under ideal circumstances, the driver should be able to keep the GPU
fully saturated with work. Measure how close to ideal we get under the
harshest of conditions with no user payload.

v2: Also measure throughput using only one thread.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422074203.9799-1-chris@chris-wilson.co.uk
2020-04-23 16:40:30 +01:00
Dave Airlie
1aa63ddf72 drm-misc-next for 5.8:
UAPI Changes:
 
   - drm: error out with EBUSY when device has existing master
   - drm: rework SET_MASTER and DROP_MASTER perm handling
 
 Cross-subsystem Changes:
 
   - fbdev: savage: fix -Wextra build warning
   - video: omap2: Use scnprintf() for avoiding potential buffer overflow
 
 Core Changes:
 
   - Remove drm_pci.h
   - drm_pci_{alloc/free)() are now legacy
   - Introduce managed DRM resourcesA
   - Allow drivers to subclass struct drm_framebuffer
   - Introduce struct drm_afbc_framebuffer and helpers
   - fbdev: remove return value from generic fbdev setup
   - Introduce simple-encoder helper
   - vram-helpers: set fence on plane
   - dp_mst: ACT timeout improvements
   - dp_mst: Remove drm_dp_mst_has_audio()
   - TTM: ttm_trace_dma_{map/unmap}() cleanups
   - dma-buf: add flag for PCIP2P support
   - EDID: Various improvements
   - Encoder: cleanup semantics of possible_clones and possible_crtcs
   - VBLANK documentation updates
   - Writeback documentation updates
 
 Driver Changes:
 
   - Convert several drivers to i2c_new_client_device()
   - Drop explicit drm_mode_config_cleanup() calls from drivers
   - Auto-release device structures with drmm_add_final_kfree()
   - Init bfdev console after registering DRM device
   - Make various .debugfs functions return 0 unconditionally; ignore errors
   - video: Use scnprintf() to avoid buffer overflows
   - Convert drivers to simple encoders
 
   - drm/amdgpu: note that we can handle peer2peer DMA-buf
   - drm/amdgpu: add support for exporting VRAM using DMA-buf v3
   - drm/kirin: Revert change to register connectors
   - drm/lima: Add optional devfreq and cooling device support
   - drm/lima: Various improvements wrt. task handling
   - drm/panel: nt39016: Support multiple modes and 50Hz
   - drm/panel: Support Leadtek LTK050H3146W
   - drm/rockchip: Add support for afbc
   - drm/virtio: Various cleanups
   - drm/hisilicon/hibmc: Enforce 128-byte stride alignment
   - drm/qxl: Fix notify port address of cursor ring buffer
   - drm/sun4i: Improvements to format handling
   - drm/bridge: dw-hdmi: Various improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl6VfAAACgkQaA3BHVML
 eiNjBwgAtzRaqrKX3c4aL4NCBmfWzqxvKN0fVcx8tHtjhmrPTLITsHCM+wfcD2qC
 lkr/RMYJT02pNPGnX3jamQk0q/2GKGagChVZgORRsdYOOf5IqGIjvllhkg+U+7YV
 X0pHAfvGk2VyriHYj3s/cnwi9OwZ2UFjdS+f/u2Qp9jQYG/k8u9CCSnzgratY99I
 bI4jZi6JIoRkwuBpBEc9NbrduenKhcYNgPLDiYXY2TFmVz89NwITPnLyf5FWG5zd
 HsQ+dfIS9eoIxL3DTRgBZrPMvrqgiUjztB7cM4bdE0ttwTS7MW6M50/iV553qb9k
 DZ1+/pWFFyZLOPUYc3EK/QYdu8R3QA==
 =MQkd
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-04-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.8:

UAPI Changes:

  - drm: error out with EBUSY when device has existing master
  - drm: rework SET_MASTER and DROP_MASTER perm handling

Cross-subsystem Changes:

  - mm: export two symbols from slub/slob
  - fbdev: savage: fix -Wextra build warning
  - video: omap2: Use scnprintf() for avoiding potential buffer overflow

Core Changes:

  - Remove drm_pci.h
  - drm_pci_{alloc/free)() are now legacy
  - Introduce managed DRM resourcesA
  - Allow drivers to subclass struct drm_framebuffer
  - Introduce struct drm_afbc_framebuffer and helpers
  - fbdev: remove return value from generic fbdev setup
  - Introduce simple-encoder helper
  - vram-helpers: set fence on plane
  - dp_mst: ACT timeout improvements
  - dp_mst: Remove drm_dp_mst_has_audio()
  - TTM: ttm_trace_dma_{map/unmap}() cleanups
  - dma-buf: add flag for PCIP2P support
  - EDID: Various improvements
  - Encoder: cleanup semantics of possible_clones and possible_crtcs
  - VBLANK documentation updates
  - Writeback documentation updates

Driver Changes:

  - Convert several drivers to i2c_new_client_device()
  - Drop explicit drm_mode_config_cleanup() calls from drivers
  - Auto-release device structures with drmm_add_final_kfree()
  - Init bfdev console after registering DRM device
  - Make various .debugfs functions return 0 unconditionally; ignore errors
  - video: Use scnprintf() to avoid buffer overflows
  - Convert drivers to simple encoders

  - drm/amdgpu: note that we can handle peer2peer DMA-buf
  - drm/amdgpu: add support for exporting VRAM using DMA-buf v3
  - drm/kirin: Revert change to register connectors
  - drm/lima: Add optional devfreq and cooling device support
  - drm/lima: Various improvements wrt. task handling
  - drm/panel: nt39016: Support multiple modes and 50Hz
  - drm/panel: Support Leadtek LTK050H3146W
  - drm/rockchip: Add support for afbc
  - drm/virtio: Various cleanups
  - drm/hisilicon/hibmc: Enforce 128-byte stride alignment
  - drm/qxl: Fix notify port address of cursor ring buffer
  - drm/sun4i: Improvements to format handling
  - drm/bridge: dw-hdmi: Various improvements

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200414090738.GA16827@linux-uq9g
2020-04-22 10:41:35 +10:00
Chris Wilson
d4e3d455a1 drm/i915/selftests: Move gpu energy measurement into its own little lib
Move the handy utility to measure the GPU energy consumption using RAPL
msr into a common lib so that it can be reused easily.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417152018.13079-1-chris@chris-wilson.co.uk
2020-04-17 18:48:51 +01:00
Chris Wilson
9da0ea0963 drm/i915/gem: Drop cached obj->bind_count
We cached the number of vma bound to the object in order to speed up
shrinker decisions. This has been superseded by being more proactive in
removing objects we cannot shrink from the shrinker lists, and so we can
drop the clumsy attempt at atomically counting the bind count and
comparing it to the number of pinned mappings of the object. This will
only get more clumsier with asynchronous binding and unbinding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401223924.16667-1-chris@chris-wilson.co.uk
2020-04-02 01:17:39 +01:00
Chris Wilson
d75a92a814 drm/i915: Allow for different modes of interruptible i915_active_wait
Allow some users the discretion to not immediately return on a normal
signal. Hopefully, they will opt to use TASK_KILLABLE instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200327112212.16046-1-chris@chris-wilson.co.uk
2020-03-30 18:20:33 +01:00
Daniel Vetter
d33b58d011 drm: Garbage collect drm_dev_fini
It has become empty. Given the few users I figured not much point
splitting this up.

v2: Rebase over i915 changes.

v3: Rebase over patch split fix.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323144950.3018436-26-daniel.vetter@ffwll.ch
2020-03-26 15:45:36 +01:00
Daniel Vetter
7fb81e9d80 drm/i915: Use drmm_add_final_kfree
With this we can drop the final kfree from the release function.

The mock device in the selftests needed it's pci_device split
up from the drm_device. In the future we could simplify this again
by allocating the pci_device as a managed allocation too.

v2: I overlooked that i915_driver_destroy is also called in the
unwind code of the error path. There we need a drm_dev_put.
Similar for the mock object.

Now the problem with that is that the drm_driver->release callbacks
for both the real driver and the mock one assume everything has been
set up. Hence going through that path for a partially set up driver
will result in issues. Quickest fix is to disable the ->release() hook
until the driver is fully initialized, and keep the onion unwinding.
Long term would be cleanest to move everything over to drmm_ release
actions, but that's a lot of work for a big driver like i915. Plus
more core work needed first anyway.

v3: Fix i915_drm pointer wrangling in mock_gem_device. Also switch
over to start using drm_dev_put() to clean up even on the error path.
Aside I think the current error path is leaking the allocation.

v4: more fixes for intel-gfx-ci, some if it damage from v3 :-/

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200323144950.3018436-9-daniel.vetter@ffwll.ch
2020-03-26 15:17:43 +01:00
Chris Wilson
73c8bfb7fe drm/i915: Drop final few uses of drm_i915_private.engine
We've migrated all the heavy users over to the intel_gt, and can finally
drop the last few users and with that the mirror in dev_priv->engine[].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325234803.6175-1-chris@chris-wilson.co.uk
2020-03-26 10:50:17 +00:00
Matthew Auld
45d4173994 drm/i915/selftests/perf: watch out for stolen objects
Stolen memory is allocated at creation, returning -ENOSPC if we run out
space.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1424
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323110301.38806-1-matthew.auld@intel.com
2020-03-23 11:52:34 +00:00
Chris Wilson
2386b492de drm/i915: Prefer '%ps' for printing function symbol names
%pS includes the offset, which is useful for return addresses but noise
when we are pretty printing a known (and expected) function entry point.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319091943.7815-1-chris@chris-wilson.co.uk
2020-03-19 16:18:14 +00:00
Lionel Landwerlin
9aba9c188d drm/i915/perf: remove generated code
A little bit of history :

   Back when i915-perf was introduced (4.13), there was no way to
   dynamically add new OA configurations to i915. Only the generated
   configs baked in at build time were allowed.

   It quickly became obvious that we would need to allow applications
   to upload their own configurations, for instance to be able to test
   new ones, and so by the next stable version (4.14) we added uAPIs
   to allow uploading new configurations.

   When adding that capability, we took the opportunity to remove most
   HW configurations except the TestOa one which is a configuration
   IGT would rely on to verify that the HW is outputting correct
   values. At the time it made sense to have that confiuration in at
   the same time a given HW platform added to the i915-perf driver.

Now that IGT has become the reference point for HW configurations (see
commit 53f8f541ca ("lib: Add i915_perf library"), previously this was
located in the GPUTop repository), the need for having those
configurations in i915-perf is gone.

On the Mesa side, we haven't relied on this test configuration for a
while. The MDAPI library always required 4.14 feature level and always
loaded its configuration into i915.

I'm sure nobody will miss this generated stuff in i915 :)

v2: Fix selftests by creating an empty config

v3: Fix unlocking on allocation error (Dan Carpenter)

v4: Fixup checkpatch warnings

v5: Fix incorrect unlock in error path (Umesh)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200317132222.2638719-1-lionel.g.landwerlin@intel.com
2020-03-17 15:27:50 +02:00
Chris Wilson
dec9cf9ee8 drm/i915/gt: Pull restoration of GGTT fences underneath the GT
Make the GT responsible for restoring its fence when it wakes up from
suspend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-2-chris@chris-wilson.co.uk
2020-03-16 20:28:28 +00:00
Chris Wilson
f899f786d1 drm/i915: Move GGTT fence registers under gt/
Since the fence registers control HW detiling through the GGTT
aperture, make them a part of the intel_ggtt under gt/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-1-chris@chris-wilson.co.uk
2020-03-16 20:28:26 +00:00
Chris Wilson
e3e7aeec32 drm/i915/selftests: Apply a heavy handed flush to i915_active
Due to the ordering of cmpxchg()/dma_fence_signal() inside node_retire(),
we must also use the xchg() as our primary memory barrier to flush the
outstanding callbacks after expected completion of the i915_active.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306133852.3420322-1-chris@chris-wilson.co.uk
2020-03-07 00:05:54 +00:00
Matthew Auld
1fe3818d17 drm/i915/selftests: try to rein in alloc_smoke
Depending on RNG we might try to fill an 8G region for every possible
order, using the smallest possible chunk size of 4K, which seems to be
very slow. Try to remedy the situation by adding an overall timeout for
the test, while also selecting each order level in a random fashion.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1310
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305204711.217783-2-matthew.auld@intel.com
2020-03-06 14:33:15 +00:00
Mika Kuoppala
ee2413eeed drm/i915: Add mechanism to submit a context WA on ring submission
This patch adds framework to submit an arbitrary batchbuffer on each
context switch to clear residual state for render engine on Gen7/7.5
devices.

The idea of always emitting the context and vm setup around each request
is primary to make reset recovery easy, and not require rewriting the
ringbuffer. As each request would set up its own context, leaving it to
the HW to notice and elide no-op context switches, we could restart the
ring at any point, and reorder the requests freely.

However, to avoid emitting clear_residuals() between consecutive requests
in the ringbuffer of the same context, we do want to track the current
context in the ring. In doing so, we need to be careful to only record a
context switch when we are sure the next request will be emitted.

This security mitigation change does not trigger any performance
regression. Performance is on par with current mainline/drm-tip.

v2: Update vm_alias params to point to correct address space "vm" due to
changes made in the patch "f21613797bae98773"

v3-v4: none

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Balestrieri Francesco <francesco.balestrieri@intel.com>
Cc: Bloomfield Jon <jon.bloomfield@intel.com>
Cc: Dutt Sudeep <sudeep.dutt@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306000957.2836150-1-chris@chris-wilson.co.uk
2020-03-06 08:59:06 +00:00
Chris Wilson
36e191f064 drm/i915: Apply i915_request_skip() on submission
Trying to use i915_request_skip() prior to i915_request_add() causes us
to try and fill the ring upto request->postfix, which has not yet been
set, and so may cause us to memset() past the end of the ring.

Instead of skipping the request immediately, just flag the error on the
request (only accepting the first fatal error we see) and then clear the
request upon submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200304121849.2448028-1-chris@chris-wilson.co.uk
2020-03-04 14:29:50 +00:00
Chris Wilson
950da30162 drm/i915/selftests: Disable heartbeat around manual pulse tests
Still chasing the mystery of the stray idle flush, let's ensure that the
heartbeat does not run at the same time as our test and confuse us.

References: https://gitlab.freedesktop.org/drm/intel/issues/541
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-8-chris@chris-wilson.co.uk
2020-02-28 09:25:41 +00:00
Jani Nikula
cf9bfa3c5c drm/i915: stop assigning drm->dev_private pointer
We no longer need or use it as we subclass struct drm_device in our
struct drm_i915_private, and can always use to_i915() to get at
i915. Stop assigning the pointer to catch anyone trying to add new users
for ->dev_private.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dale B Stimson <dale.b.stimson@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200224113312.13674-1-jani.nikula@intel.com
2020-02-26 10:36:35 +02:00
Chris Wilson
d13a317700 drm/i915: Flush idle barriers when waiting
If we do find ourselves with an idle barrier inside our active while
waiting, attempt to flush it by emitting a pulse using the kernel
context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Steve Carbonari <steven.carbonari@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200225192206.1107336-1-chris@chris-wilson.co.uk
2020-02-25 19:23:17 +00:00
Chris Wilson
e986209c67 drm/i915/gt: Rename i915_gem_restore_ggtt_mappings() for its new placement
The i915_ggtt now sits beneath gt/ outside of the auspices of gem/ and
should be given a fresh name to reflect that. We also want to give it a
name that reflects its role in the system suspend/resume, with the
intention of pulling together all the GGTT operations (e.g. restoring
the fence registers once they are pulled under gt/intel_ggtt_detiler.c)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Rreviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-2-chris@chris-wilson.co.uk
2020-01-30 21:35:37 +00:00
Matthew Auld
ba12993c52 drm/i915/selftests/perf: measure memcpy bw between regions
Measure the memcpy bw between our CPU accessible regions, trying all
supported mapping combinations(WC, WB) across various sizes.

v2:
    use smaller sizes
    throw in memcpy32/memcpy64/memcpy_from_wc

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200129093343.194570-1-matthew.auld@intel.com
2020-01-29 13:13:50 +00:00
Matthew Auld
2c86e55d2a drm/i915/gtt: split up i915_gem_gtt
Attempt to split i915_gem_gtt.[ch] into more manageable chunks.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk
2020-01-07 19:27:36 +00:00
YueHaibing
62bf5465b2 drm/i915: Add missing include file <linux/math64.h>
Fix build error:
./drivers/gpu/drm/i915/selftests/i915_random.h: In function i915_prandom_u32_max_state:
./drivers/gpu/drm/i915/selftests/i915_random.h:48:23: error:
 implicit declaration of function mul_u32_u32; did you mean mul_u64_u32_div? [-Werror=implicit-function-declaration]
  return upper_32_bits(mul_u32_u32(prandom_u32_state(state), ep_ro));

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 7ce5b6850b ("drm/i915/selftests: Use mul_u32_u32() for 32b x 32b -> 64b result")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107135014.36472-1-yuehaibing@huawei.com
2020-01-07 14:04:31 +00:00
Chris Wilson
b2fcaac98b drm/i915/selftests: Make headers self-contained
Include the types used by the headers to they can be compiled
standalone.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200103104516.1757103-2-chris@chris-wilson.co.uk
2020-01-03 13:33:36 +00:00
Chris Wilson
f3bc632acb drm/i915/selftests: Move igt_atomic_section[] out of the header
Move the definition of the igt_atomic_section[] into a C file, leaving
the declaration in the header so as not to upset headertest!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200103104516.1757103-1-chris@chris-wilson.co.uk
2020-01-03 13:31:39 +00:00
Chris Wilson
6056e50033 drm/i915/gem: Support discontiguous lmem object maps
Create a vmap for discontinguous lmem objects to support
i915_gem_object_pin_map().

v2: Offset io address by region.start for fake-lmem

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200102204215.1519103-1-chris@chris-wilson.co.uk
2020-01-03 11:26:01 +00:00
Chris Wilson
4b0dd4a29a drm/i915/selftests: Flush the context worker
When cleaning up the mock device, remember to flush the context worker
to free the residual GEM contexts before shutting down the device.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/802
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191230165821.3840449-1-chris@chris-wilson.co.uk
2019-12-30 20:32:06 +00:00
Chris Wilson
45b152f752 drm/i915/gt: Avoid using the GPU before initialisation
Mark the GT as wedged so that we are not tempted to use it prior to
initialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191229183153.3719869-3-chris@chris-wilson.co.uk
2019-12-30 14:04:57 +00:00
Chris Wilson
d03b224f42 drm/i915/gt: Apply sanitiization just before resume
Bring sanitization completely underneath the umbrella of intel_gt, and
perform it exclusively after suspend and before the next resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191226111834.2545953-1-chris@chris-wilson.co.uk
2019-12-26 12:37:30 +00:00
Jani Nikula
3531c4023c drm/i915/selftests: make mock_drm.h self-contained
Needs i915_drv.h because i915 gets dereferenced.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191219155652.2666-2-jani.nikula@intel.com
2019-12-23 12:38:43 +02:00
Chris Wilson
e26b6d4341 drm/i915/gt: Pull GT initialisation under intel_gt_init()
Begin pulling the GT setup underneath a single GT umbrella; let intel_gt
take ownership of its engines! As hinted, the complication is the
lifetime of the probed engine versus the active lifetime of the GT
backends. We need to detect the engine layout early and keep it until
the end so that we can sanitize state on takeover and release.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222120752.1368352-1-chris@chris-wilson.co.uk
2019-12-22 12:51:32 +00:00
Chris Wilson
e6ba764802 drm/i915: Remove i915->kernel_context
Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
2019-12-21 16:37:10 +00:00
Dan Carpenter
86ca2bf2f9 drm/i915/selftests: remove a condition
We know that "err" is non-zero so there is no need to check.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213105050.y2v5nylsuxvc44jj@kili.mountain
2019-12-13 11:11:57 +00:00
Chris Wilson
5de34ed137 drm/i915/selftests: Show the i915_active on failure
Print the i915_active state on selftest failure, with a hope it helps
illuminate the cause of the failure.

References: https://gitlab.freedesktop.org/drm/intel/issues/765
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191210115502.3767070-1-chris@chris-wilson.co.uk
2019-12-11 11:33:18 +00:00
Daniele Ceraolo Spurio
e9362e1336 drm/i915/guc: kill doorbell code and selftests
Instead of relying on the workqueue, the upcoming reworked GuC
submission flow will offer the host driver indipendent control over
the execution status of each context submitted to GuC. As part of this,
the doorbell usage model has been reworked, with each doorbell being
paired to a single lrc and a doorbell ring representing new work
available for that specific context. This mechanism, however, limits
the number of contexts that can be registered with GuC to the number of
doorbells, which is an undesired limitation. To avoid this limitation,
we requested the GuC team to also provide a H2G that will allow the host
to notify the GuC of work available for a specified lrc, so we can use
that mechanism instead of relying on the doorbells. We can therefore drop
the doorbell code we currently have, also given the fact that in the
unlikely case we'd want to switch back to using doorbells we'd have to
heavily rework it.
The workqueue will still have a use in the new interface to pass special
commands, so that code has been retained for now.

With the doorbells gone and the GuC client becoming even simpler, the
existing GuC selftests don't give us any meaningful coverage so we can
remove them as well. Some selftests might come with the new code, but
they will look different from what we have now so if doesn't seem worth
it to keep the file around in the meantime.

v2: fix comments and commit message (John)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205220243.27403-3-daniele.ceraolospurio@intel.com
2019-12-09 13:55:44 -08:00
Chris Wilson
b006869c6e drm/i915/selftests: Always lock the drm_mm around insert/remove
Be paranoid and make sure the drm_mm is locked whenever we insert/remove
our own nodes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191129095659.665381-1-chris@chris-wilson.co.uk
2019-11-29 14:23:53 +00:00
Chris Wilson
34f5fe1243 drm/i915/selftests: Move mock_vma to the heap to reduce stack_frame
An i915_vma struct on the stack may push the frame over the limit, if
set conservatively, so move it to the heap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125124856.1761176-1-chris@chris-wilson.co.uk
2019-11-25 15:09:14 +00:00
Chris Wilson
de5825beae drm/i915: Serialise with engine-pm around requests on the kernel_context
As the engine->kernel_context is used within the engine-pm barrier, we
have to be careful when emitting requests outside of the barrier, as the
strict timeline locking rules do not apply. Instead, we must ensure the
engine_park() cannot be entered as we build the request, which is
simplest by taking an explicit engine-pm wakeref around the request
construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-1-chris@chris-wilson.co.uk
2019-11-25 13:17:18 +00:00
Chris Wilson
3b054a1c03 drm/i915/selftests: Include the subsubtest name for live_parallel_engines
Include the name of the failing subsubtest, should it fails.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191123191547.925360-1-chris@chris-wilson.co.uk
2019-11-23 21:49:27 +00:00
Chris Wilson
e668950149 drm/i915/selftests: Be explicit in ERR_PTR handling
When setting up a full GGTT, we expect the next insert to fail with
-ENOSPC. Simplify the use of ERR_PTR to not confuse either the reader or
smatch.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
References: f40a7b7558 ("drm/i915: Initial selftests for exercising eviction")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120093302.3723715-8-chris@chris-wilson.co.uk
2019-11-20 10:37:43 +00:00
Chris Wilson
ba446f7460 drm/i915/selftests: Exercise rc6 w/a handling
Reading from CTX_INFO upsets rc6, requiring us to detect and prevent
possible rc6 context corruption. Poke at the bear!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Tested-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191119154723.3311814-1-chris@chris-wilson.co.uk
2019-11-19 20:05:01 +00:00
Chris Wilson
8eed671415 drm/i915/selftests: Add intel_gt_driver_late_release for mock device
Having called intel_gt_init_early() to setup the mock intel_gt, we need
to call the corresponding intel_gt_driver_late_release() to clean up.

References: dea397e818 ("drm/i915/gt: Flush retire.work timer object on unload")
References: 24635c5152 ("drm/i915: Move intel_gt initialization to a separate file")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118094342.2193485-1-chris@chris-wilson.co.uk
2019-11-18 15:41:41 +00:00
Chris Wilson
c9ad602fea drm/i915: Split i915_active.mutex into an irq-safe spinlock for the rbtree
As we want to be able to run inside atomic context for retiring the
i915_active, and we are no longer allowed to abuse mutex_trylock, split
the tree management portion of i915_active.mutex into an irq-safe
spinlock.

References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Fixes: 274cbf20fd ("drm/i915: Push the i915_active.retire into a worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114172535.1116-1-chris@chris-wilson.co.uk
2019-11-14 17:43:41 +00:00
Chris Wilson
3fb33cd32f drm/i915/selftests: Add coverage of mocs registers
Probe the mocs registers for new contexts and across GPU resets. Similar
to intel_workarounds, we have tables of what register values we expect
to see, so verify that user contexts are affected by them. In the
future, we should add tests similar to intel_sseu to cover dynamic
reconfigurations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112223600.30993-4-chris@chris-wilson.co.uk
2019-11-14 17:38:54 +00:00
Chris Wilson
3c7a44bbbf drm/i915/selftests: Perform some basic cycle counting of MI ops
Some basic information that is useful to know, such as how many cycles
is a MI_NOOP.

v2: Keep volatile pages pinned at all times! (Matthew)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Anna Karas <anna.karas@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111172716.23733-1-chris@chris-wilson.co.uk
2019-11-11 18:30:13 +00:00
Chris Wilson
a8c9a7f52e drm/i915/selftests: Complete transition to a real struct file mock
Since drm provided us with a real struct file we can use for our
anonymous internal clients (mock_file), complete our transition to using
that as the primary interface (and not the mocked up struct drm_file we
previous were using).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107213929.23286-1-chris@chris-wilson.co.uk
2019-11-08 10:17:41 +00:00
Masahiro Yamada
ab11a9270a drm/i915: make more headers self-contained
The headers in the gem/selftests/, gt/selftests, gvt/, selftests/
directories have never been compile-tested, but it would be possible
to make them self-contained.

This commit only addresses missing <linux/types.h> and forward
struct declarations.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108094142.25942-1-yamada.masahiro@socionext.com
2019-11-08 10:16:13 +00:00
Chris Wilson
6fedafacae drm/i915/selftests: Wrap vm_mmap() around GEM objects
Provide a utility function to create a vma corresponding to an mmap() of
our device. And use it to exercise the equivalent of userspace
performing a GTT mmap of our objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-4-chris@chris-wilson.co.uk
2019-11-07 21:22:58 +00:00
Chris Wilson
85ca528ed7 drm/i915/selftests: Replace mock_file hackery with drm's true fake
As drm now exports a method to create an anonymous struct file around a
drm_device for internal use, make use of it to avoid our horrible hacks.

Danial suggested that the mock_file_put() wrapper was suitable for
drm-core, along with the mock_drm_getfile() [and that the vestigal
mock_drm_file() in this patch should perhaps be the drm interface
itself]. However, the eventual goal is to remove the mock_drm_file() and
use the struct file and fput() directly, in this patch we take a simple
transition in that direction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107180601.30815-3-chris@chris-wilson.co.uk
2019-11-07 21:22:16 +00:00
Daniel Vetter
f86dbacb30 drm/i915: Switch obj->mm.lock lockdep annotations on its head
The trouble with having a plain nesting flag for locks which do not
naturally nest (unlike block devices and their partitions, which is
the original motivation for nesting levels) is that lockdep will
never spot a true deadlock if you screw up.

This patch is an attempt at trying better, by highlighting a bit more
of the actual nature of the nesting that's going on. Essentially we
have two kinds of objects:

- objects without pages allocated, which cannot be on any lru and are
  hence inaccessible to the shrinker.

- objects which have pages allocated, which are on an lru, and which
  the shrinker can decide to throw out.

For the former type of object, memory allocations while holding
obj->mm.lock are permissible. For the latter they are not. And
get/put_pages transitions between the two types of objects.

This is still not entirely fool-proof since the rules might change.
But as long as we run such a code ever at runtime lockdep should be
able to observe the inconsistency and complain (like with any other
lockdep class that we've split up in multiple classes). But there are
a few clear benefits:

- We can drop the nesting flag parameter from
  __i915_gem_object_put_pages, because that function by definition is
  never going allocate memory, and calling it on an object which
  doesn't have its pages allocated would be a bug.

- We strictly catch more bugs, since there's not only one place in the
  entire tree which is annotated with the special class. All the
  other places that had explicit lockdep nesting annotations we're now
  going to leave up to lockdep again.

- Specifically this catches stuff like calling get_pages from
  put_pages (which isn't really a good idea, if we can call get_pages
  so could the shrinker). I've seen patches do exactly that.

Of course I fully expect CI will show me for the fool I am with this
one here :-)

v2: There can only be one (lockdep only has a cache for the first
subclass, not for deeper ones, and we don't want to make these locks
even slower). Still separate enums for better documentation.

Real fix: don't forget about phys objs and pin_map(), and fix the
shrinker to have the right annotations ... silly me.

v3: Forgot usertptr too ...

v4: Improve comment for pages_pin_count, drop the IMPORTANT comment
and instead prime lockdep (Chris).

v5: Appease checkpatch, no double empty lines (Chris)

v6: More rebasing over selftest changes. Also somehow I forgot to
push this patch :-/

Also format comments consistently while at it.

v7: Fix typo in commit message (Joonas)

Also drop the priming, with the lmem merge we now have allocations
while holding the lmem lock, which wreaks the generic priming I've
done in earlier patches. Should probably be resurrected when lmem is
fixed. See

commit 232a6ebae4
Author: Matthew Auld <matthew.auld@intel.com>
Date:   Tue Oct 8 17:01:14 2019 +0100

    drm/i915: introduce intel_memory_region

I'm keeping the priming patch locally so it wont get lost.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Tang, CQ" <cq.tang@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v6)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191105090148.30269-1-daniel.vetter@ffwll.ch
[mlankhorst: Fix commit typos pointed out by Michael Ruhl]
2019-11-07 09:58:11 +01:00
Chris Wilson
38813767c7 drm/i915/selftests: Flush all active callbacks
Flushing the outer i915_active is not enough, as we need the barrier to
be applied across all the active dma_fence callbacks. So we must
serialise with each outstanding fence.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112096
References: f79520bb33 ("drm/i915/selftests: Synchronize checking active status with retirement")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101181022.25633-1-chris@chris-wilson.co.uk
2019-11-02 08:34:53 +00:00
Chris Wilson
797a615357 drm/i915/gt: Call intel_gt_sanitize() directly
Assume all responsibility for operating on the HW to sanitize the GT
state upon load/resume in intel_gt_sanitize() itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101141009.15581-1-chris@chris-wilson.co.uk
2019-11-01 14:47:36 +00:00
Chris Wilson
4605bb73a8 drm/i915/gt: Pull timeline initialise to intel_gt_init_early
Our timelines are currently contained within an intel_gt, and we only
need to perform list/spinlock initialisation, so we can pull the
intel_timelines_init() into our intel_gt_init_early().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101130406.4142-1-chris@chris-wilson.co.uk
2019-11-01 14:47:36 +00:00
Chris Wilson
f05816cbbc drm/i915/selftests: Spin on all engines simultaneously
Vanshidhar Konda asked for the simplest test "to verify that the kernel
can submit and hardware can execute batch buffers on all the command
streamers in parallel." We have a number of tests in userspace that
submit load to each engine and verify that it is present, but strictly
we have no selftest to prove that the kernel can _simultaneously_
execute on all known engines. (We have tests to demonstrate that we can
submit to HW in parallel, but we don't insist that they execute in
parallel.)

v2: Improve the igt_spinner support for older gen.

Suggested-by: Vanshidhar Konda <vanshidhar.r.konda@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Vanshidhar Konda <vanshidhar.r.konda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Vanshidhar Konda <vanshidhar.r.konda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101101528.10553-1-chris@chris-wilson.co.uk
2019-11-01 13:06:35 +00:00
Chris Wilson
e5661c6ab0 drm/i915/selftests: Start kthreads before stopping
An interesting observation made with our parallel selftests was that on
our small/single cpu systems we would call kthread_stop() before the
kthreads were spawned. If this happens, the kthread is never run at all;
completely bypassing the test.

A simple yield() from the parent will ensure that all children have the
opportunity to start before we reap them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101084940.31838-1-chris@chris-wilson.co.uk
2019-11-01 10:12:29 +00:00
Chris Wilson
164a412886 drm/i915/selftests: Pretty print the i915_active
If the idle_pulse fails to flush the i915_active, dump the tree to see
if that has any clues.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191031101116.19894-1-chris@chris-wilson.co.uk
2019-10-31 14:43:14 +00:00
Lionel Landwerlin
bf96b51508 drm/i915/perf: ensure selftests select valid format
Gen12 only support a single report format :
I915_OA_FORMAT_A32u40_A4u32_B8_C8

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 00a7f0d715 ("drm/i915/tgl: Add perf support on TGL")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029142826.20014-1-lionel.g.landwerlin@intel.com
2019-10-29 18:58:07 +00:00
Matthew Auld
e60f7bb7ea drm/i915/selftests: check for missing aperture
We may be missing support for the mappable aperture on some platforms.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-7-matthew.auld@intel.com
2019-10-29 10:35:47 +00:00
Chris Wilson
6804da20bb drm/i915/selftests: Select a random engine for testing memory regions
Use any blitter engine at random for prefilling the memory region.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191027225808.19437-5-chris@chris-wilson.co.uk
2019-10-28 11:57:17 +00:00
Chris Wilson
3fc794f27f drm/i915: Split memory_region initialisation into its own file
Pull the memory region bookkeeping into its file. Let's start clean and
see how long it lasts!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191026202032.4371-1-chris@chris-wilson.co.uk
2019-10-26 22:25:34 +01:00
Matthew Auld
340be48f2c drm/i915/selftests: add write-dword test for LMEM
Simple test writing to dwords across an object, using various engines in
a randomized order, checking that our writes land from the cpu.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025153728.23689-4-chris@chris-wilson.co.uk
2019-10-25 22:55:49 +01:00
Abdiel Janulgue
01377a0d7e drm/i915/lmem: support kernel mapping
We can create LMEM objects, but we also need to support mapping them
into kernel space for internal use.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Steve Hampson <steven.t.hampson@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025153728.23689-3-chris@chris-wilson.co.uk
2019-10-25 22:55:43 +01:00
Matthew Auld
b908be543e drm/i915: support creating LMEM objects
We currently define LMEM, or local memory, as just another memory
region, like system memory or stolen, which we can expose to userspace
and can be mapped to the CPU via some BAR.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025153728.23689-1-chris@chris-wilson.co.uk
2019-10-25 22:55:31 +01:00
Chris Wilson
c35eb477c0 drm/i915/selftests: Tweak the default subtest runtime
BAT is growing a little fat and CI is under pressure and needs to trim
off some redundant runtime. An easy option is to reduce the selftest
runtimes, so try halving our default subtest timeout. While this reduces
the number of iterations used, for the majority of tests that are
passing, repeat runs (with different CI_DRM) will make up the
difference -- a negative consequence though is that we may reduce the
frequency of sporadic failures. Hopefully, we have no tests that were
crucially dependent on the previous 1s timeout...

Suggested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025092749.13468-1-chris@chris-wilson.co.uk
2019-10-25 11:54:37 +01:00
Chris Wilson
e16302cb67 drm/i915/selftests: Release ctx->engine_mutex after iteration
A lock once taken must be released again.

Fixes: c31c9e82ee ("drm/i915/selftests: Teach switch_to_context() to use the context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022223316.12662-1-chris@chris-wilson.co.uk
2019-10-23 10:07:25 +01:00
Chris Wilson
905da43c6a drm/i915/selftests: Move uncore fw selftests to operate on intel_gt
Forcewake is the speciality of the GT, so it is natural to run the
intel_uncore_forcewake tests over the GT. So pass intel_gt as the
parameter to our selftests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022131016.9065-1-chris@chris-wilson.co.uk
2019-10-22 20:44:52 +01:00
Chris Wilson
c31c9e82ee drm/i915/selftests: Teach switch_to_context() to use the context
The context details which engines to use, so use the ctx->engines[] to
generate the requests to cause the context switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022130221.20644-2-chris@chris-wilson.co.uk
2019-10-22 20:43:08 +01:00
Chris Wilson
ae2e28b026 drm/i915: Teach record_defaults to operate on the intel_gt
Again we wish to operate on the engines, which are owned by the
intel_gt. As such it is easier, and much more consistent, to pass the
intel_gt parameter.

v2: Unexport i915_gem_load_power_context()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022141935.15733-1-chris@chris-wilson.co.uk
2019-10-22 20:43:07 +01:00
Chris Wilson
7867d70995 drm/i915/gem: Distinguish each object type
Separate each object class into a separate lock type to avoid lockdep
cross-contamination between paths (i.e. userptr!).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022144501.26486-1-chris@chris-wilson.co.uk
2019-10-22 16:23:32 +01:00
Tvrtko Ursulin
d1a03ee7e9 drm/i915/selftests: Use GT engines in igt_live_test
Frees up two call sites from passing i915 to for_each_engine.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-11-tvrtko.ursulin@linux.intel.com
2019-10-22 12:16:42 +01:00
Tvrtko Ursulin
6457099ac5 drm/i915/selftests: Use GT engines in mock_gem_device
Just freeing up two more call sites from passing in i915 to
for_each_engine.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-10-tvrtko.ursulin@linux.intel.com
2019-10-22 12:16:42 +01:00
Tvrtko Ursulin
2271a223e0 drm/i915/selftests: Convert eviction selftests to gt/ggtt
Convert the test code to work directly on what it needs rather than
going through the top-level i915.

This enables another natural usage for for_each_engine(.., gt, ..).

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-9-tvrtko.ursulin@linux.intel.com
2019-10-22 12:16:42 +01:00
Chris Wilson
aa9eb0caaa drm/i915/selftests: Set vm->gt backpointer for mock_ppgtt
Add the backpointer to ppgtt and i915->gt so that we can traverse across
the device hierarchy.

Reported-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022095851.23442-1-chris@chris-wilson.co.uk
2019-10-22 12:16:42 +01:00
Chris Wilson
b5e8e954eb drm/i915/gt: Introduce barrier pulses along engines
To flush idle barriers, and even inflight requests, we want to send a
preemptive 'pulse' along an engine. We use a no-op request along the
pinned kernel_context at high priority so that it should run or else
kick off the stuck requests. We can use this to ensure idle barriers are
immediately flushed, as part of a context cancellation mechanism, or as
part of a heartbeat mechanism to detect and reset a stuck GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021174339.5389-1-chris@chris-wilson.co.uk
2019-10-21 21:01:52 +01:00
Chris Wilson
928da10c0c drm/i915/selftests: Use all physical engines for i915_active
i915_active must track over any engine, so expand the selftest to
iterate over all uabi engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021162146.1686-1-chris@chris-wilson.co.uk
2019-10-21 21:01:52 +01:00
Matthew Auld
da1184cd41 drm/i915: treat shmem as a region
Convert shmem to an intel_memory_region.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018090751.28295-2-matthew.auld@intel.com
2019-10-18 12:41:03 +01:00
Abdiel Janulgue
3aae9d0853 drm/i915: enumerate and init each supported region
Nothing to enumerate yet...

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018090751.28295-1-matthew.auld@intel.com
2019-10-18 12:41:02 +01:00
Tvrtko Ursulin
5d904e3c5d drm/i915: Pass in intel_gt at some for_each_engine sites
Where the function, or code segment, operates on intel_gt, we need to
start passing it instead of i915 to for_each_engine(_masked).

This is another partial step in migration of i915->engines[] to
gt->engines[].

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191017094500.21831-2-tvrtko.ursulin@linux.intel.com
2019-10-18 00:06:27 +01:00
Chris Wilson
e9768bfe87 drm/i915/selftests: Teach requests to use all available engines
The request selftests straddle the boundary between checking the driver
and the hardware. They are subject to the quirks of the underlying HW,
but operate on top of the backend abstractions. The tests focus on the
scheduler elements and so should check for interactions of the scheduler
across all exposed engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016125236.17960-1-chris@chris-wilson.co.uk
2019-10-17 21:14:25 +01:00
Chris Wilson
e9d4c9245f drm/i915: Store i915_ggtt as the backpointer on fence registers
Now that i915_ggtt knows everything about its own paths to perform mmio,
we can use that as our primary backpointer for individual fence
registers. This reduces the amount of pointer dancing we have to perform
on the common paths, but more importantly finishes our fence register
encapsulation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016143234.4075-1-chris@chris-wilson.co.uk
2019-10-16 19:41:36 +01:00
Chris Wilson
280bc0cecb drm/i915/selftests: Fixup naked 64b divide
drivers/gpu/drm/i915/intel_memory_region.o: in function `igt_mock_contiguous':
drivers/gpu/drm/i915/selftests/intel_memory_region.c:166: undefined reference to `__umoddi3'

v2: promote target to u64 for consistency across all builds

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 2f0b97ca02 ("drm/i915/region: support contiguous allocations")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191013114509.3405-1-chris@chris-wilson.co.uk
2019-10-14 09:26:07 +01:00
Lionel Landwerlin
daed3e4439 drm/i915/perf: implement active wait for noa configurations
NOA configuration take some amount of time to apply. That amount of
time depends on the size of the GT. There is no documented time for
this. For example, past experimentations with powergating
configuration changes seem to indicate a 60~70us delay. We go with
500us as default for now which should be over the required amount of
time (according to HW architects).

v2: Don't forget to save/restore registers used for the wait (Chris)

v3: Name used CS_GPR registers (Chris)
    Fix compile issue due to rebase (Lionel)

v4: Fix save/restore helpers (Umesh)

v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel)

v6: Add missing struct declarations in i915_perf.h

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191012072308.30312-2-chris@chris-wilson.co.uk
2019-10-12 09:08:33 +01:00
Matthew Auld
7c98501acb drm/i915/region: support volatile objects
Volatile objects are marked as DONTNEED while pinned, therefore once
unpinned the backing store can be discarded. This is limited to kernel
internal objects.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008160116.18379-4-matthew.auld@intel.com
2019-10-08 20:50:01 +01:00
Matthew Auld
2f0b97ca02 drm/i915/region: support contiguous allocations
Some kernel internal objects may need to be allocated as a contiguous
block, also thinking ahead the various kernel io_mapping interfaces seem
to expect it, although this is purely a limitation in the kernel
API...so perhaps something to be improved.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Michael J Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008160116.18379-3-matthew.auld@intel.com
2019-10-08 20:50:01 +01:00
Matthew Auld
232a6ebae4 drm/i915: introduce intel_memory_region
Support memory regions, as defined by a given (start, end), and allow
creating GEM objects which are backed by said region. The immediate goal
here is to have something to represent our device memory, but later on
we also want to represent every memory domain with a region, so stolen,
shmem, and of course device. At some point we are probably going to want
use a common struct here, such that we are better aligned with say TTM.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008160116.18379-2-matthew.auld@intel.com
2019-10-08 20:49:55 +01:00
Chris Wilson
d14a701b00 drm/i915/selftests: Assign the intel_runtime_pm pointer for mock_uncore
Couple up our mock_uncore to know about the fake global device and its
runtime powermanagement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008145045.23157-1-chris@chris-wilson.co.uk
2019-10-08 16:21:50 +01:00
Chris Wilson
7842793330 drm/i915: Drop struct_mutex from around GEM initialisation
We no longer need to placate lockdep by holding struct_mutex for our
initialisation, so don't.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-21-chris@chris-wilson.co.uk
2019-10-04 15:39:43 +01:00
Chris Wilson
2af402982a drm/i915/selftests: Drop vestigal struct_mutex guards
We no longer need struct_mutex to serialise request emission, so remove
it from the gt selftests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-20-chris@chris-wilson.co.uk
2019-10-04 15:39:41 +01:00
Chris Wilson
a4e7ccdac3 drm/i915: Move context management under GEM
Keep track of the GEM contexts underneath i915->gem.contexts and assign
them their own lock for the purposes of list management.

v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex
v3: Correct split with removal of logical HW ID

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk
2019-10-04 15:39:34 +01:00
Chris Wilson
2935ed5339 drm/i915: Remove logical HW ID
With the introduction of ctx->engines[] we allow multiple logical
contexts to be used on the same engine (e.g. with virtual engines).
According to bspec, aach logical context requires a unique tag in order
for context-switching to occur correctly between them. [Simple
experiments show that it is not so easy to trick the HW into performing
a lite-restore with matching logical IDs, though my memory from early
Broadwell experiments do suggest that it should be generating
lite-restores.]

We only need to keep a unique tag for the active lifetime of the
context, and for as long as we need to identify that context. The HW
uses the tag to determine if it should use a lite-restore (why not the
LRCA?) and passes the tag back for various status identifies. The only
status we need to track is for OA, so when using perf, we assign the
specific context a unique tag.

v2: Calculate required number of tags to fill ELSP.

Fixes: 976b55f0e1 ("drm/i915: Allow a context to define its set of engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk
2019-10-04 15:39:30 +01:00
Chris Wilson
6610197542 drm/i915: Move request runtime management onto gt
Requests are run from the gt and are tided into the gt runtime power
management, so pull the runtime request management under gt/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-12-chris@chris-wilson.co.uk
2019-10-04 15:39:26 +01:00
Chris Wilson
f33a8a5160 drm/i915: Merge wait_for_timelines with retire_request
wait_for_timelines is essentially the same loop as retiring requests
(with an extra timeout), so merge the two into one routine.

v2: i915_retire_requests_timeout and keep VT'd w/a as !interruptible

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-10-chris@chris-wilson.co.uk
2019-10-04 15:39:23 +01:00
Chris Wilson
33d856445b drm/i915: Remove the GEM idle worker
Nothing inside the idle worker now requires struct_mutex, so we can
remove the indirection of using our own worker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-9-chris@chris-wilson.co.uk
2019-10-04 15:39:22 +01:00
Chris Wilson
7e80576266 drm/i915: Drop struct_mutex from around i915_retire_requests()
We don't need to hold struct_mutex now for retiring requests, so drop it
from i915_retire_requests() and i915_gem_wait_for_idle(), finally
removing I915_WAIT_LOCKED for good.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-8-chris@chris-wilson.co.uk
2019-10-04 15:39:17 +01:00
Chris Wilson
b1e3177bd1 drm/i915: Coordinate i915_active with its own mutex
Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.

This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).

The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.

v2: Add some comments that barely elucidate anything :(

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
2019-10-04 15:39:12 +01:00
Chris Wilson
274cbf20fd drm/i915: Push the i915_active.retire into a worker
As we need to use a mutex to serialise i915_active activation
(because we want to allow the callback to sleep), we need to push the
i915_active.retire into a worker callback in case we get need to retire
from an atomic context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk
2019-10-04 15:39:10 +01:00
Chris Wilson
2850748ef8 drm/i915: Pull i915_vma_pin under the vm->mutex
Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).

In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.

Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!

v2: Add some commentary, and some helpers to reduce patch churn.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
2019-10-04 15:39:02 +01:00
Chris Wilson
5e053450c1 drm/i915: Only track bound elements of the GTT
The premise here is to simply avoiding having to acquire the vm->mutex
inside vma create/destroy to update the vm->unbound_lists, to avoid some
nasty lock recursions later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-2-chris@chris-wilson.co.uk
2019-10-04 15:39:01 +01:00
Chris Wilson
dfe324f34c drm/i915/selftests: Extract random_offset() for use with a prng
For selftests, we desire repeatability and so prefer using a prng with
known seed over true randomness. Extract random_offset() as a selftest
utility that can take the prng state.

Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191002122430.23205-1-chris@chris-wilson.co.uk
2019-10-02 15:30:44 +01:00
Chris Wilson
4e18ca703f drm/i915/selftests: Distinguish mock device from no wakeref
On systems that have no runtime-pm, we mark the wakeref as being -1. We
therefore cannot use that value for the mock-gt indicator, so opt for
-ENODEV instead. The wakeref should never be an error value -- one
hopes!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190927211749.2181-2-chris@chris-wilson.co.uk
2019-09-27 23:25:30 +01:00
Andi Shyti
c113236718 drm/i915: Extract GT render sleep (rc6) management
Continuing the theme of breaking intel_pm.c up in a reasonable chunk of
powermanagement utilities, pull out the rc6 setup into its GT handler.

Based on a patch by Chris Wilson.

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919143840.20384-1-andi.shyti@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20190927110849.28734-1-chris@chris-wilson.co.uk
2019-09-27 13:01:57 +01:00
Chris Wilson
a3f56e7da5 drm/i915/selftests: Exercise concurrent submission to all engines
The simplest and most maximal submission we can do, a thread to submit
requests unto each engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190925193446.26007-1-chris@chris-wilson.co.uk
2019-09-27 11:41:45 +01:00
Chris Wilson
7dc56af526 drm/i915/selftests: Verify the LRC register layout between init and HW
Before we submit the first context to HW, we need to construct a valid
image of the register state. This layout is defined by the HW and should
match the layout generated by HW when it saves the context image.
Asserting that this should be equivalent should help avoid any undefined
behaviour and verify that we haven't missed anything important!

Of course, having insisted that the initial register state within the
LRC should match that returned by HW, we need to ensure that it does.

v2: Drop the RELATIVE_MMIO flag from gen11, we ignore it for
constructing the lrc image.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190924145950.3011-1-chris@chris-wilson.co.uk
2019-09-24 17:27:19 +01:00
Chris Wilson
d19d71fc2b drm/i915: Mark i915_request.timeline as a volatile, rcu pointer
The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.

One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.

v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
2019-09-20 10:24:09 +01:00
Chris Wilson
a47e788c23 drm/i915/selftests: Exercise CS TLB invalidation
Check that we are correctly invalidating the TLB at the start of a
batch after updating the GTT.

v2: Comments and hold the request reference while spinning

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919131414.7495-1-chris@chris-wilson.co.uk
2019-09-19 15:49:31 +01:00
Michel Thierry
cf82d9ddd3 drm/i915/tgl: Introduce gen12 forcewake ranges
The media ranges extend beyond what gen11 gives so we can't piggypack
on gen11 ranges, even on read side.

Introduce a table for gen12 and accessors for it.

v2: correctly implement gen12_fwtable_write/read (Daniele)
v3: update with ranges from bspec.
v4: avoid GEN11_NEEDS_FORCEWAKE (Mika)
v5: bspec ref (Daniele)

BSpec: 52078
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913141652.27958-2-mika.kuoppala@linux.intel.com
2019-09-13 20:07:36 +01:00
Chris Wilson
4dd2fbbfb5 drm/i915: Make i915_vma.flags atomic_t for mutex reduction
In preparation for reducing struct_mutex stranglehold around the vm,
make the vma.flags atomic so that we can acquire a pin on the vma
atomically before deciding if we need to take the mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk
2019-09-11 13:39:42 +01:00
Tvrtko Ursulin
61fa60ff6e drm/i915: Move GT init to intel_gt.c
Code in i915_gem_init_hw is all about GT init so move it to intel_gt.c
renaming to intel_gt_init_hw.

Existing intel_gt_init_hw is renamed to intel_gt_init_hw_early since it
is currently called from driver probe.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910143823.10686-2-tvrtko.ursulin@linux.intel.com
2019-09-11 08:11:51 +01:00
Chris Wilson
7c465310fe drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel
Being a "low-level" test, we opt to bypass the normal bind/unbind hooks
for the lower level insert_entries/clear_range. For ggtt, the
bind/unbind hooks provide the runtime wakeref and so we must also handle
this in exercising the low level hooks.

<4> [538.151672] RPM raw-wakeref not held
<4> [538.151825] WARNING: CPU: 0 PID: 11 at ./drivers/gpu/drm/i915/intel_runtime_pm.h:107 fwtable_read32+0x1be/0x300 [i915]
<4> [538.151830] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp btusb btrtl btbcm x86_pkg_temp_thermal coretemp btintel crct10dif_pclmul bluetooth crc32_pclmul snd_intel_nhlt snd_hda_codec ecdh_generic ghash_clmulni_intel ecc snd_hwdep snd_hda_core lpc_ich r8169 realtek snd_pcm mei_me mei prime_numbers pinctrl_broxton pinctrl_intel [last unloaded: i915]
<4> [538.151861] CPU: 0 PID: 11 Comm: migration/0 Tainted: G     U            5.3.0-rc7-CI-Trybot_4938+ #1
<4> [538.151864] Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS AYAPLCEL.86A.0056.2018.0926.1100 09/26/2018
<4> [538.151960] RIP: 0010:fwtable_read32+0x1be/0x300 [i915]
<4> [538.151965] Code: e8 e7 f9 5f e0 e9 0b ff ff ff 80 3d d5 8d 26 00 00 0f 85 81 fe ff ff 48 c7 c7 ef 01 bd a0 c6 05 c1 8d 26 00 01 e8 b2 e4 6a e0 <0f> 0b e9 67 fe ff ff 80 3d ad 8d 26 00 00 0f 85 65 fe ff ff 48 c7
<4> [538.151969] RSP: 0018:ffffc9000007be10 EFLAGS: 00010086
<4> [538.151972] RAX: 0000000000000000 RBX: ffff88826be10d50 RCX: 0000000000000002
<4> [538.151975] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
<4> [538.151978] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
<4> [538.151981] R10: 0000000000000000 R11: ffffc9000007bcb0 R12: 0000000000101008
<4> [538.151984] R13: 0000000000000000 R14: ffffc9000036f638 R15: 0000000000000002
<4> [538.151987] FS:  0000000000000000(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
<4> [538.151990] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [538.151993] CR2: 00007fd48e7052f8 CR3: 0000000005210000 CR4: 00000000003406f0
<4> [538.151995] Call Trace:
<4> [538.152106]  bxt_vtd_ggtt_clear_range__cb+0x38/0x40 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909110011.8958-2-chris@chris-wilson.co.uk
2019-09-10 12:06:25 +01:00
Chris Wilson
cec5ca08e3 drm/i915: Perform GGTT restore much earlier during resume
As soon as we re-enable the various functions within the HW, they may go
off and read data via a GGTT offset. Hence, if we have not yet restored
the GGTT PTE before then, they may read and even *write* random locations
in memory.

Detected by DMAR faults during resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909110011.8958-4-chris@chris-wilson.co.uk
2019-09-10 11:49:11 +01:00
Matthew Auld
33dd889923 drm/i915: cleanup cache-coloring
Try to tidy up the cache-coloring such that we rid the code of any
mm.color_adjust assumptions, this should hopefully make it more obvious
in the code when we need to actually use the cache-level as the color,
and as a bonus should make adding a different color-scheme simpler.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
2019-09-09 21:00:20 +01:00
Matthew Auld
e9ceb751ad drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust
Make it clear that the color adjust callback applies to the ggtt.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-2-matthew.auld@intel.com
2019-09-09 21:00:11 +01:00
Andi Shyti
42014f69bb drm/i915: Hook up GT power management
Refactor the GT power management interface to work through the GT now
that it is under the control of gt/

Based on a patch by Chris Wilson.

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905111403.10071-1-andi.shyti@intel.com
2019-09-06 20:29:58 +01:00
Matthew Auld
31444afb46 drm/i915: s/for_each_sgt_dma/for_each_sgt_daddr/
The sg_table for our backing store might contain addresses from
stolen-memory or in the future local-memory, at which point this is no
longer a dma-iterator. As a consequence we should now break on NULL
iter.sgp, instead of dmap == 0 which is considered an invalid dma
address.

As a bonus, gcc much prefers this construct,

  Function                                     old     new   delta
  gen8_ggtt_insert_entries                     211     192     -19
  gen6_ggtt_insert_entries                     292     262     -30
  i915_error_object_create                     996     954     -42

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829201919.21493-1-matthew.auld@intel.com
2019-08-29 21:59:16 +01:00
Chris Wilson
e7b6affd0b drm/i915/selftests: cond_resched() within the longer buddy tests
Let the scheduler have a breather in between passes of the longer buddy
tests. Important if we are running under kasan etc and this takes far
longer than usual!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829170848.969-1-chris@chris-wilson.co.uk
2019-08-29 19:19:50 +01:00
Chris Wilson
8e40983dec drm/i915/selftests: Fixup a couple of missing serialisation with vma
In commit 70d6894d14 ("drm/i915: Serialize against vma moves")
I managed to miss a couple of i915_vma_move_to_active() that had not
serialised against an async vma pinning. Add the missing
i915_request_await.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821193851.18232-1-chris@chris-wilson.co.uk
2019-08-21 22:21:57 +01:00
Chris Wilson
70d6894d14 drm/i915: Serialize against vma moves
Make sure that when submitting requests, we always serialize against
potential vma moves and clflushes.

Time for a i915_request_await_vma() interface!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190819112033.30638-1-chris@chris-wilson.co.uk
2019-08-19 15:25:56 +01:00
Chris Wilson
ef46884975 drm/i915: Propagate fence errors
Errors spread like wildfire, and must eventually be returned to the
user. They need to be captured and passed along the flow of fences,
infecting each in turn with the existing error, until finally they fall
out of a user visible result.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190817232511.11391-1-chris@chris-wilson.co.uk
2019-08-18 12:38:09 +01:00
Chris Wilson
25ffd4b11d drm/i915: Markup expected timeline locks for i915_active
As every i915_active_request should be serialised by a dedicated lock,
i915_active consists of a tree of locks; one for each node. Markup up
the i915_active_request with what lock is supposed to be guarding it so
that we can verify that the serialised updated are indeed serialised.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-2-chris@chris-wilson.co.uk
2019-08-16 18:02:07 +01:00
Matthew Auld
6f6333ba50 drm/i915/selftest/buddy: fixup igt_buddy_alloc_range
Dan reported the following static checker warning:

drivers/gpu/drm/i915/selftests/i915_buddy.c:670 igt_buddy_alloc_range()
error: we previously assumed 'block' could be null (see line 665)

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815103210.11802-1-matthew.auld@intel.com
2019-08-15 13:13:23 +01:00
Matthew Auld
14d1b9a624 drm/i915: buddy allocator
Simple buddy allocator. We want to allocate properly aligned
power-of-two blocks to promote usage of huge-pages for the GTT, so 64K,
2M and possibly even 1G. While we do support allocating stuff at a
specific offset, it is more intended for preallocating portions of the
address space, say for an initial framebuffer, for other uses drm_mm is
probably a much better fit. Anyway, hopefully this can all be thrown
away if we eventually move to having the core MM manage device memory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809202926.14545-2-matthew.auld@intel.com
2019-08-10 19:47:40 +01:00
Chris Wilson
75d0a7f31e drm/i915: Lift timeline into intel_context
Move the timeline from being inside the intel_ring to intel_context
itself. This saves much pointer dancing and makes the relations of the
context to its timeline much clearer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk
2019-08-09 20:18:30 +01:00
Chris Wilson
c7302f2044 drm/i915: Defer final intel_wakeref_put to process context
As we need to acquire a mutex to serialise the final
intel_wakeref_put, we need to ensure that we are in process context at
that time. However, we want to allow operation on the intel_wakeref from
inside timer and other hardirq context, which means that need to defer
that final put to a workqueue.

Inside the final wakeref puts, we are safe to operate in any context, as
we are simply marking up the HW and state tracking for the potential
sleep. It's only the serialisation with the potential sleeping getting
that requires careful wait avoidance. This allows us to retain the
immediate processing as before (we only need to sleep over the same
races as the current mutex_lock).

v2: Add a selftest to ensure we exercise the code while lockdep watches.
v3: That test was extremely loud and complained about many things!
v4: Not a whale!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111295
References: https://bugs.freedesktop.org/show_bug.cgi?id=111245
References: https://bugs.freedesktop.org/show_bug.cgi?id=111256
Fixes: 18398904ca ("drm/i915: Only recover active engines")
Fixes: 51fbd8de87 ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808202758.10453-1-chris@chris-wilson.co.uk
2019-08-08 21:28:51 +01:00
Chris Wilson
cbb153c50e drm/i915/selftests: Fixup a missing legacy_idx
Grr, missed one*. For using the legacy engine map, we should use
engine->legacy_idx. Ideally, we should know the intel_context in the
selftest and avoid all the fiddling around with unwanted GEM contexts.

* In my defence, the conflict was added in another patch after it was
tested by CI.

v2: mock engines needs legacy love as well

Fixes: f1c4d157ab ("drm/i915: Fix up the inverse mapping for default ctx->engines[]")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808194525.9410-2-chris@chris-wilson.co.uk
2019-08-08 20:53:31 +01:00
Chris Wilson
ca883c304f drm/i915/selftests: Pass intel_context to mock_request
Modernise the mock_request factory to take intel_context not a (GEM
context, intel_engine_cs) tuple.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808115640.20552-1-chris@chris-wilson.co.uk
2019-08-08 13:44:31 +01:00
Chris Wilson
361f9dc243 drm/i915: Use drm_i915_private directly from drv_get_drvdata()
As we store a pointer to i915 in the drvdata field (as the pointer is both
an alias to the drm_device and drm_i915_private), we can use the stored
pointer directly as the i915 device.

v2: Store and use i915 inside drv_get_drvdata()
v3: Only expect i915 inside drv_get_drvdata() so drop the assumed
i915/drm equivalence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806074219.11043-1-chris@chris-wilson.co.uk
2019-08-06 09:36:22 +01:00
Chris Wilson
d8af05ff38 drm/i915: Allow sharing the idle-barrier from other kernel requests
By placing our idle-barriers in the i915_active fence tree, we expose
those for reuse by other components that are issuing requests along the
kernel_context. Reusing the proto-barrier active_node is perfectly fine
as the new request implies a context-switch, and so an opportune point
to run the idle-barrier. However, the proto-barrier is not equivalent
to a normal active_node and care must be taken to avoid dereferencing the
ERR_PTR used as its request marker.

v2: Comment the more egregious cheek
v3: A glossary!

Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ce476c80b8 ("drm/i915: Keep contexts pinned until after the next kernel context switch")
Fixes: a9877da2d6 ("drm/i915/oa: Reconfigure contexts on the fly")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802100015.1281-1-chris@chris-wilson.co.uk
2019-08-02 11:53:04 +01:00
Chris Wilson
f277bc0c98 drm/i915/selftests: Pass intel_context to igt_spinner
Teach igt_spinner to only use our internal structs, decoupling the
interface from the GEM contexts. This makes it easier to avoid
requiring ce->gem_context back references for kernel_context that may
have them in future.

v2: Lift engine lock to verify_wa() caller.
v3: Less than v2, but more so

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731081126.9139-1-chris@chris-wilson.co.uk
2019-07-31 09:45:27 +01:00
Chris Wilson
d8bf0e7627 drm/i915/selftests: Let igt_vma_partial et al breathe
Give the scheduler a chance to breathe by calling cond_resched() as some
of the loops may take some time on slower machines, and so catch the
attention of the watchdogs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111196
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190723095800.2820-1-chris@chris-wilson.co.uk
2019-07-23 12:23:43 +01:00
Daniele Ceraolo Spurio
0f261b241d drm/i915/uc: move GuC and HuC files under gt/uc/
Both microcontrollers are part of the GT HW and are closely related to
GT operations. To keep all the files cleanly together, they've been
placed in their own subdir inside the gt/ folder

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-6-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-07-13 19:58:23 +01:00
Chris Wilson
cb823ed991 drm/i915/gt: Use intel_gt as the primary object for handling resets
Having taken the first step in encapsulating the functionality by moving
the related files under gt/, the next step is to start encapsulating by
passing around the relevant structs rather than the global
drm_i915_private. In this step, we pass intel_gt to intel_reset.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
2019-07-12 21:06:56 +01:00
Daniele Ceraolo Spurio
aebf052bb6 drm/i915/guc: Simplify guc client
We originally added support, in some cases partial, for different modes
of operations via guc clients:

- proxy vs direct submission;
- variable engine mask per-client.

We only ever used one flow (all submissions via a single proxy), so the
other code paths haven't been exercised and are most likely
non-functional. The guc firmware interface is also in the process of
being updated to better fit the i915 flow and our client abstraction
will need to change accordingly (or possibly go away entirely), so these
old unused paths can be considered dead and removed.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Matthew Brost <Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190710005437.3496-3-daniele.ceraolospurio@intel.com
2019-07-11 11:15:49 +01:00
Chris Wilson
71b0846c17 drm/i915/guc: Remove preemption support for current fw
Preemption via GuC submission is not being supported with its current
legacy incarnation. The current FW does support a similar pre-emption
flow via H2G, but it is class-based instead of being instance-based,
which doesn't fit well with the i915 tracking. To fix this, the
firmware is being updated to better support our needs with a new flow,
so we can safely remove the old code.

v2 (Daniele): resurrect & rebase, reword commit message, remove
preempt_context as well

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190710005437.3496-2-daniele.ceraolospurio@intel.com
2019-07-11 11:09:33 +01:00
Rodrigo Vivi
88c90e8006 Merge drm/drm-next into drm-intel-next-queued
Catch-up with 5.2. Specially to remove a drm-tip merge
fixup around intel_workarounds.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-07-10 06:51:35 -07:00
Chris Wilson
baf08ed50a drm/i915/selftests: Set igt_spinner.gt for early exit
Set up a default gt pointer for an early cleanup of igt_spinnter, before
a request is created and igt_spinner.gt set to the active engine's.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190708215524.31639-1-chris@chris-wilson.co.uk
2019-07-09 08:07:09 +01:00
Chris Wilson
63251685c1 drm/i915/selftests: Common live setup/teardown
We frequently, but not frequently enough!, remember to flush residual
operations and objects at the end of a live subtest. The purpose is to
cleanup after every subtest, leaving a clean slate for the next subtest,
and perform early detection of leaky state. As this should ideally be
common for all live subtests, pull the task into a common teardown
routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-1-chris@chris-wilson.co.uk
2019-07-03 11:07:57 +01:00
Chris Wilson
c8d84778e5 drm/i915/selftests: Hold ref on request across waits
As we wait upon the request, we should be sure to hold our own reference
for our checks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-14-chris@chris-wilson.co.uk
2019-06-26 00:00:29 +01:00
Chris Wilson
0c91621cad drm/i915/gt: Pass intel_gt to pm routines
Switch from passing the i915 container to newly named struct intel_gt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-2-chris@chris-wilson.co.uk
2019-06-25 20:17:22 +01:00
Chris Wilson
12c255b5da drm/i915: Provide an i915_active.acquire callback
If we introduce a callback for i915_active that is only called the first
time we use the i915_active and is symmetrically paired with the
i915_active.retire callback, we can replace the open-coded and
non-atomic implementations -- which will be very fragile (i.e. broken)
upon removing the struct_mutex serialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk
2019-06-21 19:47:55 +01:00
Chris Wilson
5361db1a33 drm/i915: Track i915_active using debugobjects
Provide runtime asserts and tracking of i915_active via debugobjects.
For example, this should allow us to check that the i915_active is only
active when we expect it to be and is never freed too early.

One consequence is that, for simplicity, we no longer allow i915_active
to be on-stack which only affected the selftests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-2-chris@chris-wilson.co.uk
2019-06-21 19:47:50 +01:00
Tvrtko Ursulin
f0c02c1b91 drm/i915: Rename i915_timeline to intel_timeline and move under gt
Move all timeline code under gt and rename to intel_gt prefix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:53 +01:00
Tvrtko Ursulin
4c6d51ea2a drm/i915: Make timelines gt centric
Our timelines are stored inside intel_gt so we can convert the interface
to take exactly that and not i915.

At the same time re-order the params to our more typical layout and
replace the backpointer to the new containing structure.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-31-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:51 +01:00
Tvrtko Ursulin
d8a4424839 drm/i915: Store ggtt pointer in intel_gt
This will become useful in the following patch.

v2:
 * Assign the pointer through a helper on the top level to work around
   the layering violation. (Chris)

v3:
 * Handle selftests.

v4:
 * Move call to intel_gt_init_hw into mock_init_ggtt. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-28-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:46 +01:00
Tvrtko Ursulin
baea429dc5 drm/i915: Move i915_gem_chipset_flush to intel_gt
This aligns better with the rest of restructuring.

v2:
 * Move call out of line. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-24-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:40 +01:00
Tvrtko Ursulin
a1c8a09e0c drm/i915: Convert i915_gem_flush_ggtt_writes to intel_gt
Having introduced struct intel_gt (named the anonymous structure in i915)
we can start using it to compartmentalize our code better. It makes more
sense logically to have the code internally like this and it will also
help with future split between gt and display in i915.

v2:
 * Keep ggtt flush before fb obj flush. (Chris)

v3:
 * Fix refactoring fail.
 * Always flush ggtt writes. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-23-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:38 +01:00
Tvrtko Ursulin
763c1e6312 drm/i915: Store intel_gt backpointer in vm
This will come useful in the following patch.

v2:
 * Handle mock ggtt.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-21-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:36 +01:00
Tvrtko Ursulin
99f2eb9667 drm/i915: Move intel_gt_pm_init under intel_gt_init_early
And also rename to intel_gt_pm_init_early and make it operate on gt.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-5-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:18 +01:00
Tvrtko Ursulin
724e9564c5 drm/i915: Store some backpointers in struct intel_gt
We need an easy way to get back to i915 and uncore.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-4-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:17 +01:00
Tvrtko Ursulin
24635c5152 drm/i915: Move intel_gt initialization to a separate file
As it will grow in a following patch make a new home for it.

v2:
 * Convert mock_gem_device as well. (Chris)

v3:
 * Rename to intel_gt_init_early and move call site to i915_drv.c. (Chris)

v4:
 * Adjust SPDX tags.
 * No need to gt/ path when including intel_gt_types.h. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-3-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:15 +01:00
Dave Airlie
417f2544f4 Features:
- HDR support (Uma, Ville)
 - Add I2C symlink under HDMI connector similar to DP (Oleg)
 - Add ICL multi-segmented gamma support (Shashank, Uma)
 - Update register whitelist support for new hardware (Robert, John)
 - GuC firmware update with updated ABI interface (Michal, Oscar)
 - Add support for new DMC header versions (Lucas)
 - In-kernel blitter client for selftest use (Matthew)
 - Add Mule Creec Canyon (MCC) PCH support to go with EHL (Matt)
 - EHL platform feature updates (Matt)
 - Use Command Transport Buffers with GuC on all gens (Daniele)
 - New i915.force_probe module parameter to replace i915.alpha_support (Jani)
 
 Refactoring:
 - Better runtime PM code abstraction/encapsulation (Daniele)
 - VBT parsing cleanup and improvements (Jani)
 - Move display code to its own subdirectory (Jani)
 - Header cleanup (Jani, Daniele)
 - Prep work for subsclice mask expansion (Stuart)
 - Use uncore mmio register accessors more, remove unused macro wrappers (Tvrtko)
 - Remove unused atomic property get/set stubs (Maarten)
 - GTT cleanups and improvements (Mika)
 - Pass intel_ types instead of drm_ types in plenty of display code (Ville)
 - Engine reset, hangcheck, fault code cleanups and improvements (Tvrtko)
 - Consider AML variants simply as either KBL or CFL ULX (Ville)
 - State checker cleanups and improvements (Ville)
 - GEM code reorganization to more files under gem subdirectory (Chris)
 - Reducing dependency on a coarse struct_mutex (Chris)
 
 Fixes:
 - Fix use of uninitialized/incorrect error pointers (Colin, Dan)
 - Fix DSI fastboot on some VLV/CHV platforms (Hans)
 - Fix DSI error path (Hans)
 - Add ICL port A combo PHY HW state check (Imre)
 - Fix ICL AUX-B HW not done issue (Imre)
 - Fix perf whitelist on gen10+ (Lionel)
 - Fix PSR exit by forcing manual exit on older gens (José)
 - Match voltage ranges instead of exact values (Lucas)
 - Fix SDVO HDMI audio, with cleanups (Ville)
 - Fix plane state dumps (Ville)
 - Fix driver cleanup code to support driver hot unbind (Janusz)
 - Add checks for ICL memory bandwidth requirements (Ville)
 - Fix toggling between no C8 planes vs. at least one C8 plane (Ville)
 - Improved checks on PLL usage conditions, refactoring (Ville)
 - Avoid clobbering M/N values in fastset fuzzy checks (Ville)
 - Take a runtime pm wakeref for atomic commits (Chris)
 - Do not allow runtime pm autosuspend to remove userspace GGTT mmaps too quickly (Chris)
 - Avoid refcount_inc on known zero count to avoid debug flagging (Chris)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAl0KK/QACgkQ05gHnSar
 7m9IdRAAp5f3CRqEd2nqo4mQwNPjw5jggt4VjfijWJErLzVGdqcrY3Gn5v15AgNE
 lL4ybjcq840rs8fuRp2WicBXdjGCVxH4sQgtMMQTnpf4B2i4FqiWx+WfBkKgGG1Z
 Xf1FZwpClm86Ggf/RzASsCSbe31Sf8Jio6QONulQnDoQR46bmqhVWRcPZ4FVLX7O
 lUPudbJR/q6yW6Q6N9oJ+2h6/QBEVBEU79EZmcrG/w8HMtMh6yteAY6dFM6HMkKV
 u+3is1p4Ww5pu1t+1AJC818TCfyaLs0gveyAxnaz8ClSlzhPh/ccwSvqQUzIIKxm
 CcU3ruGGWDa0ZMl3bq+kipJdfghBnU43M0sXZ82IGSS7fkgS+csXGhp4sL/7LeH5
 7u0XE40zSpa4qeUcqLF/mkTsmD5xIxUEQ8rO4xT5Niayau74KfP/uCo+loLAZicl
 ig8OB7o8AG6RAjMkivPl2iAiWVu+MUEzWQqIZzKmkXdZ3v6JN7jQWGketzSfT8D+
 yg4GYGXqAyDisRgGlDacikUzQodD88ULKiuiqpwzr/xzO0NIvTMyOixEDvquO0c7
 +zcG6g8vJTjnHyNjUtwRlfLus30tCyZ2JExnscSOwyxRSEvKL8ds5IZHs4XVme42
 KZvVTk1aD1o90h/YUiL9YMFUB/Bu4rb9RMo7N6EEmEl5exwbDp0=
 =hS8b
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2019-06-19' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Features:
- HDR support (Uma, Ville)
- Add I2C symlink under HDMI connector similar to DP (Oleg)
- Add ICL multi-segmented gamma support (Shashank, Uma)
- Update register whitelist support for new hardware (Robert, John)
- GuC firmware update with updated ABI interface (Michal, Oscar)
- Add support for new DMC header versions (Lucas)
- In-kernel blitter client for selftest use (Matthew)
- Add Mule Creec Canyon (MCC) PCH support to go with EHL (Matt)
- EHL platform feature updates (Matt)
- Use Command Transport Buffers with GuC on all gens (Daniele)
- New i915.force_probe module parameter to replace i915.alpha_support (Jani)

Refactoring:
- Better runtime PM code abstraction/encapsulation (Daniele)
- VBT parsing cleanup and improvements (Jani)
- Move display code to its own subdirectory (Jani)
- Header cleanup (Jani, Daniele)
- Prep work for subsclice mask expansion (Stuart)
- Use uncore mmio register accessors more, remove unused macro wrappers (Tvrtko)
- Remove unused atomic property get/set stubs (Maarten)
- GTT cleanups and improvements (Mika)
- Pass intel_ types instead of drm_ types in plenty of display code (Ville)
- Engine reset, hangcheck, fault code cleanups and improvements (Tvrtko)
- Consider AML variants simply as either KBL or CFL ULX (Ville)
- State checker cleanups and improvements (Ville)
- GEM code reorganization to more files under gem subdirectory (Chris)
- Reducing dependency on a coarse struct_mutex (Chris)

Fixes:
- Fix use of uninitialized/incorrect error pointers (Colin, Dan)
- Fix DSI fastboot on some VLV/CHV platforms (Hans)
- Fix DSI error path (Hans)
- Add ICL port A combo PHY HW state check (Imre)
- Fix ICL AUX-B HW not done issue (Imre)
- Fix perf whitelist on gen10+ (Lionel)
- Fix PSR exit by forcing manual exit on older gens (José)
- Match voltage ranges instead of exact values (Lucas)
- Fix SDVO HDMI audio, with cleanups (Ville)
- Fix plane state dumps (Ville)
- Fix driver cleanup code to support driver hot unbind (Janusz)
- Add checks for ICL memory bandwidth requirements (Ville)
- Fix toggling between no C8 planes vs. at least one C8 plane (Ville)
- Improved checks on PLL usage conditions, refactoring (Ville)
- Avoid clobbering M/N values in fastset fuzzy checks (Ville)
- Take a runtime pm wakeref for atomic commits (Chris)
- Do not allow runtime pm autosuspend to remove userspace GGTT mmaps too quickly (Chris)
- Avoid refcount_inc on known zero count to avoid debug flagging (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87v9x1lpdh.fsf@intel.com
2019-06-21 14:00:10 +10:00
Chris Wilson
b32fa81115 drm/i915/gtt: Defer address space cleanup to an RCU worker
Enable RCU protection of i915_address_space and its ppgtt superclasses,
and defer its cleanup into a worker executed after an RCU grace period.

In the future we will be able to use the RCU protection to reduce the
locking around VM lookups, but the immediate benefit is being able to
defer the release into a kworker (process context). This is required as
we may need to sleep to reap the WC pages stashed away inside the ppgtt.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110934
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620183705.31006-1-chris@chris-wilson.co.uk
2019-06-20 21:52:30 +01:00
Chris Wilson
22b7a426bb drm/i915/execlists: Preempt-to-busy
When using a global seqno, we required a precise stop-the-workd event to
handle preemption and unwind the global seqno counter. To accomplish
this, we would preempt to a special out-of-band context and wait for the
machine to report that it was idle. Given an idle machine, we could very
precisely see which requests had completed and which we needed to feed
back into the run queue.

However, now that we have scrapped the global seqno, we no longer need
to precisely unwind the global counter and only track requests by their
per-context seqno. This allows us to loosely unwind inflight requests
while scheduling a preemption, with the enormous caveat that the
requests we put back on the run queue are still _inflight_ (until the
preemption request is complete). This makes request tracking much more
messy, as at any point then we can see a completed request that we
believe is not currently scheduled for execution. We also have to be
careful not to rewind RING_TAIL past RING_HEAD on preempting to the
running context, and for this we use a semaphore to prevent completion
of the request before continuing.

To accomplish this feat, we change how we track requests scheduled to
the HW. Instead of appending our requests onto a single list as we
submit, we track each submission to ELSP as its own block. Then upon
receiving the CS preemption event, we promote the pending block to the
inflight block (discarding what was previously being tracked). As normal
CS completion events arrive, we then remove stale entries from the
inflight tracker.

v2: Be a tinge paranoid and ensure we flush the write into the HWS page
for the GPU semaphore to pick in a timely fashion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
2019-06-20 16:52:36 +01:00
Daniele Ceraolo Spurio
ccb2aceaaa drm/i915: use vfuncs for reg_read/write_fw_domains
Instead of going through the if-else chain every time, let's save the
function in the uncore structure. Note that the new functions are
purposely not used from the reg read/write functions to keep the
inlining there.

While at it, use the new macro to call the old ones to clean the code a
bit.

v2: Rename macros for no-forcewake function assignment (Tvrtko)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620010021.20637-2-daniele.ceraolospurio@intel.com
2019-06-20 16:34:52 +01:00
Chris Wilson
2f5309452d drm/i915: Stop passing I915_WAIT_LOCKED to i915_request_wait()
Since commit eb8d0f5af4 ("drm/i915: Remove GPU reset dependence on
struct_mutex"), the I915_WAIT_LOCKED flags passed to i915_request_wait()
has been defunct. Now go ahead and remove it from all callers.

References: eb8d0f5af4 ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-3-chris@chris-wilson.co.uk
2019-06-19 12:58:38 +01:00
Daniel Vetter
52d2d44eee Linux 5.2-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Gj1MeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGctkH/0At3+SQPY2JJSy8
 i6+TDeytFx9OggeGLPHChRfehkAlvMb/kd34QHnuEvDqUuCAMU6HZQJFKoK9mvFI
 sDJVayPGDSqpm+iv8qLpMBPShiCXYVnGZeVfOdv36jUswL0k6wHV1pz4avFkDeZa
 1F4pmI6O2XRkNTYQawbUaFkAngWUCBG9ECLnHJnuIY6ohShBvjI4+E2JUaht+8gO
 M2h2b9ieddWmjxV3LTKgsK1v+347RljxdZTWnJ62SCDSEVZvsgSA9W2wnebVhBkJ
 drSmrFLxNiM+W45mkbUFmQixRSmjv++oRR096fxAnodBxMw0TDxE1RiMQWE6rVvG
 N6MC6xA=
 =+B0P
 -----END PGP SIGNATURE-----

Merge v5.2-rc5 into drm-next

Maarten needs -rc4 backmerged so he can pull in the fbcon notifier
removal topic branch into drm-misc-next.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-06-19 12:07:29 +02:00
Chris Wilson
1422768fa2 drm/i915/selftests: Flush live_evict
Be sure to cleanup after live_evict by flushing any residual state off
the GPU using igt_flush_test.

Tvrtko mentioned that it is probably wise to stop repeating this ad hoc
around the tests and implement a live test runner.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618161951.28820-1-chris@chris-wilson.co.uk
2019-06-18 18:12:13 +01:00
Chris Wilson
422d7df4f0 drm/i915: Replace engine->timeline with a plain list
To continue the onslaught of removing the assumption of a global
execution ordering, another casualty is the engine->timeline. Without an
actual timeline to track, it is overkill and we can replace it with a
much less grand plain list. We still need a list of requests inflight,
for the simple purpose of finding inflight requests (for retiring,
resetting, preemption etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
2019-06-14 19:03:40 +01:00
Chris Wilson
ce476c80b8 drm/i915: Keep contexts pinned until after the next kernel context switch
We need to keep the context image pinned in memory until after the GPU
has finished writing into it. Since it continues to write as we signal
the final breadcrumb, we need to keep it pinned until the request after
it is complete. Currently we know the order in which requests execute on
each engine, and so to remove that presumption we need to identify a
request/context-switch we know must occur after our completion. Any
request queued after the signal must imply a context switch, for
simplicity we use a fresh request from the kernel context.

The sequence of operations for keeping the context pinned until saved is:

 - On context activation, we preallocate a node for each physical engine
   the context may operate on. This is to avoid allocations during
   unpinning, which may be from inside FS_RECLAIM context (aka the
   shrinker)

 - On context deactivation on retirement of the last active request (which
   is before we know the context has been saved), we add the
   preallocated node onto a barrier list on each engine

 - On engine idling, we emit a switch to kernel context. When this
   switch completes, we know that all previous contexts must have been
   saved, and so on retiring this request we can finally unpin all the
   contexts that were marked as deactivated prior to the switch.

We can enhance this in future by flushing all the idle contexts on a
regular heartbeat pulse of a switch to kernel context, which will also
be used to check for hung engines.

v2: intel_context_active_acquire/_release

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
2019-06-14 19:03:32 +01:00
Daniele Ceraolo Spurio
c447ff7db3 drm/i915: update with_intel_runtime_pm to use the rpm structure
Matching the underlying get/put functions.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio
d858d5695f drm/i915: update rpm_get/put to use the rpm structure
The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio
69c6635544 drm/i915: move a few more functions to accept the rpm structure
Focusing on the functions called in few places.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-6-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio
d5b6c275d0 drm/i915: prefer i915_runtime_pm in intel_runtime function
As a first step towards updating the code to work on the runtime_pm
structure instead of i915, rework all the internals to use and pass
around that.

v2: add comment for kdev (Jani), move rpm init after pdev init for
mock_device

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-2-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Chris Wilson
84383d2e8d drm/i915: Refine i915_reset.lock_map
We already use a mutex to serialise i915_reset() and wedging, so all we
need it to link that into i915_request_wait() and we have our lock cycle
detection.

v2.5: Take error mutex for selftests

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614071023.17929-3-chris@chris-wilson.co.uk
2019-06-14 15:17:54 +01:00
Chris Wilson
ecab9be174 drm/i915: Combine unbound/bound list tracking for objects
With async binding, we don't want to manage a bound/unbound list as we
may end up running before we even acquire the pages. All that is
required is keeping track of shrinkable objects, so reduce it to the
minimum list.

Fixes: 6951e5893b ("drm/i915: Move GEM object domain management from struct_mutex to local")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk
2019-06-12 13:36:43 +01:00
Chris Wilson
33df8a7697 drm/i915: Prevent lock-cycles between GPU waits and GPU resets
We cannot allow ourselves to wait on the GPU while holding any lock as we
may need to reset the GPU. While there is not an explicit lock between
the two operations, lockdep cannot detect the dependency. So let's tell
lockdep about the wait/reset dependency with an explicit lockmap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612085246.16374-1-chris@chris-wilson.co.uk
2019-06-12 12:06:11 +01:00
Chris Wilson
ab53497b57 drm/i915: Rename i915_hw_ppgtt to i915_ppgtt
Keeping the _hw_ in there does not help to distinguish it from its
only brethren i915_ggtt, so drop it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-2-chris@chris-wilson.co.uk
2019-06-11 11:44:32 +01:00
Chris Wilson
e568ac3874 drm/i915: Pull kref into i915_address_space
Make the kref common to both derived structs (i915_ggtt and i915_ppgtt)
so that we can safely reference count an abstract ctx->vm address space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk
2019-06-11 11:44:24 +01:00
Chris Wilson
155ab8836c drm/i915: Move object close under its own lock
Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk
2019-06-06 12:51:13 +01:00
Thomas Gleixner
55716d2643 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428
Based on 1 normalized pattern(s):

  this file is released under the gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 68 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:16 +02:00
Matthew Auld
6501aa4e3a drm/i915: add in-kernel blitter client
The plan is to use the blitter engine for async object clearing when
using local memory, but before we can move the worker to get_pages() we
have to first tame some more of our struct_mutex usage. With this in
mind we should be able to upstream the object clearing as some
selftests, which should serve as a guinea pig for the ongoing locking
rework and upcoming async get_pages() framework.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190529123108.24422-2-matthew.auld@intel.com
2019-05-30 12:01:44 +01:00
Chris Wilson
c017cf6b1a drm/i915: Drop the deferred active reference
An old optimisation to reduce the number of atomics per batch sadly
relies on struct_mutex for coordination. In order to remove struct_mutex
from serialising object/context closing, always taking and releasing an
active reference on first use / last use greatly simplifies the locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
6951e5893b drm/i915: Move GEM object domain management from struct_mutex to local
Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
37d63f8fdb drm/i915: Pull scatterlist utils out of i915_gem.h
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.

v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-9-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
10be98a77c drm/i915: Move more GEM objects under gem/
Continuing the theme of separating out the GEM clutter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
f0e4a06397 drm/i915: Move GEM domain management to its own file
Continuing the decluttering of i915_gem.c, that of the read/write
domains, perhaps the biggest of GEM's follies?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-7-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
b414fcd5be drm/i915: Move mmap and friends to its own file
Continuing the decluttering of i915_gem.c, now the turn of do_mmap and
the faulthandlers

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-6-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
f033428db2 drm/i915: Move phys objects to its own file
Continuing the decluttering of i915_gem.c, this time the legacy physical
object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-5-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Chris Wilson
8475355f7a drm/i915: Move shmem object setup to its own file
Split the plain old shmem object into its own file to start decluttering
i915_gem.c

v2: Lose the confusing, hysterical raisins, suffix of _gtt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-4-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00
Michal Wajdeczko
f6470c9bcc drm/i915/selftests: Split igt_atomic_reset testcase
Split igt_atomic_reset selftests into separate full & engines parts,
so we can move former to the dedicated reset selftests file.

While here change engines test to loop first over atomic phases and
then loop over available engines.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-3-michal.wajdeczko@intel.com
2019-05-23 21:53:26 +01:00
Michal Wajdeczko
932309fb03 drm/i915/selftests: Move some reset testcases to separate file
igt_global_reset and igt_wedged_reset testcases are first candidates.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-2-michal.wajdeczko@intel.com
2019-05-23 21:52:26 +01:00
Chris Wilson
ee1136908e drm/i915/execlists: Virtual engine bonding
Some users require that when a master batch is executed on one particular
engine, a companion batch is run simultaneously on a specific slave
engine. For this purpose, we introduce virtual engine bonding, allowing
maps of master:slaves to be constructed to constrain which physical
engines a virtual engine may select given a fence on a master engine.

For the moment, we continue to ignore the issue of preemption deferring
the master request for later. Ideally, we would like to then also remove
the slave and run something else rather than have it stall the pipeline.
With load balancing, we should be able to move workload around it, but
there is a similar stall on the master pipeline while it may wait for
the slave to be executed. At the cost of more latency for the bonded
request, it may be interesting to launch both on their engines in
lockstep. (Bubbles abound.)

Opens: Also what about bonding an engine as its own master? It doesn't
break anything internally, so allow the silliness.

v2: Emancipate the bonds
v3: Couple in delayed scheduling for the selftests
v4: Handle invalid mutually exclusive bonding
v5: Mention what the uapi does
v6: s/nbond/num_bonds/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk
2019-05-22 08:40:46 +01:00
Chris Wilson
9981927cc9 drm/i915: Bump signaler priority on adding a waiter
The handling of the no-preemption priority level imposes the restriction
that we need to maintain the implied ordering even though preemption is
disabled. Otherwise we may end up with an AB-BA deadlock across multiple
engine due to a real preemption event reordering the no-preemption
WAITs. To resolve this issue we currently promote all requests to WAIT
on unsubmission, however this interferes with the timeslicing
requirement that we do not apply any implicit promotion that will defeat
the round-robin timeslice list. (If we automatically promote the active
request it will go back to the head of the queue and not the tail!)

So we need implicit promotion to prevent reordering around semaphores
where we are not allowed to preempt, and we must avoid implicit
promotion on unsubmission. So instead of at unsubmit, if we apply that
implicit promotion on adding the dependency, we avoid the semaphore
deadlock and we also reduce the gains made by the promotion for user
space waiting. Furthermore, by keeping the earlier dependencies at a
higher level, we reduce the search space for timeslicing without
altering runtime scheduling too badly (no dependencies at all will be
assigned a higher priority for rrul).

v2: Limit the bump to external edges (as originally intended) i.e.
between contexts and out to the user.

Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190515130052.4475-3-chris@chris-wilson.co.uk
(cherry picked from commit 6e7eb7a807)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-05-20 18:28:04 +03:00
Ville Syrjälä
bb211c3d0c drm/i915/selftests: Add live vma selftest
Add a live selftest to excercise rotated/remapped vmas. We simply
write through the rotated/remapped vma, and confirm that the data
appears in the right page when read through the normal vma.

Not sure what the fallout of making all rotated/remapped vmas
mappable/fenceable would be, hence I just hacked it in the test.

v2: Grab rpm reference (Chris)
    GEM_BUG_ON(view.type not as expected) (Chris)
    Allow CAN_FENCE for rotated/remapped vmas (Chris)
    Update intel_plane_uses_fence() to ask for a fence
    only for normal vmas on gen4+
v3: Deal with intel_wakeref_t
v4: Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Ville Syrjälä
e2e394bffa drm/i915/selftests: Add mock selftest for remapped vmas
Extend the rotated vma mock selftest to cover remapped vmas as
well.

TODO: reindent the loops I guess? Left like this for now to
ease review

v2: Include the vma type in the error message (Chris)
v3: Deal with trimmed sg
v4: Drop leftover debugs

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-3-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Ville Syrjälä
1a74fc0b3f drm/i915: Add a new "remapped" gtt_view
To overcome display engine stride limits we'll want to remap the
pages in the GTT. To that end we need a new gtt_view type which
is just like the "rotated" type except not rotated.

v2: Use intel_remapped_plane_info base type
    s/unused/unused_mbz/ (Chris)
    Separate BUILD_BUG_ON()s (Chris)
    Use I915_GTT_PAGE_SIZE (Chris)
v3: Use i915_gem_object_get_dma_address() (Chris)
    Trim the sg (Tvrtko)
v4: Actually trim this time. Limit the max length
    to one row of pages to keep things simple

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20 18:04:47 +03:00
Chris Wilson
1830374e13 drm/i915: Cancel retire_worker on parking
Replace the racy continuation check within retire_work with a definite
kill-switch on idling. The race was being exposed by gem_concurrent_blit
where the retire_worker would be terminated too early leaving us
spinning in debugfs/i915_drop_caches with nothing flushing the
retirement queue.

Although that the igt is trying to idle from one child while submitting
from another may be a contributing factor as to why  it runs so slowly...

v2: Use the non-sync version of cancel_delayed_work(), we only need to
stop it from being scheduled as we independently check whether now is
the right time to be parking.

Testcase: igt/gem_concurrent_blit
Fixes: 79ffac8599 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507121108.18377-3-chris@chris-wilson.co.uk
2019-05-07 17:40:19 +01:00
Chris Wilson
ae2306315f drm/i915: Remove delay for idle_work
The original intent for the delay before running the idle_work was to
provide a hysteresis to avoid ping-ponging the device runtime-pm. Since
then we have also pulled in some memory management and general device
management for parking. But with the inversion of the wakeref handling,
GEM is no longer responsible for the wakeref and by the time we call the
idle_work, the device is asleep. It seems appropriate now to drop the
delay and just run the worker immediately to flush the cached GEM state
before sleeping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190507121108.18377-2-chris@chris-wilson.co.uk
2019-05-07 17:40:19 +01:00
Chris Wilson
62c8e42345 drm/i915: Skip unused contexts for context_barrier_task()
If the context has not been used yet, it needs no barrier, and in the
process fix up the selftest in mock_contexts.

Testcase: igt/gem_ctx_clone/vm
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190429090735.326-1-chris@chris-wilson.co.uk
2019-04-29 11:12:37 +01:00
Chris Wilson
46472b3efb drm/i915: Move i915_request_alloc into selftests/
Having transitioned GEM over to using intel_context as its primary means
of tracking the GEM context and engine combined and using
i915_request_create(), we can move the older i915_request_alloc()
helper function into selftests/ where the remaining users are confined.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-9-chris@chris-wilson.co.uk
2019-04-26 18:32:20 +01:00
Chris Wilson
0268444607 drm/i915: Remove intel_context.active_link
We no longer need to track the active intel_contexts within each engine,
allowing us to drop a tricky mutex_lock from inside unpin (which may
occur inside fs_reclaim).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-8-chris@chris-wilson.co.uk
2019-04-26 18:32:17 +01:00
Chris Wilson
5e2a0419ef drm/i915: Switch back to an array of logical per-engine HW contexts
We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk
2019-04-26 18:32:11 +01:00
Chris Wilson
11334c6aad drm/i915: Split engine setup/init into two phases
In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

v2: s/setup_xcs/setup_common/ for intel_ring_submission_setup()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-6-chris@chris-wilson.co.uk
2019-04-26 18:32:07 +01:00
Chris Wilson
6b736de574 drm/i915: Pass intel_context to intel_context_pin_lock()
Move the intel_context_instance() to the caller so that we can decouple
ourselves from one context instance per engine.

v2: Rename pin_lock() to lock_pinned(), hopefully that is clearer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-5-chris@chris-wilson.co.uk
2019-04-26 18:32:05 +01:00
Chris Wilson
1b1ae40721 drm/i915/selftests: Pass around intel_context for sseu
Combine the (i915_gem_context, intel_engine) into a single parameter,
the intel_context for convenience and later simplification.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-4-chris@chris-wilson.co.uk
2019-04-26 18:32:04 +01:00
Chris Wilson
f7f28de7e5 drm/i915/selftests: Use the real kernel context for sseu isolation tests
Simply the setup slightly for the sseu selftests to use the actual
kernel_context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-3-chris@chris-wilson.co.uk
2019-04-26 18:32:03 +01:00
Chris Wilson
79ffac8599 drm/i915: Invert the GEM wakeref hierarchy
In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
2019-04-24 22:26:49 +01:00
Chris Wilson
2ccdf6a1c3 drm/i915: Pass intel_context to i915_request_create()
Start acquiring the logical intel_context and using that as our primary
means for request allocation. This is the initial step to allow us to
avoid requiring struct_mutex for request allocation along the
perma-pinned kernel context, but it also provides a foundation for
breaking up the complex request allocation to handle different scenarios
inside execbuf.

For the purpose of emitting a request from inside retirement (see the
next patch for engine power management), we also need to lift control
over the timeline mutex to the caller.

v2: Note that the request carries the active reference upon construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk
2019-04-24 22:25:35 +01:00
Chris Wilson
23c3c3d04f drm/i915: Pull the GEM powermangement coupling into its own file
Split out the powermanagement portion (GT wakeref, suspend/resume) of
GEM from i915_gem.c into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-2-chris@chris-wilson.co.uk
2019-04-24 22:25:28 +01:00
Chris Wilson
112ed2d31a drm/i915: Move GraphicsTechnology files under gt/
Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-04-24 21:01:46 +01:00
Chris Wilson
86554f48e5 drm/i915/selftests: Verify whitelist of context registers
The RING_NONPRIV allows us to add registers to a whitelist that allows
userspace to modify them. Ideally such registers should be safe and
saved within the context such that they do not impact system behaviour
for other users. This selftest verifies that those registers we do add
are (a) then writable by userspace and (b) only affect a single client.

Opens:
- Is GEN9_SLICE_COMMON_ECO_CHICKEN1 really write-only?

v2: Remove the blatant copy-paste.
v3: Emulate userspace register writes via the batch again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424110941.9869-1-chris@chris-wilson.co.uk
2019-04-24 18:37:36 +01:00
Chris Wilson
09407579ab drm/i915: Store the default sseu setup on the engine
As we push for better compartmentalisation, it is more convenient to
copy the default sseu configuration from the engine into the derived
logical context, than it is to dig it out from i915->runtime_info.

v2: Use intel_sseu_from_device_info() to describe the converter

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424095134.30249-1-chris@chris-wilson.co.uk
2019-04-24 16:37:20 +01:00
Fernando Pacheco
f3c2b76ef2 drm/i915/selftests: Check that gpu reset is usable from atomic context
GPU reset is now available with GuC enabled, so re-enable our check that
this reset is usable from atomic context.

Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419230015.18121-6-fernando.pacheco@intel.com
2019-04-20 08:20:08 +01:00
Chris Wilson
254e11864a drm/i915: Verify the engine workarounds stick on application
Read the engine workarounds back using the GPU after loading the initial
context state to verify that we are setting them correctly, and bail if
it fails.

v2: Break out the verification into its own loop

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417075657.19456-3-chris@chris-wilson.co.uk
2019-04-17 10:58:20 +01:00
Chris Wilson
1ab494cc40 drm/i915/selftests: Skip live timeline/suspend tests if wedged
If the driver is wedged, we can not issue the requests to exercise the
timelines or the system across suspend, so skip the tests. live_hangcheck
is there to fail if we cannot recover.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190413125820.14112-4-chris@chris-wilson.co.uk
2019-04-15 11:58:19 +01:00
Chris Wilson
5d75dc2b08 drm/i915: Teach intel_workarounds to use uncore mmio access
Start weaning ourselves off the implicit I915_WRITE macro madness and
start using the explicit intel_uncore mmio access.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190412202458.10653-1-chris@chris-wilson.co.uk
2019-04-13 07:46:43 +01:00
Chris Wilson
6484775766 drm/i915/selftests: Mark live_forcewake_ops as unreliable
A couple of machines in the farm show quite frequent errors in the
powerwells not being released. Either there is an external agent
interferring with the powerwells, or the powerwell doesn't quite behave
as we anticipate -- either way, the test is not reliable enough to be
enabled by default in CI. It has served its immediate purpose in
providing coverage as we made tweaks to forcewake, so keep it available
for future testing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110210
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190407192649.14750-1-chris@chris-wilson.co.uk
2019-04-08 19:15:05 +01:00
Chris Wilson
de220cc219 drm/i915: Consolidate the timeline->barrier
The timeline is strictly ordered, so by inserting the timeline->barrier
request into the timeline->last_request it naturally provides the same
barrier. Consolidate the pair of barriers into one as they serve the
same purpose.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190408091728.20207-4-chris@chris-wilson.co.uk
2019-04-08 17:04:12 +01:00
Chris Wilson
e57ce4b193 drm/i915/selftests: Fix plain use of integer 0 as NULL
Quelch a sparse warning:
drivers/gpu/drm/i915/gt/selftest_lrc.c:119:54: warning: Using plain integer as NULL pointer

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190405111430.18495-1-chris@chris-wilson.co.uk
2019-04-05 13:28:43 +01:00
Chris Wilson
bac24f59f4 drm/i915/execlists: Enable coarse preemption boundaries for gen8
When we introduced preemption, we chose to keep it disabled for gen8 as
supporting preemption inside GPGPU user batches required various w/a in
userspace. Since then, the desire to preempt long queues of requests
between batches (e.g. within busywaiting semaphores) has grown. So allow
arbitration within the busywaits and between requests, but disable
arbitration within user batches so that we can preempt between requests
and not risk breaking GPGPU.

However, since this preemption is much coarser and doesn't interfere
with userspace, we decline to include it amongst the scheduler
capabilities. (This is also required for us to skip over the preemption
selftests that expect to be able to preempt user batches.)

Michal suggested that we could perhaps allow preemption inside gen8
userspace batches if we can satisfy ourselves that the default
preemption settings are viable with existing userspace (principally
OpenCL which already should carry any known workaround). We could then
merge the two code paths back into one, even dropping the artifical
has-preemption device feature flag.

Testcase: igt/gem_exec_scheduler/semaphore-user
References: beecec9017 ("drm/i915/execlists: Preemption!")
Fixes: e886196469 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michal Winiarski <michal.winiarski@intel.com> #irc
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190329134024.5254-1-chris@chris-wilson.co.uk
2019-04-05 11:00:28 +01:00
Chris Wilson
3a891a6267 drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
We want to use intel_engine_mask_t inside i915_request.h, which means
extracting it from the general header file mess and placing it inside a
types.h. A knock on effect is that the compiler wants to warn about
type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
for the worst.

v2: Use intel_engine_mask_t consistently
v3: Move I915_NUM_ENGINES to its natural home at the end of the enum

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-04-02 15:09:08 +01:00
Daniele Ceraolo Spurio
4319382e9b drm/i915: switch intel_uncore_forcewake_for_reg to intel_uncore
The intel_uncore structure is the owner of FW, so subclass the
function to it.

While at it, use a local uncore var and switch to the new read/write
functions where it makes sense.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-7-daniele.ceraolospurio@intel.com
2019-03-26 20:16:31 +00:00
Daniele Ceraolo Spurio
a2b4abfc62 drm/i915: switch uncore mmio funcs to use intel_uncore
The full read/write ops can now work on the intel_uncore struct.
Introduce intel_uncore_read/write functions working on intel_uncore
and switch the I915_READ/WRITE macro to internally call those.

v2: no change
v3: add intel_uncore_read/write functions (Chris), update commit msg

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-6-daniele.ceraolospurio@intel.com
2019-03-26 20:16:13 +00:00
Daniele Ceraolo Spurio
2cf7bf6f2f drm/i915: add uncore flags for unclaimed mmio
Save the HW capabilities to avoid having to jump back to dev_priv
every time.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-4-daniele.ceraolospurio@intel.com
2019-03-26 19:30:59 +00:00
Dan Carpenter
602cbe8efc drm/i915/selftests: Fix an IS_ERR() vs NULL check
The live_context() function returns error pointers.  It never returns
NULL.

Fixes: 9c1477e83e ("drm/i915/selftests: Exercise adding requests to a full GGTT")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190326050843.GA20038@kadam
2019-03-26 14:53:01 +00:00
Chris Wilson
ea593dbba4 drm/i915: Allow contexts to share a single timeline across all engines
Previously, our view has been always to run the engines independently
within a context. (Multiple engines happened before we had contexts and
timelines, so they always operated independently and that behaviour
persisted into contexts.) However, at the user level the context often
represents a single timeline (e.g. GL contexts) and userspace must
ensure that the individual engines are serialised to present that
ordering to the client (or forgot about this detail entirely and hope no
one notices - a fair ploy if the client can only directly control one
engine themselves ;)

In the next patch, we will want to construct a set of engines that
operate as one, that have a single timeline interwoven between them, to
present a single virtual engine to the user. (They submit to the virtual
engine, then we decide which engine to execute on based.)

To that end, we want to be able to create contexts which have a single
timeline (fence context) shared between all engines, rather than multiple
timelines.

v2: Move the specialised timeline ordering to its own function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk
2019-03-22 13:12:38 +00:00
Chris Wilson
e0695db729 drm/i915: Create/destroy VM (ppGTT) for use with contexts
In preparation to making the ppGTT binding for a context explicit (to
facilitate reusing the same ppGTT between different contexts), allow the
user to create and destroy named ppGTT.

v2: Replace global barrier for swapping over the ppgtt and tlbs with a
local context barrier (Tvrtko)
v3: serialise with struct_mutex; it's lazy but required dammit
v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
similarly, turned out different!)

v5: Fix up test unwind for aliasing-ppgtt (snb)
v6: Tighten language for uapi struct drm_i915_gem_vm_control.
v7: Patch the context image for runtime ppgtt switching!

Testcase: igt/gem_vm_create
Testcase: igt/gem_ctx_param/vm
Testcase: igt/gem_ctx_clone/vm
Testcase: igt/gem_ctx_shared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-2-chris@chris-wilson.co.uk
2019-03-22 13:12:32 +00:00
Chris Wilson
e70d3d8040 drm/i915/selftests: Mark up preemption tests for hang detection
Use the igt_live_test framework for detecting whether an unwanted hang
occurred during test execution, and report failure if it does.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321194031.20240-2-chris@chris-wilson.co.uk
2019-03-22 07:12:12 +00:00
Chris Wilson
d067994cc4 drm/i915/selftests: Calculate maximum ring size for preemption chain
32 is too many for the likes of kbl, and in order to insert that many
requests into the ring requires us to declare the first few hung --
understandably a slow and unexpected process. Instead, measure the size
of a singe requests and use that to estimate the upper bound on the
chain length we can use for our test, remembering to flush the previous
chain between tests for safety.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: "Yokoyama, Caz" <caz.yokoyama@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321194031.20240-1-chris@chris-wilson.co.uk
2019-03-22 07:12:11 +00:00
Chris Wilson
a679f58d05 drm/i915: Flush pages on acquisition
When we return pages to the system, we ensure that they are marked as
being in the CPU domain since any external access is uncontrolled and we
must assume the worst. This means that we need to always flush the pages
on acquisition if we need to use them on the GPU, and from the beginning
have used set-domain. Set-domain is overkill for the purpose as it is a
general synchronisation barrier, but our intent is to only flush the
pages being swapped in. If we move that flush into the pages acquisition
phase, we know then that when we have obj->mm.pages, they are coherent
with the GPU and need only maintain that status without resorting to
heavy handed use of set-domain.

The principle knock-on effect for userspace is through mmap-gtt
pagefaulting. Our uAPI has always implied that the GTT mmap was async
(especially as when any pagefault occurs is unpredicatable to userspace)
and so userspace had to apply explicit domain control itself
(set-domain). However, swapping is transparent to the kernel, and so on
first fault we need to acquire the pages and make them coherent for
access through the GTT. Our use of set-domain here leaks into the uABI
that the first pagefault was synchronous. This is unintentional and
baring a few igt should be unoticed, nevertheless we bump the uABI
version for mmap-gtt to reflect the change in behaviour.

Another implication of the change is that gem_create() is presumed to
create an object that is coherent with the CPU and is in the CPU write
domain, so a set-domain(CPU) following a gem_create() would be a minor
operation that merely checked whether we could allocate all pages for
the object. On applying this change, a set-domain(CPU) causes a clflush
as we acquire the pages. This will have a small impact on mesa as we move
the clflush here on !llc from execbuf time to create, but that should
have minimal performance impact as the same clflush exists but is now
done early and because of the clflush issue, userspace recycles bo and
so should resist allocating fresh objects.

Internally, the presumption that objects are created in the CPU
write-domain and remain so through writes to obj->mm.mapping is more
prevalent than I expected; but easy enough to catch and apply a manual
flush.

For the future, we should push the page flush from the central
set_pages() into the callers so that we can more finely control when it
is applied, but for now doing it one location is easier to validate, at
the cost of sometimes flushing when there is no need.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk
2019-03-21 17:28:12 +00:00
Chris Wilson
4daffb664a drm/i915: Stop storing the context name as the timeline name
The timeline->name is only used for convenience in pretty printing the
i915_request.fence->ops->get_timeline_name() and it is just as
convenient to pull it from the gem_context directly. The few instances
of its use inside GEM_TRACE() has proven more of a nuisance than
helpful, so not worth saving imo.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-4-chris@chris-wilson.co.uk
2019-03-21 15:59:31 +00:00
Chris Wilson
3e05531243 drm/i915: Stop storing ctx->user_handle
The user_handle need only be known by userspace for it to lookup the
context via the idr; internally we have no use for it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-3-chris@chris-wilson.co.uk
2019-03-21 15:59:29 +00:00
Chris Wilson
3aa9945a52 drm/i915: Separate GEM context construction and registration to userspace
In later patches, it became apparent that userspace can see a partially
constructed GEM context and begin using it before it was ready, to much
hilarity. Close this window of opportunity by lifting the registration of
the context with userspace (the insertion of the context into the filp's
idr) to the very end of the CONTEXT_CREATE ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321140711.11190-1-chris@chris-wilson.co.uk
2019-03-21 15:59:25 +00:00
Dan Carpenter
401f147b16 drm/i915/selftests: fix NULL vs IS_ERR() check in mock_context_barrier()
The mock_context() function returns NULL on error, it doesn't return
error pointers.

Fixes: 85fddf0b00 ("drm/i915: Introduce a context barrier callback")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190321092451.GK2202@kadam
2019-03-21 13:30:16 +00:00
Daniele Ceraolo Spurio
25286aaca9 drm/i915: move regs pointer inside the uncore structure
This will allow futher simplifications in the uncore handling.

v2: move register access setup under uncore (Chris)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-8-daniele.ceraolospurio@intel.com
2019-03-20 21:12:50 +00:00
Daniele Ceraolo Spurio
f7de50278e drm/i915: make more uncore function work on intel_uncore
Move the init, fini, prune, suspend, resume function to work on
intel_uncore instead of dev_priv.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-5-daniele.ceraolospurio@intel.com
2019-03-20 21:12:42 +00:00
Daniele Ceraolo Spurio
3ceea6a1b4 drm/i915: use intel_uncore for all forcewake get/put
Now that the internal code all works on intel_uncore, flip the
external-facing interface.

v2: fix GVT.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-4-daniele.ceraolospurio@intel.com
2019-03-20 21:12:31 +00:00
Daniele Ceraolo Spurio
f568eeee53 drm/i915: use intel_uncore in fw get/put internal paths
Get/put functions used outside of uncore.c are updated in the next
patch for a nicer split.

v2: use dev_priv where we still have it (Paulo)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-3-daniele.ceraolospurio@intel.com
2019-03-20 21:12:26 +00:00
Andy Shevchenko
6e514e3717 drm/i915: Switch to bitmap_zalloc()
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304092908.57382-2-andriy.shevchenko@linux.intel.com
2019-03-20 17:50:35 +00:00
Daniele Ceraolo Spurio
fd79d93985 drm/i915/selftests: add test to verify get/put fw domains
Exercise acquiring and releasing forcewake around register reads. In
order to read a register behind a GT powerwell, we need to instruct that
powerwell to wake up using a forcewake. When we no longer require the GT
powerwell, we tell the GT to release our forcewake. Inside the
forcewake, the register read should work but outside it should just
return garbage, 0 being the most common garbage. Thus we can detect when
we are inside and outside of the forcewake with just a simple register
read, and so can verify that the GT powerwell is released when we say
so.

v2: Picking the right forcewaked register to return 0 outside of
forcewake is an art.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190320080052.27273-1-chris@chris-wilson.co.uk
2019-03-20 11:32:13 +00:00
Chris Wilson
d315d4faf8 drm/i915/selftests: Provide stub reset functions
If a test fails, we quite often mark the device as wedged. Provide the
stub functions so that we can wedge the mock device, and avoid exploding
on test failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109981
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190319214233.25498-3-chris@chris-wilson.co.uk
2019-03-20 09:01:12 +00:00
Chris Wilson
4c5896dc4c drm/i915: Hold a reference to the active HW context
For virtual engines, we need to keep the HW context alive while it
remains in use. For regular HW contexts, they are created and kept alive
until the end of the GEM context. For simplicity, generalise the
requirements and keep an active reference to each HW context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190318212347.30146-2-chris@chris-wilson.co.uk
2019-03-19 08:21:13 +00:00
Chris Wilson
206c2f812f drm/i915: Lock the gem_context->active_list while dropping the link
On unpinning the intel_context, we remove it from the active list
inside the GEM context. This list is supposed to be guarded by the GEM
context mutex, so remember to take it!

Fixes: 7e3d9a5941 ("drm/i915: Track active engines within a context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190318212347.30146-1-chris@chris-wilson.co.uk
2019-03-19 08:21:11 +00:00
Chris Wilson
65baf0ef04 drm/i915: Hold a ref to the ring while retiring
As the final request on a ring may hold the reference to this ring (via
retiring the last pinned context), we may find ourselves chasing a
dangling pointer on completion of the list.

A quick solution is to hold a reference to the ring itself as we retire
along it so that we only free it after we stop dereferencing it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190318095204.9913-4-chris@chris-wilson.co.uk
2019-03-18 21:00:28 +00:00
Chris Wilson
a9fe9ca44c drm/i915/gtt: Rename i915_vm_is_48b to i915_vm_is_4lvl
Large ppGTT are differentiated by the requirement to go to four levels
to address more than 32b. Given the introduction of more 4 level ppGTT
with different sizes of addressable bits, rename i915_vm_is_48b() to
better reflect the commonality of using 4 levels.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-4-chris@chris-wilson.co.uk
2019-03-15 09:04:54 +00:00
Chris Wilson
51d623b675 drm/i915: Drop address size from ppgtt_type
With the introduction of the separate addressable bits into the device
info, we can remove the conflation of the ppgtt size from the ppgtt
type.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-3-chris@chris-wilson.co.uk
2019-03-15 09:04:54 +00:00
Chris Wilson
cbecbccaa1 drm/i915: Record platform specific ppGTT size in intel_device_info
As the maximum addressable bits is determined by platform, record that
information in our static chipset tables. This has the advantage of
being clearly recorded in our capability dumps for dmesg, debugfs and
error states.

Based on a patch by Bob Paauwe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190314223839.28258-2-chris@chris-wilson.co.uk
2019-03-15 09:04:54 +00:00
Chris Wilson
d2eeaf2bc0 drm/i915/selftests: Disable preemption while setting up fence-timers
The impossible happens and a future fence expired while we were still
initialising. The probable cause is that the test was preempted and we
lost our scheduler cpu slice. Disable preemption during this test to
rule out preemption as a source of timer disruption.

References: https://bugs.freedesktop.org/show_bug.cgi?id=110039
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190313205944.5768-1-chris@chris-wilson.co.uk
2019-03-14 11:47:06 +00:00
Chris Wilson
22acf9fc18 drm/i915/selftests: Improve error detection of reset failure
Use a timedwait to promptly detect if the recovery after reset fails and
provide a meaningful debug dump.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190312111146.10662-2-chris@chris-wilson.co.uk
2019-03-12 12:49:30 +00:00
Chris Wilson
85fddf0b00 drm/i915: Introduce a context barrier callback
In the next patch, we will want to update live state within a context.
As this state may be in use by the GPU and we haven't been explicitly
tracking its activity, we instead attach it to a request we send down
the context setup with its new state and on retiring that request
cleanup the old state as we then know that it is no longer live.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190309160250.29324-1-chris@chris-wilson.co.uk
2019-03-09 17:19:54 +00:00
Chris Wilson
0881954965 drm/i915: Introduce intel_context.pin_mutex for pin management
Introduce a mutex to start locking the HW contexts independently of
struct_mutex, with a view to reducing the coarse struct_mutex. The
intel_context.pin_mutex is used to guard the transition to and from being
pinned on the gpu, and so is required before starting to build any
request. The intel_context will then remain pinned until the request
completes, but the mutex can be released immediately unpin completion of
pinning the context.

A slight variant of the above is used by per-context sseu that wants to
inspect the pinned status of the context, and requires that it remains
stable (either !pinned or pinned) across its operation. By using the
pin_mutex to serialise operations while pin_count==0, we can take that
pin_mutex for stabilise the boolean pin status.

v2: for Tvrtko!
* Improved commit message.
* Dropped _gpu suffix from gen8_modify_rpcs_gpu.
v3: Repair the locking for sseu selftests

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-7-chris@chris-wilson.co.uk
2019-03-08 14:04:19 +00:00
Chris Wilson
9dbfea98d7 drm/i915: Track the pinned kernel contexts on each engine
Each engine acquires a pin on the kernel contexts (normal and preempt)
so that the logical state is always available on demand. Keep track of
each engines pin by storing the returned pointer on the engine for quick
access.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-6-chris@chris-wilson.co.uk
2019-03-08 14:00:02 +00:00
Chris Wilson
95f697eb02 drm/i915: Make context pinning part of intel_context_ops
Push the intel_context pin callback down from intel_engine_cs onto the
context itself by virtue of having a central caller for
intel_context_pin() being able to lookup the intel_context itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-5-chris@chris-wilson.co.uk
2019-03-08 13:59:59 +00:00
Chris Wilson
c4d52feb2c drm/i915: Move over to intel_context_lookup()
In preparation for an ever growing number of engines and so ever
increasing static array of HW contexts within the GEM context, move the
array over to an rbtree, allocated upon first use.

Unfortunately, this imposes an rbtree lookup at a few frequent callsites,
but we should be able to mitigate those by moving over to using the HW
context as our primary type and so only incur the lookup on the boundary
with the user GEM context and engines.

v2: Check for no HW context in guc_stage_desc_init

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk
2019-03-08 13:59:52 +00:00
Chris Wilson
4dc84b77b0 drm/i915: Store the intel_context_ops in the intel_engine_cs
If we place a pointer to the engine specific intel_context_ops in the
engine itself, we can assign the ops pointer on initialising the
context, and then rely on it being set. This simplifies the code in
later patches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-3-chris@chris-wilson.co.uk
2019-03-08 13:59:50 +00:00
Chris Wilson
7e3d9a5941 drm/i915: Track active engines within a context
For use in the next patch, if we track which engines have been used by
the HW, we can reduce the work required to flush our state off the HW to
those engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-1-chris@chris-wilson.co.uk
2019-03-08 13:59:41 +00:00
Chris Wilson
c6eeb4797e drm/i915: Reduce presumption of request ordering for barriers
Currently we assume that we know the order in which requests run and so
can determine if we need to reissue a switch-to-kernel-context prior to
idling. That assumption does not hold for the future, so instead of
tracking which barriers have been used, simply determine if we have ever
switched away from the kernel context by using the engine and before
idling ensure that all engines that have been used since the last idle
are synchronously switched back to the kernel context for safety (and
else of shrinking memory while idle).

v2: Use intel_engine_mask_t and ALL_ENGINES

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-3-chris@chris-wilson.co.uk
2019-03-08 10:57:08 +00:00
Chris Wilson
5861b013e2 drm/i915: Do a synchronous switch-to-kernel-context on idling
When the system idles, we switch to the kernel context as a defensive
measure (no users are harmed if the kernel context is lost). Currently,
we issue a switch to kernel context and then come back later to see if
the kernel context is still current and the system is idle. However,
if we are no longer privy to the runqueue ordering, then we have to
relax our assumptions about the logical state of the GPU and the only
way to ensure that the kernel context is currently loaded is by issuing
a request to run after all others, and wait for it to complete all while
preventing anyone else from issuing their own requests.

v2: Pull wedging into switch_to_kernel_context_sync() but only after
waiting (though only for the same short delay) for the active context to
finish.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-1-chris@chris-wilson.co.uk
2019-03-08 10:57:05 +00:00
Chris Wilson
3123ada8eb drm/i915/selftests: Check preemption support on each engine
Check that we have setup on preemption for the engine before testing,
instead warn if it is not enabled on supported HW.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190306142517.22558-28-chris@chris-wilson.co.uk
2019-03-08 09:34:49 +00:00
Chris Wilson
2835f4f36b drm/i915/selftests: Improve switch-to-kernel-context checking
We can reduce the switch-to-kernel-context selftest to operate as a loop
and so trivially test another state transition (that of idle->busy).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190307211947.6954-1-chris@chris-wilson.co.uk
2019-03-07 23:33:35 +00:00
Michał Winiarski
b218a80b17 drm/i915/selftests: Upgrade printing test/subtest name to pr_info
We're using pr_debug for things that we don't really want to see in the
CI log, but we may find useful during test development.
Let's upgrade the test name printer - we do want to see those in CI log.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305144717.10000-1-michal.winiarski@intel.com
2019-03-06 11:36:36 +00:00
Chris Wilson
161996a800 drm/i915/selftests: Fix MI_STORE_DWORD_IMM alignment
MI_STORE_DWORD_IMM wants to write into a dword-aligned (4B) address, we
mistakenly cleared bit2 and not bits 0 and 1.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190306082447.21563-1-chris@chris-wilson.co.uk
2019-03-06 11:08:32 +00:00
Chris Wilson
8a68d46436 drm/i915: Store the BIT(engine->id) as the engine's mask
In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
2019-03-05 18:19:50 +00:00
Chris Wilson
ebece75392 drm/i915: Keep timeline HWSP allocated until idle across the system
In preparation for enabling HW semaphores, we need to keep in flight
timeline HWSP alive until its use across entire system has completed,
as any other timeline active on the GPU may still refer back to the
already retired timeline. We both have to delay recycling available
cachelines and unpinning old HWSP until the next idle point.

An easy option would be to simply keep all used HWSP until the system as
a whole was idle, i.e. we could release them all at once on parking.
However, on a busy system, we may never see a global idle point,
essentially meaning the resource will be leaked until we are forced to
do a GC pass. We already employ a fine-grained idle detection mechanism
for vma, which we can reuse here so that each cacheline can be freed
immediately after the last request using it is retired.

v3: Keep track of the activity of each cacheline.
v4: cacheline_free() on canceling the seqno tracking
v5: Finally with a testcase to exercise wraparound
v6: Pack cacheline into empty bits of page-aligned vaddr
v7: Use i915_utils to hide the pointer casting around bit manipulation

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-2-chris@chris-wilson.co.uk
2019-03-01 17:40:33 +00:00
Chris Wilson
34ae8455f4 drm/i915/selftests: Check that whitelisted registers are accessible
There is no point in whitelisting a register that the user then cannot
write to, so check the register exists before merging such patches.

v2: Mark SLICE_COMMON_ECO_CHICKEN1 [731c] as write-only
v3: Use different variables for different meanings!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-6-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20190301160108.19039-1-chris@chris-wilson.co.uk
2019-03-01 16:53:39 +00:00
Chris Wilson
3ef7114982 drm/i915: Introduce i915_timeline.mutex
A simple mutex used for guarding the flow of requests in and out of the
timeline. In the short-term, it will be used only to guard the addition
of requests into the timeline, taken on alloc and released on commit so
that only one caller can construct a request into the timeline
(important as the seqno and ring pointers must be serialised). This will
be used by observers to ensure that the seqno/hwsp is stable. Later,
when we have reduced retiring to only operate on a single timeline at a
time, we can then use the mutex as the sole guard required for retiring.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301110547.14758-2-chris@chris-wilson.co.uk
2019-03-01 14:54:46 +00:00
Chris Wilson
b5773a3616 drm/i915/execlists: Suppress mere WAIT preemption
WAIT is occasionally suppressed by virtue of preempted requests being
promoted to NEWCLIENT if they have not all ready received that boost.
Make this consistent for all WAIT boosts that they are not allowed to
preempt executing contexts and are merely granted the right to be at the
front of the queue for the next execution slot. This is in keeping with
the desire that the WAIT boost be a minor tweak that does not give
excessive promotion to its user and open ourselves to trivial abuse.

The problem with the inconsistent WAIT preemption becomes more apparent
as the preemption is propagated across the engines, where one engine may
preempt and the other not, and we be relying on the exact execution
order being consistent across engines (e.g. using HW semaphores to
coordinate parallel execution).

v2: Also protect GuC submission from false preemption loops.
v3: Build bug safeguards and better debug messages for st.
v4: Do the priority bumping in unsubmit (i.e. on preemption/reset
unwind), applying it earlier during submit causes out-of-order execution
combined with execute fences.
v5: Call sw_fence_fini for our dummy request (Matthew)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228220639.3173-1-chris@chris-wilson.co.uk
2019-02-28 23:10:43 +00:00
Chris Wilson
13f1bfd3b3 drm/i915: Make object/vma allocation caches global
As our allocations are not device specific, we can move our slab caches
to a global scope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
2019-02-28 11:08:02 +00:00
Chris Wilson
32eb6bcfdd drm/i915: Make request allocation caches global
As kmem_caches share the same properties (size, allocation/free behaviour)
for all potential devices, we can use global caches. While this
potential has worse fragmentation behaviour (one can argue that
different devices would have different activity lifetimes, but you can
also argue that activity is temporal across the system) it is the
default behaviour of the system at large to amalgamate matching caches.

The benefit for us is much reduced pointer dancing along the frequent
allocation paths.

v2: Defer shrinking until after a global grace period for futureproofing
multiple consumers of the slab caches, similar to the current strategy
for avoiding shrinking too early.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-1-chris@chris-wilson.co.uk
2019-02-28 11:07:56 +00:00
Chris Wilson
368375107b drm/i915/selftests: Exercise resetting during non-user payloads
In selftests/live_hangcheck, we have a lot of tests for resetting simple
spinners, but nothing quite prepared us for how the GPU reacted to
triggering a reset outside of the safe spinner. These two subtests fill
the ring with plain old empty, non-spinning requests, and then triggers
a reset. Without a user-payload to blame, these requests will exercise
the 'non-started' paths and mostly be replayed verbatim.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-4-chris@chris-wilson.co.uk
2019-02-26 09:55:41 +00:00
Chris Wilson
b300fde896 drm/i915: Remove i915_request.global_seqno
Having weaned the interrupt handling off using a single global execution
queue, we no longer need to emit a global_seqno. Note that we still have
a few assumptions about execution order along engine timelines, but this
removes the most obvious artefact!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-3-chris@chris-wilson.co.uk
2019-02-26 09:55:37 +00:00
Chris Wilson
8892f47742 drm/i915: Remove access to global seqno in the HWSP
Stop accessing the HWSP to read the global seqno, and stop tracking the
mirror in the engine's execution timeline -- it is unused.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-2-chris@chris-wilson.co.uk
2019-02-26 09:55:33 +00:00
Chris Wilson
c41166f9a1 drm/i915: Beware temporary wedging when determining -EIO
At a few points in our uABI, we check to see if the driver is wedged and
report -EIO back to the user in that case. However, as we perform the
check and reset asynchronously (where once before they were both
serialised by the struct_mutex), we may instead see the temporary wedging
used to cancel inflight rendering to avoid a deadlock during reset
(caused by either us timing out in our reset handler,
i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
for a stuck modeset). If we suspect this is the case, that is we see a
wedged driver *and* reset in progress, then wait until the reset is
resolved before reporting upon the wedged status.

v2: might_sleep() (Mika)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
2019-02-20 16:31:08 +00:00
Chris Wilson
e4106dae0f drm/i915/selftests: Make unbannable contexts for reset handling
igt_ctx_sseu was caught using bannable contexts, and in the course of
resetting rapidly to run its test, was banned. Don't let ourselves ban
the test!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190218145051.18981-1-chris@chris-wilson.co.uk
2019-02-18 16:00:34 +00:00
Chris Wilson
83e3a21530 drm/i915/selftests: Move local mock_ggtt allocations to the heap
This struct appears quite large and pushes our stack frame over
1024 bytes -- too high for conservative setups. So move the mock_ggtt
struct to the heap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190217202518.24730-1-chris@chris-wilson.co.uk
2019-02-17 21:07:46 +00:00
Chris Wilson
2a4a275403 drm/i915/selftests: Always free spinner on __sseu_prepare error
Prepare a nice little onion unwind to ensure that we always free the
spinner if we __sseu_prepare fails.

Fixes: c06ee6ff2c ("drm/i915/selftests: Context SSEU reconfiguration tests")
Reported-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190215195010.16637-1-chris@chris-wilson.co.uk
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
2019-02-16 11:06:22 +00:00
Chris Wilson
9095c86374 drm/i915/selftests: Drop unnecessary struct_mutex around i915_reset()
Since we no longer need to hold struct_mutex to perform a global device
reset, don't do so for igt_reset_wedge().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190215102732.15520-2-chris@chris-wilson.co.uk
2019-02-15 18:17:30 +00:00
Chris Wilson
c836eb79c0 drm/i915/selftests: Always use an active engine while resetting
Currently, we only try to reset a live engine for checking the whitelist
retention across a per-engine reset. For safety, it appears we need to
prime the system with a hanging spinner before performing a full-device
reset. (Figuring out the root cause behind the instability with handling
a reset during a no-op request is a challenge for another test, the
whitelist test has its own purpose.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190213224805.32021-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2019-02-15 09:37:31 +00:00
Chris Wilson
21182b3c4c drm/i915: Don't claim an unstarted request was guilty
If we haven't even begun executing the payload of the stalled request,
then we should not claim that its userspace context was guilty of
submitting a hanging batch.

v2: Check for context corruption before trying to restart.
v3: Preserve semaphores on skipping requests (need to keep the timelines
intact).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-7-chris@chris-wilson.co.uk
2019-02-08 16:47:40 +00:00
Chris Wilson
2caffbf117 drm/i915: Revoke mmaps and prevent access to fence registers across reset
Previously, we were able to rely on the recursive properties of
struct_mutex to allow us to serialise revoking mmaps and reacquiring the
FENCE registers with them being clobbered over a global device reset.
I then proceeded to throw out the baby with the bath water in order to
pursue a struct_mutex-less reset.

Perusing LWN for alternative strategies, the dilemma on how to serialise
access to a global resource on one side was answered by
https://lwn.net/Articles/202847/ -- Sleepable RCU:

    1  int readside(void) {
    2      int idx;
    3      rcu_read_lock();
    4	   if (nomoresrcu) {
    5          rcu_read_unlock();
    6	       return -EINVAL;
    7      }
    8	   idx = srcu_read_lock(&ss);
    9	   rcu_read_unlock();
    10	   /* SRCU read-side critical section. */
    11	   srcu_read_unlock(&ss, idx);
    12	   return 0;
    13 }
    14
    15 void cleanup(void)
    16 {
    17     nomoresrcu = 1;
    18     synchronize_rcu();
    19     synchronize_srcu(&ss);
    20     cleanup_srcu_struct(&ss);
    21 }

No more worrying about stop_machine, just an uber-complex mutex,
optimised for reads, with the overhead pushed to the rare reset path.

However, we do run the risk of a deadlock as we allocate underneath the
SRCU read lock, and the allocation may require a GPU reset, causing a
dependency cycle via the in-flight requests. We resolve that by declaring
the driver wedged and cancelling all in-flight rendering.

v2: Use expedited rcu barriers to match our earlier timing
characteristics.
v3: Try to annotate locking contexts for sparse
v4: Reduce selftest lock duration to avoid a reset deadlock with fences
v5: s/srcu/reset_backoff_srcu/
v6: Remove more stale comments

Testcase: igt/gem_mmap_gtt/hang
Fixes: eb8d0f5af4 ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-2-chris@chris-wilson.co.uk
2019-02-08 16:47:32 +00:00
Chris Wilson
21950ee7cc drm/i915: Pull i915_gem_active into the i915_active family
Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-02-05 17:20:11 +00:00
Chris Wilson
64d6c500a3 drm/i915: Generalise GPU activity tracking
We currently track GPU memory usage inside VMA, such that we never
release memory used by the GPU until after it has finished accessing it.
However, we may want to track other resources aside from VMA, or we may
want to split a VMA into multiple independent regions and track each
separately. For this purpose, generalise our request tracking (akin to
struct reservation_object) so that we can embed it into other objects.

v2: Tweak error handling during selftest setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk
2019-02-05 17:12:00 +00:00
Chris Wilson
a21f453c73 drm/i915/selftests: Exercise some AB...BA preemption chains
Build a chain using 2 contexts (A, B) then request a preemption such
that a later A request runs before the spinner in B.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205123835.25331-1-chris@chris-wilson.co.uk
2019-02-05 16:16:02 +00:00
Tvrtko Ursulin
c06ee6ff2c drm/i915/selftests: Context SSEU reconfiguration tests
Exercise the context image reconfiguration logic for idle and busy
contexts, with the resets thrown into the mix as well.

Free from the uAPI restrictions this test runs on all Gen9+ platforms
with slice power gating.

v2:
 * Rename some helpers for clarity.
 * Include subtest names in error logs.
 * Remove unnecessary function export.

v3:
 * Rebase for RUNTIME_INFO.

v4:
 * Fix incomplete unexport from v2. (Chris Wilson)

v5:
 * Rebased for runtime pm api changes.

v6:
 * Rebased for i915_reset.c.

v7:
 * Tidy checkpatch warnings.
 * Consolidate error checking and logging a bit.
 * Skip idle test phase if something failed before it.

v8:
 (Chris Wilson)
 * Fix i915_request_wait error handling.
 * No need to PIN_HIGH the VMA.
 * Remove pointless GEM_BUG_ON before pointer dereference.

v9:
 * Avoid rq leak if rpcs query fails. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # v6
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-5-tvrtko.ursulin@linux.intel.com
2019-02-05 11:32:03 +00:00
Tvrtko Ursulin
7810858412 drm/i915: Add timeline barrier support
Timeline barrier allows serialization between different timelines.

After calling i915_timeline_set_barrier with a request, all following
submissions on this timeline will be set up as depending on this request,
or barrier. Once the barrier has been completed it automatically gets
cleared and things continue as normal.

This facility will be used by the upcoming context SSEU code.

v2:
 * Assert barrier has been retired on timeline_fini. (Chris Wilson)
 * Fix mock_timeline.

v3:
 * Improved comment language. (Chris Wilson)

v4:
 * Maintain ordering with previous barriers set on the timeline.

v5:
 * Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-3-tvrtko.ursulin@linux.intel.com
2019-02-05 11:32:03 +00:00
Chris Wilson
789659f430 drm/i915: Drop fake breadcrumb irq
Missed breadcrumb detection is defunct due to the tight coupling with
dma_fence signaling and the myriad ways we may signal fences from
everywhere but from an interrupt, i.e. we frequently signal a fence
before we even see its interrupt. This means that even if we miss an
interrupt for a fence, it still is signaled before our breadcrumb
hangcheck fires, so simplify the breadcrumb hangchecking by moving it
into the GPU hangcheck and forgo fake interrupts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-3-chris@chris-wilson.co.uk
2019-01-29 21:45:30 +00:00
Chris Wilson
52c0fdb25c drm/i915: Replace global breadcrumbs with per-context interrupt tracking
A few years ago, see commit 688e6c7258 ("drm/i915: Slaughter the
thundering i915_wait_request herd"), the issue of handling multiple
clients waiting in parallel was brought to our attention. The
requirement was that every client should be woken immediately upon its
request being signaled, without incurring any cpu overhead.

To handle certain fragility of our hw meant that we could not do a
simple check inside the irq handler (some generations required almost
unbounded delays before we could be sure of seqno coherency) and so
request completion checking required delegation.

Before commit 688e6c7258, the solution was simple. Every client
waiting on a request would be woken on every interrupt and each would do
a heavyweight check to see if their request was complete. Commit
688e6c7258 introduced an rbtree so that only the earliest waiter on
the global timeline would woken, and would wake the next and so on.
(Along with various complications to handle requests being reordered
along the global timeline, and also a requirement for kthread to provide
a delegate for fence signaling that had no process context.)

The global rbtree depends on knowing the execution timeline (and global
seqno). Without knowing that order, we must instead check all contexts
queued to the HW to see which may have advanced. We trim that list by
only checking queued contexts that are being waited on, but still we
keep a list of all active contexts and their active signalers that we
inspect from inside the irq handler. By moving the waiters onto the fence
signal list, we can combine the client wakeup with the dma_fence
signaling (a dramatic reduction in complexity, but does require the HW
being coherent, the seqno must be visible from the cpu before the
interrupt is raised - we keep a timer backup just in case).

Having previously fixed all the issues with irq-seqno serialisation (by
inserting delays onto the GPU after each request instead of random delays
on the CPU after each interrupt), we can rely on the seqno state to
perfom direct wakeups from the interrupt handler. This allows us to
preserve our single context switch behaviour of the current routine,
with the only downside that we lose the RT priority sorting of wakeups.
In general, direct wakeup latency of multiple clients is about the same
(about 10% better in most cases) with a reduction in total CPU time spent
in the waiter (about 20-50% depending on gen). Average herd behaviour is
improved, but at the cost of not delegating wakeups on task_prio.

v2: Capture fence signaling state for error state and add comments to
warm even the most cold of hearts.
v3: Check if the request is still active before busywaiting
v4: Reduce the amount of pointer misdirection with list_for_each_safe
and using a local i915_request variable inside the loops
v5: Add a missing pluralisation to a purely informative selftest message.

References: 688e6c7258 ("drm/i915: Slaughter the thundering i915_wait_request herd")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
2019-01-29 21:45:22 +00:00
Chris Wilson
c9a6462288 drm/i915/execlists: Suppress preempting self
In order to avoid preempting ourselves, we currently refuse to schedule
the tasklet if we reschedule an inflight context. However, this glosses
over a few issues such as what happens after a CS completion event and
we then preempt the newly executing context with itself, or if something
else causes a tasklet_schedule triggering the same evaluation to
preempt the active context with itself.

However, when we avoid preempting ELSP[0], we still retain the preemption
value as it may match a second preemption request within the same time period
that we need to resolve after the next CS event. However, since we only
store the maximum preemption priority seen, it may not match the
subsequent event and so we should double check whether or not we
actually do need to trigger a preempt-to-idle by comparing the top
priorities from each queue. Later, this gives us a hook for finer
control over deciding whether the preempt-to-idle is justified.

The sequence of events where we end up preempting for no avail is:

1. Queue requests/contexts A, B
2. Priority boost A; no preemption as it is executing, but keep hint
3. After CS switch, B is less than hint, force preempt-to-idle
4. Resubmit B after idling

v2: We can simplify a bunch of tests based on the knowledge that PI will
ensure that earlier requests along the same context will have the highest
priority.
v3: Demonstrate the stale preemption hint with a selftest

References: a2bf92e8cc ("drm/i915/execlists: Avoid kicking priority on the current context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-4-chris@chris-wilson.co.uk
2019-01-29 20:00:05 +00:00
Chris Wilson
8547444137 drm/i915: Identify active requests
To allow requests to forgo a common execution timeline, one question we
need to be able to answer is "is this request running?". To track
whether a request has started on HW, we can emit a breadcrumb at the
beginning of the request and check its timeline's HWSP to see if the
breadcrumb has advanced past the start of this request. (This is in
contrast to the global timeline where we need only ask if we are on the
global timeline and if the timeline has advanced past the end of the
previous request.)

There is still confusion from a preempted request, which has already
started but relinquished the HW to a high priority request. For the
common case, this discrepancy should be negligible. However, for
identification of hung requests, knowing which one was running at the
time of the hang will be much more important.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk
2019-01-29 19:59:59 +00:00
Chris Wilson
06039d9820 drm/i915/selftests: Apply a subtest filter
In bringup on simulated HW even rudimentary tests are slow, and so many
may fail that we want to be able to filter out the noise to focus on the
specific problem. Even just the tests groups provided for igt is not
specific enough, and we would like to isolate one particular subtest
(and probably subsubtests!). For simplicity, allow the user to provide a
command line parameter such as

	i915.st_filter=i915_timeline_mock_selftests/igt_sync

to restrict ourselves to only running on subtest. The exact name to use
is given during a normal run, highlighted as an error if it failed,
debug otherwise. The test group is optional, and then all subtests are
compared for an exact match with the filter (most subtests have unique
names). The filter can be negated, e.g. i915.st_filter=!igt_sync and
then all tests but those that match will be run. More than one match can
be supplied separated by a comma, e.g.

	i915.st_filter=igt_vma_create,igt_vma_pin1

to only run those specified, or

	i915.st_filter=!igt_vma_create,!igt_vma_pin1

to run all but those named. Mixing a blacklist and whitelist will only
execute those subtests matching the whitelist so long as they are
previously excluded in the blacklist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-1-chris@chris-wilson.co.uk
2019-01-29 19:59:57 +00:00
Chris Wilson
5013eb8cd6 drm/i915: Track the context's seqno in its own timeline HWSP
Now that we have allocated ourselves a cacheline to store a breadcrumb,
we can emit a write from the GPU into the timeline's HWSP of the
per-context seqno as we complete each request. This drops the mirroring
of the per-engine HWSP and allows each context to operate independently.
We do not need to unwind the per-context timeline, and so requests are
always consistent with the timeline breadcrumb, greatly simplifying the
completion checks as we no longer need to be concerned about the
global_seqno changing mid check.

One complication though is that we have to be wary that the request may
outlive the HWSP and so avoid touching the potentially danging pointer
after we have retired the fence. We also have to guard our access of the
HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe.

At this point, we are emitting both per-context and global seqno and
still using the single per-engine execution timeline for resolving
interrupts.

v2: s/fake_complete/mark_complete/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk
2019-01-28 19:07:09 +00:00
Chris Wilson
8ba306a6a3 drm/i915: Share per-timeline HWSP using a slab suballocator
If we restrict ourselves to only using a cacheline for each timeline's
HWSP (we could go smaller, but want to avoid needless polluting
cachelines on different engines between different contexts), then we can
suballocate a single 4k page into 64 different timeline HWSP. By
treating each fresh allocation as a slab of 64 entries, we can keep it
around for the next 64 allocation attempts until we need to refresh the
slab cache.

John Harrison noted the issue of fragmentation leading to the same worst
case performance of one page per timeline as before, which can be
mitigated by adopting a freelist.

v2: Keep all partially allocated HWSP on a freelist

This is still without migration, so it is possible for the system to end
up with each timeline in its own page, but we ensure that no new
allocation would needless allocate a fresh page!

v3: Throw a selftest at the allocator to try and catch invalid cacheline
reuse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-4-chris@chris-wilson.co.uk
2019-01-28 19:07:06 +00:00
Chris Wilson
52954edd1f drm/i915: Allocate a status page for each timeline
Allocate a page for use as a status page by a group of timelines, as we
only need a dword of storage for each (rounded up to the cacheline for
safety) we can pack multiple timelines into the same page. Each timeline
will then be able to track its own HW seqno.

v2: Reuse the common per-engine HWSP for the solitary ringbuffer
timeline, so that we do not have to emit (using per-gen specialised
vfuncs) the breadcrumb into the distinct timeline HWSP and instead can
keep on using the common MI_STORE_DWORD_INDEX. However, to maintain the
sleight-of-hand for the global/per-context seqno switchover, we will
store both temporarily (and so use a custom offset for the shared timeline
HWSP until the switch over).

v3: Keep things simple and allocate a page for each timeline, page
sharing comes next.

v4: I was caught repeating the same MI_STORE_DWORD_IMM over and over
again in selftests.

v5: And caught red handed copying create timeline + check.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-3-chris@chris-wilson.co.uk
2019-01-28 19:07:02 +00:00
Chris Wilson
1e345568e3 drm/i915: Move list of timelines under its own lock
Currently, the list of timelines is serialised by the struct_mutex, but
to alleviate difficulties with using that mutex in future, move the
list management under its own dedicated mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-5-chris@chris-wilson.co.uk
2019-01-28 16:24:22 +00:00
Chris Wilson
0ca88ba0d6 drm/i915: Always allocate an object/vma for the HWSP
Currently we only allocate an object and vma if we are using a GGTT
virtual HWSP, and a plain struct page for a physical HWSP. For
convenience later on with global timelines, it will be useful to always
have the status page being tracked by a struct i915_vma. Make it so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk
2019-01-28 16:24:19 +00:00
Chris Wilson
528cbd17ce drm/i915: Move vma lookup to its own lock
Remove the struct_mutex requirement for looking up the vma for an
object.

v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk
2019-01-28 16:24:16 +00:00
Chris Wilson
09d7e46b97 drm/i915: Pull VM lists under the VM mutex.
A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk
2019-01-28 16:24:13 +00:00
Chris Wilson
499197dc16 drm/i915: Stop tracking MRU activity on VMA
Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
2019-01-28 16:24:09 +00:00
Chris Wilson
9b974bde4d drm/i915: Issue engine resets onto idle engines
Always perform the requested reset, even if we believe the engine is
idle. Presumably there was a reason the caller wanted the reset, and in
the near future we lose the easy tracking for whether the engine is
idle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-5-chris@chris-wilson.co.uk
2019-01-25 14:27:30 +00:00
Chris Wilson
f3dccbdbdd drm/i915/selftests: Trim struct_mutex duration for set-wedged selftest
Trim the struct_mutex hold and exclude the call to i915_gem_set_wedged()
as a reminder that it must be callable without struct_mutex held.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-4-chris@chris-wilson.co.uk
2019-01-25 14:27:29 +00:00
Chris Wilson
eb8d0f5af4 drm/i915: Remove GPU reset dependence on struct_mutex
Now that the submission backends are controlled via their own spinlocks,
with a wave of a magic wand we can lift the struct_mutex requirement
around GPU reset. That is we allow the submission frontend (userspace)
to keep on submitting while we process the GPU reset as we can suspend
the backend independently.

The major change is around the backoff/handoff strategy for performing
the reset. With no mutex deadlock, we no longer have to coordinate with
any waiter, and just perform the reset immediately.

Testcase: igt/gem_mmap_gtt/hang # regresses
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
2019-01-25 14:27:22 +00:00
Chris Wilson
e1a73a54a9 drm/i915: Measure the required reserved size for request emission
Instead of tediously and fragilely counting up the number of dwords
required to emit the breadcrumb to seal a request, fake a request and
measure it automatically once during engine setup.

The downside is that this requires a fair amount of mocking to create a
proper breadcrumb. Still, should be less error prone in future as the
breadcrumb size fluctuates!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125100520.20163-1-chris@chris-wilson.co.uk
2019-01-25 11:19:39 +00:00
Rodrigo Vivi
f42fb2317f Merge drm/drm-next into drm-intel-next-queued
We need avi infoframe stuff who got merged via drm-misc

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-01-22 14:51:36 -08:00
Chris Wilson
924090f423 drm/i915: Refactor out intel_context_init()
Prior to adding a third instance of intel_context_init() and extending
the information stored therewithin, refactor out the common assignments.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-8-chris@chris-wilson.co.uk
2019-01-22 13:13:53 +00:00
Chris Wilson
1579ab2de9 drm/i915/selftests: Use common mock_engine::advance
Replace the open-coding of advance with a call instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-19-chris@chris-wilson.co.uk
2019-01-22 13:13:53 +00:00
Chris Wilson
e4a8c8130b drm/i915/selftests: Refactor common live_test framework
Before adding yet another copy of struct live_test and its handler,
refactor the existing code into a common framework for live selftests.
For many live selftests, we want to know if the GPU hung or otherwise
misbehaved during the execution of the test (beyond any infraction in
the behaviour under test), live_test provides this by comparing the
GPU state before and after, alerting if it unexpectedly changed (e.g.
the reset counter changed). It also ensures that the GPU is idle before
and after the test, so that residual code running on the GPU is flushed
before testing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-5-chris@chris-wilson.co.uk
2019-01-22 13:01:20 +00:00
Chris Wilson
c95e7ce387 drm/i915/selftests: Create a clean GGTT for vma/gtt selftesting
Some tests (e.g. igt_vma_pin1) presume that we have a completely clean
GGTT so that it can probe boundaries without fear that something is
already allocated there. However, the mock device is starting to get
complicated and following similar rules to the live device, i.e. we
can't guarantee that i915->ggtt remains clean, so create a temporary
address_space equivalent to the mock ggtt for the purpose.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-7-chris@chris-wilson.co.uk
2019-01-22 12:50:39 +00:00
Chris Wilson
480cd6dd92 drm/i915/selftests: Track evict objects explicitly
During review of commit 71fc448c1a ("drm/i915/selftests: Make evict
tolerant of foreign objects"), Matthew mentioned it would be better if
we explicitly tracked the objects we created. We have an obj->st_link
hook for this purpose, so add the corresponding list of objects and
reduce our loops to only consider our own list.

References: 71fc448c1a ("drm/i915/selftests: Make evict tolerant of foreign objects")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-6-chris@chris-wilson.co.uk
2019-01-22 11:58:35 +00:00
Chris Wilson
209760b7f6 drm/i915/selftests: Allocate mock ring/timeline per context
To correctly simulate preemption between contexts, we need independent
timelines along each context. Make it so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190118190805.11792-1-chris@chris-wilson.co.uk
2019-01-18 20:39:27 +00:00
Chris Wilson
71fc448c1a drm/i915/selftests: Make evict tolerant of foreign objects
The evict selftests presumed that all objects in use had been allocated
by itself. This is a dubious claim and so instead of asserting complete
control over the object lists, take (temporary) ownership of them
instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190118113632.7056-1-chris@chris-wilson.co.uk
2019-01-18 12:37:56 +00:00
Chris Wilson
293f8c0f2b drm/i915: Use b->irq_enable() as predicate for mock engine
Since commit  d4ccceb055 ("drm/i915/icl: Ringbuffer interrupt handling")
we have required a mechanism to avoid touching the interrupt hardware
for breadcrumbs, superseding our mock interface for selftests.

The residual problem (ideas welcome) is in probing the mock ring
registers for ring_is_idle. Hmm, maybe we should just install
mock handlers for i915->uncore.mmio__write and friends? Only problem
being is that we would to truly mock some expected reads. :(

References: d4ccceb055 ("drm/i915/icl: Ringbuffer interrupt handling")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190118112225.13780-1-chris@chris-wilson.co.uk
2019-01-18 12:05:29 +00:00
Chris Wilson
8d71418595 drm/i915/selftests: Query the vm under test for hugepage support
Since we have the ppgtt we want to test, we can ask it directly if it is
suitable for the hugepage test we intend to undertake.

v2: Not everyone has full-ppgtt

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190117230512.4789-1-chris@chris-wilson.co.uk
2019-01-18 09:07:06 +00:00
Chris Wilson
9f58892ea9 drm/i915: Pull all the reset functionality together into i915_reset.c
Currently the code to reset the GPU and our state is spread widely
across a few files. Pull the logic together into a common file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk
2019-01-16 22:45:31 +00:00
Chris Wilson
18bb2bccb5 drm/i915: Serialise concurrent calls to i915_gem_set_wedged()
Make i915_gem_set_wedged() and i915_gem_unset_wedged() behaviour more
consistent if called concurrently, and only do the wedging and reporting
once, curtailing any possible race where we start unwedging in the middle
of a wedge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114210408.4561-2-chris@chris-wilson.co.uk
2019-01-16 15:24:16 +00:00
Chris Wilson
305dc3f983 drm/i915: Differentiate between ggtt->mutex and ppgtt->mutex
We have two classes of VM, global GTT and per-process GTT. In order to
allow ourselves the freedom to mix both along call chains, distinguish
the two classes with regards to their mutex and lockdep maps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114215956.32266-1-chris@chris-wilson.co.uk
2019-01-14 22:57:28 +00:00
Chris Wilson
d4225a535b drm/i915: Syntatic sugar for using intel_runtime_pm
Frequently, we use intel_runtime_pm_get/_put around a small block.
Formalise that usage by providing a macro to define such a block with an
automatic closure to scope the intel_runtime_pm wakeref to that block,
i.e. macro abuse smelling of python.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
2019-01-14 16:18:25 +00:00
Chris Wilson
c9d08cc3e3 drm/i915/selftests: Mark up rpm wakerefs
Track the temporary wakerefs used within the selftests so that leaks are
clear.

v2: Add a couple of coarse annotations for mock selftests as we now
loudly warn about the errors.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-14-chris@chris-wilson.co.uk
2019-01-14 16:18:20 +00:00
Chris Wilson
16e4dd0342 drm/i915: Markup paired operations on wakerefs
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).

For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,

v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
2019-01-14 16:17:53 +00:00
Chris Wilson
bd780f37a3 drm/i915: Track all held rpm wakerefs
Everytime we take a wakeref, record the stack trace of where it was
taken; clearing the set if we ever drop back to no owners. For debugging
a rpm leak, we can look at all the current wakerefs and check if they
have a matching rpm_put.

v2: Use skip=0 for unwinding the stack as it appears our noinline
function doesn't appear on the stack (nor does save_stack_trace itself!)
v3: Allow rpm->debug_count to disappear between inspections and so
avoid calling krealloc(0) as that may return a ZERO_PTR not NULL! (Mika)
v4: Show who last acquire/released the runtime pm

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-1-chris@chris-wilson.co.uk
2019-01-14 16:17:50 +00:00
Daniele Ceraolo Spurio
f663b0ca9b drm/i915/selftests: recreate WA lists inside the selftest
By using the wa lists inside the live driver structures, we won't
catch issues where those are incorrectly setup or corrupted.
To cover this gap, update the workaround framework to allow saving the
wa lists to independent structures and use them in the selftests.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190110013232.8972-1-daniele.ceraolospurio@intel.com
[tursulin: Fixup checkpatch whitespace complaint in memset.]
2019-01-10 09:15:18 +00:00
Dave Airlie
8c1a765bc6 drm-misc-next for 5.1:
UAPI Changes:
 
 Cross-subsystem Changes:
   - Turn dma-buf fence sequence numbers into 64 bit numbers
 
 Core Changes:
   - Move to a common helper for the DP MST hotplug for radeon, i915 and
     amdgpu
   - i2c improvements for drm_dp_mst
   - Removal of drm_syncobj_cb
   - Introduction of an helper to create and attach the TV margin properties
 
 Driver Changes:
   - Improve cache flushes for v3d
   - Reflection support for vc4
   - HDMI overscan support for vc4
   - Add implicit fencing support for rockchip and sun4i
   - Switch to generic fbdev emulation for virtio
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXDOTqAAKCRDj7w1vZxhR
 xZ8QAQD4j8m9Ea3bzY5Rr8BYUx1k+Cjj6Y6abZmot2rSvdyOHwD+JzJFIFAPZjdd
 uOKhLnDlubaaoa6OGPDQShjl9p3gyQE=
 =WQGO
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2019-01-07-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.1:

UAPI Changes:

Cross-subsystem Changes:
  - Turn dma-buf fence sequence numbers into 64 bit numbers

Core Changes:
  - Move to a common helper for the DP MST hotplug for radeon, i915 and
    amdgpu
  - i2c improvements for drm_dp_mst
  - Removal of drm_syncobj_cb
  - Introduction of an helper to create and attach the TV margin properties

Driver Changes:
  - Improve cache flushes for v3d
  - Reflection support for vc4
  - HDMI overscan support for vc4
  - Add implicit fencing support for rockchip and sun4i
  - Switch to generic fbdev emulation for virtio

Signed-off-by: Dave Airlie <airlied@redhat.com>

[airlied: applied amdgpu merge fixup]
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107180333.amklwycudbsub3s5@flea
2019-01-10 05:58:52 +10:00
Jani Nikula
3eb0930a42 Merge drm/drm-next into drm-intel-next-queued
Generally catch up with 5.0-rc1, and specifically get the changes:

96d4f267e4 ("Remove 'type' argument from access_ok() function")
0b2c8f8b6b ("i915: fix missing user_access_end() in page fault exception case")
594cc251fd ("make 'user_access_begin()' do 'access_ok()'")

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-01-08 10:50:22 +02:00
Chris Wilson
d58f0083d3 drm/i915/selftests: Mark the whole mock device as DMA capable
Being a mock device, we suffer no DMA restrictions, so set the coherent
mask to 64b.

v2: Fix up mock_huge_selftests

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109243
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107181856.23789-1-chris@chris-wilson.co.uk
2019-01-07 22:00:28 +00:00
Chris Wilson
55277e1f31 drm/i915: Always try to reset the GPU on takeover
When we first introduced the reset to sanitize the GPU on taking over
from the BIOS and before returning control to third parties (the BIOS!),
we restricted it to only systems utilizing HW contexts as we were
uncertain of how stable our reset mechanism truly was. We now have
reasonable coverage across all machines that expose a GPU reset method,
and so we should be safe to sanitize the GPU state everywhere.

v2: We _have_ to skip the reset if it would clobber the display.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190103112104.19561-1-chris@chris-wilson.co.uk
2019-01-03 12:40:42 +00:00
Chris Wilson
1225036831 drm/i915/selftests: Take a breath during check_partial_mappings()
With kasan on a slow machine, it can take an age to check all the
partial mappings in a single iteration, so break it up with a
cond_resched) to avoid RCU stall reports.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190102114431.23022-1-chris@chris-wilson.co.uk
2019-01-02 12:17:09 +00:00
Jani Nikula
0258404f9d drm/i915: start moving runtime device info to a separate struct
First move the low hanging fruit, the fields that are only initialized
runtime. Use RUNTIME_INFO() exclusively to access the fields.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c24fe7a4b0492a888690c46814c0ff21ce2f12b1.1546267488.git.jani.nikula@intel.com
2019-01-02 12:46:29 +02:00
Chris Wilson
ed2922c025 drm/i915: Remove redundant trailing request flush
Now that we perform the request flushing inline with emitting the
breadcrumb, we can remove the now redundant manual flush. And we can
also remove the infrastructure that remained only for its purpose.

v2: emit_breadcrumb_sz is in dwords, but rq->reserved_space is in bytes

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228171641.16531-1-chris@chris-wilson.co.uk
2018-12-31 15:35:45 +00:00
Arun KS
ca79b0c211 mm: convert totalram_pages and totalhigh_pages variables to atomic
totalram_pages and totalhigh_pages are made static inline function.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:47 -08:00