Commit Graph

3440 Commits

Author SHA1 Message Date
Nitin Gote
c156ef573e
drm/i915/gt: fix typos in i915/gt files.
Fix all typos in files under drm/i915/gt reported by codespell tool.

v2: Fix grammar mistake in comment. <Andi>

v3: Correct typo in commit log. <Krzysztof Niemiec>

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250120081517.3237326-2-nitin.r.gote@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-01-23 05:48:22 -05:00
Linus Torvalds
1d6d399223 Kthreads affinity follow either of 4 existing different patterns:
1) Per-CPU kthreads must stay affine to a single CPU and never execute
    relevant code on any other CPU. This is currently handled by smpboot
    code which takes care of CPU-hotplug operations. Affinity here is
    a correctness constraint.
 
 2) Some kthreads _have_ to be affine to a specific set of CPUs and can't
    run anywhere else. The affinity is set through kthread_bind_mask()
    and the subsystem takes care by itself to handle CPU-hotplug
    operations. Affinity here is assumed to be a correctness constraint.
 
 3) Per-node kthreads _prefer_ to be affine to a specific NUMA node. This
    is not a correctness constraint but merely a preference in terms of
    memory locality. kswapd and kcompactd both fall into this category.
    The affinity is set manually like for any other task and CPU-hotplug
    is supposed to be handled by the relevant subsystem so that the task
    is properly reaffined whenever a given CPU from the node comes up.
    Also care should be taken so that the node affinity doesn't cross
    isolated (nohz_full) cpumask boundaries.
 
 4) Similar to the previous point except kthreads have a _preferred_
    affinity different than a node. Both RCU boost kthreads and RCU
    exp kworkers fall into this category as they refer to "RCU nodes"
    from a distinctly distributed tree.
 
 Currently the preferred affinity patterns (3 and 4) have at least 4
 identified users, with more or less success when it comes to handle
 CPU-hotplug operations and CPU isolation. Each of which do it in its own
 ad-hoc way.
 
 This is an infrastructure proposal to handle this with the following API
 changes:
 
 _ kthread_create_on_node() automatically affines the created kthread to
   its target node unless it has been set as per-cpu or bound with
   kthread_bind[_mask]() before the first wake-up.
 
 - kthread_affine_preferred() is a new function that can be called right
   after kthread_create_on_node() to specify a preferred affinity
   different than the specified node.
 
 When the preferred affinity can't be applied because the possible
 targets are offline or isolated (nohz_full), the kthread is affine
 to the housekeeping CPUs (which means to all online CPUs most of the
 time or only the non-nohz_full CPUs when nohz_full= is set).
 
 kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
 converted, along with a few old drivers.
 
 Summary of the changes:
 
 * Consolidate a bunch of ad-hoc implementations of kthread_run_on_cpu()
 
 * Introduce task_cpu_fallback_mask() that defines the default last
   resort affinity of a task to become nohz_full aware
 
 * Add some correctness check to ensure kthread_bind() is always called
   before the first kthread wake up.
 
 * Default affine kthread to its preferred node.
 
 * Convert kswapd / kcompactd and remove their halfway working ad-hoc
   affinity implementation
 
 * Implement kthreads preferred affinity
 
 * Unify kthread worker and kthread API's style
 
 * Convert RCU kthreads to the new API and remove the ad-hoc affinity
   implementation.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmeNf8gACgkQhSRUR1CO
 jHedQQ/+IxTjjqQiItzrq41TES2S0desHDq8lNJFb7rsR/DtKFyLx3s67cOYV+cM
 Yx54QHg2m/Fz4nXMQ7Po5ygOtJGCKBc5C5QQy7y0lVKeTQK+daDfEtBSa3oG7j3C
 u+E3tTY6qxkbCzymUyaKkHN4/ay2vLvjFS50luV7KMyI3x47Aji+t7VdCX4LCPP2
 eAwOALWD0+7qLJ/VF6gsmQLKA4Qx7PQAzBa3KSBmUN9UcN8Gk1bQHCTIQKDHP9LQ
 v8BXrNZtYX1o2+snNYpX2z6/ECjxkdwriOgqqZY5306hd9RAQ1u46Dx3byrIqjGn
 ULG/XQ2istPyhTqb/h+RbrobdOcwEUIeqk8hRRbBXE8bPpqUz9EMuaCMxWDbQjgH
 NTuKG4ifKJ/IqstkkuDkdOiByE/ysMmwqrTXgSnu2ITNL9yY3BEgFbvA95hgo42s
 f7QCxEfZb1MHcNEMENSMwM3xw5lLMGMpxVZcMQ3gLwyotMBRrhFZm1qZJG7TITYW
 IDIeCbH4JOMdQwLs3CcWTXio0N5/85NhRNFV+IDn96OrgxObgnMtV8QwNgjXBAJ5
 wGeJWt8s34W1Zo3qS9gEuVzEhW4XaxISQQMkHe8faKkK6iHmIB/VjSQikDwwUNQ/
 AspYj82RyWBCDZsqhiYh71kpxjvS6Xp0bj39Ce1sNsOnuksxKkQ=
 =g8In
 -----END PGP SIGNATURE-----

Merge tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks

Pull kthread updates from Frederic Weisbecker:
 "Kthreads affinity follow either of 4 existing different patterns:

   1) Per-CPU kthreads must stay affine to a single CPU and never
      execute relevant code on any other CPU. This is currently handled
      by smpboot code which takes care of CPU-hotplug operations.
      Affinity here is a correctness constraint.

   2) Some kthreads _have_ to be affine to a specific set of CPUs and
      can't run anywhere else. The affinity is set through
      kthread_bind_mask() and the subsystem takes care by itself to
      handle CPU-hotplug operations. Affinity here is assumed to be a
      correctness constraint.

   3) Per-node kthreads _prefer_ to be affine to a specific NUMA node.
      This is not a correctness constraint but merely a preference in
      terms of memory locality. kswapd and kcompactd both fall into this
      category. The affinity is set manually like for any other task and
      CPU-hotplug is supposed to be handled by the relevant subsystem so
      that the task is properly reaffined whenever a given CPU from the
      node comes up. Also care should be taken so that the node affinity
      doesn't cross isolated (nohz_full) cpumask boundaries.

   4) Similar to the previous point except kthreads have a _preferred_
      affinity different than a node. Both RCU boost kthreads and RCU
      exp kworkers fall into this category as they refer to "RCU nodes"
      from a distinctly distributed tree.

  Currently the preferred affinity patterns (3 and 4) have at least 4
  identified users, with more or less success when it comes to handle
  CPU-hotplug operations and CPU isolation. Each of which do it in its
  own ad-hoc way.

  This is an infrastructure proposal to handle this with the following
  API changes:

   - kthread_create_on_node() automatically affines the created kthread
     to its target node unless it has been set as per-cpu or bound with
     kthread_bind[_mask]() before the first wake-up.

   - kthread_affine_preferred() is a new function that can be called
     right after kthread_create_on_node() to specify a preferred
     affinity different than the specified node.

  When the preferred affinity can't be applied because the possible
  targets are offline or isolated (nohz_full), the kthread is affine to
  the housekeeping CPUs (which means to all online CPUs most of the time
  or only the non-nohz_full CPUs when nohz_full= is set).

  kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
  converted, along with a few old drivers.

  Summary of the changes:

   - Consolidate a bunch of ad-hoc implementations of
     kthread_run_on_cpu()

   - Introduce task_cpu_fallback_mask() that defines the default last
     resort affinity of a task to become nohz_full aware

   - Add some correctness check to ensure kthread_bind() is always
     called before the first kthread wake up.

   - Default affine kthread to its preferred node.

   - Convert kswapd / kcompactd and remove their halfway working ad-hoc
     affinity implementation

   - Implement kthreads preferred affinity

   - Unify kthread worker and kthread API's style

   - Convert RCU kthreads to the new API and remove the ad-hoc affinity
     implementation"

* tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks:
  kthread: modify kernel-doc function name to match code
  rcu: Use kthread preferred affinity for RCU exp kworkers
  treewide: Introduce kthread_run_worker[_on_cpu]()
  kthread: Unify kthread_create_on_cpu() and kthread_create_worker_on_cpu() automatic format
  rcu: Use kthread preferred affinity for RCU boost
  kthread: Implement preferred affinity
  mm: Create/affine kswapd to its preferred node
  mm: Create/affine kcompactd to its preferred node
  kthread: Default affine kthread to its preferred NUMA node
  kthread: Make sure kthread hasn't started while binding it
  sched,arm64: Handle CPU isolation on last resort fallback rq selection
  arm64: Exclude nohz_full CPUs from 32bits el0 support
  lib: test_objpool: Use kthread_run_on_cpu()
  kallsyms: Use kthread_run_on_cpu()
  soc/qman: test: Use kthread_run_on_cpu()
  arm/bL_switcher: Use kthread_run_on_cpu()
2025-01-21 17:10:05 -08:00
John Harrison
709631924e drm/i915/uc: Include requested frequency in slow firmware load messages
To aid debug of sporadic issues, include the requested frequency in
the debug message as well as the actual frequency. That way we know
for certain that the clamping is not because the driver forgot to ask.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241221014329.4048408-1-John.C.Harrison@Intel.com
2025-01-17 11:29:08 -08:00
John Harrison
1113fc0e82 drm/i915: Add debug print about hw config table size
Add debug info to help investigate a very rare bug:
  https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13385

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241221011925.3944625-1-John.C.Harrison@Intel.com
2025-01-15 10:56:33 -08:00
Rodrigo Vivi
e0b0c6d207
drm/i915/guc/slpc: Print more SLPC debug status information
Let's peek on the Balancer and DCC status, now that we
are using the default strategies.

v2: fix identation
v3: fix typo (Vinay)

Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250110144640.1032250-2-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-01-13 11:32:33 -05:00
Rodrigo Vivi
4f7fad42aa
drm/i915/guc/slpc: Allow GuC SLPC default strategies on MTL+
The Balancer and DCC strategies were left off on a fear that
these strategies would conflict with the i915's waitboost.

However, on MTL and Beyond these strategies are only active in
certain conditions where the system is TDP limited.
So, they don't conflict, but help the
waitboost by guaranteeing a bit more of GT frequency.

Without these strategies we were likely leaving some performance
behind on some scenarios.

With this change in place, the enabling/disabling of DCC and Balancer
will now be chosen by GuC, on a platform/GT basis.

v2: - Fix typos and be clear on GuC decision on platform basis (Vinay)
    - Limit change to MTL and beyond, where GuC started to take
      TDP limit into consideration.
v3: Fix compilation. Actually amend the changes...

Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250110144640.1032250-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-01-13 11:32:25 -05:00
Dave Airlie
255e094a30 Merge tag 'drm-intel-gt-next-2025-01-10' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:

- More robust engine resets on Haswell and older (Nitin)

- Dead code removal (David)
- Selftest, logging and tracing improvements (Sk, Nitin, Sebastian,
  Apoorva)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z4DidoEACFu7D6iG@jlahtine-mobl.ger.corp.intel.com
2025-01-13 08:05:00 +10:00
Raag Jadav
9ef80eec5f drm/i915/selftest: Change throttle criteria for rps
Current live_rps_control() implementation errors out on throttling.
This was done with the assumption that throttling to minimum frequency
is a catastrophic failure, which is incorrect. Throttling can happen
due to variety of reasons and often times out of our control. Also,
the resulting frequency can be at any given point below the maximum
allowed. Change throttle criteria to reflect this logic and drop the
error, as it doesn't necessarily mean selftest failure.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250102110618.174415-1-raag.jadav@intel.com
2025-01-11 13:44:40 +01:00
Nitin Gote
6f0572fa8f drm/i915/gt: Prefer IS_ENABLED() instead of defined() on config option
Use IS_ENABLED() instead of defined() for checking whether a kconfig
option is enabled.

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250103062651.798249-2-nitin.r.gote@intel.com
2025-01-09 13:43:16 +01:00
Frederic Weisbecker
b04e317b52 treewide: Introduce kthread_run_worker[_on_cpu]()
kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2025-01-08 18:15:03 +01:00
Apoorva Singh
d6b24cc3e2
drm/i915/gt: Prevent uninitialized pointer reads
Initialize rq to NULL to prevent uninitialized pointer reads.

Signed-off-by: Apoorva Singh <apoorva.singh@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227112920.1547592-1-apoorva.singh@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-01-08 08:54:57 -05:00
Jani Nikula
6f0f335b73 Merge drm/drm-next into drm-intel-next
Backmerge to get the DRM DP payload and ACT helpers to drm-intel-next.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-01-07 18:07:54 +02:00
Dr. David Alan Gilbert
0a1584ec3d drm/i915: Remove unused intel_ring_cacheline_align
The last use of intel_ring_cacheline_align() was removed in 2017 by
commit afa8ce5b30 ("drm/i915: Nuke legacy flip queueing code")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-3-tursulin@igalia.com
2024-12-30 01:31:56 +01:00
Dr. David Alan Gilbert
64420d2f3e drm/i915: Remove unused intel_huc_suspend
intel_huc_suspend() was added in 2022 by
commit 27536e0327 ("drm/i915/huc: track delayed HuC load with a
fence")
but hasn't been used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-2-tursulin@igalia.com
2024-12-30 01:31:45 +01:00
Sebastian Brzezinka
835443da6f drm/i915/gt: Log reason for setting TAINT_WARN at reset
TAINT_WARN is used to notify CI about non-recoverable failures, which
require device to be restarted. In some cases, there is no sufficient
information about the reason for the restart. The test runner is just
killed, and DUT is rebooted, logging only 'probe with driver i915 failed
with error -4' to dmesg.

Printing error to dmesg before TAINT_WARN, would explain why the device
has been restarted, and what caused the malfunction in the first place.

Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241220131714.1309483-1-andi.shyti@linux.intel.com
2024-12-23 20:12:20 +01:00
Rodrigo Vivi
de7061947b
drm/i915/dg1: Fix power gate sequence.
sub-pipe PG is not present on DG1. Setting these bits can disable
other power gates and cause GPU hangs on video playbacks.

VLK: 16314, 4304

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13381
Fixes: 85a12d7eb8 ("drm/i915/tgl: Fix Media power gate sequence.")
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241219210019.70532-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-12-20 13:49:40 -05:00
Nitin Gote
5ed539e327 drm/i915/gt: Use ENGINE_TRACE for tracing.
Instead of drm_err(), prefer gt_err() and ENGINE_TRACE()
for GEM tracing in i915. So, it will be good to use ENGINE_TRACE()
over drm_err() drm_device based logging for engine debug log.

v2: Bit more specific in commit description (Andi)

v3: Use gt_err() along with ENGINE_TRACE() in place of drm_err() (Andi)

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241217100058.2819053-1-nitin.r.gote@intel.com
2024-12-20 11:42:26 +01:00
Sk Anirban
63b81a3a77 drm/i915/selftests: Implement frequency logging for energy reading validation
Add RC6 & RC0 frequency printing to ensure accurate energy
readings aimed at addressing GPU energy leaks and power
measurement failures.
Also update sleep time for RC6 mode to match RC0.

v2:
  - Improved commit message.
v3:
  - Used pr_err log to display frequency (Anshuman)
  - Sorted headers alphabetically (Sai Teja)
v4:
  - Improved commit message.
  - Fix pr_err log (Sai Teja)
v5:
  - Add error & debug logging for RC0 power and frequency checks (Anshuman)
v6:
  - Modify debug logging for RC0 power and frequency checks (Sai Teja)
v7:
  - Use pr_debug if RC0 power isn't measured but frequency is (Anshuman)
  - Improved commit message (Badal)
  - Change API to read actual frequency without applying forcewake (Badal)
  - Update sleep time for RC6 mode (Anshuman)

Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241129154716.2764974-1-sk.anirban@intel.com
2024-12-19 11:06:03 +01:00
Dave Airlie
301e277229 Merge tag 'drm-intel-gt-next-2024-12-18' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:

- More accurate engine busyness metrics with GuC submission (Umesh)
- Ensure partial BO segment offset never exceeds allowed max (Krzysztof)
- Flush GuC CT receive tasklet during reset preparation (Zhanjun)

- Code cleanups and refactoring (David, Lucas)
- Debugging improvements (Jesus)
- Selftest improvements (Sk)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z2KadNXgumx1aQMP@jlahtine-mobl.ger.corp.intel.com
2024-12-19 07:59:21 +10:00
Nitin Gote
512eadb334 drm/i915/gt: Increase a time to retry RING_HEAD reset
Issue seen again where engine resets fails because the engine resumes from
an incorrect RING_HEAD. HEAD is still not 0 even after writing into it.
This seems to be timing issue and we experimented different values from 5ms
to 50ms and found out that 50ms works best based on testing.
So, if write doesn't succeed at first then retry again.

v2: add a comment (Andi Shyti)

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12806
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241217063532.2729031-1-nitin.r.gote@intel.com
2024-12-18 15:26:09 +01:00
Jesus Narvaez
f373ebec18 drm/i915/guc: Update guc_err message to show outstanding g2h responses
Updating the guc_error message to show how many g2h responses
are still outstanding, in order to help with future debugging.

Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241213204720.3918056-1-jesus.narvaez@intel.com
2024-12-17 11:38:50 -08:00
Umesh Nerlige Ramappa
7ed047da59 i915/guc: Accumulate active runtime on gt reset
On gt reset, if a context is running, then accumulate it's active time
into the busyness counter since there will be no chance for the context
to switch out and update it's run time.

v2: Move comment right above the if (John)

Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-4-umesh.nerlige.ramappa@intel.com
2024-12-13 15:13:51 -08:00
Umesh Nerlige Ramappa
cf907f6d29 i915/guc: Ensure busyness counter increases motonically
Active busyness of an engine is calculated using gt timestamp and the
context switch in time. While capturing the gt timestamp, it's possible
that the context switches out. This race could result in an active
busyness value that is greater than the actual context runtime value by a
small amount. This leads to a negative delta and throws off busyness
calculations for the user.

If a subsequent count is smaller than the previous one, just return the
previous one, since we expect the busyness to catch up.

Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-3-umesh.nerlige.ramappa@intel.com
2024-12-13 15:13:50 -08:00
Umesh Nerlige Ramappa
abd318237f i915/guc: Reset engine utilization buffer before registration
On GT reset, we store total busyness counts for all engines and
re-register the utilization buffer with GuC. At that time we should
reset the buffer, so that we don't get spurious busyness counts on
subsequent queries.

To repro this issue, run igt@perf_pmu@busy-hang followed by
igt@perf_pmu@most-busy-idle-check-all for a couple iterations.

Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-2-umesh.nerlige.ramappa@intel.com
2024-12-13 15:13:49 -08:00
Rodrigo Vivi
e7f0a3a6f7
Merge drm/drm-next into drm-intel-next
Catching up with 6.13-rc2.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-12-11 15:06:05 -05:00
Sk Anirban
630e03808a drm/i915/selftests: Add delay to stabilize frequency in live_rps_power
Add delays to allow frequency stabilization before power measurement
to fix sporadic power conservation issues in live_rps_power test.

v2:
  - Move delay to respective function (Badal)

Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241203061114.2790448-1-sk.anirban@intel.com
2024-12-06 20:15:07 +05:30
Ville Syrjälä
106216c220 drm/i915/dpt: Evict all DPT VMAs on suspend
Currently intel_dpt_resume() tries to blindly rewrite all the
PTEs for currently bound DPT VMAs. That is problematic because
the CPU mapping for the DPT is only really guaranteed to exist
while the DPT object has been pinned. In the past we worked
around this issue by making DPT objects unshrinkable, but that
is undesirable as it'll waste physical RAM.

Let's instead forcefully evict all the DPT VMAs on suspend,
thus guaranteeing that intel_dpt_resume() has nothing to do.
To guarantee that all the DPT VMAs are evictable by
intel_dpt_suspend() we need to flush the cleanup workqueue
after the display output has been shut down.

And for good measure throw in a few extra WARNs to catch
any mistakes.

Cc: Brian Geffon <bgeffon@google.com>
Cc: Vidya Srinivas <vidya.srinivas@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241127061117.25622-4-ville.syrjala@linux.intel.com
Reviewed-by: Vidya Srinivas <vidya.srinivas@intel.com>
Tested-by: Vidya Srinivas <vidya.srinivas@intel.com>
2024-11-28 17:33:44 +02:00
Jani Nikula
40c9ad5f2d drm/i915/overlay: convert to struct intel_display
struct intel_display replaces struct drm_i915_private as the main
display device pointer. Convert overlay to it, as much as possible.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3680586c05e82fd01d173cfb4f8df015d6db663c.1732102179.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-11-22 13:56:35 +02:00
Linus Torvalds
28eb75e178 drm for 6.13-rc1
core:
 - split DSC helpers from DP helpers
 - clang build fixes for drm/mm test
 - drop simple pipeline support for gem vram
 - document submission error signaling
 - move drm_rect to drm core module from kms helper
 - add default client setup to most drivers
 - move to video aperture helpers instead of drm ones
 
 tests:
 - new framebuffer tests
 
 ttm:
 - remove swapped and pinned BOs from TTM lru
 
 panic:
 - fix uninit spinlock
 - add ABGR2101010 support
 
 bridge:
 - add TI TDP158 support
 - use standard PM OPS
 
 dma-fence:
 - use read_trylock instead of read_lock to help lockdep
 
 scheduler:
 - add errno to sched start to report different errors
 - add locking to drm_sched_entity_modify_sched
 - improve documentation
 
 xe:
 - add drm_line_printer
 - lots of refactoring
 - Enable Xe2 + PES disaggregation
 - add new ARL PCI ID
 - SRIOV development work
 - fix exec unnecessary implicit fence
 - define and parse OA sync props
 - forcewake refactoring
 
 i915:
 - Enable BMG/LNL ultra joiner
 - Enable 10bpx + CCS scanout on ICL+, fp16/CCS on TGL+
 - use DSB for plane/color mgmt
 - Arrow lake PCI IDs
 - lots of i915/xe display refactoring
 - enable PXP GuC autoteardown
 - Pantherlake (PTL) Xe3 LPD display enablement
 - Allow fastset HDR infoframe changes
 - write DP source OUI for non-eDP sinks
 - share PCI IDs between i915 and xe
 
 amdgpu:
 - SDMA queue reset support
 - SMU 13.0.6, JPEG 4.0.3 updates
 - Initial runtime repartitioning support
 - rework IP structs for multiple IP instances
 - Fetch EDID from _DDC if available
 - SMU13 zero rpm user control
 - lots of fixes/cleanups
 
 amdkfd:
 - Increase event FIFO size
 - add topology cap flag for per queue reset
 
 msm:
 - DPU:
 - SA8775P support
 - (disabled by default) MSM8917, MSM8937, MSM8953 and MSM8996 support
 - Enable large framebuffer support
 - Drop MSM8998 and SDM845
 - DP:
 - SA8775P support
 - GPU:
 - a7xx preemption support
 - Adreno A663 support
 
 ast:
 - warn about unsupported TX chips
 
 ivpu:
 - add coredump
 - add pantherlake support
 
 rockchip:
 - 4K@60Hz display enablement
 - generate pll programming tables
 
 panthor:
 - add timestamp query API
 - add realtime group priority
 - add fdinfo support
 
 etnaviv:
 - improve handling of DMA address limits
 - improve GPU hangcheck
 
 exynos:
 - Decon Exynos7870 support
 
 mediatek:
 - add OF graph support
 
 omap:
 - locking fixes
 
 bochs:
 - convert to gem/shmem from simpledrm
 
 v3d:
 - support big/super pages
 - add gemfs
 
 vc4:
 - BCM2712 support refactoring
 - add YUV444 format support
 
 udmabuf:
 - folio related fixes
 
 nouveau:
 - add panic support on nv50+
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmc+efwACgkQDHTzWXnE
 hr6Dyg/9HVVI3lxuWAz9MEt3w+BON5KTJAxg5Zhvc5DwiUbDXghu8sfkUfanDWS5
 /MqyPqLt5srXrtKTRDnzEI0Vf8YHeiDEcaydjpshEpCfteHZ7SADpvem8fp6/otV
 iYt8U6tMcGe9I+M2kwDkOTrKJIiyCKPi5hfBIAkxEAh6806ifPRtLkeMGbaSwBxH
 x6kZTE9ygGWAY7bAgbmVmm3JwrXG9mYDl9dW3cbi9gZ6PGAXHPZRUPvZoHhvfC2A
 UVgROH76Spm4rdWYGI3azj+gW3HsdGgUHcysb+lu37i261E+sT7kuV2UYtnOMzr5
 igO1RlQ+rcfPYLG4n+oNXDMu5d1OQXELrlQzXptym4Konpd7b/GSeVctWV0wHWuv
 nG8g7DWAFFnLAdeWqLZpf1Brze33h5+572D3BioWB4LYSEATjwoTwcBKsdRuc4Wk
 RHxjumCidybTdo/8EB1ElGlH39m/mDQA0scMlVhS/BuiIssfgcBRfltI8S3HzHcW
 YQYq6xH7F9E3shs3/TYbWR4clm66ZTnZV6ClDfGJolzyF/hbV0rsbeSpDelpooE8
 1Js7KuwVa+HvA4jtupY9vqxMTdXWwoGPfuUgKpOAreYibnd1T9Q1zVme/B1bUH05
 518IjiMGCxDnBvFWaPT9DcX4zg7pS3yzjw3hGkdz3reUqat0Gy8=
 =8cUI
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "There's a lot of rework, the panic helper support is being added to
  more drivers, v3d gets support for HW superpages, scheduler
  documentation, drm client and video aperture reworks, some new
  MAINTAINERS added, amdgpu has the usual lots of IP refactors, Intel
  has some Pantherlake enablement and xe is getting some SRIOV bits, but
  just lots of stuff everywhere.

  core:
   - split DSC helpers from DP helpers
   - clang build fixes for drm/mm test
   - drop simple pipeline support for gem vram
   - document submission error signaling
   - move drm_rect to drm core module from kms helper
   - add default client setup to most drivers
   - move to video aperture helpers instead of drm ones

  tests:
   - new framebuffer tests

  ttm:
   - remove swapped and pinned BOs from TTM lru

  panic:
   - fix uninit spinlock
   - add ABGR2101010 support

  bridge:
   - add TI TDP158 support
   - use standard PM OPS

  dma-fence:
   - use read_trylock instead of read_lock to help lockdep

  scheduler:
   - add errno to sched start to report different errors
   - add locking to drm_sched_entity_modify_sched
   - improve documentation

  xe:
   - add drm_line_printer
   - lots of refactoring
   - Enable Xe2 + PES disaggregation
   - add new ARL PCI ID
   - SRIOV development work
   - fix exec unnecessary implicit fence
   - define and parse OA sync props
   - forcewake refactoring

  i915:
   - Enable BMG/LNL ultra joiner
   - Enable 10bpx + CCS scanout on ICL+, fp16/CCS on TGL+
   - use DSB for plane/color mgmt
   - Arrow lake PCI IDs
   - lots of i915/xe display refactoring
   - enable PXP GuC autoteardown
   - Pantherlake (PTL) Xe3 LPD display enablement
   - Allow fastset HDR infoframe changes
   - write DP source OUI for non-eDP sinks
   - share PCI IDs between i915 and xe

  amdgpu:
   - SDMA queue reset support
   - SMU 13.0.6, JPEG 4.0.3 updates
   - Initial runtime repartitioning support
   - rework IP structs for multiple IP instances
   - Fetch EDID from _DDC if available
   - SMU13 zero rpm user control
   - lots of fixes/cleanups

  amdkfd:
   - Increase event FIFO size
   - add topology cap flag for per queue reset

  msm:
   - DPU:
      - SA8775P support
      - (disabled by default) MSM8917, MSM8937, MSM8953 and MSM8996 support
      - Enable large framebuffer support
      - Drop MSM8998 and SDM845
   - DP:
      - SA8775P support
   - GPU:
      - a7xx preemption support
      - Adreno A663 support

  ast:
   - warn about unsupported TX chips

  ivpu:
   - add coredump
   - add pantherlake support

  rockchip:
   - 4K@60Hz display enablement
   - generate pll programming tables

  panthor:
   - add timestamp query API
   - add realtime group priority
   - add fdinfo support

  etnaviv:
   - improve handling of DMA address limits
   - improve GPU hangcheck

  exynos:
   - Decon Exynos7870 support

  mediatek:
   - add OF graph support

  omap:
   - locking fixes

  bochs:
   - convert to gem/shmem from simpledrm

  v3d:
   - support big/super pages
   - add gemfs

  vc4:
   - BCM2712 support refactoring
   - add YUV444 format support

  udmabuf:
   - folio related fixes

  nouveau:
   - add panic support on nv50+"

* tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel: (1583 commits)
  drm/xe/guc: Fix dereference before NULL check
  drm/amd: Fix initialization mistake for NBIO 7.7.0
  Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"
  drm/amd/display: Fix failure to read vram info due to static BP_RESULT
  drm/amdgpu: enable GTT fallback handling for dGPUs only
  drm/amd/amdgpu: limit single process inside MES
  drm/fourcc: add AMD_FMT_MOD_TILE_GFX9_4K_D_X
  drm/amdgpu/mes12: correct kiq unmap latency
  drm/amdgpu: Support vcn and jpeg error info parsing
  drm/amd : Update MES API header file for v11 & v12
  drm/amd/amdkfd: add/remove kfd queues on start/stop KFD scheduling
  drm/amdkfd: change kfd process kref count at creation
  drm/amdgpu: Cleanup shift coding style
  drm/amd/amdgpu: Increase MES log buffer to dump mes scratch data
  drm/amdgpu: Implement virt req_ras_err_count
  drm/amdgpu: VF Query RAS Caps from Host if supported
  drm/amdgpu: Add msg handlers for SRIOV RAS Telemetry
  drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support
  drm/amd/display: 3.2.309
  drm/amd/display: Adjust VSDB parser for replay feature
  ...
2024-11-21 14:56:17 -08:00
Linus Torvalds
0352387523 First step of consolidating the VDSO data page handling:
The VDSO data page handling is architecture specific for historical
   reasons, but there is no real technical reason to do so.
 
   Aside of that VDSO data has become a dump ground for various mechanisms
   and fail to provide a clear separation of the functionalities.
 
   Clean this up by:
 
     * consolidating the VDSO page data by getting rid of architecture
       specific warts especially in x86 and PowerPC.
 
     * removing the last includes of header files which are pulling in other
       headers outside of the VDSO namespace.
 
     * seperating timekeeping and other VDSO data accordingly.
 
   Further consolidation of the VDSO page handling is done in subsequent
   changes scheduled for the next merge window.
 
   This also lays the ground for expanding the VDSO time getters for
   independent PTP clocks in a generic way without making every architecture
   add support seperately.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmc7kyoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVBjD/9awdN2YeCGIM9rlHIktUdNRmRSL2SL
 6av1CPffN5DenONYTXWrDYPkC4yfjUwIs8H57uzFo10yA7RQ/Qfq+O68k5GnuFew
 jvpmmYSZ6TT21AmAaCIhn+kdl9YbEJFvN2AWH85Bl29k9FGB04VzJlQMMjfEZ1a5
 Mhwv+cfYNuPSZmU570jcxW2XgbyTWlLZBByXX/Tuz9bwpmtszba507bvo45x6gIP
 twaWNzrsyJpdXfMrfUnRiChN8jHlDN7I6fgQvpsoRH5FOiVwIFo0Ip2rKbk+ONfD
 W/rcU5oeqRIxRVDHzf2Sv8WPHMCLRv01ZHBcbJOtgvZC3YiKgKYoeEKabu9ZL1BH
 6VmrxjYOBBFQHOYAKPqBuS7BgH5PmtMbDdSZXDfRaAKaCzhCRysdlWW7z48r2R//
 zPufb7J6Tle23AkuZWhFjvlGgSBl4zxnTFn31HYOyQps3TMI4y50Z2DhE/EeU8a6
 DRl8/k1KQVDUZ6udJogS5kOr1J8pFtUPrA2uhR8UyLdx7YKiCzcdO1qWAjtXlVe8
 oNpzinU+H9bQqGe9IyS7kCG9xNaCRZNkln5Q1WfnkTzg5f6ihfaCvIku3l4bgVpw
 3HmcxYiC6RxQB+ozwN7hzCCKT4L9aMhr/457TNOqRkj2Elw3nvJ02L4aI86XAKLE
 jwO9Fkp9qcCxCw==
 =q5eD
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull vdso data page handling updates from Thomas Gleixner:
 "First steps of consolidating the VDSO data page handling.

  The VDSO data page handling is architecture specific for historical
  reasons, but there is no real technical reason to do so.

  Aside of that VDSO data has become a dump ground for various
  mechanisms and fail to provide a clear separation of the
  functionalities.

  Clean this up by:

   - consolidating the VDSO page data by getting rid of architecture
     specific warts especially in x86 and PowerPC.

   - removing the last includes of header files which are pulling in
     other headers outside of the VDSO namespace.

   - seperating timekeeping and other VDSO data accordingly.

  Further consolidation of the VDSO page handling is done in subsequent
  changes scheduled for the next merge window.

  This also lays the ground for expanding the VDSO time getters for
  independent PTP clocks in a generic way without making every
  architecture add support seperately"

* tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/vdso: Add missing brackets in switch case
  vdso: Rename struct arch_vdso_data to arch_vdso_time_data
  powerpc: Split systemcfg struct definitions out from vdso
  powerpc: Split systemcfg data out of vdso data page
  powerpc: Add kconfig option for the systemcfg page
  powerpc/pseries/lparcfg: Use num_possible_cpus() for potential processors
  powerpc/pseries/lparcfg: Fix printing of system_active_processors
  powerpc/procfs: Propagate error of remap_pfn_range()
  powerpc/vdso: Remove offset comment from 32bit vdso_arch_data
  x86/vdso: Split virtual clock pages into dedicated mapping
  x86/vdso: Delete vvar.h
  x86/vdso: Access vdso data without vvar.h
  x86/vdso: Move the rng offset to vsyscall.h
  x86/vdso: Access rng vdso data without vvar.h
  x86/vdso: Access timens vdso data without vvar.h
  x86/vdso: Allocate vvar page from C code
  x86/vdso: Access rng data from kernel without vvar
  x86/vdso: Place vdso_data at beginning of vvar page
  x86/vdso: Use __arch_get_vdso_data() to access vdso data
  x86/mm/mmap: Remove arch_vma_name()
  ...
2024-11-19 16:09:13 -08:00
Linus Torvalds
4c797b11a8 vfs-6.13.file
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZzcW4gAKCRCRxhvAZXjc
 okF+AP9xTMb2SlnRPBOBd9yFcmVXmQi86TSCUPAEVb+wIldGYwD/RIOdvXYJlp9v
 RgJkU1DC3ddkXtONNDY6gFaP+siIWA0=
 =gMc7
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.13.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs file updates from Christian Brauner:
 "This contains changes the changes for files for this cycle:

   - Introduce a new reference counting mechanism for files.

     As atomic_inc_not_zero() is implemented with a try_cmpxchg() loop
     it has O(N^2) behaviour under contention with N concurrent
     operations and it is in a hot path in __fget_files_rcu().

     The rcuref infrastructures remedies this problem by using an
     unconditional increment relying on safe- and dead zones to make
     this work and requiring rcu protection for the data structure in
     question. This not just scales better it also introduces overflow
     protection.

     However, in contrast to generic rcuref, files require a memory
     barrier and thus cannot rely on *_relaxed() atomic operations and
     also require to be built on atomic_long_t as having massive amounts
     of reference isn't unheard of even if it is just an attack.

     This adds a file specific variant instead of making this a generic
     library.

     This has been tested by various people and it gives consistent
     improvement up to 3-5% on workloads with loads of threads.

   - Add a fastpath for find_next_zero_bit(). Skip 2-levels searching
     via find_next_zero_bit() when there is a free slot in the word that
     contains the next fd. This improves pts/blogbench-1.1.0 read by 8%
     and write by 4% on Intel ICX 160.

   - Conditionally clear full_fds_bits since it's very likely that a bit
     in full_fds_bits has been cleared during __clear_open_fds(). This
     improves pts/blogbench-1.1.0 read up to 13%, and write up to 5% on
     Intel ICX 160.

   - Get rid of all lookup_*_fdget_rcu() variants. They were used to
     lookup files without taking a reference count. That became invalid
     once files were switched to SLAB_TYPESAFE_BY_RCU and now we're
     always taking a reference count. Switch to an already existing
     helper and remove the legacy variants.

   - Remove pointless includes of <linux/fdtable.h>.

   - Avoid cmpxchg() in close_files() as nobody else has a reference to
     the files_struct at that point.

   - Move close_range() into fs/file.c and fold __close_range() into it.

   - Cleanup calling conventions of alloc_fdtable() and expand_files().

   - Merge __{set,clear}_close_on_exec() into one.

   - Make __set_open_fd() set cloexec as well instead of doing it in two
     separate steps"

* tag 'vfs-6.13.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests: add file SLAB_TYPESAFE_BY_RCU recycling stressor
  fs: port files to file_ref
  fs: add file_ref
  expand_files(): simplify calling conventions
  make __set_open_fd() set cloexec state as well
  fs: protect backing files with rcu
  file.c: merge __{set,clear}_close_on_exec()
  alloc_fdtable(): change calling conventions.
  fs/file.c: add fast path in find_next_fd()
  fs/file.c: conditionally clear full_fds
  fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd()
  move close_range(2) into fs/file.c, fold __close_range() into it
  close_files(): don't bother with xchg()
  remove pointless includes of <linux/fdtable.h>
  get rid of ...lookup...fdget_rcu() family
2024-11-18 10:30:29 -08:00
Daniele Ceraolo Spurio
db0fc586ed drm/i915/gsc: ARL-H and ARL-U need a newer GSC FW.
All MTL and ARL SKUs share the same GSC FW, but the newer platforms are
only supported in newer blobs. In particular, ARL-S is supported
starting from 102.0.10.1878 (which is already the minimum required
version for ARL in the code), while ARL-H and ARL-U are supported from
102.1.15.1926. Therefore, the driver needs to check which specific ARL
subplatform its running on when verifying that the GSC FW is new enough
for it.

Fixes: 2955ae8186 ("drm/i915: ARL requires a newer GSC firmware")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241028233132.149745-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 3c1d5ced18)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-11-12 09:44:55 +02:00
Zhanjun Dong
b939a08bc3 drm/i915/guc: Flush ct receive tasklet during reset preparation
GuC to host communication is interrupt driven, the handling has 3
parts: interrupt context, tasklet and request queue worker.
During GuC reset prepare, interrupt is disabled before destroy
contexts steps start. The IRQ and worker are flushed to finish
any outstanding in-progress message handling. But, the tasklet
flush is missing, it might causes 2 race conditions:
1. Tasklet runs after IRQ flushed, add request to queue after worker
flush started, causes unexpected G2H message request processing,
meanwhile, reset prepare code already get the context destroyed.
This will causes error reported about bad context state.
(https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11349 and
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12303)
2. Tasklet runs after intel_guc_submission_reset_prepare,
ct_try_receive_message start to run, while intel_uc_reset_prepare
already finished guc sanitize and set ct->enable to false. This will
causes warning on incorrect ct->enable state.
(https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12439)

Add the missing tasklet flush to flush all 3 parts.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104214103.214702-1-zhanjun.dong@intel.com
2024-11-06 12:29:30 -08:00
Dave Airlie
bf99ceb6e0 Merge tag 'drm-intel-next-2024-11-04' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 feature pull #2 for v6.13:

Features and functionality:

- Pantherlake (PTL) Xe3 LPD display enabling for xe driver (Clint, Suraj,
  Dnyaneshwar, Matt, Gustavo, Radhakrishna, Chaitanya, Haridhar, Juha-Pekka, Ravi)
- Enable dbuf overlap detection on Lunarlake and later (Stanislav, Vinod)
- Allow fastset for HDR infoframe changes (Chaitanya)
- Write DP source OUI also for non-eDP sinks (Imre)

Refactoring and cleanups:
- Independent platform identification for display (Jani)
- Display tracepoint fixes and cleanups (Gustavo)
- Share PCI ID headers between i915 and xe drivers (Jani)
- Use x100 version for full version and release checks (Jani)
- Conversions to struct intel_display (Jani, Ville)
- Reuse DP DPCD and AUX macros in gvt instead of duplication (Jani)
- Use string choice helpers (R Sundar, Sai Teja)
- Remove unused underrun detection irq code (Sai Teja)
- Color management debug improvements and other cleanups (Ville)
- Refactor panel fitter code to a separate file (Ville)
- Use try_cmpxchg() instead of open-coding (Uros Bizjak)

Fixes:
- PSR and Panel Replay fixes and workarounds (Jouni)
- Fix panel power during connector detection (Imre)
- Fix connector detection and modeset races (Imre)
- Fix C20 PHY TX MISC configuration (Gustavo)
- Improve panel fitter validity checks (Ville)
- Fix eDP short HPD interrupt handling while runtime suspended (Imre)
- Propagate DP MST DSC BW overhead/slice calculation errors (Imre)
- Stop hotplug polling for eDP connectors (Imre)
- Workaround panels reporting bad link status after PSR enable (Jouni)
- Panel Replay VRR VSC SDP related workaround and refactor (Animesh, Mitul)
- Fix memory leak on eDP init error path (Shuicheng)
- Fix GVT KVMGT Kconfig dependencies (Arnd Bergmann)
- Fix irq function documentation build warning (Rodrigo)
- Add platform check to power management fuse bit read (Clint)
- Revert kstrdup_const() and kfree_const() usage for clarity (Christophe JAILLET)
- Workaround horizontal odd panning issues in display versions 20 and 30 (Nemesa)
- Fix xe drive HDCP GSC firmware check (Suraj)

Merges:
- Backmerge drm-next to get some KVM changes (Rodrigo)
- Fix a build failure originating from previous backmerge (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/i915/display/intel_dp_mst.c
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87h68ni0wd.fsf@intel.com
2024-11-06 09:08:53 +10:00
Daniele Ceraolo Spurio
3c1d5ced18 drm/i915/gsc: ARL-H and ARL-U need a newer GSC FW.
All MTL and ARL SKUs share the same GSC FW, but the newer platforms are
only supported in newer blobs. In particular, ARL-S is supported
starting from 102.0.10.1878 (which is already the minimum required
version for ARL in the code), while ARL-H and ARL-U are supported from
102.1.15.1926. Therefore, the driver needs to check which specific ARL
subplatform its running on when verifying that the GSC FW is new enough
for it.

Fixes: 2955ae8186 ("drm/i915: ARL requires a newer GSC firmware")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241028233132.149745-1-daniele.ceraolospurio@intel.com
2024-11-05 14:26:21 -08:00
Dr. David Alan Gilbert
b7cfe79f06
drm/i915/gt: Remove unused execlists_unwind_incomplete_requests
execlists_unwind_incomplete_requests() is unused since 2021's
commit eb5e7da736 ("drm/i915/guc: Reset implementation for new GuC
interface")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241103144936.238116-1-linux@treblig.org
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-11-04 16:31:44 -05:00
Christian Brauner
90ee6ed776
fs: port files to file_ref
Port files to rely on file_ref reference to improve scaling and gain
overflow protection.

- We continue to WARN during get_file() in case a file that is already
  marked dead is revived as get_file() is only valid if the caller
  already holds a reference to the file. This hasn't changed just the
  check changes.

- The semantics for epoll and ttm's dmabuf usage have changed. Both
  epoll and ttm synchronize with __fput() to prevent the underlying file
  from beeing freed.

  (1) epoll

      Explaining epoll is straightforward using a simple diagram.
      Essentially, the mutex of the epoll instance needs to be taken in both
      __fput() and around epi_fget() preventing the file from being freed
      while it is polled or preventing the file from being resurrected.

          CPU1                                   CPU2
          fput(file)
          -> __fput(file)
             -> eventpoll_release(file)
                -> eventpoll_release_file(file)
                                                 mutex_lock(&ep->mtx)
                                                 epi_item_poll()
                                                 -> epi_fget()
                                                    -> file_ref_get(file)
                                                 mutex_unlock(&ep->mtx)
                   mutex_lock(&ep->mtx);
                   __ep_remove()
                   mutex_unlock(&ep->mtx);
             -> kmem_cache_free(file)

  (2) ttm dmabuf

      This explanation is a bit more involved. A regular dmabuf file stashed
      the dmabuf in file->private_data and the file in dmabuf->file:

          file->private_data = dmabuf;
          dmabuf->file = file;

      The generic release method of a dmabuf file handles file specific
      things:

          f_op->release::dma_buf_file_release()

      while the generic dentry release method of a dmabuf handles dmabuf
      freeing including driver specific things:

          dentry->d_release::dma_buf_release()

      During ttm dmabuf initialization in ttm_object_device_init() the ttm
      driver copies the provided struct dma_buf_ops into a private location:

          struct ttm_object_device {
                  spinlock_t object_lock;
                  struct dma_buf_ops ops;
                  void (*dmabuf_release)(struct dma_buf *dma_buf);
                  struct idr idr;
          };

          ttm_object_device_init(const struct dma_buf_ops *ops)
          {
                  // copy original dma_buf_ops in private location
                  tdev->ops = *ops;

                  // stash the release method of the original struct dma_buf_ops
                  tdev->dmabuf_release = tdev->ops.release;

                  // override the release method in the copy of the struct dma_buf_ops
                  // with ttm's own dmabuf release method
                  tdev->ops.release = ttm_prime_dmabuf_release;
          }

      When a new dmabuf is created the struct dma_buf_ops with the overriden
      release method set to ttm_prime_dmabuf_release is passed in exp_info.ops:

          DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
          exp_info.ops = &tdev->ops;
          exp_info.size = prime->size;
          exp_info.flags = flags;
          exp_info.priv = prime;

      The call to dma_buf_export() then sets

          mutex_lock_interruptible(&prime->mutex);
          dma_buf = dma_buf_export(&exp_info)
          {
                  dmabuf->ops = exp_info->ops;
          }
          mutex_unlock(&prime->mutex);

      which creates a new dmabuf file and then install a file descriptor to
      it in the callers file descriptor table:

          ret = dma_buf_fd(dma_buf, flags);

      When that dmabuf file is closed we now get:

          fput(file)
          -> __fput(file)
             -> f_op->release::dma_buf_file_release()
             -> dput()
                -> d_op->d_release::dma_buf_release()
                   -> dmabuf->ops->release::ttm_prime_dmabuf_release()
                      mutex_lock(&prime->mutex);
                      if (prime->dma_buf == dma_buf)
                            prime->dma_buf = NULL;
                      mutex_unlock(&prime->mutex);

      Where we can see that prime->dma_buf is set to NULL. So when we have
      the following diagram:

          CPU1                                                          CPU2
          fput(file)
          -> __fput(file)
             -> f_op->release::dma_buf_file_release()
             -> dput()
                -> d_op->d_release::dma_buf_release()
                   -> dmabuf->ops->release::ttm_prime_dmabuf_release()
                                                                        ttm_prime_handle_to_fd()
                                                                        mutex_lock_interruptible(&prime->mutex)
                                                                        dma_buf = prime->dma_buf
                                                                        dma_buf && get_dma_buf_unless_doomed(dma_buf)
                                                                        -> file_ref_get(dma_buf->file)
                                                                        mutex_unlock(&prime->mutex);

                      mutex_lock(&prime->mutex);
                      if (prime->dma_buf == dma_buf)
                            prime->dma_buf = NULL;
                      mutex_unlock(&prime->mutex);
             -> kmem_cache_free(file)

      The logic of the mechanism is the same as for epoll: sync with
      __fput() preventing the file from being freed. Here the
      synchronization happens through the ttm instance's prime->mutex.
      Basically, the lifetime of the dma_buf and the file are tighly
      coupled.

  Both (1) and (2) used to call atomic_inc_not_zero() to check whether
  the file has already been marked dead and then refuse to revive it.

  This is only safe because both (1) and (2) sync with __fput() and thus
  prevent kmem_cache_free() on the file being called and thus prevent
  the file from being immediately recycled due to SLAB_TYPESAFE_BY_RCU.

  Both (1) and (2) have been ported from atomic_inc_not_zero() to
  file_ref_get(). That means a file that is already in the process of
  being marked as FILE_REF_DEAD:

  file_ref_put()
  cnt = atomic_long_dec_return()
  -> __file_ref_put(cnt)
     if (cnt == FIlE_REF_NOREF)
             atomic_long_try_cmpxchg_release(cnt, FILE_REF_DEAD)

  can be revived again:

  CPU1                                                             CPU2
  file_ref_put()
  cnt = atomic_long_dec_return()
  -> __file_ref_put(cnt)
     if (cnt == FIlE_REF_NOREF)
                                                                   file_ref_get()
                                                                   // Brings reference back to FILE_REF_ONEREF
                                                                   atomic_long_add_negative()
             atomic_long_try_cmpxchg_release(cnt, FILE_REF_DEAD)

  This is fine and inherent to the file_ref_get()/file_ref_put()
  semantics. For both (1) and (2) this is safe because __fput() is
  prevented from making progress if file_ref_get() fails due to the
  aforementioned synchronization mechanisms.

  Two cases need to be considered that affect both (1) epoll and (2) ttm
  dmabuf:

   (i) fput()'s file_ref_put() and marks the file as FILE_REF_NOREF but
       before that fput() can mark the file as FILE_REF_DEAD someone
       manages to sneak in a file_ref_get() and brings the refcount back
       from FILE_REF_NOREF to FILE_REF_ONEREF. In that case the original
       fput() doesn't call __fput(). For epoll the poll will finish and
       for ttm dmabuf the file can be used again. For ttm dambuf this is
       actually an advantage because it avoids immediately allocating
       a new dmabuf object.

       CPU1                                                             CPU2
       file_ref_put()
       cnt = atomic_long_dec_return()
       -> __file_ref_put(cnt)
          if (cnt == FIlE_REF_NOREF)
                                                                        file_ref_get()
                                                                        // Brings reference back to FILE_REF_ONEREF
                                                                        atomic_long_add_negative()
                  atomic_long_try_cmpxchg_release(cnt, FILE_REF_DEAD)

  (ii) fput()'s file_ref_put() marks the file FILE_REF_NOREF and
       also suceeds in actually marking it FILE_REF_DEAD and then calls
       into __fput() to free the file.

       When either (1) or (2) call file_ref_get() they fail as
       atomic_long_add_negative() will return true.

       At the same time, both (1) and (2) all file_ref_get() under
       mutexes that __fput() must also acquire preventing
       kmem_cache_free() from freeing the file.

  So while this might be treated as a change in semantics for (1) and
  (2) it really isn't. It if should end up causing issues this can be
  fixed by adding a helper that does something like:

  long cnt = atomic_long_read(&ref->refcnt);
  do {
          if (cnt < 0)
                  return false;
  } while (!atomic_long_try_cmpxchg(&ref->refcnt, &cnt, cnt + 1));
  return true;

  which would block FILE_REF_NOREF to FILE_REF_ONEREF transitions.

- Jann correctly pointed out that kmem_cache_zalloc() cannot be used
  anymore once files have been ported to file_ref_t.

  The kmem_cache_zalloc() call will memset() the whole struct file to
  zero when it is reallocated. This will also set file->f_ref to zero
  which mens that a concurrent file_ref_get() can return true:

  CPU1                            CPU2
                                  __get_file_rcu()
                                    rcu_dereference_raw()
  close()
    [frees file]
  alloc_empty_file()
    kmem_cache_zalloc()
      [reallocates same file]
      memset(..., 0, ...)
                                    file_ref_get()
                                      [increments 0->1, returns true]
    init_file()
      file_ref_init(..., 1)
        [sets to 0]
                                    rcu_dereference_raw()
                                    fput()
                                      file_ref_put()
                                        [decrements 0->FILE_REF_NOREF, frees file]
    [UAF]

   causing a concurrent __get_file_rcu() call to acquire a reference to
   the file that is about to be reallocated and immediately freeing it
   on realizing that it has been recycled. This causes a UAF for the
   task that reallocated/recycled the file.

   This is prevented by switching from kmem_cache_zalloc() to
   kmem_cache_alloc() and initializing the fields manually. With
   file->f_ref initialized last.

   Note that a memset() also isn't guaranteed to atomically update an
   unsigned long so it's theoretically possible to see torn and
   therefore bogus counter values.

Link: https://lore.kernel.org/r/20241007-brauner-file-rcuref-v2-3-387e24dc9163@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-30 09:57:43 +01:00
Dave Airlie
c9ff14d033 Merge tag 'drm-intel-gt-next-2024-10-23' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:

Fixes/improvements/new stuff:

- Enable PXP GuC autoteardown flow [guc] (Juston Li)
- Retry RING_HEAD reset until it get sticks [gt] (Nitin Gote)
- Add basic PMU support for gen2 [pmu] (Ville Syrjälä)

Miscellaneous:

- Prevent a possible int overflow in wq offsets [guc] (Nikita Zhandarovich)
- PMU code cleanups (Lucas De Marchi)
- Fixed "CPU" -> "GPU" typo [gt] (Zhang He)
- Gen2/3 interrupt handling cleanup (Ville Syrjälä)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zxi-3wkIwI-Y1Qvj@linux
2024-10-25 05:57:38 +10:00
Jani Nikula
fcc2e8db7b drm/i915: remove all IS_<PLATFORM>_GT<N>() macros
There aren't many users for the IS_<PLATFORM>_GT<N>() macros, and many
of them are in fact unused. Even among the users, the platform check is
often redundant. Just remove the macros.

Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930124948.3551980-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-10-24 13:14:37 +03:00
Nitin Gote
6ef0e3ef26 drm/i915/gt: Retry RING_HEAD reset until it get sticks
we see an issue where resets fails because the engine resumes
from an incorrect RING_HEAD. Since the RING_HEAD doesn't point
to the remaining requests to re-run, but may instead point into
the uninitialised portion of the ring, the GPU may be then fed
invalid instructions from a privileged context, oft pushing the
GPU into an unrecoverable hang.

If at first the write doesn't succeed, try, try again.

v2: Avoid unnecessary timeout macro (Andi)

v3: Correct comment format (Andi)

v4: Make it generic for all platform as it won't impact (Chris)

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/5432
Testcase: igt/i915_selftest/hangcheck
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241015145710.2478599-1-nitin.r.gote@intel.com
2024-10-22 11:35:07 +02:00
Alan Previn
682c9d3d7a
drm/i915/pxp: Add missing tag for Wa_14019159160
Add missing tag for "Wa_14019159160 - Case 2" (for existing
PXP code that ensures run alone mode bit is set to allow
PxP-decryption.

 v5: - remove the max IP_VER check since new platforms that
       i915 supports needs this fix and tag the caller too
       (John Harrison).
 v4: - Include IP_VER 12.71. (Matt Roper)
 v3: - Check targeted platforms using IP_VAL. (John Harrison)
 v2: - Fix WA id number (John Harrison).
     - Improve comments and code to be specific
       for the targeted platforms (John Harrison)

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241016001658.2671225-1-alan.previn.teres.alexis@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-18 11:00:33 -04:00
Vincenzo Frascino
8fd236b00f drm: i915: Change fault type to unsigned long
Fault is currently of type u32 and with the introduction of the
generalized vdso/page.h we trigger the error below:

drivers/gpu/drm/i915/gt/intel_gt_print.h:29:36: error: format ‘%lx’ expects
argument of type ‘long unsigned int’, but argument 6 has type ‘u32’ {aka
‘unsigned int’} [-Werror=format=]
   29 |         drm_dbg(&(_gt)->i915->drm, "GT%u: " _fmt, (_gt)->info.id,
      |                                    ^~~~~~~~
include/drm/drm_print.h:424:39: note: in definition of macro ‘drm_dev_dbg’
  424 |         __drm_dev_dbg(NULL, dev, cat, fmt, ##__VA_ARGS__)
      |                                       ^~~
include/drm/drm_print.h:524:33: note: in expansion of macro ‘drm_dbg_driver’
  524 | #define drm_dbg(drm, fmt, ...)  drm_dbg_driver(drm, fmt, ##__VA_ARGS__)
      |                                 ^~~~~~~~~~~~~~
linux/drivers/gpu/drm/i915/gt/intel_gt_print.h:29:9: note: in expansion of macro
‘drm_dbg’
   29 |         drm_dbg(&(_gt)->i915->drm, "GT%u: " _fmt, (_gt)->info.id,
      |         ^~~~~~~
drivers/gpu/drm/i915/gt/intel_gt.c:310:25: note: in expansion of macro ‘gt_dbg’
  310 |                         gt_dbg(gt, "Unexpected fault\n"
      |                         ^~~~~~

This happens because the type of PAGE_MASK depends on the architecture.

Prevent the compilation error changing the 'fault' type to unsigned
long.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/all/20241014151340.1639555-2-vincenzo.frascino@arm.com
2024-10-16 00:13:04 +02:00
Ville Syrjälä
e217f22041 drm/i915/pmu: Add support for gen2
Implement pmu support for gen2 so that one can use intel_gpu_top
on it once again.

Gen2 lacks MI_MODE/MODE_IDLE so we'll have to do a bit more work
to determine the state of the engine:
- to determine if the ring contains unconsumed data we can simply
  compare RING_TAIL vs. RING_HEAD
- also check RING_HEAD vs. ACTHD to catch cases where the hardware
  is still executing a batch buffer but the ring head has already
  caught up with the tail. Not entirely sure if that's actually
  possible or not, but maybe it can happen if the batch buffer is
  initiated from the very end of the ring? But even if not strictly
  necessary there's no real harm in checking anyway.
- MI_WAIT_FOR_EVENT can be detected via a dedicated bit in RING_HEAD

v2: Use genX_ prefix rarther than suffix (Jani)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-5-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
2024-10-15 17:51:00 +03:00
Ville Syrjälä
bdc2917fbd drm/i915/gt: s/gen3/gen2/
Now that we use the gen3 codepaths also for gen2
rename everything to gen2_ to match.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2024-10-15 17:49:24 +03:00
Ville Syrjälä
259f5a9d1c drm/i915/gt: Nuke gen2_irq_{enable,disable}()
We've determined that accessing the (supposedly) 16bit
interrupt registers on gen2 as 32bit works just fine.
We already dropped the special case from the main interrupt
code, do so also for the gt interrupt stuff.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2024-10-15 17:33:15 +03:00
Ville Syrjälä
750a95407b drm/i915/irq: s/gen3/gen2/
Now that we use the gen3 codepaths also for gen2
rename everything to gen2_ to match.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2024-10-15 17:29:30 +03:00
Juston Li
b05f9847ff drm/i915/guc: Enable PXP GuC autoteardown flow
This feature flag enables GuC autoteardown which allows for a grace
period before session teardown.

Also add a HAS_PXP() helper to share with the other place that wants
to check.

Signed-off-by: Juston Li <juston.li@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240906174038.1468026-1-John.C.Harrison@Intel.com
2024-10-14 13:06:45 -07:00
Ville Syrjälä
0ddae025ab drm/i915: Disable compression tricks on JSL
Bspec asks us to disable some compression trick on JSL. While the
bspec description is pretty vague it looks like this is some extra
trick for 10bpc+ CCS which presumably the ICL derived display engine
doesn't support.

Note that we aren't currently exposing 10bpc CCS scanout support,
but once that gets added this presumably becomes an issue.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240918144445.5716-3-ville.syrjala@linux.intel.com
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2024-10-09 19:04:24 +03:00
Jani Nikula
8231ac7e72 drm/i915: use NULL for zero wakeref_t instead of plain integer 0
As of commit 2edc6a75f2 ("drm/i915: switch intel_wakeref_t underlying
type to struct ref_tracker *") we gained quite a few sparse warnings
about "Using plain integer as NULL pointer" for using 0 to initialize
wakeref_t. Switch to NULL everywhere.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241002181655.582597-1-jani.nikula@intel.com
2024-10-04 10:29:24 +03:00
Jani Nikula
de0cbc7418 drm/i915/irq: remove GEN8_IRQ_RESET_NDX() and GEN8_IRQ_INIT_NDX() macros
Define register offset triplets for all registers used with
GEN8_IRQ_RESET_NDX() and GEN8_IRQ_INIT_NDX() macros, and call the
underlying gen3_irq_reset() and gen3_irq_init() functions
directly. Remove the macros, along with the macro name concatenation
hackery.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241002102645.136155-3-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-10-03 15:42:41 +03:00
Jani Nikula
7a26b3f1f6 drm/i915/irq: remove GEN3_IRQ_RESET() and GEN3_IRQ_INIT() macros
Define register offset triplets for all registers used with
GEN3_IRQ_RESET() and GEN3_IRQ_INIT() macros, and call the underlying
gen3_irq_reset() and gen3_irq_init() functions directly. Remove the
macros, along with the macro name concatenation hackery.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241002102645.136155-2-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-10-03 15:42:41 +03:00
Jani Nikula
2edc6a75f2 drm/i915: switch intel_wakeref_t underlying type to struct ref_tracker *
For intel_wakeref_t, opaque is reasonable, but disguising the underlying
struct ref_tracker * as an unsigned long is not so great. Update the
typedef to remove one level of disguise.

Although the kernel coding style strongly discourages pointer typedefs,
it's a better alternative, and an incremental improvement on the status
quo. It provides much better type safety than an unsigned long could,
and prevents passing magic -1 instead of INTEL_WAKEREF_DEF. Moreover, it
provides a gradual path for replacing intel_wakeref_t with struct
ref_tracker * if desired.

As an extra safety measure, check for error pointers in
intel_ref_tracker_free() before passing them on to ref_tracker_free(),
to catch any mistakes with mock gt special wakeref value.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/cca2b0631f816ad90461aa1bf4fe3f80c0e13464.1726680898.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-30 17:54:13 +03:00
Jani Nikula
61dabe8234 drm/i915/gt: add a macro for mock gt wakeref special value and use it
Add a dedicated macro for the special mock gt wakeref value, with a cast
to intel_wakeref_t, instead of assuming you can assign or compare the
wakeref to -ENODEV directly.

Arguably the whole thing is a hack that should not exist, but at least
make it slightly less hacky.

Side note: If this value were to ever end up in
intel_ref_tracker_free(), it would wreak havoc.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1da887a6b4fe1ec45355571ea7b56d91fadf0af2.1726680898.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-30 17:54:11 +03:00
Jani Nikula
e056857125 Merge drm/drm-next into drm-intel-next
Sync to v6.12-rc1.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-30 11:49:10 +03:00
Linus Torvalds
de848da12f drm next for 6.12-rc1
string:
 - add mem_is_zero()
 
 core:
 - support more device numbers
 - use XArray for minor ids
 - add backlight constants
 - Split dma fence array creation into alloc and arm
 
 fbdev:
 - remove usage of old fbdev hooks
 
 kms:
 - Add might_fault() to drm_modeset_lock priming
 - Add dynamic per-crtc vblank configuration support
 
 dma-buf:
 - docs cleanup
 
 buddy:
 - Add start address support for trim function
 
 printk:
 - pass description to kmsg_dump
 
 scheduler;
 - Remove full_recover from drm_sched_start
 
 ttm:
 - Make LRU walk restartable after dropping locks
 - Allow direct reclaim to allocate local memory
 
 panic:
 - add display QR code (in rust)
 
 displayport:
 - mst: GUID improvements
 
 bridge:
 - Silence error message on -EPROBE_DEFER
 - analogix: Clean aup
 - bridge-connector: Fix double free
 - lt6505: Disable interrupt when powered off
 - tc358767: Make default DP port preemphasis configurable
 - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
 - anx7625: simplify OF array handling
 - dw-hdmi: simplify clock handling
 - lontium-lt8912b: fix mode validation
 - nwl-dsi: fix mode vsync/hsync polarity
 
 xe:
 - Enable LunarLake and Battlemage support
 - Introducing Xe2 ccs modifiers for integrated and discrete graphics
 - rename xe perf to xe observation
 - use wb caching on DGFX for system memory
 - add fence timeouts
 - Lunar Lake graphics/media/display workarounds
 - Battlemage workarounds
 - Battlemage GSC support
 - GSC and HuC fw updates for LL/BM
 - use dma_fence_chain_free
 - refactor hw engine lookup and mmio access
 - enable priority mem read for Xe2
 - Add first GuC BMG fw
 - fix dma-resv lock
 - Fix DGFX display suspend/resume
 - Use xe_managed for kernel BOs
 - Use reserved copy engine for user binds on faulting devices
 - Allow mixing dma-fence jobs and long-running faulting jobs
 - fix media TLB invalidation
 - fix rpm in TTM swapout path
 - track resources and VF state by PF
 
 i915:
 - Type-C programming fix for MTL+
 - FBC cleanup
 - Calc vblank delay more accurately
 - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
 - Fix DP LTTPR detection
 - limit relocations to INT_MAX
 - fix long hangs in buddy allocator on DG2/A380
 
 amdgpu:
 - Per-queue reset support
 - SDMA devcoredump support
 - DCN 4.0.1 updates
 - GFX12/VCN4/JPEG4 updates
 - Convert vbios embedded EDID to drm_edid
 - GFX9.3/9.4 devcoredump support
 - process isolation framework for GFX 9.4.3/4
 - take IOMMU mappings into account for P2P DMA
 
 amdkfd:
 - CRIU fixes
 - HMM fix
 - Enable process isolation support for GFX 9.4.3/4
 - Allow users to target recommended SDMA engines
 - KFD support for targetting queues on recommended SDMA engines
 
 radeon:
 - remove .load and drm_dev_alloc
 - Fix vbios embedded EDID size handling
 - Convert vbios embedded EDID to drm_edid
 - Use GEM references instead of TTM
 - r100 cp init cleanup
 - Fix potential overflows in evergreen CS offset tracking
 
 msm:
 - DPU:
 - implement DP/PHY mapping on SC8180X
 - Enable writeback on SM8150, SC8180X, SM6125, SM6350
 - DP:
 - Enable widebus on all relevant chipsets
 - MSM8998 HDMI support
 - GPU:
 - A642L speedbin support
 - A615/A306/A621 support
 - A7xx devcoredump support
 
 ast:
 - astdp: Support AST2600 with VGA
 - Clean up HPD
 - Fix timeout loop for DP link training
 - reorganize output code by type (VGA, DP, etc)
 - convert to struct drm_edid
 - fix BMC handling for all outputs
 
 exynos:
 - drop stale MAINTAINERS pattern
 - constify struct
 
 loongson:
 - use GEM refcount over TTM
 
 mgag200:
 - Improve BMC handling
 - Support VBLANK intterupts
 - transparently support BMC outputs
 
 nouveau:
 - Refactor and clean up internals
 - Use GEM refcount over TTM's
 
 gm12u320:
 - convert to struct drm_edid
 
 gma500:
 - update i2c terms
 
 lcdif:
 - pixel clock fix
 
 host1x:
 - fix syncpoint IRQ during resume
 - use iommu_paging_domain_alloc()
 
 imx:
 - ipuv3: convert to struct drm_edid
 
 omapdrm:
 - improve error handling
 - use common helper for_each_endpoint_of_node()
 
 panel:
 - add support for BOE TV101WUM-LL2 plus DT bindings
 - novatek-nt35950: improve error handling
 - nv3051d: improve error handling
 - panel-edp: add support for BOE NE140WUM-N6G; revert support for
   SDC ATNA45AF01
 - visionox-vtdr6130: improve error handling; use
   devm_regulator_bulk_get_const()
 - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
   DT; Fix porch parameter
 - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
   BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
   CMN N116BCP-EA2, CSW MNB601LS1-4
 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
 - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
 - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
   for code sharing
 - panel-edp: fix name for HKC MB116AN01
 - jd9365da: fix "exit sleep" commands
 - jdi-fhd-r63452: simplify error handling with DSI multi-style
   helpers
 - mantix-mlaf057we51: simplify error handling with DSI multi-style
   helpers
 - simple:
   support Innolux G070ACE-LH3 plus DT bindings
   support On Tat Industrial Company KD50G21-40NT-A1 plus DT bindings
 - st7701:
   decouple DSI and DRM code
   add SPI support
   support Anbernic RG28XX plus DT bindings
 
 mediatek:
 - support alpha blending
 - remove cl in struct cmdq_pkt
 - ovl adaptor fix
 - add power domain binding for mediatek DPI controller
 
 renesas:
 - rz-du: add support for RZ/G2UL plus DT bindings
 
 rockchip:
 - Improve DP sink-capability reporting
 - dw_hdmi: Support 4k@60Hz
 - vop: Support RGB display on Rockchip RK3066; Support 4096px width
 
 sti:
 - convert to struct drm_edid
 
 stm:
 - Avoid UAF wih managed plane and CRTC helpers
 - Fix module owner
 - Fix error handling in probe
 - Depend on COMMON_CLK
 - ltdc: Fix transparency after disabling plane; Remove unused interrupt
 
 tegra:
 - gr3d: improve PM domain handling
 - convert to struct drm_edid
 - Call drm_atomic_helper_shutdown()
 
 vc4:
 - fix PM during detect
 - replace DRM_ERROR() with drm_error()
 - v3d: simplify clock retrieval
 
 v3d:
 - Clean up perfmon
 
 virtio:
 - add DRM capset
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmbq43gACgkQDHTzWXnE
 hr4+lg/+O/r41E7ioitcM0DWeWem0dTlvQr41pJ8jujHvw+bXNdg0BMGWtsTyTLA
 eOft2AwofsFjg+O7l8IFXOT37mQLdIdfjb3+w5brI198InL3OWC3QV8ZSwY9VGET
 n8crO9jFoxNmHZnFniBZbtI6egTyl6H+2ey3E0MTnKiPUKZQvsK/4+x532yVLPob
 UUOze5wcjyGZc7LJEIZPohPVneCb9ki7sabDQqh4cxIQ0Eg+nqPpWjYM4XVd+lTS
 8QmssbR49LrJ7z9m90qVE+8TjYUCn+ChDPMs61KZAAnc8k++nK41btjGZ23mDKPb
 YEguahCYthWJ4U8K18iXBPnLPxZv5+harQ8OIWAUYqdIOWSXHozvuJ2Z84eHV13a
 9mQ5vIymXang8G1nEXwX/vml9uhVhBCeWu3qfdse2jfaTWYUb1YzhqUoFvqI0R0K
 8wT03MyNdx965CSqAhpH5Jd559ueZmpd+jsHOfhAS+1gxfD6NgoPXv7lpnMUmGWX
 SnaeC9RLD4cgy7j2Swo7TEqQHrvK5XhZSwX94kU6RPmFE5RRKqWgFVQmwuikDMId
 UpNqDnPT5NL2UX4TNG4V4coyTXvKgVcSB9TA7j8NSLfwdGHhiz73pkYosaZXKyxe
 u6qKMwMONfZiT20nhD7RhH0AFnnKosAcO14dhn0TKFZPY6Ce9O8=
 =7jR+
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "This adds a couple of patches outside the drm core, all should be
  acked appropriately, the string and pstore ones are the main ones that
  come to mind.

  Otherwise it's the usual drivers, xe is getting enabled by default on
  some new hardware, we've changed the device number handling to allow
  more devices, and we added some optional rust code to create QR codes
  in the panic handler, an idea first suggested I think 10 years ago :-)

  string:
   - add mem_is_zero()

  core:
   - support more device numbers
   - use XArray for minor ids
   - add backlight constants
   - Split dma fence array creation into alloc and arm

  fbdev:
   - remove usage of old fbdev hooks

  kms:
   - Add might_fault() to drm_modeset_lock priming
   - Add dynamic per-crtc vblank configuration support

  dma-buf:
   - docs cleanup

  buddy:
   - Add start address support for trim function

  printk:
   - pass description to kmsg_dump

  scheduler:
   - Remove full_recover from drm_sched_start

  ttm:
   - Make LRU walk restartable after dropping locks
   - Allow direct reclaim to allocate local memory

  panic:
   - add display QR code (in rust)

  displayport:
   - mst: GUID improvements

  bridge:
   - Silence error message on -EPROBE_DEFER
   - analogix: Clean aup
   - bridge-connector: Fix double free
   - lt6505: Disable interrupt when powered off
   - tc358767: Make default DP port preemphasis configurable
   - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
   - anx7625: simplify OF array handling
   - dw-hdmi: simplify clock handling
   - lontium-lt8912b: fix mode validation
   - nwl-dsi: fix mode vsync/hsync polarity

  xe:
   - Enable LunarLake and Battlemage support
   - Introducing Xe2 ccs modifiers for integrated and discrete graphics
   - rename xe perf to xe observation
   - use wb caching on DGFX for system memory
   - add fence timeouts
   - Lunar Lake graphics/media/display workarounds
   - Battlemage workarounds
   - Battlemage GSC support
   - GSC and HuC fw updates for LL/BM
   - use dma_fence_chain_free
   - refactor hw engine lookup and mmio access
   - enable priority mem read for Xe2
   - Add first GuC BMG fw
   - fix dma-resv lock
   - Fix DGFX display suspend/resume
   - Use xe_managed for kernel BOs
   - Use reserved copy engine for user binds on faulting devices
   - Allow mixing dma-fence jobs and long-running faulting jobs
   - fix media TLB invalidation
   - fix rpm in TTM swapout path
   - track resources and VF state by PF

  i915:
   - Type-C programming fix for MTL+
   - FBC cleanup
   - Calc vblank delay more accurately
   - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
   - Fix DP LTTPR detection
   - limit relocations to INT_MAX
   - fix long hangs in buddy allocator on DG2/A380

  amdgpu:
   - Per-queue reset support
   - SDMA devcoredump support
   - DCN 4.0.1 updates
   - GFX12/VCN4/JPEG4 updates
   - Convert vbios embedded EDID to drm_edid
   - GFX9.3/9.4 devcoredump support
   - process isolation framework for GFX 9.4.3/4
   - take IOMMU mappings into account for P2P DMA

  amdkfd:
   - CRIU fixes
   - HMM fix
   - Enable process isolation support for GFX 9.4.3/4
   - Allow users to target recommended SDMA engines
   - KFD support for targetting queues on recommended SDMA engines

  radeon:
   - remove .load and drm_dev_alloc
   - Fix vbios embedded EDID size handling
   - Convert vbios embedded EDID to drm_edid
   - Use GEM references instead of TTM
   - r100 cp init cleanup
   - Fix potential overflows in evergreen CS offset tracking

  msm:
   - DPU:
      - implement DP/PHY mapping on SC8180X
      - Enable writeback on SM8150, SC8180X, SM6125, SM6350
   - DP:
      - Enable widebus on all relevant chipsets
      - MSM8998 HDMI support
   - GPU:
      - A642L speedbin support
      - A615/A306/A621 support
      - A7xx devcoredump support

  ast:
   - astdp: Support AST2600 with VGA
   - Clean up HPD
   - Fix timeout loop for DP link training
   - reorganize output code by type (VGA, DP, etc)
   - convert to struct drm_edid
   - fix BMC handling for all outputs

  exynos:
   - drop stale MAINTAINERS pattern
   - constify struct

  loongson:
   - use GEM refcount over TTM

  mgag200:
   - Improve BMC handling
   - Support VBLANK intterupts
   - transparently support BMC outputs

  nouveau:
   - Refactor and clean up internals
   - Use GEM refcount over TTM's

  gm12u320:
   - convert to struct drm_edid

  gma500:
   - update i2c terms

  lcdif:
   - pixel clock fix

  host1x:
   - fix syncpoint IRQ during resume
   - use iommu_paging_domain_alloc()

  imx:
   - ipuv3: convert to struct drm_edid

  omapdrm:
   - improve error handling
   - use common helper for_each_endpoint_of_node()

  panel:
   - add support for BOE TV101WUM-LL2 plus DT bindings
   - novatek-nt35950: improve error handling
   - nv3051d: improve error handling
   - panel-edp:
      - add support for BOE NE140WUM-N6G
      - revert support for SDC ATNA45AF01
   - visionox-vtdr6130:
      - improve error handling
      - use devm_regulator_bulk_get_const()
   - boe-th101mb31ig002:
      - Support for starry-er88577 MIPI-DSI panel plus DT
      - Fix porch parameter
   - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE
     NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
     CMN N116BCP-EA2, CSW MNB601LS1-4
   - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
   - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
   - jd9365da:
      - Support Melfas lmfbx101117480 MIPI-DSI panel plus DT
      - Refactor for code sharing
   - panel-edp: fix name for HKC MB116AN01
   - jd9365da: fix "exit sleep" commands
   - jdi-fhd-r63452: simplify error handling with DSI multi-style
     helpers
   - mantix-mlaf057we51: simplify error handling with DSI multi-style
     helpers
   - simple:
      - support Innolux G070ACE-LH3 plus DT bindings
      - support On Tat Industrial Company KD50G21-40NT-A1 plus DT
        bindings
   - st7701:
      - decouple DSI and DRM code
      - add SPI support
      - support Anbernic RG28XX plus DT bindings

  mediatek:
   - support alpha blending
   - remove cl in struct cmdq_pkt
   - ovl adaptor fix
   - add power domain binding for mediatek DPI controller

  renesas:
   - rz-du: add support for RZ/G2UL plus DT bindings

  rockchip:
   - Improve DP sink-capability reporting
   - dw_hdmi: Support 4k@60Hz
   - vop:
      - Support RGB display on Rockchip RK3066
      - Support 4096px width

  sti:
   - convert to struct drm_edid

  stm:
   - Avoid UAF wih managed plane and CRTC helpers
   - Fix module owner
   - Fix error handling in probe
   - Depend on COMMON_CLK
   - ltdc:
      - Fix transparency after disabling plane
      - Remove unused interrupt

  tegra:
   - gr3d: improve PM domain handling
   - convert to struct drm_edid
   - Call drm_atomic_helper_shutdown()

  vc4:
   - fix PM during detect
   - replace DRM_ERROR() with drm_error()
   - v3d: simplify clock retrieval

  v3d:
   - Clean up perfmon

  virtio:
   - add DRM capset"

* tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel: (1326 commits)
  drm/xe: Fix missing conversion to xe_display_pm_runtime_resume
  drm/xe/xe2hpg: Add Wa_15016589081
  drm/xe: Don't keep stale pointer to bo->ggtt_node
  drm/xe: fix missing 'xe_vm_put'
  drm/xe: fix build warning with CONFIG_PM=n
  drm/xe: Suppress missing outer rpm protection warning
  drm/xe: prevent potential UAF in pf_provision_vf_ggtt()
  drm/amd/display: Add all planes on CRTC to state for overlay cursor
  drm/i915/bios: fix printk format width
  drm/i915/display: Fix BMG CCS modifiers
  drm/amdgpu: get rid of bogus includes of fdtable.h
  drm/amdkfd: CRIU fixes
  drm/amdgpu: fix a race in kfd_mem_export_dmabuf()
  drm: new helper: drm_gem_prime_handle_to_dmabuf()
  drm/amdgpu/atomfirmware: Silence UBSAN warning
  drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare'
  drm/amd/amdgpu: apply command submission parser for JPEG v1
  drm/amd/amdgpu: apply command submission parser for JPEG v2+
  drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
  drm/amd/pm: update the features set on smu v14.0.2/3
  ...
2024-09-19 10:18:15 +02:00
Linus Torvalds
9ea925c806 Updates for timers and timekeeping:
- Core:
 
 	- Overhaul of posix-timers in preparation of removing the
 	  workaround for periodic timers which have signal delivery
 	  ignored.
 
         - Remove the historical extra jiffie in msleep()
 
 	  msleep() adds an extra jiffie to the timeout value to ensure
 	  minimal sleep time. The timer wheel ensures minimal sleep
 	  time since the large rewrite to a non-cascading wheel, but the
 	  extra jiffie in msleep() remained unnoticed. Remove it.
 
         - Make the timer slack handling correct for realtime tasks.
 
 	  The procfs interface is inconsistent and does neither reflect
 	  reality nor conforms to the man page. Show the correct 0 slack
 	  for real time tasks and enforce it at the core level instead of
 	  having inconsistent individual checks in various timer setup
 	  functions.
 
         - The usual set of updates and enhancements all over the place.
 
   - Drivers:
 
         - Allow the ACPI PM timer to be turned off during suspend
 
 	- No new drivers
 
 	- The usual updates and enhancements in various drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn7jQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobqnD/9COlU0nwsulABI/aNIrsh6iYvnCC9v
 14CcNta7Qn+157Wfw9BWOyHdNhR1/fPCXE8jJ71zTyIOeW27HV2JyTtxTwe9ZcdK
 ViHAaj7YcIjcVUEC3StCoRCPnvLslEw4qJA5AOQuDyMivdQn+YVa2c0baJxKaXZt
 xk4HZdMj4NAS0jRKnoZSwtKW/+Oz6rR4GAWrZo+Zs1/8ur3HfqnQfi8lJ1hJtLLW
 V7XDCVRvamVi6Ah3ocYPPp/1P6yeQDA1ge9aMddqaza5STWISXRtSnFMUmYP3rbS
 FaL8TyL+ilfny8pkGB2WlG6nLuSbtvogtdEh1gG1k1RmZt44kAtk8ba/KiWFPBSb
 zK9cjojRMBS71f9G4kmb5F4rnXoLsg1YbD1Nzhz3wq2Cs1Z90dc2QwMren0zoQ1x
 Fn56ueRyAiagBlnrSaKyso/2RvqJTNoSdi3RkpjYeAph0UoDCqvTvKjGAf1mWiw1
 T/1lUWSVqWHnzZbM7XXzzajIN9bl6A7bbqlcAJ2O9vZIDt7273DG+bQym9Vh6Why
 0LTGGERHxzKBsG7WRg+2Gmvv6S18UPKRo8tLtlA758rHlFuPTZCShWrIriwSNl1K
 Hxon+d4BparSnm1h9W/NHPKJA574UbWRCBjdk58IkAj8DxZZY4ORD9SMP+ggkV7G
 F6p9cgoDNP9KFg==
 =jE0N
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Core:

   - Overhaul of posix-timers in preparation of removing the workaround
     for periodic timers which have signal delivery ignored.

   - Remove the historical extra jiffie in msleep()

     msleep() adds an extra jiffie to the timeout value to ensure
     minimal sleep time. The timer wheel ensures minimal sleep time
     since the large rewrite to a non-cascading wheel, but the extra
     jiffie in msleep() remained unnoticed. Remove it.

   - Make the timer slack handling correct for realtime tasks.

     The procfs interface is inconsistent and does neither reflect
     reality nor conforms to the man page. Show the correct 0 slack for
     real time tasks and enforce it at the core level instead of having
     inconsistent individual checks in various timer setup functions.

   - The usual set of updates and enhancements all over the place.

  Drivers:

   - Allow the ACPI PM timer to be turned off during suspend

   - No new drivers

   - The usual updates and enhancements in various drivers"

* tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  ntp: Make sure RTC is synchronized when time goes backwards
  treewide: Fix wrong singular form of jiffies in comments
  cpu: Use already existing usleep_range()
  timers: Rename next_expiry_recalc() to be unique
  platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function
  clocksource/drivers/jcore: Use request_percpu_irq()
  clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent
  clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init
  clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
  clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers
  platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended
  clocksource: acpi_pm: Add external callback for suspend/resume
  clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped()
  dt-bindings: timer: rockchip: Add rk3576 compatible
  timers: Annotate possible non critical data race of next_expiry
  timers: Remove historical extra jiffie for timeout in msleep()
  hrtimer: Use and report correct timerslack values for realtime tasks
  hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
  timers: Add sparse annotation for timer_sync_wait_running().
  signal: Replace BUG_ON()s
  ...
2024-09-17 07:25:37 +02:00
Rodrigo Vivi
3de5774cb8
drm/i915/irq: Rename suspend/resume functions
Although these functions are used in runtime_pm, they are not
exclusively used there, so remove the misleading prefix.

Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240912172539.418957-3-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-16 12:09:02 -04:00
Zhang He
aa944281bd drm/i915/gt: Fixed "CPU" -> "GPU" typo
Column header should be GPU, not CPU

Signed-off-by: Zhang He <zhanghe9702@163.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240913140721.31165-1-zhanghe9702@163.com
2024-09-16 10:27:19 +02:00
Jani Nikula
02189ca841 Merge drm/drm-next into drm-intel-next
Sync the branches to resolve the conflict reported in the below link.

Link: https://lore.kernel.org/r/20240906131502.7a7d1962@canb.auug.org.au
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-11 10:57:18 +03:00
Dave Airlie
32bd3eb5fb Merge tag 'drm-intel-gt-next-2024-09-06' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:

- Expose fan speed via hwmon (Raag)
- Correction to Wa_14019159160 on ARL (John H)
- Whitelist COMMON_SLICE_CHICKEN1 for UMD access on DG2/MTL/ARL (Dnyaneshwar)
- Do not attempt to load the GSC multiple times to avoid hanging GSC HW (Daniele)

- Populate /sys/class/drm/cardX/engines/ even if one engine fails (Andi)
- Use kmemdup_array instead of kmemdup for multiple allocation (Yu)
- Remove extra unlikely() (Hongbo)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Ztrfr_Wuurfa-3Rv@jlahtine-mobl.ger.corp.intel.com
2024-09-11 09:11:54 +10:00
Thomas Gleixner
2f7eedca6c Merge branch 'linus' into timers/core
To update with the latest fixes.
2024-09-10 13:49:53 +02:00
Nikita Zhandarovich
d3d37f7468 drm/i915/guc: prevent a possible int overflow in wq offsets
It may be possible for the sum of the values derived from
i915_ggtt_offset() and __get_parent_scratch_offset()/
i915_ggtt_offset() to go over the u32 limit before being assigned
to wq offsets of u64 type.

Mitigate these issues by expanding one of the right operands
to u64 to avoid any overflow issues just in case.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: c2aa552ff0 ("drm/i915/guc: Add multi-lrc context registration")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://patchwork.freedesktop.org/patch/msgid/20240725155925.14707-1-n.zhandarovich@fintech.ru
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 1f1c1bd566)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-09-10 08:13:51 +01:00
Anna-Maria Behnsen
bd7c8ff9fe treewide: Fix wrong singular form of jiffies in comments
There are several comments all over the place, which uses a wrong singular
form of jiffies.

Replace 'jiffie' by 'jiffy'. No functional change.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
2024-09-08 20:47:40 +02:00
Nikita Zhandarovich
1f1c1bd566
drm/i915/guc: prevent a possible int overflow in wq offsets
It may be possible for the sum of the values derived from
i915_ggtt_offset() and __get_parent_scratch_offset()/
i915_ggtt_offset() to go over the u32 limit before being assigned
to wq offsets of u64 type.

Mitigate these issues by expanding one of the right operands
to u64 to avoid any overflow issues just in case.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: c2aa552ff0 ("drm/i915/guc: Add multi-lrc context registration")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://patchwork.freedesktop.org/patch/msgid/20240725155925.14707-1-n.zhandarovich@fintech.ru
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-06 15:00:32 -04:00
Jani Nikula
c96c834836 drm/i915: use IS_ENABLED() instead of defined() on config options
Prefer IS_ENABLED() instead of defined() for checking whether a kconfig
option is enabled.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240904145218.3902145-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-05 21:12:08 +03:00
Daniele Ceraolo Spurio
59d3cfdd7f drm/i915: Do not attempt to load the GSC multiple times
If the GSC FW fails to load the GSC HW hangs permanently; the only ways
to recover it are FLR or D3cold entry, with the former only being
supported on driver unload and the latter only on DGFX, for which we
don't need to load the GSC. Therefore, if GSC fails to load there is no
need to try again because the HW is stuck in the error state and the
submission to load the FW would just hang the GSCCS.

Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS
hangs to full GT resets, which would trigger a new attempt to load the
GSC FW in the post-reset HW re-init; this issue is also fixed by not
attempting to load the GSC FW after an error.

Fixes: 15bd4a67e9 ("drm/i915/gsc: GSC firmware loading")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v6.3+
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 03ded4d432)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-02 13:25:13 +03:00
Dave Airlie
6d0ebb3904 Cross-driver (xe-core) Changes:
- Require BMG scanout buffers to be 64k physically aligned (Maarten)
 
 Core (drm) Changes:
 - Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka)
 
 Driver Changes:
 - General cleanup and more work moving towards intel_display isolation (Jani)
 - New display workaround (Suraj)
 - Use correct cp_irq_count on HDCP (Suraj)
 - eDP PSR fix when CRC is enabled (Jouni)
 - Fix DP MST state after a sink reset (Imre)
 - Fix Arrow Lake GSC firmware version (John)
 - Use chained DSBs for LUT programming (Ville)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmbQf+AACgkQ+mJfZA7r
 E8pKQQf+K6n6lMhtpfNUYkFS6IFkNDMuWZV6R262AWNFIJkBumfiqb+LLsfoxfb8
 jcrYHKw9cPsx2BWg645KbImhT1L6iISfY0OKkUmDLCyUUV8uhbbnKScF8OK01HVW
 XWeZo9LB9gM8SmVJMljHsko+4JCiOq8+99Jm/GjVCZmW69tQ2fFfjeF3Nh9v32uI
 +DcGlFP33Htp7oynOJxyfqGK1RfKjkJSZEQKODNHKJ7Q+2LRkA85xs3yfHkUkBX8
 AjNI82r1SqAodhCgw8k19yeawvsAx8eu0h6Cal5UQzHlnAYEjfh+flbIDJv7YAZX
 Nh5lpn71LBBGCytCI/MvnsmUtcor7A==
 =tLyy
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2024-08-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Cross-driver (xe-core) Changes:
- Require BMG scanout buffers to be 64k physically aligned (Maarten)

Core (drm) Changes:
- Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka)

Driver Changes:
- General cleanup and more work moving towards intel_display isolation (Jani)
- New display workaround (Suraj)
- Use correct cp_irq_count on HDCP (Suraj)
- eDP PSR fix when CRC is enabled (Jouni)
- Fix DP MST state after a sink reset (Imre)
- Fix Arrow Lake GSC firmware version (John)
- Use chained DSBs for LUT programming (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZtCC0lJ0Zf3MoSdW@intel.com
2024-08-30 13:41:32 +10:00
Dave Airlie
4f7d8da5e3 drm-misc-next for v6.12:
UAPI Changes:
 
 devfs:
 - support device numbers up to MINORBITS limit
 
 Core Changes:
 
 ci:
 - increase job timeout
 
 devfs:
 - use XArray for minor ids
 
 displayport:
 - mst: GUID improvements
 
 docs:
 - add fixes and cleanups
 
 panic:
 - optionally display QR code
 
 Driver Changes:
 
 amdgpu:
 - faster vblank disabling
 - GUID improvements
 
 gm12u320
 - convert to struct drm_edid
 
 host1x:
 - fix syncpoint IRQ during resume
 - use iommu_paging_domain_alloc()
 
 imx:
 - ipuv3: convert to struct drm_edid
 
 omapdrm:
 - improve error handling
 
 panel:
 - add support for BOE TV101WUM-LL2 plus DT bindings
 - novatek-nt35950: improve error handling
 - nv3051d: improve error handling
 - panel-edp: add support for BOE NE140WUM-N6G; revert support for
   SDC ATNA45AF01
 - visionox-vtdr6130: improve error handling; use
   devm_regulator_bulk_get_const()
 
 renesas:
 - rz-du: add support for RZ/G2UL plus DT bindings
 
 sti:
 - convert to struct drm_edid
 
 tegra:
 - gr3d: improve PM domain handling
 - convert to struct drm_edid
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmbQiVAACgkQaA3BHVML
 eiNgVggAqN5f9i0Rbk5tTfasBBSq0qiNE3X7mHDFQsAY+iGRJUzuYlIjjATSunsB
 HnqcA0aoT3CaBpl1drRTg1wCWRXBZrnAG2mgVa/eGBjrSH2i2d9IgxcNT2XvQkI5
 K4Ac2Ulpr+57d8nHmeEjztQusD2vaDtNH7b6pU2wNmZkiqUCbzcLn9GuL9OF8tSh
 6EApiPExbBASQeV0+xVt7mbtasclzFf8wukQXlK8zlWDeHTTTRibBwRy1txyqdG3
 qnBCabVTorgah81vBezXegrro4yQ1ITo5A1ZTYYJroA70mqMN5cm5kYasIb1zqXP
 f/xXGLB/a96bV9zqEFhWGInlEqGthA==
 =1fX2
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-08-29' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.12:

UAPI Changes:

devfs:
- support device numbers up to MINORBITS limit

Core Changes:

ci:
- increase job timeout

devfs:
- use XArray for minor ids

displayport:
- mst: GUID improvements

docs:
- add fixes and cleanups

panic:
- optionally display QR code

Driver Changes:

amdgpu:
- faster vblank disabling
- GUID improvements

gm12u320
- convert to struct drm_edid

host1x:
- fix syncpoint IRQ during resume
- use iommu_paging_domain_alloc()

imx:
- ipuv3: convert to struct drm_edid

omapdrm:
- improve error handling

panel:
- add support for BOE TV101WUM-LL2 plus DT bindings
- novatek-nt35950: improve error handling
- nv3051d: improve error handling
- panel-edp: add support for BOE NE140WUM-N6G; revert support for
  SDC ATNA45AF01
- visionox-vtdr6130: improve error handling; use
  devm_regulator_bulk_get_const()

renesas:
- rz-du: add support for RZ/G2UL plus DT bindings

sti:
- convert to struct drm_edid

tegra:
- gr3d: improve PM domain handling
- convert to struct drm_edid

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240829144654.GA145538@linux.fritz.box
2024-08-30 13:40:38 +10:00
Raag Jadav
727eb1e3f0 drm/i915/hwmon: expose fan speed
Add hwmon support for fan1_input attribute, which will expose fan speed
in RPM. With this in place we can monitor fan speed using lm-sensors tool.

$ sensors
i915-pci-0300
Adapter: PCI adapter
in0:         653.00 mV
fan1:        3833 RPM
power1:           N/A  (max =  43.00 W)
energy1:      32.02 kJ

v2: Handle overflow, add mutex protection and ABI documentation
    Aesthetic adjustments (Riana)
v3: Change rotations data type, ABI date and version
v4: Fix wakeref leak
    Drop switch case and simplify hwm_fan_xx() (Andi)
v5: Rework time calculation, aesthetic adjustments (Andy)
v6: Drop redundant overflow logic (Andy)
    Split fan_input_read() into dedicated helper (Badal)
v7: Fix undefined reference to __udivdi3 for i386 (Andy)

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240823034548.2670032-1-raag.jadav@intel.com
2024-08-28 12:06:07 +05:30
Rodrigo Vivi
04cf420bbc
Merge drm/drm-next into drm-intel-next
Need to take some Xe bo definition in here before
we can add the BMG display 64k aligned size restrictions.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-27 17:06:28 -04:00
Daniele Ceraolo Spurio
03ded4d432 drm/i915: Do not attempt to load the GSC multiple times
If the GSC FW fails to load the GSC HW hangs permanently; the only ways
to recover it are FLR or D3cold entry, with the former only being
supported on driver unload and the latter only on DGFX, for which we
don't need to load the GSC. Therefore, if GSC fails to load there is no
need to try again because the HW is stuck in the error state and the
submission to load the FW would just hang the GSCCS.

Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS
hangs to full GT resets, which would trigger a new attempt to load the
GSC FW in the post-reset HW re-init; this issue is also fixed by not
attempting to load the GSC FW after an error.

Fixes: 15bd4a67e9 ("drm/i915/gsc: GSC firmware loading")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v6.3+
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com
2024-08-27 08:08:09 -07:00
Daniel Vetter
3f53d7e442 Merge tag 'drm-intel-gt-next-2024-08-23' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
UAPI Changes:

- Limit the number of relocations to INT_MAX (Tvrtko)

  Only impact should be synthetic tests.

Driver Changes:

- Fix for #11396: GPU Hang and rcs0 reset on Cherrytrail platform
- Fix Virtual Memory mapping boundaries calculation (Andi)
- Fix for #11255: Long hangs in buddy allocator with DG2/A380 without
  Resizable BAR since 6.9 (David)
- Mark the GT as dead when mmio is unreliable (Chris, Andi)
- Workaround additions / fixes for MTL, ARL and DG2 (John H, Nitin)
- Enable partial memory mapping of GPU virtual memory (Andi, Chris)

- Prevent NULL deref on intel_memory_regions_hw_probe (Jonathan, Dan)
- Avoid UAF on intel_engines_release (Krzysztof)

- Don't update PWR_CLK_STATE starting Gen12 (Umesh)
- Code and dmesg cleanups (Andi, Jesus, Luca)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZshcfSqgfnl8Mh4P@jlahtine-mobl.ger.corp.intel.com
2024-08-27 12:55:17 +02:00
John Harrison
2955ae8186 drm/i915: ARL requires a newer GSC firmware
ARL and MTL share a single GSC firmware blob. However, ARL requires a
newer version of it.

So add differentiate of the PCI ids for ARL from MTL and create ARL as
a sub-platform of MTL. That way, all the existing workarounds and such
still treat ARL as MTL exactly as before. However, now the GSC code
can check for ARL and do an extra version check on the firmware before
committing to it.

Also, the version extraction code has various ways of failing but the
return code was being ignore and so the firmware load would attempt to
continue anyway. Fix that by propagating the return code to the next
level out.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Fixes: 213c43676b ("drm/i915/mtl: Remove the 'force_probe' requirement for Meteor Lake")
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240802031051.3816392-1-John.C.Harrison@Intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 67733d7a71)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-08-27 13:23:58 +03:00
John Harrison
54f90b0333 drm/i915/guc: Fix missing enable of Wa_14019159160 on ARL
The previous update to enable the workaround on ARL only changed two
out of three places where the w/a needs to be enabled. That meant the
GuC side was operational but not the KMD side. And as the KMD side is
the trigger, it meant the w/a was not actually active. So fix that.

Fixes: 104bcfae57 ("drm/i915/arl: Enable Wa_14019159160 for ARL")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809000646.1747507-1-John.C.Harrison@Intel.com
2024-08-26 15:24:49 -07:00
Dnyaneshwar Bhadane
037f93434c drm/i915/gt: Whitelist COMMON_SLICE_CHICKEN1 for UMD access.
As part of the recommended tuning setting, whitelist COMMON_SLICE_CHICKEN1
for MTL/ARL and DG2.

The UMD will selectively enable or disable specific bits of the
register based on the type of workload and its requirements.

v2: Remove the KMD par of enabling specific bits(Matt R)

Bspec: 68331
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240825121156.2498810-1-dnyaneshwar.bhadane@intel.com
2024-08-26 09:57:20 -07:00
renjun wang
22bc22ccf9 drm: Fix kerneldoc for "Returns" section
The blank line between title "Returns:" and detail description is not
allowed, otherwise the title will goes under the description block in
generated .html file after running `make htmldocs`.

There are a few examples for current kerneldoc at [1][2][3].

v2:
- use Link tag with stable URLs

Signed-off-by: renjun wang <renjunw0@foxmail.com>
Link: https://www.kernel.org/doc/html/v6.10/gpu/drm-kms.html#c.drm_crtc_commit_wait # 1
Link: https://www.kernel.org/doc/html/v6.10/gpu/drm-kms.html#c.drm_atomic_get_crtc_state # 2
Link: https://www.kernel.org/doc/html/v6.10/gpu/i915.html#c.i915_vma_pin_fence # 3
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/tencent_37A873672B5CD20DECAF99DEDAC5E45C3106@qq.com
2024-08-26 16:40:09 +02:00
John Harrison
67733d7a71
drm/i915: ARL requires a newer GSC firmware
ARL and MTL share a single GSC firmware blob. However, ARL requires a
newer version of it.

So add differentiate of the PCI ids for ARL from MTL and create ARL as
a sub-platform of MTL. That way, all the existing workarounds and such
still treat ARL as MTL exactly as before. However, now the GSC code
can check for ARL and do an extra version check on the firmware before
committing to it.

Also, the version extraction code has various ways of failing but the
return code was being ignore and so the firmware load would attempt to
continue anyway. Fix that by propagating the return code to the next
level out.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Fixes: 213c43676b ("drm/i915/mtl: Remove the 'force_probe' requirement for Meteor Lake")
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240802031051.3816392-1-John.C.Harrison@Intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-26 10:21:11 -04:00
Jani Nikula
e24b0ef20a drm/i915: remove unnecessary display includes
There are a number of leftover #include "display/..." directives that
are completely unnecessary. Remove them to make it easier to spot the
relevant ones. In one case, switch to a more specific include.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240823123318.3189503-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-08-26 10:56:51 +03:00
Yu Jiaoliang
3126d5fff5 drm/i915/gt: Use kmemdup_array instead of kmemdup for multiple allocation
Let the kememdup_array() take care about multiplication and possible
overflows.

v2:
- Change subject
- Leave one blank line between the commit log and the tag section
- Fix code alignment issue

v3:
- Fix code alignment
- Apply the patch on a clean drm-tip

Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240821144036.343556-1-andi.shyti@linux.intel.com
2024-08-24 01:13:08 +02:00
Andi Shyti
6628851159 drm/i915/gt: Continue creating engine sysfs files even after a failure
The i915 driver generates sysfs entries for each engine of the
GPU in /sys/class/drm/cardX/engines/.

The process is straightforward: we loop over the UABI engines and
for each one, we:

 - Create the object.
 - Create basic files.
 - If the engine supports timeslicing, create timeslice duration files.
 - If the engine supports preemption, create preemption-related files.
 - Create default value files.

Currently, if any of these steps fail, the process stops, and no
further sysfs files are created.

However, it's not necessary to stop the process on failure.
Instead, we can continue creating the remaining sysfs files for
the other engines. Even if some files fail to be created, the
list of engines can still be retrieved by querying i915.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240819113140.325235-1-andi.shyti@linux.intel.com
2024-08-24 00:28:55 +02:00
Luca Coelho
0523374e30 drm/i915/gt: remove stray declaration of intel_gt_release_all()
When intel_gt_release_all() was removed from the code in commit
e899505533 ("drm/i915: do not clean GT table on error path"), its
declaration in the header file remained.  Remove it.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240813140618.387553-1-luciano.coelho@intel.com
2024-08-17 11:24:46 +02:00
Jesus Narvaez
437ad4534a drm/i915/guc: Change GEM_WARN_ON to guc_err to prevent taints in CI
This warning was supposed to catch a harmless issue, but changing to
guc_error should prevent kernel taints in CI runs.

Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240808204943.911727-1-jesus.narvaez@intel.com
2024-08-16 11:04:53 -07:00
Chris Wilson
a857add73e drm/i915/gt: Mark the GT as dead when mmio is unreliable
After we detect that mmio is returning all 0xff, we believe that the GPU
has dropped off the pci bus and is dead. Mark the device as wedged such
that we can propagate the failure back to userspace and wait for
recovery.

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240807091014.469992-1-andi.shyti@linux.intel.com
2024-08-09 12:51:17 +01:00
Andi Shyti
b7b930d104 drm/i915: Replace double blank with single blank after comma in gem/ and gt/
Do not use double blanks, ",  " in function parameters where it's
not required by any alignment purpose. Replase it with a single
blank, ", ".

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240807130516.491053-2-andi.shyti@linux.intel.com
2024-08-08 11:40:41 +01:00
Krzysztof Niemiec
fceff12e52 drm/i915/gt: Empty uabi engines list during intel_engines_release()
While the uabi_engines_llist is populated in intel_engines_init() during
driver load, the corresponding function intel_engines_release() does not
correctly get rid of it. This can lead to a UAF if, after failed
initialization (for example when gt is set wedged on init), we try to
access the engines.

Suggested-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801154047.115176-2-krzysztof.niemiec@intel.com
2024-08-05 23:10:46 +01:00
Nitin Gote
843f10ce65 drm/i915/gt: Add Wa_14019789679
Wa_14019789679 implementation for MTL, ARL and DG2.

v2: Corrected condition

v3:
   - Fix indentation (Jani Nikula)
   - dword size should be 0x1 and
     initialize dword to 0 instead of MI_NOOP (Tejas)
   - Use IS_GFX_GT_IP_RANGE() (Tejas)

v4:
   - 3DSTATE_MESH_CONTROL instruction is 3 dwords long
     Align with dword size. (Roper, Matthew D)
   - Add RCS engine check. (Tejas)

Bspec: 47083

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240731155614.3460645-1-nitin.r.gote@intel.com
2024-08-05 22:33:39 +01:00
Nitin Gote
65564157ae drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
We're seeing a GPU hang issue on a CHV platform, which was caused by commit
bac24f59f4 ("drm/i915/execlists: Enable coarse preemption boundaries for
Gen8").

The Gen8 platform only supports timeslicing and doesn't have a preemption
mechanism, as its engines do not have a preemption timer.

Commit 751f82b353 ("drm/i915/gt: Only disable preemption on Gen8 render
engines") addressed this issue only for render engines. This patch extends
that fix by ensuring that preemption is not considered for all engines on
Gen8 platforms.

v4:
 - Use the correct Fixes tag (Rodrigo Vivi)
 - Reworded commit log (Andi Shyti)

v3:
 - Inside need_preempt(), condition of can_preempt() is not required
   as simplified can_preempt() is enough. (Chris Wilson)

v2: Simplify can_preempt() function (Tvrtko Ursulin)

Fixes: 751f82b353 ("drm/i915/gt: Only disable preemption on gen8 render engines")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
CC: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711163208.1355736-1-nitin.r.gote@intel.com
(cherry picked from commit 7df0be6e62)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-07-23 09:34:09 +00:00
John Harrison
e4a0251d36 drm/i915/guc: Extend w/a 14019159160
There is a new part to an existing workaround, so enable that piece as
well.

v2: Extend even further.
v3: Drop DG2 as there are CI failures still to resolve. Also re-order
the parameters to a function to reduce excessive line wrapping.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-3-John.C.Harrison@Intel.com
2024-07-18 12:51:56 -07:00
John Harrison
104bcfae57 drm/i915/arl: Enable Wa_14019159160 for ARL
The context switch out workaround also applies to ARL.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240622004636.662081-2-John.C.Harrison@Intel.com
2024-07-18 12:51:55 -07:00
Nitin Gote
7df0be6e62 drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
We're seeing a GPU hang issue on a CHV platform, which was caused by commit
bac24f59f4 ("drm/i915/execlists: Enable coarse preemption boundaries for
Gen8").

The Gen8 platform only supports timeslicing and doesn't have a preemption
mechanism, as its engines do not have a preemption timer.

Commit 751f82b353 ("drm/i915/gt: Only disable preemption on Gen8 render
engines") addressed this issue only for render engines. This patch extends
that fix by ensuring that preemption is not considered for all engines on
Gen8 platforms.

v4:
 - Use the correct Fixes tag (Rodrigo Vivi)
 - Reworded commit log (Andi Shyti)

v3:
 - Inside need_preempt(), condition of can_preempt() is not required
   as simplified can_preempt() is enough. (Chris Wilson)

v2: Simplify can_preempt() function (Tvrtko Ursulin)

Fixes: 751f82b353 ("drm/i915/gt: Only disable preemption on gen8 render engines")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
CC: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711163208.1355736-1-nitin.r.gote@intel.com
2024-07-16 00:10:05 +02:00
Daniel Vetter
bfc109361c Merge tag 'drm-intel-gt-next-2024-07-04' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:

Fixes/improvements/new stuff:

- Downgrade stolen lmem setup warning [gem] (Jonathan Cavitt)
- Evaluate GuC priority within locks [gt/uc] (Andi Shyti)
- Fix potential UAF by revoke of fence registers [gt] (Janusz Krzysztofik)
- Return NULL instead of '0' [gem] (Andi Shyti)
- Use the correct format specifier for resource_size_t [gem] (Andi Shyti)
- Suppress oom warning in favour of ENOMEM to userspace [gem] (Nirmoy Das)

Miscellaneous:

- Evaluate forcewake usage within locks [gt] (Andi Shyti)
- Fix typo in comment [gt/uc] (Andi Shyti)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZoZP6mUSergfzFMh@linux
2024-07-05 12:14:59 +02:00
Daniel Vetter
86634fa4e6 Linux 6.10-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmaB0NweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkvwH/36UJRk/o6wvXnyH
 E6QjCSWo2226APyWks22NjtC3I/8Iqdvkneuh6wG0qL2sXAB078EMjUq5R81bF8H
 wWFBJwetjYTp8GEyLioMEb2wCH/J3R29dLFC4UYTplafXRGP6//xcpJaKmTxcgdR
 31IzvTPXbApZ7L3k1U6rA2bK9PNKcFCOvZlrNMUCuwMrabymHsDfOUt1DqXyg2xp
 zjqiWYBwlklozmgawSWt/mdEgkWuTcAbg+KyqDVQF59s9aj/OOwZ0j+HACq5V8CM
 quTPIAYL6CC9p7uxa69lGr/sgC0Is/BZLPX7RTZAwCgarGvnX+1HUsjDcaFCtrVg
 O6fPUV8=
 =pgUx
 -----END PGP SIGNATURE-----

Merge v6.10-rc6 into drm-next

The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.

Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-07-05 10:47:28 +02:00
Dave Airlie
a78313bb20 Merge tag 'drm-intel-gt-next-2024-06-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
UAPI Changes:

- Support replaying GPU hangs with captured context image (Tvrtko Ursulin)

Driver Changes:

Fixes/improvements/new stuff:

- Automate CCS Mode setting during engine resets [gt] (Andi Shyti)
- Revert "drm/i915: Remove extra multi-gt pm-references" (Janusz Krzysztofik)
- Fix HAS_REGION() usage in intel_gt_probe_lmem() (Ville Syrjälä)
- Disarm breadcrumbs if engines are already idle [gt] (Chris Wilson)
- Shadow default engine context image in the context (Tvrtko Ursulin)
- Support replaying GPU hangs with captured context image (Tvrtko Ursulin)
- avoid FIELD_PREP warning [guc] (Arnd Bergmann)
- Fix CCS id's calculation for CCS mode setting [gt] (Andi Shyti)
- Increase FLR timeout from 3s to 9s (Andi Shyti)
- Update workaround 14018575942 [mtl] (Angus Chen)

Future platform enablement:

- Enable w/a 16021333562 for DG2, MTL and ARL [guc] (John Harrison)

Miscellaneous:

- Pass the region ID rather than a bitmask to HAS_REGION() (Ville Syrjälä)
- Remove counter productive REGION_* wrappers (Ville Syrjälä)
- Fix typo [gem/i915_gem_ttm_move] (Deming Wang)
- Delete the live_hearbeat_fast selftest [gt] (Krzysztof Niemiec)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zmmazub+U9ewH9ts@linux
2024-06-27 17:21:44 +10:00
Janusz Krzysztofik
996c3412a0 drm/i915/gt: Fix potential UAF by revoke of fence registers
CI has been sporadically reporting the following issue triggered by
igt@i915_selftest@live@hangcheck on ADL-P and similar machines:

<6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence
...
<6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
<6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
<3> [414.070354] Unable to pin Y-tiled fence; err:-4
<3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active))
...
<4>[  609.603992] ------------[ cut here ]------------
<2>[  609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301!
<4>[  609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[  609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G     U  W          6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1
<4>[  609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4>[  609.604010] Workqueue: i915 __i915_gem_free_work [i915]
<4>[  609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915]
...
<4>[  609.604271] Call Trace:
<4>[  609.604273]  <TASK>
...
<4>[  609.604716]  __i915_vma_evict+0x2e9/0x550 [i915]
<4>[  609.604852]  __i915_vma_unbind+0x7c/0x160 [i915]
<4>[  609.604977]  force_unbind+0x24/0xa0 [i915]
<4>[  609.605098]  i915_vma_destroy+0x2f/0xa0 [i915]
<4>[  609.605210]  __i915_gem_object_pages_fini+0x51/0x2f0 [i915]
<4>[  609.605330]  __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915]
<4>[  609.605440]  process_scheduled_works+0x351/0x690
...

In the past, there were similar failures reported by CI from other IGT
tests, observed on other platforms.

Before commit 63baf4f3d5 ("drm/i915/gt: Only wait for GPU activity
before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for
idleness of vma->active via fence_update().   That commit introduced
vma->fence->active in order for the fence_update() to be able to wait
selectively on that one instead of vma->active since only idleness of
fence registers was needed.  But then, another commit 0d86ee3509
("drm/i915/gt: Make fence revocation unequivocal") replaced the call to
fence_update() in i915_vma_revoke_fence() with only fence_write(), and
also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front.
No justification was provided on why we might then expect idleness of
vma->fence->active without first waiting on it.

The issue can be potentially caused by a race among revocation of fence
registers on one side and sequential execution of signal callbacks invoked
on completion of a request that was using them on the other, still
processed in parallel to revocation of those fence registers.  Fix it by
waiting for idleness of vma->fence->active in i915_vma_revoke_fence().

Fixes: 0d86ee3509 ("drm/i915/gt: Make fence revocation unequivocal")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.8+
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 24bb052d3d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-24 13:05:15 +03:00
Dave Airlie
4552a6a42a Merge tag 'drm-intel-next-2024-06-19' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 feature pull for v6.11:

Features and functionality:
- Battlemage (BMG) Xe2 HPD display enabling (Balasubramani, Clint, Gustavo,
  José, Matt, Anusha, Lucas, Ravi, Radhakrishna, Nirmoy, Ankit, Matthew)
- Panel Replay enabling (Jouni, Animesh)
- DP AUX-less ALPM (Advanced Link Power Management) and LOBF (Link off between
  frames) enabling (Animesh, Jouni)
- Enable link training failure fallback for DP MST links (Imre)
- CMRR (Content Match Refresh Rate) enabling (Mitul)
- Allow the first async flip to change modifier (Ville)
- Enable eDP AUX based HDR backlight (Suraj)
- Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps (Ville)

Refactoring and cleanups:
- Stop using implicit dev_priv local variable in macros (Jani)
- Expand and clean up VBT table definitions (Ville)
- PSR/ALPM refactoring (Jouni, Animesh)
- Plane fb refactoring (Ville)
- Rawclk, FSB, and mem frequency refactoring (Jani)
- GVT register macro usage cleanups (Jani, Ville)
- Plane, cursor, wm and ddb register macro and usage cleanups (Ville)
- Pipe CRC register macro cleanups (Ville)
- PCI ID macro cleanups and refactoring to match xe style (Jani)
- Move drm-intel repo to gitlab.freedesktop.org (Ryszard)
- Identify all platforms/subplatforms in display probe (Jani)
- Move Intel drm headers under include/drm/intel (Jani)
- Drop local redundant W=1 warnings in favour of drm subsystem warnigs (Jani)
- Include cleanups; include what you use (Jani)
- Convert overlay and DMC error state printing to drm_printer (Jani)
- Joiner renames (Stan)
- DSB interface cleanups (Ville)
- Improve workaround for disabling FBC when VT-d is active (Vinod)
- State checker refactoring and cleanups for color, planes and cdclk (Ville)
- Cleanups around scanline arithmetic (Ville)
- Use drm_crtc_vblank_crtc() instead of open coding (Ville)
- DSC cleanups (Ville)

Fixes:
- Improve VBT array bounds check (Luca)
- LNL PSR fixes (Jouni)
- Audio workaround, disable min hblank fix (Uma)
- Stop selecting ACPI_BUTTON config (Jani)
- Add MTL Cx0 PHY config compare (Mika)
- Fix MTL C20 PHY port clock verification (Mika)
- Fix static analyzer warning for uapi.event access (Luca)
- HDCP fixes and workarounds (Suraj)
- Fix DP MST DSC input BPP computation (Imre)
- Fix assert on pending async-put power domain work (Imre)
- Fix documentation build for DMC wakelocks (Luca)
- Disable DSC on eDP when indicated by VBT (Ville)

DRM Core changes:
- Various DPCD register additions for panel replay and ALPM (Jouni)
- Add target_rr_divider to adaptive sync SDP (Mitul)

Xe driver changes:
- Remove unused xe->enabled_irq_mask and xe->sb_lock members (Jani)
- i915 display compat header cleanups (Jani)
- Remove redundant copy of intel_fbdev_fb.h (Ville)
- Add process name to devcoredump (José)
- Add xe_gt_err_once() (Matthew)
- Implement transient flush for BMG/Xe3 (Nirmoy)

Merges:
- Backmerges to sync with xe, drm-misc and upstream (Rodrigo, Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87y170eu80.fsf@intel.com
2024-06-21 13:11:24 +10:00
Dave Airlie
91c93e475c drm-misc-next for 6.11:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
  - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
  - bridge: Remove drm_bridge_chain_mode_fixup
  - ci: Require a more recent version of mesa, improve farm estup and
    test generation
  - mipi-dbi: Remove mipi_dbi_machine_little_endian, make SPI bits per
    word configurable, support RGB888, and allow pixel formats to be
    specified in the DT.
  - mm: Remove drm_mm_replace_node
  - panic: Allow to dump kmsg to the screen
  - print: Add a drm prefix to warn level messages too, remove
    ___drm_dbg, consolidate prefix handling
 
 Driver Changes:
  - sun4i: Rework the blender setup for DE2
  - bridges:
    - bridge-connector: Plumb in the new HDMI helpers
    - samsung-dsim: Fix timings calculation
    - tc358767: Plenty of small fixes
  - panels:
    - More cleanup of prepare / enable state tracking in drivers
    - New panel: PrimeView PM070WL4,
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZmqkZAAKCRDj7w1vZxhR
 xUUeAP9ZuOvD8dGBFvR95yXQpsoaxyrC37zsckCo87SWRsyg3QEAmxmzdhgsGzKX
 lXd3KsF/i1nOPflG9QeMj/lfroE28ww=
 =k2mC
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-06-13' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.11:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
 - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
 - bridge: Remove drm_bridge_chain_mode_fixup
 - ci: Require a more recent version of mesa, improve farm estup and
   test generation
 - mipi-dbi: Remove mipi_dbi_machine_little_endian, make SPI bits per
   word configurable, support RGB888, and allow pixel formats to be
   specified in the DT.
 - mm: Remove drm_mm_replace_node
 - panic: Allow to dump kmsg to the screen
 - print: Add a drm prefix to warn level messages too, remove
   ___drm_dbg, consolidate prefix handling

Driver Changes:
 - sun4i: Rework the blender setup for DE2
 - bridges:
   - bridge-connector: Plumb in the new HDMI helpers
   - samsung-dsim: Fix timings calculation
   - tc358767: Plenty of small fixes
 - panels:
   - More cleanup of prepare / enable state tracking in drivers
   - New panel: PrimeView PM070WL4,

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240613-cicada-of-infinite-unity-0955ca@houat
2024-06-21 10:50:04 +10:00
Janusz Krzysztofik
24bb052d3d drm/i915/gt: Fix potential UAF by revoke of fence registers
CI has been sporadically reporting the following issue triggered by
igt@i915_selftest@live@hangcheck on ADL-P and similar machines:

<6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence
...
<6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
<6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
<3> [414.070354] Unable to pin Y-tiled fence; err:-4
<3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active))
...
<4>[  609.603992] ------------[ cut here ]------------
<2>[  609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301!
<4>[  609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[  609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G     U  W          6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1
<4>[  609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4>[  609.604010] Workqueue: i915 __i915_gem_free_work [i915]
<4>[  609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915]
...
<4>[  609.604271] Call Trace:
<4>[  609.604273]  <TASK>
...
<4>[  609.604716]  __i915_vma_evict+0x2e9/0x550 [i915]
<4>[  609.604852]  __i915_vma_unbind+0x7c/0x160 [i915]
<4>[  609.604977]  force_unbind+0x24/0xa0 [i915]
<4>[  609.605098]  i915_vma_destroy+0x2f/0xa0 [i915]
<4>[  609.605210]  __i915_gem_object_pages_fini+0x51/0x2f0 [i915]
<4>[  609.605330]  __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915]
<4>[  609.605440]  process_scheduled_works+0x351/0x690
...

In the past, there were similar failures reported by CI from other IGT
tests, observed on other platforms.

Before commit 63baf4f3d5 ("drm/i915/gt: Only wait for GPU activity
before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for
idleness of vma->active via fence_update().   That commit introduced
vma->fence->active in order for the fence_update() to be able to wait
selectively on that one instead of vma->active since only idleness of
fence registers was needed.  But then, another commit 0d86ee3509
("drm/i915/gt: Make fence revocation unequivocal") replaced the call to
fence_update() in i915_vma_revoke_fence() with only fence_write(), and
also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front.
No justification was provided on why we might then expect idleness of
vma->fence->active without first waiting on it.

The issue can be potentially caused by a race among revocation of fence
registers on one side and sequential execution of signal callbacks invoked
on completion of a request that was using them on the other, still
processed in parallel to revocation of those fence registers.  Fix it by
waiting for idleness of vma->fence->active in i915_vma_revoke_fence().

Fixes: 0d86ee3509 ("drm/i915/gt: Make fence revocation unequivocal")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.8+
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com
2024-06-19 12:21:52 +02:00
Jani Nikula
d754ed2821 Merge drm/drm-next into drm-intel-next
Sync to v6.10-rc3.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-19 11:38:31 +03:00
Jani Nikula
d0a6e5015f drm/i915: use i9xx_fsb_freq() for GT clock frequency
Reuse i9xx_fsb_freq() for GT clock frequency initialization instead of
depending on rawclk_freq.

Note: If the init order was changed, we could use i915->fsb_freq
directly. However, GT clock initialization is done in
i915_driver_mmio_probe(), but intel_dram_detect() later in
i915_driver_hw_probe(), with a dependency on intel_pcode_init().

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0678d8ec9772725b47d4fa5b14e3b3a34256d5cf.1718356614.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-17 11:54:31 +03:00
Jani Nikula
a4ad402078 drm/i915: convert fsb_freq and mem_freq to kHz
We'll want to use fsb frequency for deriving GT clock and rawclk
frequencies in the future. Increase the accuracy by converting to
kHz. Do the same for mem freq to be aligned.

Round the frequencies ending in 666 to 667.

v2: Also handle mem_freq in gen5_rps_init() (Ville)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/17fe2544b876549f63fac0f956273f5f282081b3.1718356614.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-17 11:54:31 +03:00
Jani Nikula
024a05a47e drm/i915/gt: remove mem freq from gt debugfs
It's a bit out of place, and only printed for VLV/CHV.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/bbfec4c67a81d1d3de1f40484a80b7164e69df21.1718356614.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-17 11:54:31 +03:00
Andi Shyti
ae45f07cad drm/i915/gt/uc: Evaluate GuC priority within locks
The ce->guc_state.lock was made to protect guc_prio, which
indicates the GuC priority level.

But at the begnning of the function we perform some sanity check
of guc_prio outside its protected section. Move them within the
locked region.

Use this occasion to expand the if statement to make it clearer.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240613222402.551625-2-andi.shyti@linux.intel.com
2024-06-15 13:58:18 +02:00
Andi Shyti
45ebbbbeaa drm/i915/gt/uc: Fix typo in comment
Replace "dynmically" with "dynamically".

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240613222837.552277-1-andi.shyti@linux.intel.com
2024-06-15 12:38:59 +02:00
Andi Shyti
c677f31c85 drm/i915/gt: debugfs: Evaluate forcewake usage within locks
The forcewake count and domains listing is multi process critical
and the uncore provides a spinlock for such cases.

Lock the forcewake evaluation section in the fw_domains_show()
debugfs interface.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607145131.217251-1-andi.shyti@linux.intel.com
2024-06-11 17:08:01 +02:00
Angus Chen
79655e867a drm/i915/mtl: Update workaround 14018575942
The WA should be extended to cover VDBOX engine. We found that
28-channels 1080p VP9 encoding may hit this issue.

v3: update the WA number and explain the reason why
    this workaround is needed
v2: add WA number
v1: initial version

Signed-off-by: Angus Chen <angus.chen@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426073638.45775-1-angus.chen@intel.com
2024-06-11 16:06:20 +02:00
John Harrison
6ef078383a drm/i915/guc: Enable w/a 16021333562 for DG2, MTL and ARL
Enable another workaround that is implemented inside the GuC.

v2: Use the correct Gen12 w/a id rather than the Xe version (review
feedback from Matthew R) also extend to include ARL.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240528230515.479395-1-John.C.Harrison@Intel.com
2024-06-06 17:28:55 -07:00
Michal Wajdeczko
ce79b73336
drm/i915: Don't use __func__ as prefix for drm_dbg_printer
Updated code of drm_dbg_printer() is already printing symbolic
name of the caller like drm_dbg() does.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517163406.2348-4-michal.wajdeczko@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-06 14:46:15 -04:00
Niemiec, Krzysztof
ca1a453361 drm/i915/gt: Delete the live_hearbeat_fast selftest
The test is trying to push the heartbeat frequency to the limit, which
might sometimes fail. Such a failure does not provide valuable
information, because it does not indicate that there is something
necessarily wrong with either the driver or the hardware.

Remove the test to prevent random, unnecessary failures from appearing
in CI.

Suggested-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Niemiec, Krzysztof <krzysztof.niemiec@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe2vu5h7v7ooxbhwpbfsypxg5mjrnt56gc3cgrqpnhgrgce334@qfrv2skxrp47
2024-06-06 03:46:13 +02:00
Jani Nikula
03c7918d0d drm: move i915_drm.h under include/drm/intel
Clean up the top level include/drm directory by grouping all the Intel
specific files under a common subdirectory.

v2: Also fix comment in intel_pci_config.h (Ilpo)

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0e344a72e9be596ac2b8b55a26fd674a96f03cdc.1717075103.git.jani.nikula@intel.com
2024-05-31 16:11:09 +03:00
Jani Nikula
1bb01bdab0 drm: move i915_component.h under include/drm/intel
Clean up the top level include/drm directory by grouping all the Intel
specific files under a common subdirectory.

v2: Also change Documentation/gpu/i915.rst (Andi)

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a8c07233a8234858eb6711140482ef8db4c91cf4.1717075103.git.jani.nikula@intel.com
2024-05-31 16:11:00 +03:00
Jani Nikula
0706d57100 drm: move i915_gsc_proxy_mei_interface.h under include/drm/intel
Clean up the top level include/drm directory by grouping all the Intel
specific files under a common subdirectory.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/461662d528c3f327c81b764b7c883cd4519d8729.1717075103.git.jani.nikula@intel.com
2024-05-31 16:10:56 +03:00
Jani Nikula
05255ccbf1 drm: move intel-gtt.h under include/drm/intel
Clean up the top level include/drm directory by grouping all the Intel
specific files under a common subdirectory.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ae224504d99cc6428da6dced9dcde2b7953624ef.1717075103.git.jani.nikula@intel.com
2024-05-31 16:10:50 +03:00
Andi Shyti
ee01b6a386 drm/i915/gt: Fix CCS id's calculation for CCS mode setting
The whole point of the previous fixes has been to change the CCS
hardware configuration to generate only one stream available to
the compute users. We did this by changing the info.engine_mask
that is set during device probe, reset during the detection of
the fused engines, and finally reset again when choosing the CCS
mode.

We can't use the engine_mask variable anymore, as with the
current configuration, it imposes only one CCS no matter what the
hardware configuration is.

Before changing the engine_mask for the third time, save it and
use it for calculating the CCS mode.

After the previous changes, the user reported a performance drop
to around 1/4. We have tested that the compute operations, with
the current patch, have improved by the same factor.

Fixes: 6db31251bb ("drm/i915/gt: Enable only one CCS for compute workload")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Gnattu OC <gnattuoc@me.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Jian Ye <jian.ye@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Tested-by: Gnattu OC <gnattuoc@me.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@linux.intel.com
(cherry picked from commit a09d2327a9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-05-29 11:35:38 +03:00
Arnd Bergmann
d4f36db623 drm/i915/guc: avoid FIELD_PREP warning
With gcc-7 and earlier, there are lots of warnings like

In file included from <command-line>:0:0:
In function '__guc_context_policy_add_priority.isra.66',
    inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3,
    inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2:
include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
...
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP'
   FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) | \
   ^~~~~~~~~~

Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning.

Fixes: 77b6f79df6 ("drm/i915/guc: Update to GuC version 69.0.3")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com
(cherry picked from commit 364e039827)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-05-29 11:35:26 +03:00
Chris Wilson
70cb9188ff drm/i915/gt: Disarm breadcrumbs if engines are already idle
The breadcrumbs use a GT wakeref for guarding the interrupt, but are
disarmed during release of the engine wakeref. This leaves a hole where
we may attach a breadcrumb just as the engine is parking (after it has
parked its breadcrumbs), execute the irq worker with some signalers still
attached, but never be woken again.

That issue manifests itself in CI with IGT runner timeouts while tests
are waiting indefinitely for release of all GT wakerefs.

<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
<7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
<7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
<7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
<7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
<7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
<7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
<4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
...
<6> [299.356526] sysrq: Show State
...
<6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
<6> [299.373967] Call Trace:
<6> [299.373968]  <TASK>
<6> [299.373970]  __schedule+0x3bb/0xda0
<6> [299.373974]  schedule+0x41/0x110
<6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
<6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
<6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
<6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
<6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
<6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
<6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
<6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
<6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
<6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]

At the end of the interrupt worker, if there are no more engines awake,
disarm the breadcrumb and go to sleep.

Fixes: 9d5612ca16 ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit fbad43ecca)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-05-29 11:35:15 +03:00
Andi Shyti
a09d2327a9 drm/i915/gt: Fix CCS id's calculation for CCS mode setting
The whole point of the previous fixes has been to change the CCS
hardware configuration to generate only one stream available to
the compute users. We did this by changing the info.engine_mask
that is set during device probe, reset during the detection of
the fused engines, and finally reset again when choosing the CCS
mode.

We can't use the engine_mask variable anymore, as with the
current configuration, it imposes only one CCS no matter what the
hardware configuration is.

Before changing the engine_mask for the third time, save it and
use it for calculating the CCS mode.

After the previous changes, the user reported a performance drop
to around 1/4. We have tested that the compute operations, with
the current patch, have improved by the same factor.

Fixes: 6db31251bb ("drm/i915/gt: Enable only one CCS for compute workload")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Gnattu OC <gnattuoc@me.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Jian Ye <jian.ye@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Tested-by: Gnattu OC <gnattuoc@me.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@linux.intel.com
2024-05-22 08:04:40 +02:00
Linus Torvalds
61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00
Arnd Bergmann
364e039827 drm/i915/guc: avoid FIELD_PREP warning
With gcc-7 and earlier, there are lots of warnings like

In file included from <command-line>:0:0:
In function '__guc_context_policy_add_priority.isra.66',
    inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3,
    inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2:
include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
...
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP'
   FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) | \
   ^~~~~~~~~~

Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning.

Fixes: 77b6f79df6 ("drm/i915/guc: Update to GuC version 69.0.3")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com
2024-05-16 09:15:09 -07:00
Tvrtko Ursulin
0f1bb41bf3 drm/i915: Support replaying GPU hangs with captured context image
When debugging GPU hangs Mesa developers are finding it useful to replay
the captured error state against the simulator. But due various simulator
limitations which prevent replicating all hangs, one step further is being
able to replay against a real GPU.

This is almost doable today with the missing part being able to upload the
captured context image into the driver state prior to executing the
uploaded hanging batch and all the buffers.

To enable this last part we add a new context parameter called
I915_CONTEXT_PARAM_CONTEXT_IMAGE. It follows the existing SSEU
configuration pattern of being able to select which context to apply
against, paired with the actual image and its size.

Since this is adding a new concept of debug only uapi, we hide it behind
a new kconfig option and also require activation with a module parameter.
Together with a warning banner printed at driver load, all those combined
should be sufficient to guard against inadvertently enabling the feature.

In terms of implementation we allow the legacy context set param to be
used since that removes the need to record the per context data in the
proto context, while still allowing flexibility of specifying context
images for any context.

Mesa MR using the uapi can be seen at:
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594

v2:
 * Fix whitespace alignment as per checkpatch.
 * Added warning on userspace misuse.
 * Rebase for extracting ce->default_state shadowing.

v3:
 * Rebase for I915_CONTEXT_PARAM_LOW_LATENCY.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Carlos Santa <carlos.santa@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-2-tursulin@igalia.com
2024-05-16 07:37:05 +00:00
Tvrtko Ursulin
e1eb97c211 drm/i915: Shadow default engine context image in the context
To enable adding override of the default engine context image let us start
shadowing the per engine state in the context.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-1-tursulin@igalia.com
2024-05-16 07:37:05 +00:00
Tvrtko Ursulin
60a2f25de7 Merge drm/drm-next into drm-intel-gt-next
Some display refactoring patches are needed in order to allow conflict-
less merging.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-05-16 07:33:01 +00:00
Linus Torvalds
db5d28c0bf drm for 6.10-rc1
new drivers:
 - panthor: ARM Mali/Immortalis CSF-based GPU driver
 
 core:
 - add a CONFIG_DRM_WERROR option
 - make more headers self-contained
 - grab resv lock in pin/unpin
 - fix vmap resv locking
 - EDID/eDP panel matching
 - Kconfig cleanups
 - DT sound bindings
 - Add SIZE_HINTS property for cursor planes
 - Add struct drm_edid_product_id and helpers.
 - Use drm device based logging in more drm functions.
 - drop seq_file.h from a bunch of places
 - use drm_edid driver conversions
 
 dp:
 - DP Tunnel documentation
 - MST read sideband cap
 - Adaptive sync SDP prep work
 
 ttm:
 - improve placement for TTM BOs in idle/busy handling
 
 panic:
 - Fixes for drm-panic, and option to test it.
 - Add drm panic to simpledrm, mgag200, imx, ast
 
 bridge:
 - improve init ordering
 - adv7511: allow GPIO pin sharing
 - tc358775: add tc358675 support
 
 panel:
 - AUO B120XAN01.0
 - Samsung s6e3fa7
 - BOE NT116WHM-N44
 - CMN N116BCA-EA1,
 - CrystalClear CMT430B19N00
 - Startek KD050HDFIA020-C020A
 - powertip PH128800T006-ZHC01
 - Innolux G121X1-L03
 - LG sw43408
 - Khadas TS050 V2
 - EDO RM69380 OLED
 - CSOT MNB601LS1-1
 
 amdgpu:
 - HDCP/ODM/RAS fixes
 - Devcoredump improvements
 - Expose VCN activity via sysfs
 - SMY 13.0.x updates
 - Enable fast updates on DCN 3.1.4
 - Add dclk and vclk reporting on additional devices
 - Add ACA RAS infrastructure
 - Implement TLB flush fence
 - EEPROM handling fixes
 - SMUIO 14.0.2 support
 - SMU 14.0.1 Updates
 - SMU 14.0.2 support
 - Sync page table freeing with TLB flushes
 - DML2 refactor
 - DC debug improvements
 - DCN 3.5.x Updates
 - GPU reset fixes
 - HDP fix for second GFX pipe on GC 10.x
 - Enable secondary GFX pipe on GC 10.3
 - Refactor and clean up BACO/BOCO/BAMACO handling
 - Remove invalid TTM resource start check
 - UAF fix in VA IOCTL
 - GPUVM page fault redirection to secondary IH rings for IH 6.x
 - Initial support for mapping kernel queues via MES
 - Fix VRAM memory accounting
 
 amdkfd:
 - MQD handling cleanup
 - Preemption handling fixes for XCDs
 - TLB flush fix for GC 9.4.2
 - Properly clean up workqueue during module unload
 - Fix memory leak process create failure
 - Range check CP bad op exception targets to avoid reporting invalid exceptions to userspace
 - Fix eviction fence handling
 - Fix leak in GPU memory allocation failure case
 - DMABuf import handling fix
 - Enable SQ watchpoint for gfx10
 
 i915:
 - Adding new DG2 PCI ID
 - add context hints for GT frequency
 - enable only one CCS for compute workloads
 - new workarounds
 - Fix UAF on destroy against retire race and remove two earlier partial fixes
 - Limit the reserved VM space to only the platforms that need it
 - Fix gt reset with GuC submission is disable
 - Add and use gt_to_guc() wrapper
 
 i915/xe display:
 - Lunar Lake display enabling, including cdclk and other refactors
 - BIOS/VBT/opregion related refactor
 - Digital port related refactor/clean-up
 - Fix 2s boot time regression on DP panel replay init
 - Remove duplication on audio enable/disable on SDVO and g4x+ DP
 - Disable AuxCCS framebuffers if built for Xe
 - Make crtc disable more atomic
 - Increase DP idle pattern wait timeout to 2ms
 - Start using container_of_const() for some extra const safety
 - Fix Jasper Lake boot freeze
 - Enable MST mode for 128b/132b single-stream sideband
 - Enable Adaptive Sync SDP Support for DP
 - Fix MTL supported DP rates - removal of UHBR13.5
 - PLL refactoring
 - Limit eDP MSO pipe only for display version 20
 - More display refactor towards independence from i915 dev_priv
 - Convert i915/xe fbdev to DRM client
 - More initial work to make display code more independent from i915
 
 xe:
 - improved error capture
 - clean up some uAPI leftovers
 - devcoredump update
 - Add BMG mocs table
 - Handle GSCCS ER interrupt
 - Implement xe2- and GuC workarounds
 - struct xe_device cleanup
 - Hwmon updates
 - Add LRC parsing for more GPU instruction
 - Increase VM_BIND number of per-ioctl Ops
 - drm/xe: Add XE_BO_GGTT_INVALIDATE flag
 - Initial development for SR-IOV support
 - Add new PCI IDs to DG2 platform
 - Move userptr over to start using hmm_range_fault
 
 msm:
 - Switched to generating register header files during build process
   instead of shipping pre-generated headers
 - Merged DPU and MDP4 format databases.
 - DP:
 - Stop using compat string to distinguish DP and eDP cases
 - Added support for X Elite platform (X1E80100)
 - Reworked DP aux/audio support
 - Added SM6350 DP to the bindings
 - GPU:
 - a7xx perfcntr reg fixes
 - MAINTAINERS updates
 - a750 devcoredump support
 
 radeon:
 - Silence UBSAN warnings related to flexible arrays
 
 nouveau:
 - move some uAPI objects to uapi headers
 
 omapdrm:
 - console fix
 
 ast:
 - add i2c polling
 
 qaic:
 - add debugfs entries
 
 exynos:
 - fix platform_driver .owner
 - drop cleanup code
 
 mediatek:
 - Use devm_platform_get_and_ioremap_resource() in mtk_hdmi_ddc_probe()
 - Add GAMMA 12-bit LUT support for MT8188
 - Rename mtk_drm_* to mtk_*
 - Drop driver owner initialization
 - Correct calculation formula of PHY Timing
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmZEUU0ACgkQDHTzWXnE
 hr5qMBAAjUFF0w3YOQMsn0LEAm628kMRHpoVeSXmIfO9z9lTyad30EtiS4ggFgj7
 Q/oQ6hHCd5jdsvGSJDgtTTAsTQX+aCkXrgf/18ENbqR5mM3MdefUAPR/zawZ7HR4
 8+b2h6p7gHBw8wDjuIvQ5e9InHcnIkKWJc82qnJG5Urgxa05SDh3mu3cosPTJiBw
 a851vlWaYcxC0yAUwJlWaXDdN8yzdFaSQNboZBS/CMLXF/WE6Ht257uxJmaouc0Y
 Z0kBybok5x0TPQEXF9IV+kuSW3EYpYcwRi0BFFM9sJjkEBdH3rYRZwuYP1LR+7VZ
 HKsmIkie8YzCm2VwTquYzUvHgF+swZX4RRch9XJlGz7UvBLc0eBO/2n4X6fNd8Kl
 QGNNqEfsnUQrAHKvGsOUgoGjSCmEo8voGcMZ3JPIAdJ/GcnJwpMvNxtF6XB08hEu
 rDxuU6o7WkM4dJbtiaFEHNh0Fmjj6aXdBL23UD9pcqPT1fc9cT3xnUd5RJIRuRwV
 /tpb2WfkFAoxCkKFiunaC4rE8oG6ME6wr/trYjvoYuhCI5hCVaXRBGzJEtC30IP6
 lG2YZ8r0jHjktbgjZ0Cz/hY424H4sxSN9SJAnXXFDzcfjBJ/nOgo5nMD1jKajAD5
 SYfqWaD5Y+YygtyLJPMfZQI2XMOpCzteXD8uaNXXFJfpV7Apeyg=
 =ocVM
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-05-15' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "This is the main pull request for the drm subsystems for 6.10.

  In drivers the main thing is a new driver for ARM Mali firmware based
  GPUs, otherwise there are a lot of changes to amdgpu/xe/i915/msm and
  scattered changes to everything else.

  In the core a bunch of headers and Kconfig was refactored, along with
  the addition of a new panic handler which is meant to provide a user
  friendly message when a panic happens and graphical display is
  enabled.

  New drivers:
   - panthor: ARM Mali/Immortalis CSF-based GPU driver

  Core:
   - add a CONFIG_DRM_WERROR option
   - make more headers self-contained
   - grab resv lock in pin/unpin
   - fix vmap resv locking
   - EDID/eDP panel matching
   - Kconfig cleanups
   - DT sound bindings
   - Add SIZE_HINTS property for cursor planes
   - Add struct drm_edid_product_id and helpers.
   - Use drm device based logging in more drm functions.
   - drop seq_file.h from a bunch of places
   - use drm_edid driver conversions

  dp:
   - DP Tunnel documentation
   - MST read sideband cap
   - Adaptive sync SDP prep work

  ttm:
   - improve placement for TTM BOs in idle/busy handling

  panic:
   - Fixes for drm-panic, and option to test it.
   - Add drm panic to simpledrm, mgag200, imx, ast

  bridge:
   - improve init ordering
   - adv7511: allow GPIO pin sharing
   - tc358775: add tc358675 support

  panel:
   - AUO B120XAN01.0
   - Samsung s6e3fa7
   - BOE NT116WHM-N44
   - CMN N116BCA-EA1,
   - CrystalClear CMT430B19N00
   - Startek KD050HDFIA020-C020A
   - powertip PH128800T006-ZHC01
   - Innolux G121X1-L03
   - LG sw43408
   - Khadas TS050 V2
   - EDO RM69380 OLED
   - CSOT MNB601LS1-1

  amdgpu:
   - HDCP/ODM/RAS fixes
   - Devcoredump improvements
   - Expose VCN activity via sysfs
   - SMY 13.0.x updates
   - Enable fast updates on DCN 3.1.4
   - Add dclk and vclk reporting on additional devices
   - Add ACA RAS infrastructure
   - Implement TLB flush fence
   - EEPROM handling fixes
   - SMUIO 14.0.2 support
   - SMU 14.0.1 Updates
   - SMU 14.0.2 support
   - Sync page table freeing with TLB flushes
   - DML2 refactor
   - DC debug improvements
   - DCN 3.5.x Updates
   - GPU reset fixes
   - HDP fix for second GFX pipe on GC 10.x
   - Enable secondary GFX pipe on GC 10.3
   - Refactor and clean up BACO/BOCO/BAMACO handling
   - Remove invalid TTM resource start check
   - UAF fix in VA IOCTL
   - GPUVM page fault redirection to secondary IH rings for IH 6.x
   - Initial support for mapping kernel queues via MES
   - Fix VRAM memory accounting

  amdkfd:
   - MQD handling cleanup
   - Preemption handling fixes for XCDs
   - TLB flush fix for GC 9.4.2
   - Properly clean up workqueue during module unload
   - Fix memory leak process create failure
   - Range check CP bad op exception targets to avoid reporting invalid exceptions to userspace
   - Fix eviction fence handling
   - Fix leak in GPU memory allocation failure case
   - DMABuf import handling fix
   - Enable SQ watchpoint for gfx10

  i915:
   - Adding new DG2 PCI ID
   - add context hints for GT frequency
   - enable only one CCS for compute workloads
   - new workarounds
   - Fix UAF on destroy against retire race and remove two earlier partial fixes
   - Limit the reserved VM space to only the platforms that need it
   - Fix gt reset with GuC submission is disable
   - Add and use gt_to_guc() wrapper

  i915/xe display:
   - Lunar Lake display enabling, including cdclk and other refactors
   - BIOS/VBT/opregion related refactor
   - Digital port related refactor/clean-up
   - Fix 2s boot time regression on DP panel replay init
   - Remove duplication on audio enable/disable on SDVO and g4x+ DP
   - Disable AuxCCS framebuffers if built for Xe
   - Make crtc disable more atomic
   - Increase DP idle pattern wait timeout to 2ms
   - Start using container_of_const() for some extra const safety
   - Fix Jasper Lake boot freeze
   - Enable MST mode for 128b/132b single-stream sideband
   - Enable Adaptive Sync SDP Support for DP
   - Fix MTL supported DP rates - removal of UHBR13.5
   - PLL refactoring
   - Limit eDP MSO pipe only for display version 20
   - More display refactor towards independence from i915 dev_priv
   - Convert i915/xe fbdev to DRM client
   - More initial work to make display code more independent from i915

  xe:
   - improved error capture
   - clean up some uAPI leftovers
   - devcoredump update
   - Add BMG mocs table
   - Handle GSCCS ER interrupt
   - Implement xe2- and GuC workarounds
   - struct xe_device cleanup
   - Hwmon updates
   - Add LRC parsing for more GPU instruction
   - Increase VM_BIND number of per-ioctl Ops
   - drm/xe: Add XE_BO_GGTT_INVALIDATE flag
   - Initial development for SR-IOV support
   - Add new PCI IDs to DG2 platform
   - Move userptr over to start using hmm_range_fault

  msm:
   - Switched to generating register header files during build process
     instead of shipping pre-generated headers
   - Merged DPU and MDP4 format databases.
   - DP:
     - Stop using compat string to distinguish DP and eDP cases
     - Added support for X Elite platform (X1E80100)
     - Reworked DP aux/audio support
     - Added SM6350 DP to the bindings
   - GPU:
     - a7xx perfcntr reg fixes
     - MAINTAINERS updates
     - a750 devcoredump support

  radeon:
   - Silence UBSAN warnings related to flexible arrays

  nouveau:
   - move some uAPI objects to uapi headers

  omapdrm:
   - console fix

  ast:
   - add i2c polling

  qaic:
   - add debugfs entries

  exynos:
   - fix platform_driver .owner
   - drop cleanup code

  mediatek:
   - Use devm_platform_get_and_ioremap_resource() in mtk_hdmi_ddc_probe()
   - Add GAMMA 12-bit LUT support for MT8188
   - Rename mtk_drm_* to mtk_*
   - Drop driver owner initialization
   - Correct calculation formula of PHY Timing"

* tag 'drm-next-2024-05-15' of https://gitlab.freedesktop.org/drm/kernel: (1477 commits)
  drm/xe/ads: Use flexible-array
  drm/xe: Use ordered WQ for G2H handler
  drm/msm/gen_header: allow skipping the validation
  drm/msm/a6xx: Cleanup indexed regs const'ness
  drm/msm: Add devcoredump support for a750
  drm/msm: Adjust a7xx GBIF debugbus dumping
  drm/msm: Update a6xx registers XML
  drm/msm: Fix imported a750 snapshot header for upstream
  drm/msm: Import a750 snapshot registers from kgsl
  MAINTAINERS: Add Konrad Dybcio as a reviewer for the Adreno driver
  MAINTAINERS: Add a separate entry for Qualcomm Adreno GPU drivers
  drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails
  drm/msm/adreno: fix CP cycles stat retrieval on a7xx
  drm/msm/a7xx: allow writing to CP_BV counter selection registers
  drm: zynqmp_dpsub: Always register bridge
  Revert "drm/bridge: ti-sn65dsi83: Fix enable error path"
  drm/fb_dma: Add checks in drm_fb_dma_get_scanout_buffer()
  drm/fbdev-generic: Do not set physical framebuffer address
  drm/panthor: Fix the FW reset logic
  drm/panthor: Make sure we handle 'unknown group state' case properly
  ...
2024-05-15 09:43:42 -07:00
Chris Wilson
fbad43ecca drm/i915/gt: Disarm breadcrumbs if engines are already idle
The breadcrumbs use a GT wakeref for guarding the interrupt, but are
disarmed during release of the engine wakeref. This leaves a hole where
we may attach a breadcrumb just as the engine is parking (after it has
parked its breadcrumbs), execute the irq worker with some signalers still
attached, but never be woken again.

That issue manifests itself in CI with IGT runner timeouts while tests
are waiting indefinitely for release of all GT wakerefs.

<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
<7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
<7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
<7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
<7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
<7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
<7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
<4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
...
<6> [299.356526] sysrq: Show State
...
<6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
<6> [299.373967] Call Trace:
<6> [299.373968]  <TASK>
<6> [299.373970]  __schedule+0x3bb/0xda0
<6> [299.373974]  schedule+0x41/0x110
<6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
<6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
<6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
<6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
<6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
<6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
<6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
<6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
<6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
<6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]

At the end of the interrupt worker, if there are no more engines awake,
disarm the breadcrumb and go to sleep.

Fixes: 9d5612ca16 ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz.krzysztofik@linux.intel.com
2024-05-14 13:18:34 +02:00
Ville Syrjälä
d082c05a63 drm/i915: Pass the region ID rather than a bitmask to HAS_REGION()
The name 'HAS_REGION()' suggests we are checking for a single
region, so seem more sensible to pass in the region ID rather
than a bitmask.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-2-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-08 14:24:45 +03:00
Ville Syrjälä
8b69ac66d6 drm/i915: Fix HAS_REGION() usage in intel_gt_probe_lmem()
HAS_REGION() takes a bitmask, not the region ID. This causes the
GEM_BUG_ON() to assert that the SMEM region is available rather
than the intended LMEM region. No real harm since SMEM is always
available, but also not checking what was intended.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-08 14:22:34 +03:00
Andi Shyti
51c1b42a23
drm/i915/gt: Automate CCS Mode setting during engine resets
We missed setting the CCS mode during resume and engine resets.
Create a workaround to be added in the engine's workaround list.
This workaround sets the XEHP_CCS_MODE value at every reset.

The issue can be reproduced by running:

  $ clpeak --kernel-latency

Without resetting the CCS mode, we encounter a fence timeout:

  Fence expiration time out i915-0000:03:00.0:clpeak[2387]:2!

Fixes: 6db31251bb ("drm/i915/gt: Enable only one CCS for compute workload")
Reported-by: Gnattu OC <gnattuoc@me.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Tested-by: Gnattu OC <gnattuoc@me.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426000723.229296-1-andi.shyti@linux.intel.com
(cherry picked from commit 4cfca03f76)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-06 14:15:24 -04:00
Dave Airlie
9f9039c6ef Core DRM:
- Export drm_client_dev_unregister (Thomas Zimmermann)
 
 Display i915:
 - More initial work to make display code more independent from i915 (Jani)
 - Convert i915/xe fbdev to DRM client (Thomas Zimmermann)
 - VLV/CHV DPIO register cleanup (Ville)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmYxTr4ACgkQ+mJfZA7r
 E8pagAf+LArd0UeMypS7ZDQEBJ8Og6c/mU9OolAByvJeCJ9tqq0giEOwuqYfUQ1U
 nClUMkRk9M3LqdLq/n8hVUBKxCs+aPSgcxfo4+jAXnOsTvcXaDX+sCoSOHP7JmVd
 3DvJwr7SCHcyC2xFlxJ8jrbwdRu7TK2ezOq4W2LKiTIGjHtnQou2Gf15B7FgeKMR
 XaOGlXp50rZCGCLYbkStV8kjsjyJ8Md87JHSv9/Lr5mS4hcF4OR2dRXV4ojjXYrW
 IqRfAEcjXQIdkv8NLU0pXPqnI79DGb1CtLwQachNRY7OgPD4PTGVFr+meHF4DW7u
 RF32WQxu8O8/02HuXAEnLxzVrc6Zdg==
 =ASC/
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2024-04-30' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next

Core DRM:
- Export drm_client_dev_unregister (Thomas Zimmermann)

Display i915:
- More initial work to make display code more independent from i915 (Jani)
- Convert i915/xe fbdev to DRM client (Thomas Zimmermann)
- VLV/CHV DPIO register cleanup (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZjFPcSCTd_5c0XU_@intel.com
2024-05-02 14:30:31 +10:00
Andi Shyti
4cfca03f76 drm/i915/gt: Automate CCS Mode setting during engine resets
We missed setting the CCS mode during resume and engine resets.
Create a workaround to be added in the engine's workaround list.
This workaround sets the XEHP_CCS_MODE value at every reset.

The issue can be reproduced by running:

  $ clpeak --kernel-latency

Without resetting the CCS mode, we encounter a fence timeout:

  Fence expiration time out i915-0000:03:00.0:clpeak[2387]:2!

Fixes: 2bebae0112 ("drm/i915/gt: Enable only one CCS for compute workload")
Reported-by: Gnattu OC <gnattuoc@me.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Tested-by: Gnattu OC <gnattuoc@me.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426000723.229296-1-andi.shyti@linux.intel.com
2024-04-30 13:41:20 +02:00
Dave Airlie
68b89e23c2 Merge tag 'drm-intel-gt-next-2024-04-26' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next
UAPI Changes:

- drm/i915/guc: Use context hints for GT frequency

    Allow user to provide a low latency context hint. When set, KMD
    sends a hint to GuC which results in special handling for this
    context. SLPC will ramp the GT frequency aggressively every time
    it switches to this context. The down freq threshold will also be
    lower so GuC will ramp down the GT freq for this context more slowly.
    We also disable waitboost for this context as that will interfere with
    the strategy.

    We need to enable the use of SLPC Compute strategy during init, but
    it will apply only to contexts that set this bit during context
    creation.

    Userland can check whether this feature is supported using a new param-
    I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
    enabled platforms as they use SLPC for frequency management.

    The Mesa usage model for this flag is here -
    https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint

- drm/i915/gt: Enable only one CCS for compute workload

    Enable only one CCS engine by default with all the compute sices
    allocated to it.

    While generating the list of UABI engines to be exposed to the
    user, exclude any additional CCS engines beyond the first
    instance

    ***

    NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by
    default to mitigate a hardware bug. All the EUs will still remain
    usable, and all the userspace drivers have been confirmed to be able
    to dynamically detect the change in number of CCS engines and adjust.

    For the smaller percent of applications that get perf benefit from
    letting the userspace driver dispatch across all 4 CCS engines we will
    be introducing a sysfs control as a later patch to choose 4 CCS each
    with 25% EUs (or 50% if 2 CCS).

    NOTE: A regression has been reported at

    https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895

    However Andi has been triaging the issue and we're closing in a fix
    to the gap in the W/A implementation:

    https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html

Driver Changes:

- Add new and fix to existing workarounds: Wa_14018575942 (MTL),
  Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438,
  Wa_14020495402 (Gen12.70) (Tejas, John, Lucas)
- Fix UAF on destroy against retire race and remove two earlier
  partial fixes (Janusz)
- Limit the reserved VM space to only the platforms that need it (Andi)
- Reset queue_priority_hint on parking for execlist platforms (Chris)
- Fix gt reset with GuC submission is disabled (Nirmoy)
- Correct capture of EIR register on hang (John)

- Remove usage of the deprecated ida_simple_xx() API
- Refactor confusing __intel_gt_reset() (Nirmoy)
- Fix the fix for GuC reset lock confusion (John)
- Simplify/extend platform check for Wa_14018913170 (John)
- Replace dev_priv with i915 (Andi)
- Add and use gt_to_guc() wrapper (Andi)
- Remove bogus null check (Rodrigo, Dan)

. Selftest improvements (Janusz, Nirmoy, Daniele)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZitVBTvZmityDi7D@jlahtine-mobl.ger.corp.intel.com
2024-04-30 14:40:43 +10:00
Jani Nikula
4973e63240 drm/i915/display: split out intel_fbc_regs.h from i915_reg.h
Clean up i915_reg.h.

v2: Drop chicken regs and comments (Ville)

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/aa9b5d8adefbe97e1e37c9cfada3ab1581b0e8d5.1714128645.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-04-29 12:30:01 +03:00
Kent Overstreet
0069455bcb fix missing vmalloc.h includes
Patch series "Memory allocation profiling", v6.

Overview:
Low overhead [1] per-callsite memory allocation profiling. Not just for
debug kernels, overhead low enough to be deployed in production.

Example output:
  root@moria-kvm:~# sort -rn /proc/allocinfo
   127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
    56373248     4737 mm/slub.c:2259 func:alloc_slab_page
    14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
    14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
    13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
    11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
     9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
     4206592        4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
     4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
     3940352      962 mm/memory.c:4214 func:alloc_anon_folio
     2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
     ...

Usage:
kconfig options:
 - CONFIG_MEM_ALLOC_PROFILING
 - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
 - CONFIG_MEM_ALLOC_PROFILING_DEBUG
   adds warnings for allocations that weren't accounted because of a
   missing annotation

sysctl:
  /proc/sys/vm/mem_profiling

Runtime info:
  /proc/allocinfo

Notes:

[1]: Overhead
To measure the overhead we are comparing the following configurations:
(1) Baseline with CONFIG_MEMCG_KMEM=n
(2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
(3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
(4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1)
(5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT
(6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)  && CONFIG_MEMCG_KMEM=y
(7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y

Performance overhead:
To evaluate performance we implemented an in-kernel test executing
multiple get_free_page/free_page and kmalloc/kfree calls with allocation
sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
affinity set to a specific CPU to minimize the noise. Below are results
from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
56 core Intel Xeon:

                        kmalloc                 pgalloc
(1 baseline)            6.764s                  16.902s
(2 default disabled)    6.793s  (+0.43%)        17.007s (+0.62%)
(3 default enabled)     7.197s  (+6.40%)        23.666s (+40.02%)
(4 runtime enabled)     7.405s  (+9.48%)        23.901s (+41.41%)
(5 memcg)               13.388s (+97.94%)       48.460s (+186.71%)
(6 def disabled+memcg)  13.332s (+97.10%)       48.105s (+184.61%)
(7 def enabled+memcg)   13.446s (+98.78%)       54.963s (+225.18%)

Memory overhead:
Kernel size:

   text           data        bss         dec         diff
(1) 26515311	      18890222    17018880    62424413
(2) 26524728	      19423818    16740352    62688898    264485
(3) 26524724	      19423818    16740352    62688894    264481
(4) 26524728	      19423818    16740352    62688898    264485
(5) 26541782	      18964374    16957440    62463596    39183

Memory consumption on a 56 core Intel CPU with 125GB of memory:
Code tags:           192 kB
PageExts:         262144 kB (256MB)
SlabExts:           9876 kB (9.6MB)
PcpuExts:            512 kB (0.5MB)

Total overhead is 0.2% of total memory.

Benchmarks:

Hackbench tests run 100 times:
hackbench -s 512 -l 200 -g 15 -f 25 -P
      baseline       disabled profiling           enabled profiling
avg   0.3543         0.3559 (+0.0016)             0.3566 (+0.0023)
stdev 0.0137         0.0188                       0.0077


hackbench -l 10000
      baseline       disabled profiling           enabled profiling
avg   6.4218         6.4306 (+0.0088)             6.5077 (+0.0859)
stdev 0.0933         0.0286                       0.0489

stress-ng tests:
stress-ng --class memory --seq 4 -t 60
stress-ng --class cpu --seq 4 -t 60
Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/

[2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/


This patch (of 37):

The next patch drops vmalloc.h from a system header in order to fix a
circular dependency; this adds it to all the files that were pulling it in
implicitly.

[kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c]
  Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev
[surenb@google.com: fix arch/x86/mm/numa_32.c]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com
[kent.overstreet@linux.dev: a few places were depending on sizes.h]
  Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev
[arnd@arndb.de: fix mm/kasan/hw_tags.c]
  Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org
[surenb@google.com: fix arc build]
  Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:49 -07:00
Nirmoy Das
4d3421e04c drm/i915: Fix gt reset with GuC submission is disabled
Currently intel_gt_reset() kills the GuC and then resets requested
engines. This is problematic because there is a dedicated CSB FIFO
which only GuC can access and if that FIFO fills up, the hardware
will block on the next context switch until there is space that means
the system is effectively hung. If an engine is reset whilst actively
executing a context, a CSB entry will be sent to say that the context
has gone idle. Thus if reset happens on a very busy system then
killing GuC before killing the engines will lead to deadlock because
of filled up CSB FIFO.

To address this issue, the GuC should be killed only after resetting
the requested engines and before calling intel_gt_init_hw().

v2: Improve commit message(John)

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422201951.633-2-nirmoy.das@intel.com
2024-04-24 18:48:32 +02:00
Nirmoy Das
31c3c53ee3 drm/i915: Refactor confusing __intel_gt_reset()
__intel_gt_reset() is really for resetting engines though
the name might suggest something else. So add a helper function
to remove confusions with no functional changes.

v2: Move intel_gt_reset_all_engines() next to
    intel_gt_reset_engine() to make diff simple(John)

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422201951.633-1-nirmoy.das@intel.com
2024-04-24 18:48:31 +02:00
Dave Airlie
0208ca55aa Linux 6.9-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmYlapoeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGJLgH/RFrsXefpxAWACJv
 wRmIx/YFjI3Z4nypBZPXysX/JOuf/cf4bR22jMHC1vhQDtxVgB7eUpEYx0C0Y5RD
 ll7cunAZCpP+PB725/bOWbaF94eigl8xC4RrKPon4tG1CS9NAvCJFDyNgWpqh48s
 sSKKae68a16zMtIX5Za8nvmsBQOeD67ZFBgI4m7wIDDhTXY3XlG84r8Rz/ZwcdE/
 rExFkDMUqjpoJVAMBtETQNk4pvKZVMrW2B0f/xZSR+IKXw5l2FO0raXkIG/6TERs
 CxJvCuUjKIosqFLIK2O/cU9G+5V2axZ/WD8YRwDr2vlnnOPht+dFQzyPcuSqezbG
 ShIhQo8=
 =GI5F
 -----END PGP SIGNATURE-----

Backmerge tag 'v6.9-rc5' into drm-next

Linux 6.9-rc5

I've had a persistent msm failure on clang, and the fix is in fixes
so just pull it back to fix that.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-22 14:35:52 +10:00
Christophe JAILLET
2af231e1b8
drm/i915/guc: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().

Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_range() is inclusive. So a -1 has been added when needed.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7108c1871c6cb08d403c4fa6534bc7e6de4cb23d.1705245316.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-14 15:09:18 -04:00
Jani Nikula
87816d6074 drm/i915/gt: drop display clock info from gt debugfs
The same info is available in i915_cdclk_info.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/50461f13ab09b162de25d3f3587890548f4db499.1712599670.git.jani.nikula@intel.com
2024-04-09 11:29:38 +03:00
John Harrison
152191e5e9
drm/i915/guc: Fix the fix for reset lock confusion
The previous fix for the circlular lock splat about the busyness
worker wasn't quite complete. Even though the reset-in-progress flag
is cleared at the start of intel_uc_reset_finish, the entire function
is still inside the reset mutex lock. Not sure why the patch appeared
to fix the issue both locally and in CI. However, it is now back
again.

There is a further complication that the wedge code path within
intel_gt_reset() jumps around so much that it results in nested
reset_prepare/_finish calls. That is, the call sequence is:
  intel_gt_reset
  | reset_prepare
  | __intel_gt_set_wedged
  | | reset_prepare
  | | reset_finish
  | reset_finish

The nested finish means that even if the clear of the in-progress flag
was moved to the end of _finish, it would still be clear for the
entire second call. Surprisingly, this does not seem to be causing any
other problems at present.

As an aside, a wedge on fini does not call the finish functions at
all. The reset_in_progress flag is left set (twice).

So instead of trying to cancel the worker anywhere at all in the reset
path, just add a cancel to intel_guc_submission_fini instead. Note
that it is not a problem if the worker is still active during a reset.
Either it will run before the reset path starts locking things and
will simply block the reset code for a tiny amount of time. Or it will
run after the locks have been acquired and will early exit due to the
try-lock.

Also, do not use the reset-in-progress flag to decide whether a
synchronous cancel is safe (from a lockdep perspective) or not.
Instead, use the actual reset mutex state (both the genuine one and
the custom rolled BACKOFF one).

Fixes: 0e00a8814e ("drm/i915/guc: Avoid circular locking issue on busyness flush")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com
(cherry picked from commit 3563d85531)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-08 13:09:37 -04:00
John Harrison
3563d85531 drm/i915/guc: Fix the fix for reset lock confusion
The previous fix for the circlular lock splat about the busyness
worker wasn't quite complete. Even though the reset-in-progress flag
is cleared at the start of intel_uc_reset_finish, the entire function
is still inside the reset mutex lock. Not sure why the patch appeared
to fix the issue both locally and in CI. However, it is now back
again.

There is a further complication that the wedge code path within
intel_gt_reset() jumps around so much that it results in nested
reset_prepare/_finish calls. That is, the call sequence is:
  intel_gt_reset
  | reset_prepare
  | __intel_gt_set_wedged
  | | reset_prepare
  | | reset_finish
  | reset_finish

The nested finish means that even if the clear of the in-progress flag
was moved to the end of _finish, it would still be clear for the
entire second call. Surprisingly, this does not seem to be causing any
other problems at present.

As an aside, a wedge on fini does not call the finish functions at
all. The reset_in_progress flag is left set (twice).

So instead of trying to cancel the worker anywhere at all in the reset
path, just add a cancel to intel_guc_submission_fini instead. Note
that it is not a problem if the worker is still active during a reset.
Either it will run before the reset path starts locking things and
will simply block the reset code for a tiny amount of time. Or it will
run after the locks have been acquired and will early exit due to the
try-lock.

Also, do not use the reset-in-progress flag to decide whether a
synchronous cancel is safe (from a lockdep perspective) or not.
Instead, use the actual reset mutex state (both the genuine one and
the custom rolled BACKOFF one).

Fixes: 0e00a8814e ("drm/i915/guc: Avoid circular locking issue on busyness flush")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240329235306.1559639-1-John.C.Harrison@Intel.com
2024-04-05 09:31:37 -07:00
Rodrigo Vivi
7406538860
drm/i915/guc: Remove bogus null check
This null check is bogus because we are already using 'ce' stuff
in many places before this function is called.

Having this here is useless and confuses static analyzer tools
that can see:

struct intel_engine_cs *engine = ce->engine;

before this check, in the same function.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202403101225.7AheJhZJ-lkp@intel.com/
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328213107.90632-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-03 14:44:22 -04:00
Andi Shyti
6db31251bb
drm/i915/gt: Enable only one CCS for compute workload
Enable only one CCS engine by default with all the compute sices
allocated to it.

While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance.

This change can be tested with igt i915_query.

Fixes: d2eae8e98d ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-4-andi.shyti@linux.intel.com
(cherry picked from commit 2bebae0112)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-03 14:26:10 -04:00
Andi Shyti
ea315f98e5
drm/i915/gt: Do not generate the command streamer for all the CCS
We want a fixed load CCS balancing consisting in all slices
sharing one single user engine. For this reason do not create the
intel_engine_cs structure with its dedicated command streamer for
CCS slices beyond the first.

Fixes: d2eae8e98d ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com
(cherry picked from commit c7a5aa4e57)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-03 14:26:10 -04:00
Andi Shyti
bc9a1ec012
drm/i915/gt: Disable HW load balancing for CCS
The hardware should not dynamically balance the load between CCS
engines. Wa_14019159160 recommends disabling it across all
platforms.

Fixes: d2eae8e98d ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-2-andi.shyti@linux.intel.com
(cherry picked from commit f5d2904cf8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-03 14:26:10 -04:00
Andi Shyti
94bf3e60e1
drm/i915/gt: Limit the reserved VM space to only the platforms that need it
Commit 9bb66c179f ("drm/i915: Reserve some kernel space per
vm") reduces the available VM space of one page in order to apply
Wa_16018031267 and Wa_16018063123.

This page was reserved indiscrimitely in all platforms even when
not needed. Limit it to DG2 onwards.

Fixes: 9bb66c179f ("drm/i915: Reserve some kernel space per vm")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327200546.640108-1-andi.shyti@linux.intel.com
(cherry picked from commit 9721634441)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-03 14:26:10 -04:00
Rodrigo Vivi
5add703f6a
Merge drm/drm-next into drm-intel-next
Catching up on 6.9-rc2

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-02 08:17:13 -04:00
Andi Shyti
2bebae0112 drm/i915/gt: Enable only one CCS for compute workload
Enable only one CCS engine by default with all the compute sices
allocated to it.

While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance.

This change can be tested with igt i915_query.

Fixes: d2eae8e98d ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-4-andi.shyti@linux.intel.com
2024-03-30 01:17:36 +01:00
Andi Shyti
c7a5aa4e57 drm/i915/gt: Do not generate the command streamer for all the CCS
We want a fixed load CCS balancing consisting in all slices
sharing one single user engine. For this reason do not create the
intel_engine_cs structure with its dedicated command streamer for
CCS slices beyond the first.

Fixes: d2eae8e98d ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com
2024-03-30 01:17:35 +01:00
Andi Shyti
f5d2904cf8 drm/i915/gt: Disable HW load balancing for CCS
The hardware should not dynamically balance the load between CCS
engines. Wa_14019159160 recommends disabling it across all
platforms.

Fixes: d2eae8e98d ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-2-andi.shyti@linux.intel.com
2024-03-30 01:17:35 +01:00
Andi Shyti
9721634441 drm/i915/gt: Limit the reserved VM space to only the platforms that need it
Commit 9bb66c179f ("drm/i915: Reserve some kernel space per
vm") reduces the available VM space of one page in order to apply
Wa_16018031267 and Wa_16018063123.

This page was reserved indiscrimitely in all platforms even when
not needed. Limit it to DG2 onwards.

Fixes: 9bb66c179f ("drm/i915: Reserve some kernel space per vm")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240327200546.640108-1-andi.shyti@linux.intel.com
2024-03-28 20:19:11 +01:00
Chris Wilson
4a3859ea52
drm/i915/gt: Reset queue_priority_hint on parking
Originally, with strict in order execution, we could complete execution
only when the queue was empty. Preempt-to-busy allows replacement of an
active request that may complete before the preemption is processed by
HW. If that happens, the request is retired from the queue, but the
queue_priority_hint remains set, preventing direct submission until
after the next CS interrupt is processed.

This preempt-to-busy race can be triggered by the heartbeat, which will
also act as the power-management barrier and upon completion allow us to
idle the HW. We may process the completion of the heartbeat, and begin
parking the engine before the CS event that restores the
queue_priority_hint, causing us to fail the assertion that it is MIN.

<3>[  166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[  166.210781] Dumping ftrace buffer:
<0>[  166.210795] ---------------------------------
...
<0>[  167.302811] drm_fdin-1097      2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 }
<0>[  167.302861] drm_fdin-1097      2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646
<0>[  167.302928] drm_fdin-1097      2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0
<0>[  167.302992] drm_fdin-1097      2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659
<0>[  167.303044] drm_fdin-1097      2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40
<0>[  167.303095] drm_fdin-1097      2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 }
<0>[  167.303159] kworker/-89       11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2
<0>[  167.303208] kworker/-89       11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin
<0>[  167.303272] kworker/-89       11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2
<0>[  167.303321] kworker/-89       11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin
<0>[  167.303384] kworker/-89       11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660
<0>[  167.303434] kworker/-89       11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns }
<0>[  167.303484] kworker/-89       11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked
<0>[  167.303534]   <idle>-0         5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040
<0>[  167.303583] kworker/-89       11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns }
<0>[  167.303756] kworker/-89       11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns }
<0>[  167.303806] kworker/-89       11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[  167.303811] ---------------------------------
<4>[  167.304722] ------------[ cut here ]------------
<2>[  167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283!
<4>[  167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[  167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G        W          6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1
<4>[  167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
<4>[  167.304738] Workqueue: i915-unordered retire_work_handler [i915]
<4>[  167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915]
<4>[  167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48
<4>[  167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246
<4>[  167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006
<4>[  167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
<4>[  167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001
<4>[  167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000
<4>[  167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0
<4>[  167.304950] FS:  0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000
<4>[  167.304952] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0
<4>[  167.304955] PKRU: 55555554
<4>[  167.304957] Call Trace:
<4>[  167.304958]  <TASK>
<4>[  167.305573]  ____intel_wakeref_put_last+0x1d/0x80 [i915]
<4>[  167.305685]  i915_request_retire.part.0+0x34f/0x600 [i915]
<4>[  167.305800]  retire_requests+0x51/0x80 [i915]
<4>[  167.305892]  intel_gt_retire_requests_timeout+0x27f/0x700 [i915]
<4>[  167.305985]  process_scheduled_works+0x2db/0x530
<4>[  167.305990]  worker_thread+0x18c/0x350
<4>[  167.305993]  kthread+0xfe/0x130
<4>[  167.305997]  ret_from_fork+0x2c/0x50
<4>[  167.306001]  ret_from_fork_asm+0x1b/0x30
<4>[  167.306004]  </TASK>

It is necessary for the queue_priority_hint to be lower than the next
request submission upon waking up, as we rely on the hint to decide when
to kick the tasklet to submit that first request.

Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 98850e96cf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28 12:16:16 -04:00
Tejas Upadhyay
186bce6827
drm/i915/mtl: Update workaround 14018575942
Applying WA 14018575942 only on Compute engine has impact on
some apps like chrome. Updating this WA to apply on Render
engine as well as it is helping with performance on Chrome.

Note: There is no concern from media team thus not applying
WA on media engines. We will revisit if any issues reported
from media team.

V2(Matt):
 - Use correct WA number

Fixes: 668f37e1ee ("drm/i915/mtl: Update workaround 14018778641")
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240228103738.2018458-1-tejas.upadhyay@intel.com
(cherry picked from commit 7127128017)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28 12:16:15 -04:00