Commit Graph

1381929 Commits

Author SHA1 Message Date
Rafael J. Wysocki
d405ec23df ACPI: processor: perflib: Move problematic pr->performance check
Commit d33bd88ac0 ("ACPI: processor: perflib: Fix initial _PPC limit
application") added a pr->performance check that prevents the frequency
QoS request from being added when the given processor has no performance
object.  Unfortunately, this causes a WARN() in freq_qos_remove_request()
to trigger on an attempt to take the given CPU offline later because the
frequency QoS object has not been added for it due to the missing
performance object.

Address this by moving the pr->performance check before calling
acpi_processor_get_platform_limit() so it only prevents a limit from
being set for the CPU if the performance object is not present.  This
way, the frequency QoS request is added as it was before the above
commit and it is present all the time along with the CPU's cpufreq
policy regardless of whether or not the CPU is online.

Fixes: d33bd88ac0 ("ACPI: processor: perflib: Fix initial _PPC limit application")
Tested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2801421.mvXUDI8C0e@rafael.j.wysocki
2025-08-12 15:33:58 +02:00
Paolo Abeni
c04fdca8a9 ipsec-2025-08-11
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmiZtCAACgkQrB3Eaf9P
 W7c/5w/+P3F7DWsjlNUipyIRtwaaMImrELvwrRbg4ajv4RALd1HRSWV8idHg2Kaj
 7XSJzDoGjegCwfXlRAuMRlmPz8HJUEkZ4rAeygOqwxRCrtV7R1JoGbiHot0Bk+Jn
 aWNvbD4/cbULmkvdo1CBPmONb5XDkLJA1Rh6dJJoZtlCCF20zC1HLesoT1EZcIkE
 1Rpmb+O2RB2zj0m+ciKuU5NgqGC4jpwcB+Wlcpa7hTUBIIsuEfUmj8IHXir/7b0v
 MKZMqJfD4xM3MxRQjkR8xEXDcrVGLsS18BzCDW6x3DcW+aq+0gZlUhLg6LDB8c+e
 lwHMzeTlorWIsFf4PrqL4QJMlFx8S5UInZOe/sex1xTJ3afhQgVJv+48V9U+XI4E
 eOL6DVBG+L00dieBdSRcdF2g+ceOx0PaLtfSHhc1zLuWSOlgwnlcLfhhTfxw4eFl
 ShE/TaGgat4l6ng1HHrq6ZnjjCSRXdfAquGUgIWIPlYyQw+8dxXtYaqMJKP28Gl/
 HDqZRxoRy0Wy1woABj7vpotg9I/hi3wV7mRcWxo+tWfMNJm+4BhjvUyp8yYtaV3j
 +1nO1HCg9nEQATVPNQKIUlo7UxwRcgFcm4QzfcWtsGh7/5O0VcUxgIRJtVdY2TZS
 nSBcEQUuS0l47ngGOTzd1pHce5ocKV8YDEprqsXuqYAUh59KnD8=
 =kxTv
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-2025-08-11' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2025-08-11

1) Fix flushing of all states in xfrm_state_fini.
   From Sabrina Dubroca.

2) Fix some IPsec software offload features. These
   got lost with some recent HW offload changes.
   From Sabrina Dubroca.

Please pull or let me know if there are problems.

* tag 'ipsec-2025-08-11' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  udp: also consider secpath when evaluating ipsec use for checksumming
  xfrm: bring back device check in validate_xmit_xfrm
  xfrm: restore GSO for SW crypto
  xfrm: flush all states in xfrm_state_fini
====================

Link: https://patch.msgid.link/20250811092008.731573-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 15:01:09 +02:00
Paolo Abeni
74078816f8 Merge branch 'net-prevent-deadlocks-and-mis-configuration-with-per-napi-threaded-config'
Jakub Kicinski says:

====================
net: prevent deadlocks and mis-configuration with per-NAPI threaded config

Running the test added with a recent fix on a driver with persistent
NAPI config leads to a deadlock. The deadlock is fixed by patch 3,
patch 2 is I think a more fundamental problem with the way we
implemented the config.

I hope the fix makes sense, my own thinking is definitely colored
by my preference (IOW how the per-queue config RFC was implemented).

v1: https://lore.kernel.org/20250808014952.724762-1-kuba@kernel.org
====================

Link: https://patch.msgid.link/20250809001205.1147153-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 14:43:08 +02:00
Jakub Kicinski
b3fc08ab9a net: prevent deadlocks when enabling NAPIs with mixed kthread config
The following order of calls currently deadlocks if:
 - device has threaded=1; and
 - NAPI has persistent config with threaded=0.

  netif_napi_add_weight_config()
    dev->threaded == 1
      napi_kthread_create()

  napi_enable()
    napi_restore_config()
      napi_set_threaded(0)
        napi_stop_kthread()
	  while (NAPIF_STATE_SCHED)
	    msleep(20)

We deadlock because disabled NAPI has STATE_SCHED set.
Creating a thread in netif_napi_add() just to destroy it in
napi_disable() is fairly ugly in the first place. Let's read
both the device config and the NAPI config in netif_napi_add().

Fixes: e6d7626881 ("net: Update threaded state in napi config in netif_set_threaded")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250809001205.1147153-4-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 14:43:05 +02:00
Jakub Kicinski
ccba9f6baa net: update NAPI threaded config even for disabled NAPIs
We have to make sure that all future NAPIs will have the right threaded
state when the state is configured on the device level.
We chose not to have an "unset" state for threaded, and not to wipe
the NAPI config clean when channels are explicitly disabled.
This means the persistent config structs "exist" even when their NAPIs
are not instantiated.

Differently put - the NAPI persistent state lives in the net_device
(ncfg == struct napi_config):

    ,--- [napi 0] - [napi 1]
 [dev]      |          |
    `--- [ncfg 0] - [ncfg 1]

so say we a device with 2 queues but only 1 enabled:

    ,--- [napi 0]
 [dev]      |
    `--- [ncfg 0] - [ncfg 1]

now we set the device to threaded=1:

    ,---------- [napi 0 (thr:1)]
 [dev(thr:1)]      |
    `---------- [ncfg 0 (thr:1)] - [ncfg 1 (thr:?)]

Since [ncfg 1] was not attached to a NAPI during configuration we
skipped it. If we create a NAPI for it later it will have the old
setting (presumably disabled). One could argue if this is right
or not "in principle", but it's definitely not how things worked
before per-NAPI config..

Fixes: 2677010e77 ("Add support to set NAPI threaded for individual NAPI")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250809001205.1147153-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 14:43:05 +02:00
Jakub Kicinski
bda053d644 selftests: drv-net: don't assume device has only 2 queues
The test is implicitly assuming the device only has 2 queues.
A real device will likely have more. The exact problem is that
because NAPIs get added to the list from the head, the netlink
dump reports them in reverse order. So the naive napis[0] will
actually likely give us the _last_ NAPI, not the first one.
Re-enable all the NAPIs instead of hard-coding 2 in the test.
This way the NAPIs we operated on will always reappear,
doesn't matter where they were in the registration order.

Fixes: e6d7626881 ("net: Update threaded state in napi config in netif_set_threaded")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250809001205.1147153-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 14:43:05 +02:00
Thorsten Blum
dcb82900b1
ASoC: codecs: Call strscpy() with correct size argument
In aw8xxxx_profile_info(), strscpy() is called with the length of the
source string "null" rather than the size of the destination buffer.

This is fine as long as the destination buffer is larger than the source
string, but we should still use the destination buffer size instead to
call strscpy() as intended. And since 'name' points to the fixed-size
buffer 'uinfo->value.enumerated.name', we can safely omit the size
argument and let strscpy() infer it using sizeof() and remove 'name'.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20250810214144.1985-2-thorsten.blum@linux.dev
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-08-12 12:05:56 +01:00
Jordan Rife
e93f7af148 docs: Fix name for net.ipv4.udp_child_hash_entries
udp_child_ehash_entries -> udp_child_hash_entries

Fixes: 9804985bf2 ("udp: Introduce optional per-netns hash table.")
Signed-off-by: Jordan Rife <jordan@jrife.io>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250808185800.1189042-1-jordan@jrife.io
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 13:03:13 +02:00
Paolo Abeni
b3e8c3dfce Merge branch 'fix-broken-link-with-th1520-gmac-when-linkspeed-changes'
Yao Zi says:

====================
Fix broken link with TH1520 GMAC when linkspeed changes

It's noted that on TH1520 SoC, the GMAC's link becomes broken after
the link speed is changed (for example, running ethtool -s eth0 speed
100 on the peer when negotiated to 1Gbps), but the GMAC could function
normally if the speed is brought back to the initial.

Just like many other SoCs utilizing STMMAC IP, we need to adjust the TX
clock supplying TH1520's GMAC through some SoC-specific glue registers
when linkspeed changes. But it's found that after the full kernel
startup, reading from them results in garbage and writing to them makes
no effect, which is the cause of broken link.

Further testing shows perisys-apb4-hclk must be ungated for normal
access to Th1520 GMAC APB glue registers, which is neither described in
dt-binding nor acquired by the driver.

This series expands the dt-binding of TH1520's GMAC to allow an extra
"APB glue registers interface clock", instructs the driver to acquire
and enable the clock, and finally supplies CLK_PERISYS_APB4_HCLK for
TH1520's GMACs in SoC devicetree.

v2: https://lore.kernel.org/netdev/20250801091240.46114-1-ziyao@disroot.org/
v1: https://lore.kernel.org/all/20250729093734.40132-1-ziyao@disroot.org/
====================

Link: https://patch.msgid.link/20250808093655.48074-2-ziyao@disroot.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:52:25 +02:00
Yao Zi
a7f75e2883 riscv: dts: thead: Add APB clocks for TH1520 GMACs
Describe perisys-apb4-hclk as the APB clock for TH1520 SoC, which is
essential for accessing GMAC glue registers.

Fixes: 7e756671a6 ("riscv: dts: thead: Add TH1520 ethernet nodes")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Tested-by: Drew Fustini <fustini@kernel.org>
Link: https://patch.msgid.link/20250808093655.48074-5-ziyao@disroot.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:52:21 +02:00
Yao Zi
4cc339ce48 net: stmmac: thead: Get and enable APB clock on initialization
It's necessary to adjust the MAC TX clock when the linkspeed changes,
but it's noted such adjustment always fails on TH1520 SoC, and reading
back from APB glue registers that control clock generation results in
garbage, causing broken link.

With some testing, it's found a clock must be ungated for access to APB
glue registers. Without any consumer, the clock is automatically
disabled during late kernel startup. Let's get and enable it if it's
described in devicetree.

For backward compatibility with older devicetrees, probing won't fail if
the APB clock isn't found. In this case, we emit a warning since the
link will break if the speed changes.

Fixes: 33a1a01e3a ("net: stmmac: Add glue layer for T-HEAD TH1520 SoC")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Tested-by: Drew Fustini <fustini@kernel.org>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Link: https://patch.msgid.link/20250808093655.48074-4-ziyao@disroot.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:52:21 +02:00
Yao Zi
c8a9a619c0 dt-bindings: net: thead,th1520-gmac: Describe APB interface clock
Besides ones for GMAC core and peripheral registers, the TH1520 GMAC
requires one more clock for configuring APB glue registers. Describe
it in the binding.

Fixes: f920ce04c3 ("dt-bindings: net: Add T-HEAD dwmac support")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Link: https://patch.msgid.link/20250808093655.48074-3-ziyao@disroot.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:52:21 +02:00
Nikunj A Dadhania
5eb1bcdb6a x86/sev: Improve handling of writes to intercepted TSC MSRs
Currently, when a Secure TSC enabled SNP guest attempts to write to the
intercepted GUEST_TSC_FREQ MSR (a read-only MSR), the guest kernel response
incorrectly implies a VMM configuration error, when in fact it is the usual
VMM configuration to intercept writes to read-only MSRs, unless explicitly
documented.

Modify the intercepted TSC MSR #VC handling:
* Write to GUEST_TSC_FREQ will generate a #GP instead of terminating the
  guest
* Write to MSR_IA32_TSC will generate a #GP instead of silently ignoring it

However, continue to terminate the guest when reading from intercepted
GUEST_TSC_FREQ MSR with Secure TSC enabled, as intercepted reads indicate an
improper VMM configuration for Secure TSC enabled SNP guests.

  [ bp: simplify comment. ]

Fixes: 38cc6495cd ("x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC enabled guests")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/20250722074853.22253-1-nikunj@amd.com
2025-08-12 12:33:58 +02:00
Buday Csaba
8ea25274eb net: mdiobus: release reset_gpio in mdiobus_unregister_device()
reset_gpio is claimed in mdiobus_register_device(), but it is not
released in mdiobus_unregister_device(). It is instead only
released when the whole MDIO bus is unregistered.
When a device uses the reset_gpio property, it becomes impossible
to unregister it and register it again, because the GPIO remains
claimed.
This patch resolves that issue.

Fixes: bafbdd527d ("phylib: Add device reset GPIO support") # see notes
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Cc: Csókás Bence <csokas.bence@prolan.hu>
[ csokas.bence: Resolve rebase conflict and clarify msg ]
Signed-off-by: Buday Csaba <buday.csaba@prolan.hu>
Link: https://patch.msgid.link/20250807135449.254254-2-csokas.bence@prolan.hu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:32:58 +02:00
Clark Wang
8ee90742cf net: phy: nxp-c45-tja11xx: fix the PHY ID mismatch issue when using C45
TJA1103/04/20/21 support both C22 and C45 accessing methods.

The TJA11xx driver has implemented the match_phy_device() API.
However, it does not handle the C45 ID. If C45 was used to access
TJA11xx, match_phy_device() would always return false due to
phydev->phy_id only used by C22 being empty, resulting in the
generic phy driver being used for TJA11xx PHYs.

Therefore, check phydev->c45_ids.device_ids[MDIO_MMD_PMAPMD] when
using C45.

Fixes: 1b76b2497a ("net: phy: nxp-c45-tja11xx: simplify .match_phy_device OP")
Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Link: https://patch.msgid.link/20250807040832.2455306-1-xiaoning.wang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:01:07 +02:00
Jouni Högander
184889dfe0 drm/i915/psr: Do not trigger Frame Change events from frontbuffer flush
We want to get rid of triggering "Frame Change" events from
frontbuffer flush calls. We are about to move using TRANS_PUSH
register for this on LunarLake and onwards. Touching TRANS_PUSH
register from fronbuffer flush would be problematic as it's written by
DSB as well.

Fix this by using intel_psr_exit when flush or invalidate is done on
LunarLake and onwards. This is not possible on AlderLake and
MeteorLake due to HW bug in PSR2 disable.

This patch is also fixing problems with cursor plane where cursor is
disappearing or duplicate cursor is seen on the screen.

v2: Commit message updated

Bspec: 68927, 68934, 66624
Reported-by: Janna Martl <janna.martl109@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5522
Fixes: 411ad63877 ("drm/i915/psr: Use SFF_CTL on invalidate/flush for LunarLake onwards")
Tested-by: Janna Martl <janna.martl109@gmail.com>
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250801062905.564453-1-jouni.hogander@intel.com
(cherry picked from commit 46fb38cb20c0d185a6391ab524b23e0e0219c41f)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2025-08-12 09:05:11 +01:00
Vinod Govindapillai
fd56b9c950 drm/i915/fbc: fix the implementation of wa_18038517565
As per the wa_18038517565, we need to disable FBC compressor
clock gating before enabling FBC and enable after disabling
FBC. Placing the enabling of clock gating in the fbc deactivate
function can make the above wa logic go wrong in case of
frontbuffer rendering FBC mechanism. FBC deactivate can get
called during fb invalidate and then the corresponding FBC
activate can get called without properly disabling the clock
gating and can result in compression stalled. So move the
enable clock gating at the end of one FBC session after FBC
is completely disabled for a pipe.

Bspec: 74212, 72197, 69741, 65555
Fixes: 010363c461 ("drm/i915/display: implement wa_18038517565")
Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Link: https://lore.kernel.org/r/20250729124648.288497-1-vinod.govindapillai@intel.com
(cherry picked from commit 82dde0407ab126f8413fd6c51429e5057ced5ba2)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2025-08-12 09:05:07 +01:00
Adrián Larumbe
54d4f44551 drm/panfrost: Print RSS for tiler heap BO's in debugfs GEMS file
Otherwise it would display the virtual allocation size, which is often
much bigger than the RSS.

Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: e48ade5e23 ("drm/panfrost: show device-wide list of DRM GEM objects over DebugFS")
Tested-by: Christopher Healy <healych@amazon.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250808010235.2831853-1-adrian.larumbe@collabora.com
2025-08-12 08:31:47 +02:00
Jialin Wang
c0e1b774f6 proc: proc_maps_open allow proc_mem_open to return NULL
The commit 65c6604725 ("proc: fix the issue of proc_mem_open returning
NULL") caused proc_maps_open() to return -ESRCH when proc_mem_open()
returns NULL.  This breaks legitimate /proc/<pid>/maps access for kernel
threads since kernel threads have NULL mm_struct.

The regression causes perf to fail and exit when profiling a kernel
thread:

  # perf record -v -g -p $(pgrep kswapd0)
  ...
  couldn't open /proc/65/task/65/maps

This patch partially reverts the commit to fix it.

Link: https://lkml.kernel.org/r/20250807165455.73656-1-wjl.linux@gmail.com
Fixes: 65c6604725 ("proc: fix the issue of proc_mem_open returning NULL")
Signed-off-by: Jialin Wang <wjl.linux@gmail.com>
Cc: Penglei Jiang <superman.xpt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-11 23:01:00 -07:00
Lorenzo Stoakes
0b5be138ce mm/mremap: avoid expensive folio lookup on mremap folio pte batch
It was discovered in the attached report that commit f822a9a81a ("mm:
optimize mremap() by PTE batching") introduced a significant performance
regression on a number of metrics on x86-64, most notably
stress-ng.bigheap.realloc_calls_per_sec - indicating a 37.3% regression in
number of mremap() calls per second.

I was able to reproduce this locally on an intel x86-64 raptor lake
system, noting an average of 143,857 realloc calls/sec (with a stddev of
4,531 or 3.1%) prior to this patch being applied, and 81,503 afterwards
(stddev of 2,131 or 2.6%) - a 43.3% regression.

During testing I was able to determine that there was no meaningful
difference in efforts to optimise the folio_pte_batch() operation, nor
checking folio_test_large().

This is within expectation, as a regression this large is likely to
indicate we are accessing memory that is not yet in a cache line (and
perhaps may even cause a main memory fetch).

The expectation by those discussing this from the start was that
vm_normal_folio() (invoked by mremap_folio_pte_batch()) would likely be
the culprit due to having to retrieve memory from the vmemmap (which
mremap() page table moves does not otherwise do, meaning this is
inevitably cold memory).

I was able to definitively determine that this theory is indeed correct
and the cause of the issue.

The solution is to restore part of an approach previously discarded on
review, that is to invoke pte_batch_hint() which explicitly determines,
through reference to the PTE alone (thus no vmemmap lookup), what the PTE
batch size may be.

On platforms other than arm64 this is currently hardcoded to return 1, so
this naturally resolves the issue for x86-64, and for arm64 introduces
little to no overhead as the pte cache line will be hot.

With this patch applied, we move from 81,503 realloc calls/sec to 138,701
(stddev of 496.1 or 0.4%), which is a -3.6% regression, however accounting
for the variance in the original result, this is broadly restoring
performance to its prior state.

Link: https://lkml.kernel.org/r/20250807185819.199865-1-lorenzo.stoakes@oracle.com
Fixes: f822a9a81a ("mm: optimize mremap() by PTE batching")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202508071609.4e743d7c-lkp@intel.com
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Barry Song <baohua@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-11 23:00:59 -07:00
Suren Baghdasaryan
aba6faec01 userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
When UFFDIO_MOVE encounters a migration PMD entry, it proceeds with
obtaining a folio and accessing it even though the entry is swp_entry_t. 
Add the missing check and let split_huge_pmd() handle migration entries. 
While at it also remove unnecessary folio check.

[surenb@google.com: remove extra folio check, per David]
  Link: https://lkml.kernel.org/r/20250807200418.1963585-1-surenb@google.com
Link: https://lkml.kernel.org/r/20250806220022.926763-1-surenb@google.com
Fixes: adef440691 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: syzbot+b446dbe27035ef6bd6c2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68794b5c.a70a0220.693ce.0050.GAE@google.com/
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-11 23:00:59 -07:00
Dev Jain
cf1b80dc31 mm: pass page directly instead of using folio_page
In commit_anon_folio_batch(), we iterate over all pages pointed to by the
PTE batch.  Therefore we need to know the first page of the batch;
currently we derive that via folio_page(folio, 0), but, that takes us to
the first (head) page of the folio instead - our PTE batch may lie in the
middle of the folio, leading to incorrectness.

Bite the bullet and throw away the micro-optimization of reusing the folio
in favour of code simplicity.  Derive the page and the folio in
change_pte_range, and pass the page too to commit_anon_folio_batch to fix
the aforementioned issue.

Link: https://lkml.kernel.org/r/20250806145611.3962-1-dev.jain@arm.com
Fixes: cac1db8c3a ("mm: optimize mprotect() by PTE batching")
Reported-by: syzbot+57bcc752f0df8bb1365c@syzkaller.appspotmail.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Debugged-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-11 23:00:59 -07:00
Sukrut Heroorkar
ab5ac789ef selftests/proc: fix string literal warning in proc-maps-race.c
This change resolves non literal string format warning invoked for
proc-maps-race.c while compiling.

proc-maps-race.c:205:17: warning: format not a string literal and no format arguments [-Wformat-security]
 205 |                 printf(text);
     |                 ^~~~~~
proc-maps-race.c:209:17: warning: format not a string literal and no format arguments [-Wformat-security]
 209 |                 printf(text);
     |                 ^~~~~~
proc-maps-race.c: In function `print_last_lines':
proc-maps-race.c:224:9: warning: format not a string literal and no format arguments [-Wformat-security]
 224 |         printf(start);
     |         ^~~~~~

Add string format specifier %s for the printf calls in both
print_first_lines() and print_last_lines() thus resolving the warnings.

The test executes fine after this change thus causing no effect to the
functional behavior of the test.

Link: https://lkml.kernel.org/r/20250804225633.841777-1-hsukrut3@gmail.com
Fixes: aadc099c48 ("selftests/proc: add verbose mode for /proc/pid/maps tearing tests")
Signed-off-by: Sukrut Heroorkar <hsukrut3@gmail.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: David Hunter <david.hunter.linux@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-11 23:00:59 -07:00
Dmitry Antipov
d7f1affc55 cifs: avoid extra calls to strlen() in cifs_get_spnego_key()
Since 'snprintf()' returns the number of characters emitted, an
output position may be advanced with this return value rather
than using an explicit calls to 'strlen()'. Compile tested only.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-11 23:20:26 -05:00
David Howells
b63335fb3d cifs: Fix collect_sample() to handle any iterator type
collect_sample() is used to gather samples of the data in a Write op for
analysis to try and determine if the compression algorithm is likely to
achieve anything more quickly than actually running the compression
algorithm.

However, collect_sample() assumes that the data it is going to be sampling
is stored in an ITER_XARRAY-type iterator (which it now should never be)
and doesn't actually check that it is before accessing the underlying
xarray directly.

Fix this by replacing the code with a loop that just uses the standard
iterator functions to sample every other 2KiB block, skipping the
intervening ones.  It's not quite the same as the previous algorithm as it
doesn't necessarily align to the pages within an ordinary write from the
pagecache.

Note that the btrfs code from which this was derived samples the inode's
pagecache directly rather than the iterator - but that doesn't necessarily
work for network filesystems if O_DIRECT is in operation.

Fixes: 94ae8c3fee ("smb: client: compress: LZ77 code improvements cleanup")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: Enzo Matsumiya <ematsumiya@suse.de>
cc: Shyam Prasad N <sprasad@microsoft.com>
cc: Tom Talpey <tom@talpey.com>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-11 23:20:07 -05:00
Russell King (Oracle)
89886abd07 net: stmmac: dwc-qos: fix clk prepare/enable leak on probe failure
dwc_eth_dwmac_probe() gets bulk clocks, and then prepares and enables
them. Unfortunately, if dwc_eth_dwmac_config_dt() or stmmac_dvr_probe()
fail, we leave the clocks prepared and enabled. Fix this by using
devm_clk_bulk_get_all_enabled() to combine the steps and provide devm
based release of the prepare and enable state.

This also fixes a similar leakin dwc_eth_dwmac_remove() which wasn't
correctly retrieving the struct plat_stmmacenet_data. This becomes
unnecessary.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: a045e40645 ("net: stmmac: refactor clock management in EQoS driver")
Link: https://patch.msgid.link/E1ukM1X-0086qu-Td@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:11:48 -07:00
Russell King (Oracle)
de1e963ad0 net: stmmac: rk: put the PHY clock on remove
The PHY clock (bsp_priv->clk_phy) is obtained using of_clk_get(), which
doesn't take part in the devm release. Therefore, when a device is
unbound, this clock needs to be explicitly put. Fix this.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: fecd4d7eef ("net: stmmac: dwmac-rk: Add integrated PHY support")
Link: https://patch.msgid.link/E1ukM1S-0086qo-PC@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:11:23 -07:00
Jeff Layton
52966bf71d ref_tracker: use %p instead of %px in debugfs dentry name
As Kees points out, this is a kernel address leak, and debugging is
not a sufficiently good reason to expose the real kernel address.

Fixes: 65b584f536 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
Reported-by: Kees Cook <kees@kernel.org>
Closes: https://lore.kernel.org/netdev/202507301603.62E553F93@keescook/
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:07:44 -07:00
Fabio Porcedda
61aaca8b89 net: usb: qmi_wwan: add Telit Cinterion FN990A w/audio composition
Add the following Telit Cinterion FN990A w/audio composition:

0x1077: tty (diag) + adb + rmnet + audio + tty (AT/NMEA) + tty (AT) +
tty (AT) + tty (AT)
T:  Bus=01 Lev=01 Prnt=01 Port=09 Cnt=01 Dev#=  8 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=1077 Rev=05.04
S:  Manufacturer=Telit Wireless Solutions
S:  Product=FN990
S:  SerialNumber=67e04c35
C:  #Ifs=10 Cfg#= 1 Atr=e0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
E:  Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 3 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=20 Driver=snd-usb-audio
I:  If#= 4 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio
E:  Ad=03(O) Atr=0d(Isoc) MxPS=  68 Ivl=1ms
I:  If#= 5 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio
E:  Ad=84(I) Atr=0d(Isoc) MxPS=  68 Ivl=1ms
I:  If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 7 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=88(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8a(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 9 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8c(I) Atr=03(Int.) MxPS=  10 Ivl=32ms

Cc: stable@vger.kernel.org
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:07:16 -07:00
Dave Hansen
b132a3b0c2 MAINTAINERS: Remove bouncing T7XX reviewer
This reviewer's email no longer works. Remove it from MAINTAINERS.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Cc: Liu Haijun <haijun.liu@mediatek.com>
Cc: Ricardo Martinez <ricardo.martinez@linux.intel.com>
Link: https://patch.msgid.link/20250808173925.FECE3782@davehans-spike.ostc.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:04:36 -07:00
Dave Hansen
b56e9fb1c9 MAINTAINERS: Mark Intel PTP DFL ToD as orphaned
This maintainer's email no longer works. Remove it from MAINTAINERS.
Also mark the code as an Orphan.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Tianfei Zhang <tianfei.zhang@intel.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Link: https://patch.msgid.link/20250808175324.8C4B7354@davehans-spike.ostc.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:02:28 -07:00
Dave Hansen
7573980c70 MAINTAINERS: Mark Intel WWAN IOSM driver as orphaned
This maintainer's email no longer works. Remove it from MAINTAINERS.

I've been unable to locate a new maintainer for this at Intel. Mark
the driver as Orphaned.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Loic Poulain <loic.poulain@oss.qualcomm.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Link: https://patch.msgid.link/20250808174505.C9FF434F@davehans-spike.ostc.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11 20:02:17 -07:00
Fushuai Wang
31cd31c9e1 x86/fpu: Fix NULL dereference in avx512_status()
Problem
-------
With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status
causes a warning and a NULL pointer dereference.

This is because the AVX-512 timestamp code uses x86_task_fpu() but
doesn't check it for NULL. CONFIG_X86_DEBUG_FPU addles that function
for kernel threads (PF_KTHREAD specifically), making it return NULL.

The point of the warning was to ensure that kernel threads only access
task->fpu after going through kernel_fpu_begin()/_end(). Note: all
kernel tasks exposed in /proc have a valid task->fpu.

Solution
--------
One option is to silence the warning and check for NULL from
x86_task_fpu(). However, that warning is fairly fresh and seems like a
defense against misuse of the FPU state in kernel threads.

Instead, stop outputting AVX-512_elapsed_ms for kernel threads
altogether. The data was garbage anyway because avx512_timestamp is
only updated for user threads, not kernel threads.

If anyone ever wants to track kernel thread AVX-512 use, they can come
back later and do it properly, separate from this bug fix.

[ dhansen: mostly rewrite changelog ]

Fixes: 22aafe3bcb ("x86/fpu: Remove init_task FPU state dependencies, add debugging warning for PF_KTHREAD tasks")
Co-developed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250811185044.2227268-1-sohil.mehta%40intel.com
2025-08-11 13:28:07 -07:00
Srinivas Pandruvada
3ead77989c cpufreq: intel_pstate: Support Clearwater Forest OOB mode
Prevent intel_pstate from loading when OOB (Out Of Band) P-states mode is
enabled.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20250808145122.4057208-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-11 21:49:47 +02:00
Rafael J. Wysocki
fa3fa55de0 cpuidle: governors: menu: Avoid using invalid recent intervals data
Marc has reported that commit 85975daeaa ("cpuidle: menu: Avoid
discarding useful information") caused the number of wakeup interrupts
to increase on an idle system [1], which was not expected to happen
after merely allowing shallower idle states to be selected by the
governor in some cases.

However, on the system in question, all of the idle states deeper than
WFI are rejected by the driver due to a firmware issue [2].  This causes
the governor to only consider the recent interval duriation data
corresponding to attempts to enter WFI that are successful and the
recent invervals table is filled with values lower than the scheduler
tick period.  Consequently, the governor predicts an idle duration
below the scheduler tick period length and avoids stopping the tick
more often which leads to the observed symptom.

Address it by modifying the governor to update the recent intervals
table also when entering the previously selected idle state fails, so
it knows that the short idle intervals might have been the minority
had the selected idle states been actually entered every time.

Fixes: 85975daeaa ("cpuidle: menu: Avoid discarding useful information")
Link: https://lore.kernel.org/linux-pm/86o6sv6n94.wl-maz@kernel.org/ [1]
Link: https://lore.kernel.org/linux-pm/7ffcb716-9a1b-48c2-aaa4-469d0df7c792@arm.com/ [2]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/2793874.mvXUDI8C0e@rafael.j.wysocki
2025-08-11 21:46:14 +02:00
Len Brown
e91a158b69 intel_idle: Allow loading ACPI tables for any family
There is no reason to limit intel_idle's loading of ACPI tables to
family 6.  Upcoming Intel processors are not in family 6.

Below "Fixes" really means "applies cleanly until".
That syntax commit didn't change the previous logic,
but shows this patch applies back 5-years.

Fixes: 4a9f45a053 ("intel_idle: Convert to new X86 CPU match macros")
Signed-off-by: Len Brown <len.brown@intel.com>
Link: https://patch.msgid.link/06101aa4fe784e5b0be1cb2c0bdd9afcf16bd9d4.1754681697.git.len.brown@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-11 21:43:26 +02:00
Tang Yizhou
0452f08395 blk-wbt: doc: Update the doc of the wbt_lat_usec interface
The symbol wb_window_usec cannot be found. Update the doc to reflect the
latest implementation, in other words, the debugfs interface
'curr_win_nsec'.

Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20250727173959.160835-4-yizhou.tang@shopee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 10:21:38 -06:00
Tang Yizhou
bccdfcd56d blk-wbt: Eliminate ambiguity in the comments of struct rq_wb
In the current implementation, the last_issue and last_comp members of
struct rq_wb are used only by read requests and not by non-throttled write
requests. Therefore, eliminate the ambiguity here.

Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20250727173959.160835-3-yizhou.tang@shopee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 10:21:38 -06:00
Tang Yizhou
d8b96a7962 blk-wbt: Optimize wbt_done() for non-throttled writes
In the current implementation, the sync_cookie and last_cookie members of
struct rq_wb are used only by read requests and not by non-throttled write
requests. Based on this, we can optimize wbt_done() by removing one if
condition check for non-throttled write requests.

Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20250727173959.160835-2-yizhou.tang@shopee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 10:21:38 -06:00
Waiman Long
dfb36e4a8d futex: Use user_write_access_begin/_end() in futex_put_value()
Commit cec199c5e3 ("futex: Implement FUTEX2_NUMA") introduced the
futex_put_value() helper to write a value to the given user
address.

However, it uses user_read_access_begin() before the write. For
architectures that differentiate between read and write accesses, like
PowerPC, futex_put_value() fails with -EFAULT.

Fix that by using the user_write_access_begin/user_write_access_end() pair
instead.

Fixes: cec199c5e3 ("futex: Implement FUTEX2_NUMA")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250811141147.322261-1-longman@redhat.com
2025-08-11 17:53:21 +02:00
David Kaplan
4fa7d880ae x86/bugs: Select best SRSO mitigation
The SRSO bug can theoretically be used to conduct user->user or guest->guest
attacks and requires a mitigation (namely IBPB instead of SBPB on context
switch) for these.  So mark SRSO as being applicable to the user->user and
guest->guest attack vectors.

Additionally, SRSO supports multiple mitigations which mitigate different
potential attack vectors.  Some CPUs are also immune to SRSO from
certain attack vectors (like user->kernel).

Use the specific attack vectors requiring mitigation to select the best
SRSO mitigation to avoid unnecessary performance hits.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250721160310.1804203-1-david.kaplan@amd.com
2025-08-11 17:32:36 +02:00
Linus Torvalds
53e760d894 nfsd-6.17 fixes:
- A correctness fix for delegated timestamps
 - Address an NFSD shutdown hang when LOCALIO is in use
 - Prevent a remotely exploitable crasher when TLS is in use
 
 These arrived too late to be included in the initial nfsd-6.17
 pull request.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmiZ+UMACgkQM2qzM29m
 f5cevBAAgvdeL/4VUue/p7vZEHBtHr3HlaoPpGi/mhFh/f9rrKKs/osSP45uV/we
 tDq8k8f37S/PPAKu5Ts0BmJUVeI16ZvqYw1tXcq6Xifl+qYtowP9re/Xf+6Uln/5
 ebVgqQDO8Zl6rEIZGen/iSp4oq/yk7g7n8XAlL2DzoMcfdju8q5mtyaqKiJtHhor
 lE69sI73v0lj1HLpy/NHdSOQQVAUmhBJQYSpDRGh6jlkWhm9T/U5CP79TBAJVLlx
 Jglhs7GQe0dlP6lLHD0tc7dZ/3LImICQBw2P7PdYaM3Dc1Y2y5uzSfKHnxZ4EHBr
 +uDOD8WFxzt/9WzIoXSCDeMe7KvA8lUnqzEV06Ov5H8h8fHQ1ClR7hhEom+32DKo
 7IC61/MNP+TcWrar+ObucjtuBsuFC65IkPdRAQHUyh0U9rOjFV0Riye9RCMRHZFy
 JPOlfPaUK8wP9AR4O3o6+Aeq4nx49RKd9su4YM/sAl+NdmCZjUnXbryvqymHp99d
 Lmxq9VIIoNyhX0tEbwNx8aop97yOb+76yFGFzLCPFWwV46x1Q49WsTL+fY9xN2uj
 6DAK6wJOMfQGmPFxHH1ttzryjBvCXcVS4SEgzR3UK6KMDYHjI6WE+y1PO/AV9Wae
 RJTHooz4Hsw3h80/yoleZ5YixEiXiQSUDuu7sUVAvksRpnbhFxs=
 =qWSE
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - A correctness fix for delegated timestamps

 - Address an NFSD shutdown hang when LOCALIO is in use

 - Prevent a remotely exploitable crasher when TLS is in use

* tag 'nfsd-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  sunrpc: fix handling of server side tls alerts
  nfsd: avoid ref leak in nfsd_open_local_fh()
  nfsd: don't set the ctime on delegated atime updates
2025-08-11 07:38:55 -07:00
Vasiliy Kovalev
b26e2afb38 ALSA: hda/realtek: Fix headset mic on HONOR BRB-X
Add a PCI quirk to enable microphone input on the headphone jack on
the HONOR BRB-X M1010 laptop.

Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250811132716.45076-1-kovalev@altlinux.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-11 16:17:31 +02:00
Zheng Qixing
343dc5423b block: fix kobject double initialization in add_disk
Device-mapper can call add_disk() multiple times for the same gendisk
due to its two-phase creation process (dm create + dm load). This leads
to kobject double initialization errors when the underlying iSCSI devices
become temporarily unavailable and then reappear.

However, if the first add_disk() call fails and is retried, the queue_kobj
gets initialized twice, causing:

kobject: kobject (ffff88810c27bb90): tried to init an initialized object,
something is seriously wrong.
 Call Trace:
  <TASK>
  dump_stack_lvl+0x5b/0x80
  kobject_init.cold+0x43/0x51
  blk_register_queue+0x46/0x280
  add_disk_fwnode+0xb5/0x280
  dm_setup_md_queue+0x194/0x1c0
  table_load+0x297/0x2d0
  ctl_ioctl+0x2a2/0x480
  dm_ctl_ioctl+0xe/0x20
  __x64_sys_ioctl+0xc7/0x110
  do_syscall_64+0x72/0x390
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fix this by separating kobject initialization from sysfs registration:
 - Initialize queue_kobj early during gendisk allocation
 - add_disk() only adds the already-initialized kobject to sysfs
 - del_gendisk() removes from sysfs but doesn't destroy the kobject
 - Final cleanup happens when the disk is released

Fixes: 2bd85221a6 ("block: untangle request_queue refcounting from sysfs")
Reported-by: Li Lingfeng <lilingfeng3@huawei.com>
Closes: https://lore.kernel.org/all/83591d0b-2467-433c-bce0-5581298eb161@huawei.com/
Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Link: https://lore.kernel.org/r/20250808053609.3237836-1-zhengqixing@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 08:00:49 -06:00
Qianfeng Rong
196447c712 blk-cgroup: remove redundant __GFP_NOWARN
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.

Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant.  Let's clean up these
redundant flags across subsystems.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20250809141358.168781-1-rongqianfeng@vivo.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 07:59:40 -06:00
Qianfeng Rong
8f3e4e87b0 block, bfq: remove redundant __GFP_NOWARN
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.

Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant.  Let's clean up these
redundant flags across subsystems.

Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Link: https://lore.kernel.org/r/20250811081135.374315-1-rongqianfeng@vivo.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 07:59:22 -06:00
Caleb Sander Mateos
5058a62875 ublk: check for unprivileged daemon on each I/O fetch
Commit ab03a61c66 ("ublk: have a per-io daemon instead of a per-queue
daemon") allowed each ublk I/O to have an independent daemon task.
However, nr_privileged_daemon is only computed based on whether the last
I/O fetched in each ublk queue has an unprivileged daemon task.
Fix this by checking whether every fetched I/O's daemon is privileged.
Change nr_privileged_daemon from a count of queues to a boolean
indicating whether any I/Os have an unprivileged daemon.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: ab03a61c66 ("ublk: have a per-io daemon instead of a per-queue daemon")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250808155216.296170-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 07:57:48 -06:00
Uday Shankar
212c928d01 ublk: don't quiesce in ublk_ch_release
ublk_ch_release currently quiesces the device's request_queue while
setting force_abort/fail_io.  This avoids data races by preventing
concurrent reads from the I/O path, but is not strictly needed - at this
point, canceling is already set and guaranteed to be observed by any
concurrently executing I/Os, so they will be handled properly even if
the changes to force_abort/fail_io propagate to the I/O path later.
Remove the quiesce/unquiesce calls from ublk_ch_release. This makes the
writes to force_abort/fail_io concurrent with the reads in the I/O path,
so make the accesses atomic.

Before this change, the call to blk_mq_quiesce_queue was responsible for
most (90%) of the runtime of ublk_ch_release. With that call eliminated,
ublk_ch_release runs much faster. Here is a comparison of the total time
spent in calls to ublk_ch_release when a server handling 128 devices
exits, before and after this change:

before: 1.11s
after: 0.09s

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250808-ublk_quiesce2-v1-1-f87ade33fa3d@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 07:57:37 -06:00
Philipp Reisner
d5dd409812 drbd: Remove the open-coded page pool
If the network stack keeps a reference for too long, DRBD keeps
references on a higher number of pages as a consequence.

Fix all that by no longer relying on page reference counts dropping to
an expected value. Instead, DRBD gives up its reference and lets the
system handle everything else. While at it, remove the open-coded
custom page pool mechanism and use the page_pool included in the
kernel.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Tested-by: Eric Hagberg <ehagberg@janestreet.com>
Link: https://lore.kernel.org/r/20250605103852.23029-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11 07:54:27 -06:00
Christoph Hellwig
f76823e3b2 xfs: split xfs_zone_record_blocks
xfs_zone_record_blocks not only records successfully written blocks that
now back file data, but is also used for blocks speculatively written by
garbage collection that were never linked to an inode and instantly
become invalid.

Split the latter functionality out to be easier to understand.  This also
make it clear that we don't need to attach the rmap inode to a
transaction for the skipped blocks case as we never dirty any peristent
data structure.

Also make the argument order to xfs_zone_record_blocks a bit more
natural.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11 14:04:20 +02:00