Fix an error caught with -Werror in amd-pstate-ut.
-----BEGIN PGP SIGNATURE-----
iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmgmSekaHG1hcmlvLmxp
bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnYs7BAA0lYBgiko019KdK429T9V
QcQ18Q0+u3+OOaWZFJJVghDhNmICVH6CTdvPTrOVhwD4C8xcmcY5OjgGcp5Fgwsr
Nx6tosRY65E0x7wZSINYzAH8Ri2xf66kzGmt19kw/+FtkseXRRR1ukXEbBEaN7nj
6ro1iEWM0ANpKX3wclG6a0MQ+yTpE1+A8oQdANikTiEWZ7PwbVmV2gV1XQtrQ87k
GZ6+mOLDdBXbPpkgW6w6hSecOsYQQdcdQBWX2pEqB8yG2iP3tByGF0pe/ORhH1DR
XnFM8K3K8wwqnecogFc1QbZrfnx3/h9OoigIsi0JR0vFx7C88aBdRv6UAMLkniWN
gLNsTEsy1OUIkY7Dp2wJdJUJtWl8+k7U9kCFz1GMzt6aGq3s9J2zt9Q/5HxOMZhJ
BmobFR5dLLLERRCmVK33QkMw4QvQs0zyeWxdeJeIi9D0DcvI6cZI4mH2YB878esq
nNW00s4yZpXftekgsA5e4TpHVgYdZ/sYR1cTJYuGPWH6yr2CAqXKqTItjiQq8T4r
bgMxWo5PceyPgPBUn5FO36LPFemfsQ7DXPKS6KbW8LzcfbVCZSfL561dg/La0+e2
ZKJFBvsU6LgFATf8YBQxTV/Kr7/ihYx3DbtB4mEWu9/piYBHlBsmYmofparFPSpK
QJBmh4GpDj54r27esKeJuWk=
=IUvq
-----END PGP SIGNATURE-----
Merge tag 'amd-pstate-v6.16-2025-05-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge an amd-pstate driver fix for 6.16 (5/15/25) from Mario
Liminciello:
"Fix an error caught with -Werror in amd-pstate-ut."
* tag 'amd-pstate-v6.16-2025-05-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver()
Lockdep reports a possible circular locking dependency [1] when
cpu_hotplug_lock is acquired inside store_local_boost(), after
policy->rwsem has already been taken by store().
However, the boost update is strictly per-policy and does not
access shared state or iterate over all policies.
Since policy->rwsem is already held, this is enough to serialize
against concurrent topology changes for the current policy.
Remove the cpus_read_lock() to resolve the lockdep warning and
avoid unnecessary locking.
[1]
======================================================
WARNING: possible circular locking dependency detected
6.15.0-rc6-debug-gb01fc4eca73c #1 Not tainted
------------------------------------------------------
power-profiles-/588 is trying to acquire lock:
ffffffffb3a7d910 (cpu_hotplug_lock){++++}-{0:0}, at: store_local_boost+0x56/0xd0
but task is already holding lock:
ffff8b6e5a12c380 (&policy->rwsem){++++}-{4:4}, at: store+0x37/0x90
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&policy->rwsem){++++}-{4:4}:
down_write+0x29/0xb0
cpufreq_online+0x7e8/0xa40
cpufreq_add_dev+0x82/0xa0
subsys_interface_register+0x148/0x160
cpufreq_register_driver+0x15d/0x260
amd_pstate_register_driver+0x36/0x90
amd_pstate_init+0x1e7/0x270
do_one_initcall+0x68/0x2b0
kernel_init_freeable+0x231/0x270
kernel_init+0x15/0x130
ret_from_fork+0x2c/0x50
ret_from_fork_asm+0x11/0x20
-> #1 (subsys mutex#3){+.+.}-{4:4}:
__mutex_lock+0xc2/0x930
subsys_interface_register+0x7f/0x160
cpufreq_register_driver+0x15d/0x260
amd_pstate_register_driver+0x36/0x90
amd_pstate_init+0x1e7/0x270
do_one_initcall+0x68/0x2b0
kernel_init_freeable+0x231/0x270
kernel_init+0x15/0x130
ret_from_fork+0x2c/0x50
ret_from_fork_asm+0x11/0x20
-> #0 (cpu_hotplug_lock){++++}-{0:0}:
__lock_acquire+0x10ed/0x1850
lock_acquire.part.0+0x69/0x1b0
cpus_read_lock+0x2a/0xc0
store_local_boost+0x56/0xd0
store+0x50/0x90
kernfs_fop_write_iter+0x132/0x200
vfs_write+0x2b3/0x590
ksys_write+0x74/0xf0
do_syscall_64+0xbb/0x1d0
entry_SYSCALL_64_after_hwframe+0x56/0x5e
Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Link: https://patch.msgid.link/20250513015726.1497-1-ImanDevel@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Clang warns (or errors with CONFIG_WERROR=y):
drivers/cpufreq/amd-pstate-ut.c:262:6: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized]
262 | if (ret)
| ^~~
ret is declared at the top of the function and in the for loop so the
initialization of ret is local to the loop. Remove the declaration in
the for loop so that ret is always used initialized.
Fixes: d26d16438b ("amd-pstate-ut: Reset amd-pstate driver mode after running selftests")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505072226.g2QclhaR-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250512-amd-pstate-ut-uninit-ret-v1-1-fcb4104f502e@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
On some hybrid platforms some efficient CPUs (E-cores) are not connected
to the L3 cache, but there are no other differences between them and the
other E-cores that use L3. In that case, it is generally more efficient
to run "light" workloads on the E-cores that do not use L3 and allow all
of the cores using L3, including P-cores, to go into idle states.
For this reason, slightly increase the cost for all CPUs sharing the L3
cache to make EAS prefer CPUs that do not use it to the other CPUs of
the same type (if any).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2032776.usQuhbGJ8B@rjwysocki.net
Modify intel_pstate to register EM perf domains for CPUs on hybrid
platforms without SMT which causes EAS to be enabled on them when
schedutil is used as the cpufreq governor (which requires intel_pstate
to operate in the passive mode).
This change is targeting platforms (for example, Lunar Lake) where the
"little" CPUs (E-cores) are always more energy-efficient than the "big"
or "performance" CPUs (P-cores) when run at the same HWP performance
level, so it is sufficient to tell EAS that E-cores are always preferred
(so long as there is enough spare capacity on one of them to run the
given task). However, migrating tasks between CPUs of the same type
too often is not desirable because it may hurt both performance and
energy efficiency due to leaving warm caches behind.
For this reason, register a separate perf domain for each CPU and choose
the cost values for them so that the cost mostly depends on the CPU type,
but there is also a small component of it depending on the performance
level (utilization) which helps to balance the load between CPUs of the
same type.
The cost component related to the CPU type is computed with the help of
the observation that the IPC metric value for a given CPU is inversely
proportional to its performance-to-frequency scaling factor and the cost
of running code on it can be assumed to be roughly proportional to that
IPC ratio (in principle, the higher the IPC ratio, the more resources
are utilized when running at a given frequency, so the cost should be
higher).
For all CPUs that are online at the system initialization time, EM perf
domains are registered when the driver starts up, after asymmetric
capacity support has been enabled. For the CPUs that become online
later, EM perf domains are registered after setting the asymmetric
capacity for them.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://patch.msgid.link/6057101.MhkbZ0Pkbq@rjwysocki.net
Policy locking was added to cpufreq_policy_is_good_for_eas() by commit
4854649b1f ("cpufreq/sched: Move cpufreq-specific EAS checks to
cpufreq") to address a theoretical race condition, but it turned out to
introduce a circular locking dependency between the policy rwsem and
sched_domains_mutex via cpuset_mutex. This leads to a board lockup on
OdroidN2 that is based on the ARM64 Amlogic Meson SoC.
Drop the policy locking from cpufreq_policy_is_good_for_eas() to address
this issue.
Fixes: 4854649b1f ("cpufreq/sched: Move cpufreq-specific EAS checks to cpufreq")
Closes: https://lore.kernel.org/linux-pm/1bf3df62-0641-459f-99fc-fd511e564b84@samsung.com/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2806514.mvXUDI8C0e@rjwysocki.net
This reduces log-msgs during boot from many pages to ~10 occurrences. I
didn't investigate why it wasn't just 1, maybe its a low-level service to
other modules, re-probed by each of them ?
Link: https://lkml.kernel.org/r/20250325235156.663269-4-jim.cromie@gmail.com
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Louis Chauvet <louis.chauvet@bootlin.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add support for a new feature on some BIOS that allows setting
"lowest CPU minimum frequency".
Fix the amd-pstate-ut unit tests to restore system settings when done.
-----BEGIN PGP SIGNATURE-----
iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmgdIEoaHG1hcmlvLmxp
bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnZW4w//YdA3ZhC7st6xWUlTfiCl
qSTPmj5EXTdm8TsGemXvPsQQzd/t5AtiMuJzJmnhiqJ9/go10Jb0Io/V3Sia4wb/
k7aFkwMpJtKXokItH8QxUc2RkE1zRkAZL8G3ASI6k3MfDWnBmo4Qbe4fC806csvp
zrmeMJ0G3fjS0PlspxWG02VOyiHOXV4C2uLUUv798yDmokBULkreBISpxk46Mgl4
3KKVfKeG98lm+7OS3GjQVPPuELHp0Ra+YeYER7LTta3ui3SNES7eadCufpCp6yl3
DmV5ic1UWeHSr2v20yJD7qi6/z2nKlWq8DlFbw2gTU6jjHxi3WytSh2MplMleJJo
BfhBoQmtrqWVrR+ZYy6FT5BhnJ9lepfLhNtbv63TXKGHlHrb7vO6Y5oafV8oVtY2
24mlbtNdRD4NnrE3+v79PHSrHyw1dFG3ULSRUGu7I13QRYK6b65558YoMYY62qGk
ajwUwdS7NTFpvPGSuAo9R2H51CWUFanZSu9h2GVxpuGBafx6vNTbBD/wVeZu3ZNJ
7cvTWc3SgRD/A7aXox8aZUFSdDFaAu+IEMMICCJyvmiWQUsVhy234QAXfoD9ui+r
VBWb5Br8ontsge/HAPEbWsrdbzPR9xajxRtPDNfh+/A7zwkWI3XWWpneIEV53xeK
eVKCoKytyXsHlXHPdWT45pY=
=lBoU
-----END PGP SIGNATURE-----
Merge tag 'amd-pstate-v6.16-2025-05-08' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge amd-pstate content for 6.16 (5/8/25) from Mario Limonciello:
- Add support for a new feature on some BIOS that allows setting
"lowest CPU minimum frequency".
- Fix the amd-pstate-ut unit tests to restore system settings when
done.
* tag 'amd-pstate-v6.16-2025-05-08' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
amd-pstate-ut: Reset amd-pstate driver mode after running selftests
cpufreq/amd-pstate: Add support for the "Requested CPU Min frequency" BIOS option
cpufreq/amd-pstate: Add offline, online and suspend callbacks for amd_pstate_driver
cpufreq/amd-pstate: Move max_perf limiting in amd_pstate_update
Intel hybrid processors have CPUs of different capacity. Populate the
interface /sys/devices/system/cpu/cpuN/cpu_capacity.
This interface uses the per-CPU variable `cpu_scale`. On x86 this
variable has no other use besides feeding the sysfs entries. Initialize
it when setting CPU capacity for the scheduler and scale-invariant code.
Feed it with arch_scale_cpu_capacity() as it gives capacity normalized
to the interval [0, SCHED_CAPACITY_SCALE].
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Doing cpufreq-specific EAS checks that require accessing policy
internals directly from sched_is_eas_possible() is a bit unfortunate,
so introduce cpufreq_ready_for_eas() in cpufreq, move those checks
into that new function and make sched_is_eas_possible() call it.
While at it, address a possible race between the EAS governor check
and governor change by doing the former under the policy rwsem.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://patch.msgid.link/2317800.iZASKD2KPV@rjwysocki.net
In amd-pstate-ut, one of the basic test is to switch between all
possible mode combinations. After running this test the mode of the
amd-pstate driver is active mode. Store and reset the mode to its original
state.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250430064206.7402-1-swapnil.sapkal@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
For historic reasons there are some TSC-related functions in the
<asm/msr.h> header, even though there's an <asm/tsc.h> header.
To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h>
to <asm/tsc.h> and to eventually eliminate the inclusion of
<asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency
to the source files that reference definitions from <asm/msr.h>.
[ mingo: Clarified the changelog. ]
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel(). Using a
cppc_perf_caps to carry the value is unnecessary.
Add a check to ensure the pointer 'enable' is not null.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-8-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When turbo mode is unavailable on a Skylake-X system, executing the
command:
# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
results in an unchecked MSR access error:
WRMSR to 0x199 (attempted to write 0x0000000100001300).
This issue was reproduced on an OEM (Original Equipment Manufacturer)
system and is not a common problem across all Skylake-X systems.
This error occurs because the MSR 0x199 Turbo Engage Bit (bit 32) is set
when turbo mode is disabled. The issue arises when intel_pstate fails to
detect that turbo mode is disabled. Here intel_pstate relies on
MSR_IA32_MISC_ENABLE bit 38 to determine the status of turbo mode.
However, on this system, bit 38 is not set even when turbo mode is
disabled.
According to the Intel Software Developer's Manual (SDM), the BIOS sets
this bit during platform initialization to enable or disable
opportunistic processor performance operations. Logically, this bit
should be set in such cases. However, the SDM also specifies that "OS
and applications must use CPUID leaf 06H to detect processors with
opportunistic processor performance operations enabled."
Therefore, in addition to checking MSR_IA32_MISC_ENABLE bit 38, verify
that CPUID.06H:EAX[1] is 0 to accurately determine if turbo mode is
disabled.
Fixes: 4521e1a0ce ("cpufreq: intel_pstate: Reflect current no_turbo state correctly")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Initialize lower frequency limit to the "Requested CPU Min frequency"
BIOS option (if it is set) value as part of the driver->init()
callback. The BIOS specified value is passed by the PMFW as min_perf in
CPPC_REQ MSR. To ensure that we don't mistake a stale min_perf value in
CPPC_REQ value as the "Requested CPU Min frequency" during a kexec wakeup,
reset the CPPC_REQ.min_perf value back to the BIOS specified one in the
offline, exit and suspend callbacks.
amd_pstate_target() and amd_pstate_epp_update_limit() which are invoked
as part of the resume() and online() callbacks will take care of restoring
the CPPC_REQ back to the correct values.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250428071623.4309-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Rename and use the existing amd_pstate_epp callbacks for amd_pstate driver
as well. Remove the debug print in online callback while at it.
These callbacks will be needed to support the "Requested CPU Min Frequency"
BIOS option.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20250428062520.4997-2-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Commit 7491cdf46b ("cpufreq: Avoid using inconsistent policy->min and
policy->max") overlooked the fact that policy->min and policy->max were
accessed directly in cpufreq_frequency_table_target() and in the
functions called by it. Consequently, the changes made by that commit
led to problems with setting policy limits.
Address this by passing the target frequency limits to __resolve_freq()
and cpufreq_frequency_table_target() and propagating them to the
functions called by the latter.
Fixes: 7491cdf46b ("cpufreq: Avoid using inconsistent policy->min and policy->max")
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Closes: https://lore.kernel.org/linux-pm/aAplED3IA_J0eZN0@linaro.org/
Reported-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/5896780.DvuYhMxLoT@rjwysocki.net
If the global boost flag is enabled and policy boost flag is disabled, a
call to `cpufreq_boost_trigger_state(true)` must enable the policy's
boost state.
The current code misses that because of an optimization. Fix it.
Suggested-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/852ff11c589e6300730d207baac195b2d9d8b95f.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the global boost flag was enabled and policy boost flag was disabled
before a suspend resume cycle, cpufreq_online() will enable the policy
boost flag on resume.
While it is important for the policy boost flag to mirror the global
boost flag when a policy is first created, it should be avoided when the
policy is reinitialized (for example after a suspend resume cycle).
Though, if the global boost flag is disabled at this point of time, we
want to make sure policy boost flag is disabled too.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/de5c72a0af101049204305c73cd30eb3a3e7b4a0.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During CPU hotunplug events (such as those occurring during
suspend/resume cycles), platform firmware may modify the CPU boost
state.
If boost was disabled prior to CPU removal, it correctly remains
disabled upon re-plug. However, if firmware re-enables boost while the
CPU is offline, the CPU may return with boost enabled—even if it was
originally disabled—once it is hotplugged back in. This leads to
inconsistent behavior and violates user or kernel policy expectations.
To maintain consistency, ensure the boost state is re-synchronized with
the kernel policy when a CPU is hotplugged back in.
Note: This re-synchronization is not necessary during the initial call
to ->init() for a CPU, as the cpufreq core handles it via
cpufreq_online(). At that point, acpi_cpufreq_driver.boost_enabled is
initialized to the value returned by boost_state(0).
Fixes: 2b16c63183 ("cpufreq: ACPI: Remove set_boost in acpi_cpufreq_cpu_init()")
Reported-by: Nicholas Chin <nic.c3.14@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220013
Tested-by: Nicholas Chin <nic.c3.14@gmail.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/9c7de55fb06015c1b77e7dafd564b659838864e0.1745511526.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Fix possible out-of-bound / null-ptr-deref in drivers (Andre Przywara
and Henry Martin).
- Fix Kconfig issues with compile-test (Johan Hovold and Krzysztof
Kozlowski).
- Fix invalid return value in .get() (Marc Zyngier).
- Add SM8650 to cpufreq-dt-platdev blocklist (Pengyu Luo).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmgF9KgACgkQ0rkcPK6B
EhwefQ/+ORtqeZPEZqDPrllIE18NsgLjML99BTK558Heq7ci1CFbh8/XsGEmnBrB
Vzw/p8UkWu1Ifc7IilBvgN6FoXj1QzA+/vfMzboej29U0go5EIVfAp6yifolLI17
wcT7kcAS6CnxhFAYYnVuI8ninFa041SxXkbcW4V5fnOp5IUm2oKbFrAJAv999sKT
1Ds2RdgopLZ1yGARDYrLI9d0d3XdoUZBSc/MVwxbpK8LKmazENaEtbyAmOEksSCE
wM58csyLH+OKYHuUuSBkpaNh+0jAOD4O/m/LhEERsFyKxgUoZWBWimpGRUji2yX7
4zXuakik+8OJkhLjgBe53MXQ/1Ev0t9dSpgq0XeIZwbIUKMsuc8siwoMehmou3Om
vupqeLmmhgwGRRUiFk0q70QGLj3g6tl/M5mmHGku/vS/8wiTfP70J8djq7HiDaIV
QnXt3qKXi9egZqQ+B6qNUgOet3VAj+RQS0SFQpbXWISWplvfuyJE6SCHna5Ald4I
41WX4swQ8Q0gruHNwFzehzKrOtIYchqt/zbTibxTBZjBDiN+eJVrpPd0yzWjP4xK
QKwhwtpc7oNb8AfnSw1483w/LXJGztZNCtvmWaMjmbS8H+FdKu5BMuLpAFUtm2hk
M6U2FMyttwK9I83afarPksuOr+rEGDtxevPL8pRoVPLXgDTqWNk=
=PSnc
-----END PGP SIGNATURE-----
Merge tag 'cpufreq-arm-fixes-6.15-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge ARM cpufreq fixes for 6.15-rc from Viresh Kumar:
"- Fix possible out-of-bound / null-ptr-deref in drivers (Andre Przywara
and Henry Martin).
- Fix Kconfig issues with compile-test (Johan Hovold and Krzysztof
Kozlowski).
- Fix invalid return value in .get() (Marc Zyngier).
- Add SM8650 to cpufreq-dt-platdev blocklist (Pengyu Luo)."
* tag 'cpufreq-arm-fixes-6.15-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: fix compile-test defaults
cpufreq: cppc: Fix invalid return value in .get() callback
cpufreq: scpi: Fix null-ptr-deref in scpi_cpufreq_get_rate()
cpufreq: scmi: Fix null-ptr-deref in scmi_cpufreq_get_rate()
cpufreq: apple-soc: Fix null-ptr-deref in apple_soc_cpufreq_get_rate()
cpufreq: Do not enable by default during compile testing
cpufreq: Add SM8650 to cpufreq-dt-platdev blocklist
cpufreq: sun50i: prevent out-of-bounds access
Add a fix for X3D processors where depending upon what BIOS was
set initially rankings might be set improperly.
Add a fix for changing min/max limits while on the performance
governor.
-----BEGIN PGP SIGNATURE-----
iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmf+yFYaHG1hcmlvLmxp
bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnbteA//c5CHUfiYRhtEAr0sL37a
JWpZDcZQ+r6m+AbTwV3KXx5C6QjAcoVWtvhMxAyanBxQn9ZmlnsHz9lXTUXCxGNb
VVJoh4xpcOxAel9qEn1sZyYvNn0dGfK6bzUnHylt/Sebl14sTbduxLFnrUDci9oN
wLUMcpzPmJIMJhlbqclI5CX4/XBeiPW2Wi67SDseozY9S0qUmLBH4MrXov2mEU1X
Rc8HXHPQcOCYmVfhLBrY9keQkGZNi5cKqnM6Cm29k8BCUw7CfxZZ9QEahba12BjK
t9R4wHZdEGjAsHE7L+l+xkVuuWAAOD7rIXRq1F2jxTbsBb8h22rYlKSdaFjU9WaV
QoMhxnR3QyHNRLSZk0xGDrlybk1PWh6YGW6f5mBu2uOto+V8/jTCa4uYojEbgF8N
RW8c46umo+1stWjfYHhOP0JFi/wAG23RFqkTE1xVXpXwTnF5dTrZeOAAwWbZ2T3I
V+XVVuKNK0IunBbLgs8Q6s9eRjSEx0exTRS21M8jNYSac3V4x9BokJDyUYXFXtwe
g2yKiQQWCWQWu6lvwHW8IuqAC55qPXd8f0Q+UFtt+yf0vazzmvymbjhZl/tHzDKX
cWimrhrh0no8srIJWLXDld35U7m6YvfxWfaRUwRllotgJfIOtFPboKotqC3SXMuP
KE+IqzWWpl8ZIuxD9QP8yoA=
=IGEy
-----END PGP SIGNATURE-----
Merge tag 'amd-pstate-v6.15-2025-04-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge amd-pstate content for 6.15 (4/15/25) from Mario Limonciello:
"Add a fix for X3D processors where depending upon what BIOS was
set initially rankings might be set improperly.
Add a fix for changing min/max limits while on the performance
governor."
* tag 'amd-pstate-v6.15-2025-04-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate: Enable ITMT support after initializing core rankings
cpufreq/amd-pstate: Fix min_limit perf and freq updation for performance governor
Since cpufreq_driver_resolve_freq() can run in parallel with
cpufreq_set_policy() and there is no synchronization between them,
the former may access policy->min and policy->max while the latter
is updating them and it may see intermediate values of them due
to the way the update is carried out. Also the compiler is free
to apply any optimizations it wants both to the stores in
cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq()
which may result in additional inconsistencies.
To address this, use WRITE_ONCE() when updating policy->min and
policy->max in cpufreq_set_policy() and use READ_ONCE() for reading
them in cpufreq_driver_resolve_freq(). Moreover, rearrange the update
in cpufreq_set_policy() to avoid storing intermediate values in
policy->min and policy->max with the help of the observation that
their new values are expected to be properly ordered upfront.
Also modify cpufreq_driver_resolve_freq() to take the possible reverse
ordering of policy->min and policy->max, which may happen depending on
the ordering of operations when this function and cpufreq_set_policy()
run concurrently, into account by always honoring the max when it
turns out to be less than the min (in case it comes from thermal
throttling or similar).
Fixes: 1517176906 ("cpufreq: Make policy min/max hard requirements")
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/5907080.DvuYhMxLoT@rjwysocki.net
Commit 3f66425a4f ("cpufreq: Enable COMPILE_TEST on Arm drivers")
enabled compile testing of most Arm CPUFreq drivers but left the
existing default values unchanged so that many drivers are enabled by
default whenever COMPILE_TEST is selected.
This specifically results in the S3C64XX CPUFreq driver being enabled
and initialised during boot of non-S3C64XX platforms with the following
error logged:
cpufreq: Unable to obtain ARMCLK: -2
Commit d4f610a9ba ("cpufreq: Do not enable by default during compile
testing") recently fixed most of the default values, but two entries
were missed and two could use a more specific default condition.
Fix the default values for drivers that can be compile tested and that
should be enabled by default when not compile testing.
Fixes: 3f66425a4f ("cpufreq: Enable COMPILE_TEST on Arm drivers")
Cc: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
A subset of AMD systems supporting Preferred Core rankings can have
their rankings changed dynamically at runtime. Update the
"sg->asym_prefer_cpu" across the local hierarchy of CPU when the
preferred core ranking changes.
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250409053446.23367-4-kprateek.nayak@amd.com
Returning a negative error code in a function with an unsigned
return type is a pretty bad idea. It is probably worse when the
justification for the change is "our static analisys tool found it".
Fixes: cf7de25878 ("cppc_cpufreq: Fix possible null pointer dereference")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Commit b52aaeeadf ("cpufreq: intel_pstate: Avoid SMP calls to get
cpu-type") introduced two issues into hwp_get_cpu_scaling(). First,
it made that function use the CPU type of the CPU running the code
even though the target CPU is passed as the argument to it and second,
it used smp_processor_id() for that even though hwp_get_cpu_scaling()
runs in preemptible context.
Fix both of these problems by simply passing "cpu" to cpu_data().
Fixes: b52aaeeadf ("cpufreq: intel_pstate: Avoid SMP calls to get cpu-type")
Link: https://lore.kernel.org/linux-pm/20250412103434.5321-1-xry111@xry111.site/
Reported-by: Xi Ruoyao <xry111@xry111.site>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12659608.O9o76ZdvQC@rjwysocki.net
When working on dynamic ITMT priority support, it was observed that
"asym_prefer_cpu" on AMD systems supporting Preferred Core ranking
was always set to the first CPU in the sched group when the system boots
up despite another CPU in the group having a higher ITMT ranking.
"asym_prefer_cpu" is cached when the sched domain hierarchy is
constructed. On AMD systems that support Preferred Core rankings, sched
domains are rebuilt when ITMT support is enabled for the first time from
amd_pstate*_cpu_init().
Since amd_pstate*_cpu_init() is called to initialize the cpudata for
each CPU, the ITMT support is enabled after the first CPU initializes
its asym priority but this is too early since other CPUs have not yet
initialized their asym priorities and the sched domain is rebuilt only
once when the support is toggled on for the first time.
Initialize the asym priorities first in amd_pstate*_cpu_init() and then
enable ITMT support later in amd_pstate_register_driver() to ensure all
CPUs have correctly initialized their asym priorities before sched
domain hierarchy is rebuilt.
Clear the ITMT support when the amd-pstate driver unregisters since core
rankings cannot be trusted unless the update_limits() callback is
operational.
Remove the delayed work mechanism now that ITMT support is only toggled
from the driver init path which is outside the cpuhp critical section.
Fixes: f3a0523918 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250411081439.27652-1-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
A recent change has introduced a bug into cpufreq_get_policy(), but this
function is not used, so it's better to drop it altogether.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/2802770.mvXUDI8C0e@rjwysocki.net
cpufreq_cpu_get_raw() can return NULL when the target CPU is not present
in the policy->cpus mask. scpi_cpufreq_get_rate() does not check for
this case, which results in a NULL pointer dereference.
Fixes: 343a8d17fa ("cpufreq: scpi: remove arm_big_little dependency")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
cpufreq_cpu_get_raw() can return NULL when the target CPU is not present
in the policy->cpus mask. scmi_cpufreq_get_rate() does not check for
this case, which results in a NULL pointer dereference.
Add NULL check after cpufreq_cpu_get_raw() to prevent this issue.
Fixes: 99d6bdf338 ("cpufreq: add support for CPU DVFS based on SCMI message protocol")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
cpufreq_cpu_get_raw() can return NULL when the target CPU is not present
in the policy->cpus mask. apple_soc_cpufreq_get_rate() does not check
for this case, which results in a NULL pointer dereference.
Fixes: 6286bbb405 ("cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Since cpufreq_update_limits() obtains a cpufreq policy pointer for the
given CPU and reference counts the corresponding policy object, it may
as well pass the policy pointer to the cpufreq driver's ->update_limits()
callback which allows that callback to avoid invoking cpufreq_cpu_get()
for the same CPU.
Accordingly, redefine ->update_limits() to take a policy pointer instead
of a CPU number and update both drivers implementing it, intel_pstate
and amd-pstate, as needed.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/8560367.NyiUUSuA9g@rjwysocki.net
Since cpufreq_update_limits() obtains a cpufreq policy pointer for the
given CPU and reference counts the object pointed to by it, calling
cpufreq_update_policy() from cpufreq_update_limits() is somewhat
wasteful because that function calls cpufreq_cpu_get() on the same
CPU again.
To avoid that unnecessary overhead, move the part of the code running
under the policy rwsem from cpufreq_update_policy() to a new function
called cpufreq_policy_refresh() and invoke that new function from
both cpufreq_update_policy() and cpufreq_update_limits().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/6047110.MhkbZ0Pkbq@rjwysocki.net
Use __free() for policy reference counting cleanup where applicable in
the cpufreq core.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/9437968.CDJkKcVGEf@rjwysocki.net
Since cpufreq_cpu_acquire() and cpufreq_cpu_release() have no more
users in the tree, remove them.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/3880470.kQq0lBPeGt@rjwysocki.net
Instead of using cpufreq_cpu_acquire() and cpufreq_cpu_release() in
cpufreq_update_policy(), which is the last user of these functions,
make it use __free() for policy reference counting cleanup and the
"write" locking guard for policy locking.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/22654186.EfDdHjke4D@rjwysocki.net
Rename __intel_pstate_update_max_freq() to intel_pstate_update_max_freq()
and move the cpufreq policy reference counting and locking into it (and
implement the locking with the recently introduced cpufreq policy "write"
locking guard).
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/2315023.iZASKD2KPV@rjwysocki.net
Introduce "read" and "write" locking guards for cpufreq policies and use
them where applicable in the cpufreq core.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/8518682.T7Z3S40VBb@rjwysocki.net
In preparation for the introduction of cpufreq policy locking guards,
move the part of cpufreq_online() that is carried out under the policy
rwsem into a separate function called cpufreq_policy_online().
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/3354747.aeNJFYEL58@rjwysocki.net
Notice that the policy->cpu update in cpufreq_policy_alloc() can be
moved to cpufreq_online() and then it can be carried out under the
policy rwsem, along with the clearing of policy->governor (unnecessary
in the "new policy" code branch, but also not harmful). If this is
done, the bottom parts of the "if (policy)" branches become identical
and they can be collapsed and moved below the conditional.
Modify the code accordingly which makes it somewhat easier to follow.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/13741234.uLZWGnKmhe@rjwysocki.net
Enabling the compile test should not cause automatic enabling of all
drivers.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
SM8650 have already been supported by qcom-cpufreq-hw driver, but
never been added to cpufreq-dt-platdev. This makes noise
[ 0.388525] cpufreq-dt cpufreq-dt: failed register driver: -17
[ 0.388537] cpufreq-dt cpufreq-dt: probe with driver cpufreq-dt failed with error -17
So adding it to the cpufreq-dt-platdev driver's blocklist to fix it.
Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
A KASAN enabled kernel reports an out-of-bounds access when handling the
nvmem cell in the sun50i cpufreq driver:
==================================================================
BUG: KASAN: slab-out-of-bounds in sun50i_cpufreq_nvmem_probe+0x180/0x3d4
Read of size 4 at addr ffff000006bf31e0 by task kworker/u16:1/38
This is because the DT specifies the nvmem cell as covering only two
bytes, but we use a u32 pointer to read the value. DTs for other SoCs
indeed specify 4 bytes, so we cannot just shorten the variable to a u16.
Fortunately nvmem_cell_read() allows to return the length of the nvmem
cell, in bytes, so we can use that information to only access the valid
portion of the data.
To cover multiple cell sizes, use memcpy() to copy the information into a
zeroed u32 buffer, then also make sure we always read the data in little
endian fashion, as this is how the data is stored in the SID efuses.
Fixes: 6cc4bcceff ("cpufreq: sun50i: Refactor speed bin decoding")
Reported-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Škrabec <jernej.skrabec@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The min_limit perf and freq values can get disconnected with performance
governor, as we only modify the perf value in the special case. Fix that
by modifying the perf and freq values together
Fixes: 009d1c29a4 ("cpufreq/amd-pstate: Move perf values into a union")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250407081925.850473-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Prevent cpufreq_update_limits() from crashing the kernel due to a NULL
pointer dereference when it is called before registering a cpufreq
driver, for instance as a result of a notification triggered by the
platform firmware (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmftQCYSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1itkIAIsiuSOfLUDRkvX/8LwHJVr9Eyv6mcqp
sPvb3plNl1xQzH3l+SnJgWrPaZp8PEvaoO45TvDc1i4gld3hzs4qnTx4GHOsYSEf
bGmSmz0gYj6Fa8Yy0wUSxFRmLF9cANz0gciyFNp2roAkiOyXTcvZUhZ9B0OORhFg
qKcoYJinmgo9XAzWP0Q/V1aufn38QDvpvpqgTY4I8H70V9a5hw0zsS9q51KtcXSj
nN9W7YDgbxqopig1aFYImsTEB00fH+X7tgcCDmoGYX/5XEEbQCee8NmaeTrwpVhI
OHnkNapJVfYIvunElwWXGL/I0yLE9xljiVlCYdXssDj6k0TxBaTHjLQ=
=fCku
-----END PGP SIGNATURE-----
Merge tag 'pm-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Prevent cpufreq_update_limits() from crashing the kernel due to a NULL
pointer dereference when it is called before registering a cpufreq
driver, for instance as a result of a notification triggered by the
platform firmware (Rafael Wysocki)"
* tag 'pm-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Reference count policy in cpufreq_update_limits()
Since acpi_processor_notify() can be called before registering a cpufreq
driver or even in cases when a cpufreq driver is not registered at all,
cpufreq_update_limits() needs to check if a cpufreq driver is present
and prevent it from being unregistered.
For this purpose, make it call cpufreq_cpu_get() to obtain a cpufreq
policy pointer for the given CPU and reference count the corresponding
policy object, if present.
Fixes: 5a25e3f7cc ("cpufreq: intel_pstate: Driver-specific handling of _PPC updates")
Closes: https://lore.kernel.org/linux-acpi/Z-ShAR59cTow0KcR@mail-itl
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/1928789.tdWV9SEqCh@rjwysocki.net
- Removal of support for IBM Cell Blades
- SMP support for microwatt platform
- Support for inline static calls on PPC32
- Enable pmu selftests for power11 platform
- Enable hardware trace macro (HTM) hcall support
- Support for limited address mode capability
- Changes to RMA size from 512 MB to 768 MB to handle fadump
- Misc fixes and cleanups
Thanks to: Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav
Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar,
Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy,
Segher Boessenkool, Sourabh Jain, Vaibhav Jain, Venkat Rao Bagalkote.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmfjZnYACgkQpnEsdPSH
ZJTmKg/+NGW7wyFY9d8Iai9ncYY7GSzsMSDTaan7qg0QWOd5gjHbsdbava7TM/DW
8p9XsC+17kSeftRNUjtc52bSN8Ei2gBdsXIagQG1alfB2X2e6wkNauifK+dz3Su6
usMEZZTO5R/jFDotFXNM1nsUj+8dvjnPgOUrji/P8k7PT5295wpza0hz1fy5SrOA
hM5cliBP36UgFe5Efvgm4OUX2gQIhbc3stt9MVfymW/k0Mit5f41UIPuVGiTWowY
s0cUJGkhxUlGXT3VfOVKuZfn4u9KMha7UCl9afSceJzXOdnUIKIbskui1VEv6cD/
iSIxi839uErAobFHlsLYprgYFciYLII3xe2qNZCA/ZxeIMS/Mm6xokESeWLhBnfa
P7ke6l0z3GDtTvgI2eSeU9BdrVveF1NgbP9GYSKgT6gtw/kRRnxgHF8tzmLON5PT
KXpQlzz8VuSBRtF2jnLFU89+FFwSA1bRUhDrp89HyYFqw1B5g4N7kFFTUJWHOuKS
fwPGy+cveKehmCUBedeTRFqHvvqdwpD/WnPlQzCly3WxqdL8U/eTXYftMiAwuK28
ovLuSs3vRThKRQ8DnUa5oB0UGsjMpRV5LdvYkhw+x8mZKUR59oj4fx2ae4TtPakg
dbAYuPPkCORdaSga/nV6vQgsLprFpcGX3dq6E19+BVBAY5D+1PE=
=GFcj
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Madhavan Srinivasan:
- Remove support for IBM Cell Blades
- SMP support for microwatt platform
- Support for inline static calls on PPC32
- Enable pmu selftests for power11 platform
- Enable hardware trace macro (HTM) hcall support
- Support for limited address mode capability
- Changes to RMA size from 512 MB to 768 MB to handle fadump
- Misc fixes and cleanups
Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom,
Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook,
Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani
(IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav
Jain, and Venkat Rao Bagalkote.
* tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits)
powerpc/kexec: fix physical address calculation in clear_utlb_entry()
crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD
powerpc: Fix 'intra_function_call not a direct call' warning
powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu'
KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests
powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7
powerpc/microwatt: Add SMP support
powerpc: Define config option for processors with broadcast TLBIE
powerpc/microwatt: Define an idle power-save function
powerpc/microwatt: Device-tree updates
powerpc/microwatt: Select COMMON_CLK in order to get the clock framework
net: toshiba: Remove reference to PPC_IBM_CELL_BLADE
net: spider_net: Remove powerpc Cell driver
cpufreq: ppc_cbe: Remove powerpc Cell driver
genirq: Remove IRQ_EDGE_EOI_HANDLER
docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR
powerpc: Remove UDBG_RTAS_CONSOLE
powerpc/io: Use standard barrier macros in io.c
powerpc/io: Rename _insw_ns() etc.
powerpc/io: Use generic raw accessors
...
- Manage sysfs attributes and boost frequencies efficiently from
cpufreq core to reduce boilerplate code in drivers (Viresh Kumar).
- Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
Dhananjay Ugwekar, Imran Shaik, zuoqian).
- Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
Bai).
- cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
- Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng).
- Optimize the amd-pstate driver to avoid cases where call paths end
up calling the same writes multiple times and needlessly caching
variables through code reorganization, locking overhaul and tracing
adjustments (Mario Limonciello, Dhananjay Ugwekar).
- Make it possible to avoid enabling capacity-aware scheduling (CAS) in
the intel_pstate driver and relocate a check for out-of-band (OOB)
platform handling in it to make it detect OOB before checking HWP
availability (Rafael Wysocki).
- Fix dbs_update() to avoid inadvertent conversions of negative integer
values to unsigned int which causes CPU frequency selection to be
inaccurate in some cases when the "conservative" cpufreq governor is
in use (Jie Zhan).
- Update the handling of the most recent idle intervals in the menu
cpuidle governor to prevent useful information from being discarded
by it in some cases and improve the prediction accuracy (Rafael
Wysocki).
- Make it possible to tell the intel_idle driver to ignore its built-in
table of idle states for the given processor, clean up the handling
of auto-demotion disabling on Baytrail and Cherrytrail chips in it,
and update its MAINTAINERS entry (David Arcari, Artem Bityutskiy,
Rafael Wysocki).
- Make some cpuidle drivers use for_each_present_cpu() instead of
for_each_possible_cpu() during initialization to avoid issues
occurring when nosmp or maxcpus=0 are used (Jacky Bai).
- Clean up the Energy Model handling code somewhat (Rafael Wysocki).
- Use kfree_rcu() to simplify the handling of runtime Energy Model
updates (Li RongQing).
- Add an entry for the Energy Model framework to MAINTAINERS as
properly maintained (Lukasz Luba).
- Address RCU-related sparse warnings in the Energy Model code (Rafael
Wysocki).
- Remove ENERGY_MODEL dependency on SMP and allow it to be selected
when DEVFREQ is set without CPUFREQ so it can be used on a wider
range of systems (Jeson Gao).
- Unify error handling during runtime suspend and runtime resume in the
core to help drivers to implement more consistent runtime PM error
handling (Rafael Wysocki).
- Drop a redundant check from pm_runtime_force_resume() and rearrange
documentation related to __pm_runtime_disable() (Rafael Wysocki).
- Rework the handling of the "smart suspend" driver flag in the PM core
to avoid issues hat may occur when drivers using it depend on some
other drivers and clean up the related PM core code (Rafael Wysocki,
Colin Ian King).
- Fix the handling of devices with the power.direct_complete flag set
if device_suspend() returns an error for at least one device to avoid
situations in which some of them may not be resumed (Rafael Wysocki).
- Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
possible deadlock that may occur if the "compressor" hibernation
module parameter is accessed during the registration of a new
ieee80211 device (Lizhi Xu).
- Suppress sleeping parent warning in device_pm_add() in the case when
new children are added under a device with the power.direct_complete
set after it has been processed by device_resume() (Xu Yang).
- Remove needless return in three void functions related to system
wakeup (Zijun Hu).
- Replace deprecated kmap_atomic() with kmap_local_page() in the
hibernation core code (David Reaver).
- Remove unused helper functions related to system sleep (David Alan
Gilbert).
- Clean up s2idle_enter() so it does not lock and unlock CPU offline
in vain and update comments in it (Ulf Hansson).
- Clean up broken white space in dpm_wait_for_children() (Geert
Uytterhoeven).
- Update the cpupower utility to fix lib version-ing in it and memory
leaks in error legs, remove hard-coded values, and implement CPU
physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
Khan, Yiwei Lin, Zhongqiu Han).
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmfhhTYSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO16/gIAKuRiG1fFgUcUSXC1iFu42vrB/1i4wpA
02GICACqM3K6/5jd3ct/WOU28GUgDs+xcmqH7CnMaM6y9nXEWjWarmSfFekAO+0q
TPtQ7xTy0hBCB3he1P2uLKBJBin4Wn47U9/rvs4J7mQd5zDxTINKIiVoHg2lEE+s
HAeSoNRb2sp5IZDm9+/LfhHNYRP1mJ97cbZlymqctGB3xgDL7qMLid/1+gFPHAQS
4/LXj3IgyU8DpA/j5nhtpaAqjN5g2QxIUfQgADRIcESK99Y/7aAMs1/G0WhJKaay
9yx+4/xmkGvVCZQx1DphksFLISEzltY0SFWLsoppPzBTGVEW2GQQsNI=
=LqVy
-----END PGP SIGNATURE-----
Merge tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are dominated by cpufreq updates which in turn are dominated by
updates related to boost support in the core and drivers and
amd-pstate driver optimizations.
Apart from the above, there are some cpuidle updates including a
rework of the most recent idle intervals handling in the venerable
menu governor that leads to significant improvements in some
performance benchmarks, as the governor is now more likely to predict
a shorter idle duration in some cases, and there are updates of the
core device power management code, mostly related to system suspend
and resume, that should help to avoid potential issues arising when
the drivers of devices depending on one another want to use different
optimizations.
There is also a usual collection of assorted fixes and cleanups,
including removal of some unused code.
Specifics:
- Manage sysfs attributes and boost frequencies efficiently from
cpufreq core to reduce boilerplate code in drivers (Viresh Kumar)
- Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
Dhananjay Ugwekar, Imran Shaik, zuoqian)
- Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
Bai)
- cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski)
- Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng)
- Optimize the amd-pstate driver to avoid cases where call paths end
up calling the same writes multiple times and needlessly caching
variables through code reorganization, locking overhaul and tracing
adjustments (Mario Limonciello, Dhananjay Ugwekar)
- Make it possible to avoid enabling capacity-aware scheduling (CAS)
in the intel_pstate driver and relocate a check for out-of-band
(OOB) platform handling in it to make it detect OOB before checking
HWP availability (Rafael Wysocki)
- Fix dbs_update() to avoid inadvertent conversions of negative
integer values to unsigned int which causes CPU frequency selection
to be inaccurate in some cases when the "conservative" cpufreq
governor is in use (Jie Zhan)
- Update the handling of the most recent idle intervals in the menu
cpuidle governor to prevent useful information from being discarded
by it in some cases and improve the prediction accuracy (Rafael
Wysocki)
- Make it possible to tell the intel_idle driver to ignore its
built-in table of idle states for the given processor, clean up the
handling of auto-demotion disabling on Baytrail and Cherrytrail
chips in it, and update its MAINTAINERS entry (David Arcari, Artem
Bityutskiy, Rafael Wysocki)
- Make some cpuidle drivers use for_each_present_cpu() instead of
for_each_possible_cpu() during initialization to avoid issues
occurring when nosmp or maxcpus=0 are used (Jacky Bai)
- Clean up the Energy Model handling code somewhat (Rafael Wysocki)
- Use kfree_rcu() to simplify the handling of runtime Energy Model
updates (Li RongQing)
- Add an entry for the Energy Model framework to MAINTAINERS as
properly maintained (Lukasz Luba)
- Address RCU-related sparse warnings in the Energy Model code
(Rafael Wysocki)
- Remove ENERGY_MODEL dependency on SMP and allow it to be selected
when DEVFREQ is set without CPUFREQ so it can be used on a wider
range of systems (Jeson Gao)
- Unify error handling during runtime suspend and runtime resume in
the core to help drivers to implement more consistent runtime PM
error handling (Rafael Wysocki)
- Drop a redundant check from pm_runtime_force_resume() and rearrange
documentation related to __pm_runtime_disable() (Rafael Wysocki)
- Rework the handling of the "smart suspend" driver flag in the PM
core to avoid issues hat may occur when drivers using it depend on
some other drivers and clean up the related PM core code (Rafael
Wysocki, Colin Ian King)
- Fix the handling of devices with the power.direct_complete flag set
if device_suspend() returns an error for at least one device to
avoid situations in which some of them may not be resumed (Rafael
Wysocki)
- Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
possible deadlock that may occur if the "compressor" hibernation
module parameter is accessed during the registration of a new
ieee80211 device (Lizhi Xu)
- Suppress sleeping parent warning in device_pm_add() in the case
when new children are added under a device with the
power.direct_complete set after it has been processed by
device_resume() (Xu Yang)
- Remove needless return in three void functions related to system
wakeup (Zijun Hu)
- Replace deprecated kmap_atomic() with kmap_local_page() in the
hibernation core code (David Reaver)
- Remove unused helper functions related to system sleep (David Alan
Gilbert)
- Clean up s2idle_enter() so it does not lock and unlock CPU offline
in vain and update comments in it (Ulf Hansson)
- Clean up broken white space in dpm_wait_for_children() (Geert
Uytterhoeven)
- Update the cpupower utility to fix lib version-ing in it and memory
leaks in error legs, remove hard-coded values, and implement CPU
physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
Khan, Yiwei Lin, Zhongqiu Han)"
* tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits)
PM: sleep: Fix bit masking operation
dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
cpufreq: Init cpufreq only for present CPUs
PM: sleep: Fix handling devices with direct_complete set on errors
cpuidle: Init cpuidle only for present CPUs
PM: clk: Remove unused pm_clk_remove()
PM: sleep: core: Fix indentation in dpm_wait_for_children()
PM: s2idle: Extend comment in s2idle_enter()
PM: s2idle: Drop redundant locks when entering s2idle
PM: sleep: Remove unused pm_generic_ wrappers
cpufreq: tegra186: Share policy per cluster
cpupower: Make lib versioning scheme more obvious and fix version link
PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
PM: EM: Address RCU-related sparse warnings
cpupower: Implement CPU physical core querying
pm: cpupower: remove hard-coded topology depth values
pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology
...
Perf and PMUs:
- Support for the "Rainier" CPU PMU from Arm
- Preparatory driver changes and cleanups that pave the way for BRBE
support
- Support for partial virtualisation of the Apple-M1 PMU
- Support for the second event filter in Arm CSPMU designs
- Minor fixes and cleanups (CMN and DWC PMUs)
- Enable EL2 requirements for FEAT_PMUv3p9
Power, CPU topology:
- Support for AMUv1-based average CPU frequency
- Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds
a generic topology_is_primary_thread() function overridden by x86 and
powerpc
New(ish) features:
- MOPS (memcpy/memset) support for the uaccess routines
Security/confidential compute:
- Fix the DMA address for devices used in Realms with Arm CCA. The
CCA architecture uses the address bit to differentiate between shared
and private addresses
- Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
default
Memory management clean-ups:
- Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs
- Some minor page table accessor clean-ups
- PIE/POE (permission indirection/overlay) helpers clean-up
Kselftests:
- MTE: skip hugetlb tests if MTE is not supported on such mappings and
user correct naming for sync/async tag checking modes
Miscellaneous:
- Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
request)
- Sysreg updates for new register fields
- CPU type info for some Qualcomm Kryo cores
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmfjB2QACgkQa9axLQDI
XvGrfg//W3Bx9+jw1G/XHHEQqGEVFmvltvxZUkvgV0Qki0rPSMnappJhZRL9n0Nm
V6PvGd2KoKHZuL3g5ViZb3cs2R9BiD2JB6PncwBKuxumHGh3vz3kk1JMkDVfWdHv
qAceOckFJD9rXjPZn+PDsfYiEi2i3RRWIP5VglZ14ue8j3prHQ6DJXLUQF2GYvzE
/bgLSq44wp5N59ddy23+qH9rxrHzz3bgpbVv/F56W/LErvE873mRmyFwiuGJm+M0
Pn8ra572rI6a4sgSwrMTeNPBU+F9o5AbqwauVhkz428RdMvgfEuW6qHUBnGWJDmt
HotXmu+4Eb2KJks/iQkDo4OTJ38yUqvvZZJtP171ms3E4yqESSJngWP6O2A6LF+y
xhe0sESF/Ew6jLhM6/hvOmBcE2AyB14JE3ymqLkXbWub4NXddBn2AF1WXFjF4CBw
F8KSUhNLekrCYKv1k9M3nhvkcpoS9FkTF/TI+zEg546alI/GLPih6uDRkgMAODh1
RDJYixHsf2NDDRQbfwvt9Xua/KKpDF6qNkHLA4OiqqVUwh1hkas24Lrnp8vmce4o
wIpWCLqYWey8Rl3XWuWgWz2Xu58fHH4Dl2k72Z8I0pwp3abCDa9xEj79G0Svk7Si
Q+FCYrNlpKee1RXBC+1MUD/Gl5r/28dEUFkAzPD80F7AgafXPd0=
=Kc9c
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Nothing major this time around.
Apart from the usual perf/PMU updates, some page table cleanups, the
notable features are average CPU frequency based on the AMUv1
counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in
the uaccess routines.
Perf and PMUs:
- Support for the 'Rainier' CPU PMU from Arm
- Preparatory driver changes and cleanups that pave the way for BRBE
support
- Support for partial virtualisation of the Apple-M1 PMU
- Support for the second event filter in Arm CSPMU designs
- Minor fixes and cleanups (CMN and DWC PMUs)
- Enable EL2 requirements for FEAT_PMUv3p9
Power, CPU topology:
- Support for AMUv1-based average CPU frequency
- Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It
adds a generic topology_is_primary_thread() function overridden by
x86 and powerpc
New(ish) features:
- MOPS (memcpy/memset) support for the uaccess routines
Security/confidential compute:
- Fix the DMA address for devices used in Realms with Arm CCA. The
CCA architecture uses the address bit to differentiate between
shared and private addresses
- Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
default
Memory management clean-ups:
- Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs
- Some minor page table accessor clean-ups
- PIE/POE (permission indirection/overlay) helpers clean-up
Kselftests:
- MTE: skip hugetlb tests if MTE is not supported on such mappings
and user correct naming for sync/async tag checking modes
Miscellaneous:
- Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
request)
- Sysreg updates for new register fields
- CPU type info for some Qualcomm Kryo cores"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits)
arm64: mm: Don't use %pK through printk
perf/arm_cspmu: Fix missing io.h include
arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
arm64: cputype: Add MIDR_CORTEX_A76AE
arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
arm64/sysreg: Enforce whole word match for open/close tokens
arm64/sysreg: Fix unbalanced closing block
arm64: Kconfig: Enable HOTPLUG_SMT
arm64: topology: Support SMT control on ACPI based system
arch_topology: Support SMT control for OF based system
cpu/SMT: Provide a default topology_is_primary_thread()
arm64/mm: Define PTDESC_ORDER
perf/arm_cspmu: Add PMEVFILT2R support
perf/arm_cspmu: Generalise event filtering
perf/arm_cspmu: Move register definitons to header
arm64/kernel: Always use level 2 or higher for early mappings
arm64/mm: Drop PXD_TABLE_BIT
arm64/mm: Check pmd_table() in pmd_trans_huge()
...
for_each_possible_cpu() is currently used to initialize cpufreq.
However, in cpu_dev_register_generic(), for_each_present_cpu()
is used to register CPU devices which means the CPU devices are
only registered for present CPUs and not all possible CPUs.
With nosmp or maxcpus=0, only the boot CPU is present, lead
to the cpufreq probe failure or defer probe due to no cpu device
available for not present CPUs.
Change for_each_possible_cpu() to for_each_present_cpu() in the
above cpufreq drivers to ensure it only registers cpufreq for
CPUs that are actually present.
Fixes: b0c69e1214 ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES")
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This functionally brings tegra186 in line with tegra210 and tegra194,
sharing a cpufreq policy between all cores in a cluster.
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
When the CPU goes offline there is no need to change the CPPC request
because the CPU will go into the deepest C-state it supports already.
Actually changing the CPPC request when it goes offline messes up the
cached values and can lead to the wrong values being restored when
it comes back.
Instead drop the actions and if the CPU comes back online let
amd_pstate_epp_set_policy() restore it to expected values.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
EPP values are cached in the cpudata structure per CPU. This is needless
though because they are also cached in the CPPC request variable.
Drop the separate cache for EPP values and always reference the CPPC
request variable when needed.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The CPPC enable register is configured as "write once". That is
any future writes don't actually do anything.
Because of this, all the cleanup paths that currently exist for
CPPC disable are non-effective.
Rework CPPC enable to only enable after all the CAP registers have
been read to avoid enabling CPPC on CPUs with invalid _CPC or
unpopulated MSRs.
As the register is write once, remove all cleanup paths as well.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
There are trace events that exist now for all amd-pstate modes that
will output information right before programming to the hardware.
This makes the existing debug statements unnecessary remaining
overhead. Drop them.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
On EPP only writes update the cached variable so that the min/max
performance controls don't need to be updated again.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The EPP tracing is done by the caller today, but this precludes the
information about whether the CPPC request has changed.
Move it into the update_perf and set_epp functions and include information
about whether the request has changed from the last one.
amd_pstate_update_perf() and amd_pstate_set_epp() now require the policy
as an argument instead of the cpudata.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
In order to prevent a potential write for shmem_update_perf()
cache the request into the cppc_req_cached variable normally only
used for the MSR case.
This adds symmetry into the code and potentially avoids extra writes.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Bitfield masks are easier to follow and less error prone.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
In amd_pstate_ut_check_freq() and amd_pstate_ut_check_perf() the cpudata
variable is only needed in the scope of the for loop. Move it there.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
If a CPU is missing a policy or one has been offlined then the unit test
is skipped for the rest of the CPUs on the system.
Instead; iterate online CPUs and skip any missing policies to allow
continuing to test the rest of them.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Enums are effectively used as a boolean and don't show
the return value of the failing call.
Instead of using enums switch to returning the actual return
code from the unit test.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Several Ryzen AI processors support the exact same value for lowest
nonlinear perf and lowest perf. Loosen up the unit tests to allow this
scenario.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The `cppc_cap1_cached` variable isn't used at all, there is no
need to read it at initialization for each CPU.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
update the policy state and have nothing to do with the amd-pstate
driver itself.
A global "limits" lock doesn't make sense because each CPU can have
policies changed independently. Each time a CPU changes values they
will atomically be written to the per-CPU perf member. Drop per CPU
locking cases.
The remaining "global" driver lock is used to ensure that only one
entity can change driver modes at a given time.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
By storing perf values in a union all the writes and reads can
be done atomically, removing the need for some concurrency protections.
While making this change, also drop the cached frequency values,
using inline helpers to calculate them on demand from perf value.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Use the perf_to_freq helpers to calculate this on the fly.
As the members are no longer cached add an extra check into
amd_pstate_epp_update_limit() to avoid unnecessary calls in
amd_pstate_update_min_max_limit().
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
populated. This is an unexpected behavior that is most likely a
BIOS bug. In the event it happens I'd like users to report bugs
to properly root cause and get this fixed.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
During resume it's possible the firmware didn't restore the CPPC request
MSR but the kernel thinks the values line up. This leads to incorrect
performance after resume from suspend.
To fix the issue invalidate the cached value at suspend. During resume use
the saved values programmed as cached limits.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The clamping in freq_to_perf() is broken right now, as we first typecast
(read wraparound) the overflowing value into a u8 and then clamp it down.
So, use a u32 to store the >255 value in certain edge cases and then clamp
it down into a u8.
Also, use a "explicit typecast + clamp" instead of just a "clamp_t" as the
latter typecasts first and then clamps between the limits, which defeats
our purpose.
Fixes: 620136ced3 ("cpufreq/amd-pstate: Modularize perf<->freq conversion")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250222033221.554976-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Support was added for Tegra234 in the referenced commit, but the Kconfig
was not updated to allow building for the arch.
Fixes: 273bc890a2 ("cpufreq: tegra194: Add support for Tegra234")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Intel pstate driver relies on SMP calls to get the cpu-type of a given CPU.
Remove the SMP calls and instead use the cached value of cpu-type which is
more efficient.
[ mingo: Forward ported it. ]
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-2-2ae010f50370@linux.intel.com
This driver can no longer be built since support for IBM Cell Blades was
removed, in particular CBE_RAS.
Remove the driver.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20241218105523.416573-22-mpe@ellerman.id.au
There is no need to take a driver wide lock while updating the
highest_perf value in the percpu cpudata struct. Hence remove it.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-13-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
There have been instances in past where refcount decrementing is missed
while exiting a function. Use automatic scope based cleanup to avoid
such errors.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-12-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Check if policy is NULL before dereferencing it in amd_pstate_update.
Fixes: e8f555daac ("cpufreq/amd-pstate: fix setting policy current frequency value")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-11-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The update_limits callback is only called in two conditions.
* When the preferred core rankings change. In which case, we just need to
change the prefcore ranking in the cpudata struct. As there are no changes
to any of the perf values, there is no need to call cpufreq_update_policy()
* When the _PPC ACPI object changes, i.e. the highest allowed Pstate
changes. The _PPC object is only used for a table based cpufreq driver
like acpi-cpufreq, hence is irrelevant for CPPC based amd-pstate.
Hence, the cpufreq_update_policy() call becomes unnecessary and can be
removed.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-9-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Delegate the perf<->frequency conversion to helper functions to reduce
code duplication, and improve readability.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-8-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
All perf values are always within 0-255 range, hence convert their
datatype to u8 everywhere.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-7-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>