set_boost is a per-policy function call, hence a driver wide lock is
unnecessary. Also this mutex_acquire can collide with the mutex_acquire
from the mode-switch path in status_store(), which can lead to a
deadlock. So, remove it.
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
In adjust_perf() callback, we are setting the max_perf to highest_perf,
as opposed to the correct limit value i.e. max_limit_perf. Fix that.
Fixes: 3f7b835fa4 ("cpufreq/amd-pstate: Move limit updating code")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-3-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Scope based guard/cleanup macros should not be used together with goto
labels. Hence, remove the goto label.
Fixes: 6c093d5a5b ("cpufreq/amd-pstate: convert mutex use to guard()")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-2-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Commit c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision
boost state") sets per-policy boost flag to false when boost fail.
However, this boost flag will be set to reverse value in
store_local_boost() and cpufreq_boost_trigger_state() in cpufreq.c. This
will cause the per-policy boost flag set to true when fail to set boost.
Remove the extra assignment in amd_pstate_set_boost() and keep all
operations on per-policy boost flag outside of set_boost() to fix this
problem.
Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250110091949.3610770-1-zhenglifeng1@huawei.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The previous approach introduced roundoff errors during division when
calculating the boost ratio. This, in turn, affected the maximum
frequency calculation, often resulting in reporting lower frequency
values.
For example, on the Glinda SoC based board with the following
parameters:
max_perf = 208
nominal_perf = 100
nominal_freq = 2600 MHz
The Linux kernel previously calculated the frequency as:
freq = ((max_perf * 1024 / nominal_perf) * nominal_freq) / 1024
freq = 5405 MHz // Integer arithmetic.
With the updated formula:
freq = (max_perf * nominal_freq) / nominal_perf
freq = 5408 MHz
This change ensures more accurate frequency calculations by eliminating
unnecessary shifts and divisions, thereby improving precision.
Signed-off-by: Naresh Solanki <naresh.solanki@9elements.com>
[ML: trim the changelog from commit message]
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241219201833.2750998-1-naresh.solanki@9elements.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
commit 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as
highest perf again") updated the value stored for highest perf to no longer
store the highest perf value but instead the boost numerator.
This is a fixed value for systems with preferred cores and not appropriate
for use ITMT rankings. Update the value used for ITMT rankings to be the
preferred core ranking.
Reported-and-tested-by: Sebastian <sobrus@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219640
Fixes: 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as highest perf again")
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20250102141204.3413202-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Currently boost_state is cached for every processor in cpudata structure
and driver boost state is set for every processor.
Both of these aren't necessary as the driver only needs to set once and
the policy stores whether boost is enabled.
Move the driver boost setting to registration and adjust all references
to cached value to pull from the policy instead.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-16-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
For Ryzen systems the EPP policy set by the BIOS is generally configured
to performance as this is the default register value for the CPPC request
MSR.
If a user doesn't use additional software to configure EPP then the system
will default biased towards performance and consume extra battery. Instead
configure the default to "balanced_performance" for this case.
Suggested-by: Artem S. Tashkinov <aros@gmx.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219526
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-15-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The ret variable is not necessary.
Reviewed-and-tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-14-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
For MSR systems the EPP value is in the same register as perf targets
and so divding them into two separate MSR writes is wasteful.
In msr_update_perf(), update both EPP and perf values in one write to
MSR_AMD_CPPC_REQ, and cache them if successful.
To accomplish this plumb the EPP value into the update_perf call and
modify all its callers to check the return value.
As this unifies calls, ensure that the MSR write is necessary before
flushing a write out. Also drop the comparison from the passive flow
tracing.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-13-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Cache the value in cpudata->epp_cached, and use that for all callers.
As all callers use cached value merge amd_pstate_get_energy_pref_index()
into show_energy_performance_preference().
Check if the EPP value is changed before writing it to MSR or
shared memory region.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-12-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The limit updating code in amd_pstate_epp_update_limit() should not
only apply to EPP updates. Move it to amd_pstate_update_min_max_limit()
so other callers can benefit as well.
With this move it's not necessary to have clamp_t calls anymore because
the verify callback is called when setting limits.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-11-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
As msr_update_perf() calls an MSR it's possible that it fails. Pass
this return code up to the caller.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-10-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Storing values in the cpudata structure in different units leads
to confusion and hardcoded conversions elsewhere. After ratios are
calculated store everything in khz for any future use. Adjust all
relevant consumers for this change as well.
Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-9-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
If writing the MSR MSR_AMD_CPPC_REQ fails then the cached value in the
amd_cpudata structure should not be updated.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-8-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
In "active" mode the most important thing for debugging whether
an issue is hardware or software based is to look at what was the
last thing written to the CPPC request MSR or shared memory region.
The 'amd_pstate_epp_perf' trace event shows the values being written
for all CPUs.
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
amd_pstate_epp_offline() is only called from within
amd_pstate_epp_cpu_offline() and doesn't make much sense to have it at all.
Hence, remove it.
Also remove the unncessary debug print in the offline path while at it.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-6-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Only amd_pstate_epp driver (i.e. cppc_state = ACTIVE) enters the
amd_pstate_epp_offline() and amd_pstate_epp_cpu_online() functions,
so remove the unnecessary if condition checking if cppc_state is
equal to AMD_PSTATE_ACTIVE.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Replace similar code chunks with amd_pstate_update_perf() and
amd_pstate_set_epp() function calls.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-4-Dhananjay.Ugwekar@amd.com
[ML: Fix LKP reported error about unused variable]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
amd_pstate_update_perf() should not be a part of shmem_set_epp() function,
so move it to the amd_pstate_epp_update_limit() function, where it is needed.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
MSR and shared memory based systems have different mechanisms to get and
set the epp value. Split those mechanisms into different functions and
assign them appropriately to the static calls at boot time. This eliminates
the need for the "if(cpu_feature_enabled(X86_FEATURE_CPPC))" checks at
runtime.
Also, propagate the error code from rdmsrl_on_cpu() and cppc_get_epp_perf()
to *_get_epp()'s caller, instead of returning -EIO unconditionally.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
commit 18d9b52271 ("cpufreq/amd-pstate: Use nominal perf for limits
when boost is disabled") introduced different semantics for min/max limits
based upon whether the user turned off boost from sysfs.
This however is not necessary when the highest perf value is the boost
numerator.
Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Fixes: 18d9b52271 ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled")
Link: https://lore.kernel.org/r/20241209185248.16301-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
commit ad4caad58d ("cpufreq: amd-pstate: Merge
amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
changed the semantics for highest perf and commit 18d9b52271
("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled")
worked around those semantic changes.
This however is a confusing result and furthermore makes it awkward to
change frequency limits and boost due to the scaling differences. Restore
the boost numerator to highest perf again.
Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Fixes: ad4caad58d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
Link: https://lore.kernel.org/r/20241209185248.16301-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Booting with amd-pstate on 3rd Generation EPYC system incorrectly
enabled ITMT support despite the system not supporting Preferred Core
ranking. amd_pstate_init_prefcore() called during amd_pstate*_cpu_init()
requires "amd_pstate_prefcore" to be set correctly however the preferred
core support is detected only after driver registration which is too
late.
Swap the function calls around to detect preferred core support before
registring the driver via amd_pstate_register_driver(). This ensures
amd_pstate*_cpu_init() sees the correct value of "amd_pstate_prefcore"
considering the platform support.
Fixes: 279f838a61 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Fixes: ff2653ded4 ("cpufreq/amd-pstate: Move registration after static function call update")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241210032557.754-1-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
with the purpose of using such hints when making scheduling decisions
- Determine the boost enumerator for each AMD core based on its type: efficiency
or performance, in the cppc driver
- Add the type of a CPU to the topology CPU descriptor with the goal of
supporting and making decisions based on the type of the respective core
- Add a feature flag to denote AMD cores which have heterogeneous topology and
enable SD_ASYM_PACKING for those
- Check microcode revisions before disabling PCID on Intel
- Cleanups and fixlets
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmc7q0UACgkQEsHwGGHe
VUq27Q//TADIn/rZj95OuWLYFXduOpzdyfF6BAOabRjUpIWTGJ5YdKjj1TCA2wUE
6SiHZWQxQropB3NgeICcDT+3OGdGzE2qywzpXspUDsBPraWx+9CA56qREYafpRps
88ZQZJWHla2/0kHN5oM4fYe05mWMLAFgIhG4tPH/7sj54Zqar40nhVksz3WjKAid
yEfzbdVeRI5sNoujyHzGANXI0Fo98nAyi5Qj9kXL9W/UV1JmoQ78Rq7V9IIgOBsc
l6Gv/h0CNtH9voqfrfUb07VHk8ZqSJ37xUnrnKdidncWGCWEAoZRr7wU+I9CHKIs
tzdx+zq6JC3YN0IwsZCjk4me+BqVLJxW2oDgW7esPifye6ElyEo4T9UO9LEpE1qm
ReAByoIMdSXWwXuITwy4NxLPKPCpU7RyJCiqFzpJp0g4qUq2cmlyERDirf6eknXL
s+dmRaglEdcQT/EL+Y+vfFdQtLdwJmOu+nPPjjFxeRcIDB+u1sXJMEFbyvkLL6FE
HOdNxL+5n/3M8Lbh77KIS5uCcjXL2VCkZK2/hyoifUb+JZR/ENoqYjElkMXOplyV
KQIfcTzVCLRVvZApf/MMkTO86cpxMDs7YLYkgFxDsBjRdoq/Mzub8yzWn6kLZtmP
ANNH4uYVtjrHE1nxJSA0JgYQlJKYeNU5yhLiTLKhHL5BwDYfiz8=
=420r
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid updates from Borislav Petkov:
- Add a feature flag which denotes AMD CPUs supporting workload
classification with the purpose of using such hints when making
scheduling decisions
- Determine the boost enumerator for each AMD core based on its type:
efficiency or performance, in the cppc driver
- Add the type of a CPU to the topology CPU descriptor with the goal of
supporting and making decisions based on the type of the respective
core
- Add a feature flag to denote AMD cores which have heterogeneous
topology and enable SD_ASYM_PACKING for those
- Check microcode revisions before disabling PCID on Intel
- Cleanups and fixlets
* tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Remove redundant CONFIG_NUMA guard around numa_add_cpu()
x86/cpu: Fix FAM5_QUARK_X1000 to use X86_MATCH_VFM()
x86/cpu: Fix formatting of cpuid_bits[] in scattered.c
x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit
x86/amd: Use heterogeneous core topology for identifying boost numerator
x86/cpu: Add CPU type to struct cpuinfo_topology
x86/cpu: Enable SD_ASYM_PACKING for PKG domain on AMD
x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES
x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
x86/mm: Don't disable PCID when INVLPG has been fixed by microcode
On shared memory designs the static functions need to work before
registration is done or the system can hang at bootup.
Move the registration later in amd_pstate_init() to solve this.
Fixes: b427ac4084 ("cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call")
Reported-by: Klara Modin <klarasmodin@gmail.com>
Closes: https://lore.kernel.org/linux-pm/cf9c146d-bacf-444e-92e2-15ebf513af96@gmail.com/#t
Tested-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241028145542.1739160-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
As the driver can be changed in and out of different modes it's possible
that adjust_perf is assigned when it shouldn't be.
This could happen if an MSR design is started up in passive mode and then
switches to active mode.
To solve this explicitly clear `adjust_perf` in amd_pstate_epp_cpu_init().
Tested-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241028145542.1739160-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Set min_perf to lowest_perf for shared memory systems, similar to the MSR
based systems.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The EPP value being set in perf_ctrls.energy_perf is not being propagated
to the shared memory, fix that.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-4-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
MSR_AMD_CPPC_ENABLE is a write once register, i.e. attempting to clear
it is futile, it will not take effect. Hence, return if disable (0)
argument is passed to the msr_cppc_enable()
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Currently the default cpufreq driver for all the AMD EPYC servers is
acpi-cpufreq. Going forward, switch to amd-pstate as the default
driver on the AMD EPYC server platforms with CPU family 0x1A or
higher. The default mode will be active mode.
Testing shows that amd-pstate with active mode and performance
governor provides comparable or better performance per-watt against
acpi-cpufreq + performance governor.
Likewise, amd-pstate with active mode and powersave governor with the
energy_performance_preference=power (EPP=255) provides comparable or
better performance per-watt against acpi-cpufreq + schedutil governor
for a wide range of workloads.
Users can still revert to using acpi-cpufreq driver on these platforms
with the "amd_pstate=disable" kernel commandline parameter.
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241021101836.9047-3-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The amd-pstate driver sets CPPC_REQ.min_perf to CPPC_REQ.max_perf when
in active mode with performance governor. Typically CPPC_REQ.max_perf
is set to CPPC.highest_perf. This causes frequency throttling on
power-limited platforms which causes performance regressions on
certain classes of workloads.
Hence, set the CPPC_REQ.min_perf to the CPPC.nominal_perf or
CPPC_REQ.max_perf, whichever is lower of the two.
Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241021101836.9047-2-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
amd_pstate_set_driver() is called twice, once in amd_pstate_init() and once
as part of amd_pstate_register_driver(). Move around code and eliminate
the redundancy.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Replace the switch case with a more readable if condition.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-4-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Replace a similar chunk of code in amd_pstate_register_driver() with
amd_pstate_set_driver() call.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Replace a similar chunk of code in amd_pstate_init() with
amd_pstate_register() call.
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
According to the AMD architectural programmer's manual volume 2 [1], in
section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described
as "Reports the most energy efficient performance level (in terms of
performance per watt). Above this threshold, lower performance levels
generally result in increased energy efficiency. Reducing performance
below this threshold does not result in total energy savings for a given
computation, although it reduces instantaneous power consumption". So
lowest_nonlinear_perf is the most power efficient performance level, and
going below that would lead to a worse performance/watt.
Also, setting the minimum frequency to lowest_nonlinear_freq (instead of
lowest_freq) allows the CPU to idle at a higher frequency which leads
to more time being spent in a deeper idle state (as trivial idle tasks
are completed sooner). This has shown a power benefit in some systems,
in other systems, power consumption has increased but so has the
throughput/watt.
Modify the initial policy_data->min set by cpufreq-core to
lowest_nonlinear_freq, in the ->verify() callback. Also set the
cpudata->req[0] to FREQ_QOS_MIN_DEFAULT_VALUE (i.e. 0), so that it also
gets overriden by the check in verify function.
Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017053927.25285-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Merge the two verify() callback functions and rename the
cpufreq_policy_data argument for better readability.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241017053927.25285-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The EPP value doesn't need to be cached to the CPPC request in
amd_pstate_epp_update_limit() because it's passed as an argument
at the end to amd_pstate_set_epp() and stored at that time.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
When the EPP updates are set the maximum capable frequency for the
CPU is used to set the upper limit instead of that of the policy.
Adjust amd_pstate_epp_update_limit() to reuse policy calculation code
from amd_pstate_update_min_max_limit().
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
When boost is changed the CPPC value is changed in amd_pstate_cpu_boost_update()
but then changed again when refresh_frequency_limits() and all it's callbacks
occur. The first is a pointless write, so instead just update the limits for
the policy and let the policy refresh anchor everything properly.
Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Existing function names "cppc_*" and "pstate_*" for shared memory and
MSR based systems are not intuitive enough, replace them with "shmem_*" and
"msr_*" respectively.
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240917091434.10685-1-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
When boost has been disabled the limit for perf should be nominal perf not
the highest perf. Using the latter to do calculations will lead to
incorrect values that are still above nominal.
Fixes: ad4caad58d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
Reported-by: Peter Jung <ptr1337@cachyos.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219348
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
While switching the driver mode between active and passive, Collaborative
Processor Performance Control (CPPC) is disabled in
amd_pstate_unregister_driver(). But, it is not enabled back while registering
the new driver (passive or active). This leads to the new driver mode not
working correctly, so enable it back in amd_pstate_register_driver().
Fixes: 3ca7bc818d ("cpufreq: amd-pstate: Add guided mode control support via sysfs")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241004122303.94283-1-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
* Move the calculation of the AMD boost numerator outside of
amd-pstate, correcting acpi-cpufreq on systems with preferred cores
* Harden preferred core detection to avoid potential false positives
* Add extra unit test coverage for mode state machine
-----BEGIN PGP SIGNATURE-----
iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmbhviEaHG1hcmlvLmxp
bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnYqDA//TrvmXcpk1mnVJw3Y7MG0
/n8dsLpxqVtEf+USnlGR+iRhgSQ/W/Kr7b5a+jmdCwpHChuWHt2FnNgcHLIxDnZC
vmEJ02/2BCRoPKvcvV4VTh0ATu3O9nqwQiBVWBdNjDy+Dzr0pzA+SQopt1hCIsO2
mzUodhpiBqYKlMf/i6+aM1gZCGGqoRC40aGqnJsgegb61vl7zIc2ZcbTxUQlyTfv
t6J73IXLx8+YtrjejBYc7mRHhMQ2hCKy92C/8cNoGocj5faSKsAA3OUDcWq8qX0U
zK3GGGdW8MLHSbt3VyntstnfiLL7TnzowcjvrMudIWpjC1987GlE9BApbN9VRZ8e
ARN3Y7/ltjut/1fRB97BwjI9aDpzA0122Qzy4UOcK8o+be1eIr+ihV3Z9EN/snWg
0L/oq5+rGHvvIzf1BwGhoPSvgBIu7eMIYDcRxKPlEiKsbXrL4DdJC/nXgaZ/HiGO
eHx1dNy7LFrdnEwVI1frZWC6ZuZcpmOBdhnfU+leVxzB3Z++Qc266rsxKBsc5taZ
PPV18pxfbbl3iL85KDIbuBUCmA0aY8WEdCKtfXpl7zlB5g0fZQLyYeUbvahK08Sk
vyQAnPECbX/4v1Vx54Z70GPk0XD2+TXdg8yApnXrmRc36z/SLdprk5hPKbKhZu/r
iPxFUnvd0HCtjsLrsq/qUiQ=
=R4HZ
-----END PGP SIGNATURE-----
Merge tag 'amd-pstate-v6.12-2024-09-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge the second round of amd-pstate changes for 6.12 from Mario
Limonciello:
"* Move the calculation of the AMD boost numerator outside of
amd-pstate, correcting acpi-cpufreq on systems with preferred cores
* Harden preferred core detection to avoid potential false positives
* Add extra unit test coverage for mode state machine"
* tag 'amd-pstate-v6.12-2024-09-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate-ut: Fix an "Uninitialized variables" issue
cpufreq/amd-pstate-ut: Add test case for mode switches
cpufreq/amd-pstate: Export symbols for changing modes
amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`
cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`
cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
x86/amd: Move amd_get_highest_perf() out of amd-pstate
ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn
ACPI: CPPC: Drop check for non zero perf ratio
x86/amd: Rename amd_get_highest_perf() to amd_get_boost_ratio_numerator()
ACPI: CPPC: Adjust return code for inline functions in !CONFIG_ACPI_CPPC_LIB
x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c
In order to effectively test all mode switch combinations export
everything necessarily for amd-pstate-ut to trigger a mode switch.
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Don't take and release the mutex when prefcore isn't present and
avoid initialization of variables that will be initially set
in the function.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The special case in amd_pstate_highest_perf_set() is the value used
for calculating the boost numerator. Merge this into
amd_get_boost_ratio_numerator() and then use that to calculate boost
ratio.
This allows dropping more special casing of the highest perf value.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
AMD systems that support preferred cores will use "166" as their
numerator for max frequency calculations instead of "255".
Add a function for detecting preferred cores by looking at the
highest perf value on all cores.
If preferred cores are enabled return 166 and if disabled the
value in the highest perf register. As the function will be called
multiple times, cache the values for the boost numerator and if
preferred cores will be enabled in global variables.
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
amd_pstate_get_highest_perf() is a helper used to get the highest perf
value on AMD systems. It's used in amd-pstate as part of preferred
core handling, but applicable for acpi-cpufreq as well.
Move it out to cppc handling code as amd_get_highest_perf().
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
- Several OF related cleanups in cpufreq drivers (Rob Herring).
- Enable COMPILE_TEST for ARM drivers (Rob Herrring).
- Introduce quirks for syscon failures and use socinfo to get revision
for TI cpufreq driver (Dhruva Gole and Nishanth Menon).
- Minor cleanups in amd-pstate driver (Anastasia Belova and Dhananjay
Ugwekar).
- Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
(Danila Tikhonov, Huacai Chen, and Liu Jing).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmbalyoACgkQ0rkcPK6B
Ehw9IBAAus+BOdYMzU8VT7j8Y98oOfb5FsJCoTU2KaV2RIIpX4k+6daruCOm0BXP
RtRiI+ILV5zLUm8CIC15f2GQE6PtDBFmjky7ItEemcbQPlTkpkZFWNFhBqE1u3hw
jllA4p1LmUwAnr1zkwl2CEUJSRJBxWPeTxPL0Ci6pycFhiNPZwGqOreJQRsIMOh3
pgohKSBebxpzgwES8fhR32CqaHphrEFCryHafZIqzsXSBuyETGEKg57zTmdo6ojy
GDuaIz6kQ9lKvW/q9iwTih93SsBnzDD85AAERDZkUDxey5IBLztrJLH5QT/XN77K
EQOHeygwyKk4su00fXy/LXmMqKHCN/mAHgb6JvWBIm2xbDWx6drBJyV/NdX4YI4w
4m1SqmFH9Cv41UIcynQR83XthGKgIddjEDKPW0GNMQ+LHWlUS6Qm4Kb2q4rXruqD
bUWs3NmZEvYD9P2XOKGHgfSPZ0iNXi0Lt5BBIWbeIPNwaikxHisNsNG1W2pMsfke
n19cvt20aBJgx2s5acIH7Po8qQglrGGK9EKWRg8gInvtB7QRbHBhXVD6ZNwuIk/7
u2+Y42R4R1GzwsD3EUl+RnnUFgRwhg53OIzcE+AaaMDqGeTdxmG42eg0jGSBA7yx
KbljH9PAfsMjjEjsVYReiIYxS28PZNyTBaxZJxD2RyxMz53CV9w=
=BlNP
-----END PGP SIGNATURE-----
Merge tag 'cpufreq-arm-updates-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge ARM cpufreq updates for 6.12 from Viresh Kumar:
"- Several OF related cleanups in cpufreq drivers (Rob Herring).
- Enable COMPILE_TEST for ARM drivers (Rob Herrring).
- Introduce quirks for syscon failures and use socinfo to get revision
for TI cpufreq driver (Dhruva Gole and Nishanth Menon).
- Minor cleanups in amd-pstate driver (Anastasia Belova and Dhananjay
Ugwekar).
- Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
(Danila Tikhonov, Huacai Chen, and Liu Jing)."
* tag 'cpufreq-arm-updates-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: ti-cpufreq: Use socinfo to get revision in AM62 family
cpufreq: Fix the cacography in powernv-cpufreq.c
cpufreq: ti-cpufreq: Introduce quirks to handle syscon fails appropriately
cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request()
cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
cpufreq: Add SM7325 to cpufreq-dt-platdev blocklist
cpufreq: Fix warning on unused of_device_id tables for !CONFIG_OF
cpufreq/amd-pstate: Add the missing cpufreq_cpu_put()
cpufreq: Drop CONFIG_ARM and CONFIG_ARM64 dependency on Arm drivers
cpufreq: Enable COMPILE_TEST on Arm drivers
cpufreq: armada-8k: Avoid excessive stack usage
cpufreq: omap: Drop asm includes
cpufreq: qcom: Add explicit io.h include for readl/writel_relaxed
cpufreq: spear: Use of_property_for_each_u32() instead of open coding
cpufreq: Use of_property_present()
* Validate return of any attempt to update EPP limits, which fixes
the masking hardware problems.
-----BEGIN PGP SIGNATURE-----
iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmbYvoIaHG1hcmlvLmxp
bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnY6ZxAAw0vB5p/mC8QfS5VuDm0O
Up+dMtjVf9VcIOx4OoqSPNTY4HmNUnBSmACfeY/LzUa22BhM+7SJ74y8UoXGYGc8
GKzuVDRsbqnuMrGqbOd8u+eJhcc8fln7zJVa1xBOM8V5yHqvnxjAsnwVieVhjmKa
7m5K8ht/tJsCpaY/BjF7iHMgq950UGjO+pUXzrYP5ARV1O367DMWVQ91X9NBKwvN
w/mapWmH+mpp0//2hL7wzrtGIfYjdndFP3xM0+7v5MoEd9KuSl7xCVObGidDOEaz
oxfMCfPPpHMJLwD/L6VOwq/AIomJQVUZS8111ucnbdkIVW5QOlSaFwKJG/KmFvUQ
98lfX2xAaGB26tvWQ3l+vFQ7Qces0NrJfCLbpS2NdQ2oSq1+1ab3Yeh1wDXYph7o
hF+Xa/qJHzIFwzeQ2EmdXwthpoXbF5mmIiBZCTE/v1WpBVkYj+lJGtU+WsyAwFJT
qPASAqrRo8KoCfhdxYT47Rc3AowUWtHoJQlZNzTdbmc7KXrk+7NfDUZ5ifLq3puI
FAPWg76NM8aefoattsC57+6/V2bygbe1wIQaXRrybZiO9cMtY0eQ9bE4ysIm0ZQ8
bSPEHg1iYl3OrMzwRYnBHmev8E4jmKdKSnBBPSqD3ObeR3fNZQs1pxWc3CtHP8LA
p+AEE66CGAzsed5wudcEoOc=
=nPcu
-----END PGP SIGNATURE-----
Merge tag 'amd-pstate-v6.12-2024-09-04' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge an amd-pstate driver update for 6.12 from Mario Limonciello:
"amd-pstate development for 6.12:
* Validate return of any attempt to update EPP limits, which fixes
the masking hardware problems."
* tag 'amd-pstate-v6.12-2024-09-04' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate: Catch failures for amd_pstate_epp_update_limit()
amd_pstate_set_epp() calls cppc_set_epp_perf() which can fail for
a variety of reasons but this is ignored. Change the return flow
to allow failures.
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
cpufreq_cpu_get may return NULL. To avoid NULL-dereference check it
and return in case of error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Fix the reference counting of cpufreq_policy object in amd_pstate_update()
function by adding the missing cpufreq_cpu_put().
Fixes: e8f555daac ("cpufreq/amd-pstate: fix setting policy current frequency value")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
commit bff7d13c19 ("cpufreq: amd-pstate: add debug message while
CPPC is supported and disabled by SBIOS") issues a warning on plaforms
where the X86_FEATURE_CPPC is expected to be enabled, but is not due
to it being disabled in the BIOS.
This feature bit corresponds to CPUID 0x80000008.ebx[27] which is a
reserved bit on the Zen1 processors and a reserved bit on Zen2 based
models 0x70-0x7F, and is expected to be cleared on these
platforms. Thus printing the warning message for these models when
X86_FEATURE_CPPC is unavailable is incorrect. Fix this.
Modify some of the comments, and use switch-case for model range
checking for improved readability while at it.
Fixes: bff7d13c19 ("cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS")
Cc: Xiaojian Du <xiaojian.du@amd.com>
Reported-by: David Wang <00107082@163.com>
Closes: https://lore.kernel.org/lkml/20240730140111.4491-1-00107082@163.com/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
After the commit 63edbaa48a ("x86/cpu/topology: Add support for the
AMD 0x80000026 leaf"), the topolgy_logical_die_id() function returns
the logical Core Chiplet Die (CCD) ID instead of the logical socket
ID.
Since this is currently used to set MSR_AMD_CPPC_ENABLE, which needs
to be set on any one of the threads of the socket, it is prudent to
use topology_logical_package_id() in place of
topology_logical_die_id().
Fixes: 63edbaa48a ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
cc: stable@vger.kernel.org # 6.10
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Link: https://lore.kernel.org/lkml/20240801124509.3650-1-Dhananjay.Ugwekar@amd.com/
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Smatch complains that "ret" could be uninitialized:
drivers/cpufreq/amd-pstate.c:734 amd_pstate_cpu_boost_update()
error: uninitialized symbol 'ret'.
This seems like it probably is a real issue. Initialize "ret" to zero to
be safe.
Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/lkml/7ff53543-6c04-48a0-8d99-7dc010b93b3a@stanley.mountain/T/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The cpufreq core doesn't check the return type of the exit() callback
and there is not much the core can do on failures at that point. Just
drop the returned value and make it return void.
Signed-off-by: Lizhe <sensor1010@163.com>
[ Viresh: Reworked the patches to fix all missing changes together. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # Mediatek
Acked-by: Sudeep Holla <sudeep.holla@arm.com> # scpi, scmi, vexpress
Acked-by: Mario Limonciello <mario.limonciello@amd.com> # amd
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> # bmips
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com> # omap
On shared memory CPPC systems, with amd_pstate=active mode, the change
in scaling_max_freq doesn't get written to the shared memory
region. Due to this, the writes to the scaling_max_freq sysfs file
don't take effect. Fix this by propagating the scaling_max_freq
changes to the shared memory region.
Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Reported-by: David Arcari <darcari@redhat.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240702081413.5688-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
When Core Performance Boost is disabled by the user, the
CPPC_REQ.max_perf should not exceed the nominal_perf since by definition
the frequencies between nominal_perf and the highest_perf are in the
boost range. Fix this in amd_pstate_update()
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Link: https://lore.kernel.org/r/66f55232be01092c423f0523f68b82b80c293943.1718988436.git.perry.yuan@amd.com
Link: https://lore.kernel.org/r/20240626042733.3747-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The "Core Performance Boost (CPB) feature, when enabled in the BIOS,
allows the OS to control the highest performance for each individual
core. The active, passive and the guided modes of the amd-pstate driver
do support controlling the core frequency boost when this BIOS feature
is enabled. Additionally, the amd-pstate driver provides a sysfs
interface allowing the user to activate/deactivate this core performance
boost feature at runtime.
Add support for the set_boost callback in the active mode driver to
enable boost control via the cpufreq core. This ensures a consistent
boost control interface across all pstate modes, including passive
mode, guided mode, and active mode.
With this addition, all three pstate modes can support the same boost
control interface with unique interface and global CPB control. Each
CPU also supports individual boost control, allowing global CPB to
change all cores' boost states simultaneously. Specific CPUs can
update their boost states separately, ensuring all cores' boost
states are synchronized.
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20240626042733.3747-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
If driver registration fails then immediately return the failure
instead of continuing to register attributes.
This fixes issues of falling back from amd-pstate to other drivers
when cpufreq init has failed for any reason.
Reported-by: alex.s.cochran@proton.me
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Link: https://lore.kernel.org/r/20240623200918.52104-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
When scaling min/max freq values were being setted,
the value of policy->cur need to update.
Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240227071133.3405003-1-li.meng@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
If the `amd-pstate` driver is not loaded automatically by default,
it is because the kernel command line parameter has not been added.
To resolve this issue, it is necessary to call the `amd_pstate_set_driver()`
function to enable the desired mode (passive/active/guided) before registering
the driver instance.
This ensures that the driver is loaded correctly without relying on the kernel
command line parameter.
When there is no parameter added to command line, Kernel config will
provide the default mode to load.
Meanwhile, user can add driver mode in command line which will override
the kernel config default option.
Reported-by: Andrei Amuraritei <andamu@posteo.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218705
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/83301c4cea4f92fb19e14b23f2bac7facfd8bdbb.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The amd-pstate-epp driver has been implemented and resolves the
performance drop issue seen in passive mode for shared memory type
CPPC systems. Users who enable the active mode driver will not
experience a performance drop compared to the passive mode driver.
Therefore, the EPP driver should be loaded by default for shared
memory type CPPC system to get better performance.
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/c705507cf3ee790e544251cfd897ed11e8e57712.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
replace the usage of the deprecated boot_cpu_has() function with
the modern cpu_feature_enabled() function. The switch to cpu_feature_enabled()
ensures compatibility with the latest CPU feature detection mechanisms and
improves code maintainability.
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/f1567593ac5e1d38343067e9c681a8c4b0707038.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
If CPPC feature is supported by the CPU however the CPUID flag bit is not
set by SBIOS, the `amd_pstate` will be failed to load while system
booting.
So adding one more debug message to inform user to check the SBIOS setting,
The change also can help maintainers to debug why amd_pstate driver failed
to be loaded at system booting if the processor support CPPC.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/42c953616ac121bd1e5c329e83d015a02e6b32c7.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
To enhance the debugging capability of the driver loading failure for
broken CPPC ACPI tables, it can optimize the expression by moving the
verification of `min_freq`, `nominal_freq`, and other dependency values
to the `amd_pstate_init_freq()` function where they are initialized.
If any of these values are incorrect, the `amd-pstate` driver will not be registered.
By ensuring that these values are correct before they are used, it will facilitate
the debugging process when encountering driver loading failures due to faulty CPPC
ACPI tables from BIOS
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/f9793f8451c1832e34cc9dc35f89c653b39cfe38.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The EPP string for 'default' represents what the firmware had configured
as the default EPP value but once a user changes EPP to another string
they can't reset it back to 'default'.
Cache the firmware EPP value and allow the user to write 'default' using
this value.
Reported-by: Artem S. Tashkinov <aros@gmx.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931#c61
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
The nominal frequency in cpudata is maintained in MHz whereas all other
frequencies are in KHz. This means we have to convert nominal frequency
value to KHz before we do any interaction with other frequency values.
In amd_pstate_set_boost(), this conversion from MHz to KHz is missed,
fix that.
Tested on a AMD Zen4 EPYC server
Before:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151
409422
After:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151000
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151000
1799527
Fixes: ec437d71db ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Peter Jung <ptr1337@cachyos.org>
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When extra warnings are enabled, gcc points out a global variable
definition in a header:
In file included from drivers/cpufreq/amd-pstate-ut.c:29:
include/linux/amd-pstate.h:123:27: error: 'amd_pstate_mode_string' defined but not used [-Werror=unused-const-variable=]
123 | static const char * const amd_pstate_mode_string[] = {
| ^~~~~~~~~~~~~~~~~~~~~~
This header is only included from two files in the same directory,
and one of them uses only a single definition from it, so clean it
up by moving most of the contents into the driver that uses them,
and making shared bits a local header file.
Fixes: 36c5014e54 ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpudata memory from kzalloc() in amd_pstate_epp_cpu_init() is
not freed in the analogous exit function, so fix that.
Signed-off-by: Peng Ma <andypma@tencent.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
To address the performance drop issue, an optimization has been
implemented. The incorrect highest performance value previously set by the
low-level power firmware for AMD CPUs with Family ID 0x19 and Model ID
ranging from 0x70 to 0x7F series has been identified as the cause.
To resolve this, a check has been implemented to accurately determine the
CPU family and model ID. The correct highest performance value is now set
and the performance drop caused by the incorrect highest performance value
are eliminated.
Before the fix, the highest frequency was set to 4200MHz, now it is set
to 4971MHz which is correct.
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ
0 0 0 0 0:0:0:0 yes 4971.0000 400.0000 400.0000
1 0 0 0 0:0:0:0 yes 4971.0000 400.0000 400.0000
2 0 0 1 1:1:1:0 yes 4971.0000 400.0000 4865.8140
3 0 0 1 1:1:1:0 yes 4971.0000 400.0000 400.0000
Fixes: f3a0523918 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218759
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Gaha Bana <gahabana@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
removed the unused variable `lowest_nonlinear_freq` for build warning.
This variable was defined and assigned a value in the previous code,
but it was not used in the subsequent code.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404271038.em6nJjzy-lkp@intel.com/
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
get some code format problems fixed in the amd-pstate driver.
Changes Made:
- Fixed incorrect comment format in the functions.
- Removed unnecessary blank line.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404271148.HK9yHBlB-lkp@intel.com/
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add quirks table to get CPPC capabilities issue fixed by providing
correct perf or frequency values while driver loading.
If CPPC capabilities are not defined in the ACPI tables or wrongly
defined by platform firmware, it needs to use quick to get those
issues fixed with correct workaround values to make pstate driver
can be loaded even though there are CPPC capabilities errors.
The workaround will match the broken BIOS which lack of CPPC capabilities
nominal_freq and lowest_freq definition in the ACPI table.
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
0
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
0
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make pstate driver initially retrieve the P-state transition delay and
latency values from the BIOS ACPI tables which has more reasonable
delay and latency values according to the platform design and
requirements.
Previously there values were hardcoded at specific value which may
have conflicted with platform and it might not reflect the most
accurate or optimized setting for the processor.
[054h 0084 8] Preserve Mask : FFFFFFFF00000000
[05Ch 0092 8] Write Mask : 0000000000000001
[064h 0100 4] Command Latency : 00000FA0
[068h 0104 4] Maximum Access Rate : 0000EA60
[06Ch 0108 2] Minimum Turnaround Time : 0000
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The amd-pstate driver cannot work when the min_freq, nominal_freq or
the max_freq is zero. When this happens it is prudent to error out
early on rather than waiting failing at the time of the governor
initialization.
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.
There is no loss in readability in replacing their invocations by
accesses to the corresponding members of cpudata.
Do so and remove these helper functions.
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Li Meng <li.meng@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
helpers computes the values of min_freq, max_freq, nominal_freq and
lowest_nominal_freq respectively afresh from
cppc_get_perf_caps(). This is not necessary as there are fields in
cpudata to cache these values.
To simplify this, add a single helper function named
amd_pstate_init_freq() which computes all these frequencies at once, and
caches it in cpudata.
Use the cached values everywhere else in the code.
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Li Meng <li.meng@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Co-developed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The min/max limit perf values calculated based on frequency
may exceed the reasonable range of perf(highest perf, lowest perf).
Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the function amd_pstate_adjust_perf(), the 'min_perf' variable is set
to 'highest_perf' instead of 'lowest_perf'.
Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Preferred core rankings can be changed dynamically by the
platform based on the workload and platform conditions and
accounting for thermals and aging.
When this occurs, cpu priority need to be set.
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
amd-pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it amd-pstate preferrred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
The initial core rankings are set up by amd-pstate when the
system boots.
Add a variable hw_prefcore in cpudata structure. It will check
if the processor and power firmware support preferred core
feature.
Add one new early parameter `disable` to allow user to disable
the preferred core.
Only when hardware supports preferred core and user set `enabled`
in early parameter, amd pstate driver supports preferred core featue.
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Scaling min/max freq values were being cached and lagging a setting
each time. Fix the ordering of the clamp call to ensure they work.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Fixes: febab20cae ("cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wkarny@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When amd_pstate is running, writing to scaling_min_freq and
scaling_max_freq has no effect. These values are only passed to the
policy level, but not to the platform level. This means that the
platform does not know about the frequency limits set by the user.
To fix this, update the min_perf and max_perf values at the platform
level whenever the user changes the scaling_min_freq and scaling_max_freq
values.
Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq_driver->fast_switch() callback expects a frequency as a return
value. amd_pstate_fast_switch() was returning the return value of
amd_pstate_update_freq(), which only indicates a success or failure.
Fix this by making amd_pstate_fast_switch() return the target_freq
when the call to amd_pstate_update_freq() is successful, and return
the current frequency from policy->cur when the call to
amd_pstate_update_freq() is unsuccessful.
Fixes: 4badf2eb1e ("cpufreq: amd-pstate: Add ->fast_switch() callback")
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Cc: 6.4+ <stable@vger.kernel.org> # v6.4+
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In commit 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
the "amd_pstate" attributes where moved from a dedicated kobject to the
cpu root kobject.
While the dedicated kobject expects to contain kobj_attributes the root
kobject needs device_attributes.
As the changed arguments are not used by the callbacks it works most of
the time.
However CFI will detect this issue:
[ 4947.849350] CFI failure at dev_attr_show+0x24/0x60 (target: show_status+0x0/0x70; expected type: 0x8651b1de)
...
[ 4947.849409] Call Trace:
[ 4947.849410] <TASK>
[ 4947.849411] ? __warn+0xcf/0x1c0
[ 4947.849414] ? dev_attr_show+0x24/0x60
[ 4947.849415] ? report_cfi_failure+0x4e/0x60
[ 4947.849417] ? handle_cfi_failure+0x14c/0x1d0
[ 4947.849419] ? __cfi_show_status+0x10/0x10
[ 4947.849420] ? handle_bug+0x4f/0x90
[ 4947.849421] ? exc_invalid_op+0x1a/0x60
[ 4947.849422] ? asm_exc_invalid_op+0x1a/0x20
[ 4947.849424] ? __cfi_show_status+0x10/0x10
[ 4947.849425] ? dev_attr_show+0x24/0x60
[ 4947.849426] sysfs_kf_seq_show+0xa6/0x110
[ 4947.849433] seq_read_iter+0x16c/0x4b0
[ 4947.849436] vfs_read+0x272/0x2d0
[ 4947.849438] ksys_read+0x72/0xe0
[ 4947.849439] do_syscall_64+0x76/0xb0
[ 4947.849440] ? do_user_addr_fault+0x252/0x650
[ 4947.849442] ? exc_page_fault+0x7a/0x1b0
[ 4947.849443] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fixes: 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
Reported-by: Jannik Glückert <jannik.glueckert@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217765
Link: https://lore.kernel.org/lkml/c7f1bf9b-b183-bf6e-1cbb-d43f72494083@gmail.com/
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>