Commit Graph

197 Commits

Author SHA1 Message Date
Dhananjay Ugwekar
db1cafc77a cpufreq: amd-pstate: Remove unnecessary driver_lock in set_boost
set_boost is a per-policy function call, hence a driver wide lock is
unnecessary. Also this mutex_acquire can collide with the mutex_acquire
from the mode-switch path in status_store(), which can lead to a
deadlock. So, remove it.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-06 09:04:30 +05:30
Dhananjay Ugwekar
55db9b73c3 cpufreq/amd-pstate: Fix max_perf updation with schedutil
In adjust_perf() callback, we are setting the max_perf to highest_perf,
as opposed to the correct limit value i.e. max_limit_perf. Fix that.

Fixes: 3f7b835fa4 ("cpufreq/amd-pstate: Move limit updating code")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-3-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-05 12:18:27 -06:00
Dhananjay Ugwekar
d364eee14c cpufreq/amd-pstate: Remove the goto label in amd_pstate_update_limits
Scope based guard/cleanup macros should not be used together with goto
labels. Hence, remove the goto label.

Fixes: 6c093d5a5b ("cpufreq/amd-pstate: convert mutex use to guard()")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-2-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-05 12:18:27 -06:00
Lifeng Zheng
fa803513ab cpufreq/amd-pstate: Fix per-policy boost flag incorrect when fail
Commit c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision
boost state") sets per-policy boost flag to false when boost fail.
However, this boost flag will be set to reverse value in
store_local_boost() and cpufreq_boost_trigger_state() in cpufreq.c. This
will cause the per-policy boost flag set to true when fail to set boost.
Remove the extra assignment in amd_pstate_set_boost() and keep all
operations on per-policy boost flag outside of set_boost() to fix this
problem.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250110091949.3610770-1-zhenglifeng1@huawei.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-03 00:04:23 -06:00
Naresh Solanki
857a61c2ce cpufreq/amd-pstate: Refactor max frequency calculation
The previous approach introduced roundoff errors during division when
calculating the boost ratio. This, in turn, affected the maximum
frequency calculation, often resulting in reporting lower frequency
values.

For example, on the Glinda SoC based board with the following
parameters:

max_perf = 208
nominal_perf = 100
nominal_freq = 2600 MHz

The Linux kernel previously calculated the frequency as:
freq = ((max_perf * 1024 / nominal_perf) * nominal_freq) / 1024
freq = 5405 MHz  // Integer arithmetic.

With the updated formula:
freq = (max_perf * nominal_freq) / nominal_perf
freq = 5408 MHz

This change ensures more accurate frequency calculations by eliminating
unnecessary shifts and divisions, thereby improving precision.

Signed-off-by: Naresh Solanki <naresh.solanki@9elements.com>
[ML: trim the changelog from commit message]
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241219201833.2750998-1-naresh.solanki@9elements.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-01-03 23:44:07 -06:00
Mario Limonciello
fd604ae6c2 cpufreq/amd-pstate: Fix prefcore rankings
commit 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as
highest perf again") updated the value stored for highest perf to no longer
store the highest perf value but instead the boost numerator.

This is a fixed value for systems with preferred cores and not appropriate
for use ITMT rankings. Update the value used for ITMT rankings to be the
preferred core ranking.

Reported-and-tested-by: Sebastian <sobrus@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219640
Fixes: 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as highest perf again")
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20250102141204.3413202-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-01-03 10:27:32 -06:00
Mario Limonciello
95fad7fb58 cpufreq/amd-pstate: Drop boost_state variable
Currently boost_state is cached for every processor in cpudata structure
and driver boost state is set for every processor.

Both of these aren't necessary as the driver only needs to set once and
the policy stores whether boost is enabled.

Move the driver boost setting to registration and adjust all references
to cached value to pull from the policy instead.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-16-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
f9a378ff64 cpufreq/amd-pstate: Set different default EPP policy for Epyc and Ryzen
For Ryzen systems the EPP policy set by the BIOS is generally configured
to performance as this is the default register value for the CPPC request
MSR.

If a user doesn't use additional software to configure EPP then the system
will default biased towards performance and consume extra battery. Instead
configure the default to "balanced_performance" for this case.

Suggested-by: Artem S. Tashkinov <aros@gmx.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219526
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-15-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
f8fde687c9 cpufreq/amd-pstate: Drop ret variable from amd_pstate_set_energy_pref_index()
The ret variable is not necessary.

Reviewed-and-tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-14-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
fff3957969 cpufreq/amd-pstate: Always write EPP value when updating perf
For MSR systems the EPP value is in the same register as perf targets
and so divding them into two separate MSR writes is wasteful.

In msr_update_perf(), update both EPP and perf values in one write to
MSR_AMD_CPPC_REQ, and cache them if successful.

To accomplish this plumb the EPP value into the update_perf call and
modify all its callers to check the return value.

As this unifies calls, ensure that the MSR write is necessary before
flushing a write out. Also drop the comparison from the passive flow
tracing.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-13-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
b3781f30bf cpufreq/amd-pstate: Cache EPP value and use that everywhere
Cache the value in cpudata->epp_cached, and use that for all callers.
As all callers use cached value merge amd_pstate_get_energy_pref_index()
into show_energy_performance_preference().

Check if the EPP value is changed before writing it to MSR or
shared memory region.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-12-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
3f7b835fa4 cpufreq/amd-pstate: Move limit updating code
The limit updating code in amd_pstate_epp_update_limit() should not
only apply to EPP updates.  Move it to amd_pstate_update_min_max_limit()
so other callers can benefit as well.

With this move it's not necessary to have clamp_t calls anymore because
the verify callback is called when setting limits.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-11-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
942718f2a2 cpufreq/amd-pstate: Change amd_pstate_update_perf() to return an int
As msr_update_perf() calls an MSR it's possible that it fails. Pass
this return code up to the caller.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-10-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
68cb0e77b6 cpufreq/amd-pstate: store all values in cpudata struct in khz
Storing values in the cpudata structure in different units leads
to confusion and hardcoded conversions elsewhere.  After ratios are
calculated store everything in khz for any future use. Adjust all
relevant consumers for this change as well.

Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-9-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
474e7218e8 cpufreq/amd-pstate: Only update the cached value in msr_set_epp() on success
If writing the MSR MSR_AMD_CPPC_REQ fails then the cached value in the
amd_cpudata structure should not be updated.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-8-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
88a95ba066 cpufreq/amd-pstate: Use FIELD_PREP and FIELD_GET macros
The FIELD_PREP and FIELD_GET macros improve readability and help
to avoid shifting bugs.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-7-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello
3b43739824 cpufreq/amd-pstate: Drop cached epp_policy variable
epp_policy is not used by any of the current code and there
is no need to cache it.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-6-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Mario Limonciello
6c093d5a5b cpufreq/amd-pstate: convert mutex use to guard()
Using scoped guard declaration will unlock mutexes automatically.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-5-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Mario Limonciello
4dcd130151 cpufreq/amd-pstate: Add trace event for EPP perf updates
In "active" mode the most important thing for debugging whether
an issue is hardware or software based is to look at what was the
last thing written to the CPPC request MSR or shared memory region.

The 'amd_pstate_epp_perf' trace event shows the values being written
for all CPUs.

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar
53ec2101df cpufreq/amd-pstate: Merge amd_pstate_epp_cpu_offline() and amd_pstate_epp_offline()
amd_pstate_epp_offline() is only called from within
amd_pstate_epp_cpu_offline() and doesn't make much sense to have it at all.
Hence, remove it.

Also remove the unncessary debug print in the offline path while at it.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-6-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar
b78f8c87ec cpufreq/amd-pstate: Remove the cppc_state check in offline/online functions
Only amd_pstate_epp driver (i.e. cppc_state = ACTIVE) enters the
amd_pstate_epp_offline() and amd_pstate_epp_cpu_online() functions,
so remove the unnecessary if condition checking if cppc_state is
equal to AMD_PSTATE_ACTIVE.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar
b1089e0c88 cpufreq/amd-pstate: Refactor amd_pstate_epp_reenable() and amd_pstate_epp_offline()
Replace similar code chunks with amd_pstate_update_perf() and
amd_pstate_set_epp() function calls.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-4-Dhananjay.Ugwekar@amd.com
[ML: Fix LKP reported error about unused variable]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar
57a2b25e45 cpufreq/amd-pstate: Move the invocation of amd_pstate_update_perf()
amd_pstate_update_perf() should not be a part of shmem_set_epp() function,
so move it to the amd_pstate_epp_update_limit() function, where it is needed.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar
16c977f817 cpufreq/amd-pstate: Convert the amd_pstate_get/set_epp() to static calls
MSR and shared memory based systems have different mechanisms to get and
set the epp value. Split those mechanisms into different functions and
assign them appropriately to the static calls at boot time. This eliminates
the need for the "if(cpu_feature_enabled(X86_FEATURE_CPPC))" checks at
runtime.

Also, propagate the error code from rdmsrl_on_cpu() and cppc_get_epp_perf()
to *_get_epp()'s caller, instead of returning -EIO unconditionally.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Mario Limonciello
2993b29b2a cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies
commit 18d9b52271 ("cpufreq/amd-pstate: Use nominal perf for limits
when boost is disabled") introduced different semantics for min/max limits
based upon whether the user turned off boost from sysfs.

This however is not necessary when the highest perf value is the boost
numerator.

Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Fixes: 18d9b52271 ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled")
Link: https://lore.kernel.org/r/20241209185248.16301-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-10 10:17:43 -06:00
Mario Limonciello
50a062a762 cpufreq/amd-pstate: Store the boost numerator as highest perf again
commit ad4caad58d ("cpufreq: amd-pstate: Merge
amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
changed the semantics for highest perf and commit 18d9b52271
("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled")
worked around those semantic changes.

This however is a confusing result and furthermore makes it awkward to
change frequency limits and boost due to the scaling differences. Restore
the boost numerator to highest perf again.

Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Fixes: ad4caad58d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
Link: https://lore.kernel.org/r/20241209185248.16301-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-10 10:17:43 -06:00
K Prateek Nayak
919bfa9b2d cpufreq/amd-pstate: Detect preferred core support before driver registration
Booting with amd-pstate on 3rd Generation EPYC system incorrectly
enabled ITMT support despite the system not supporting Preferred Core
ranking. amd_pstate_init_prefcore() called during amd_pstate*_cpu_init()
requires "amd_pstate_prefcore" to be set correctly however the preferred
core support is detected only after driver registration which is too
late.

Swap the function calls around to detect preferred core support before
registring the driver via amd_pstate_register_driver(). This ensures
amd_pstate*_cpu_init() sees the correct value of "amd_pstate_prefcore"
considering the platform support.

Fixes: 279f838a61 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Fixes: ff2653ded4 ("cpufreq/amd-pstate: Move registration after static function call update")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241210032557.754-1-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-09 21:57:34 -06:00
Linus Torvalds
d8d78a90e7 - Add a feature flag which denotes AMD CPUs supporting workload classification
with the purpose of using such hints when making scheduling decisions
 
 - Determine the boost enumerator for each AMD core based on its type: efficiency
   or performance, in the cppc driver
 
 - Add the type of a CPU to the topology CPU descriptor with the goal of
   supporting and making decisions based on the type of the respective core
 
 - Add a feature flag to denote AMD cores which have heterogeneous topology and
   enable SD_ASYM_PACKING for those
 
 - Check microcode revisions before disabling PCID on Intel
 
 - Cleanups and fixlets
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmc7q0UACgkQEsHwGGHe
 VUq27Q//TADIn/rZj95OuWLYFXduOpzdyfF6BAOabRjUpIWTGJ5YdKjj1TCA2wUE
 6SiHZWQxQropB3NgeICcDT+3OGdGzE2qywzpXspUDsBPraWx+9CA56qREYafpRps
 88ZQZJWHla2/0kHN5oM4fYe05mWMLAFgIhG4tPH/7sj54Zqar40nhVksz3WjKAid
 yEfzbdVeRI5sNoujyHzGANXI0Fo98nAyi5Qj9kXL9W/UV1JmoQ78Rq7V9IIgOBsc
 l6Gv/h0CNtH9voqfrfUb07VHk8ZqSJ37xUnrnKdidncWGCWEAoZRr7wU+I9CHKIs
 tzdx+zq6JC3YN0IwsZCjk4me+BqVLJxW2oDgW7esPifye6ElyEo4T9UO9LEpE1qm
 ReAByoIMdSXWwXuITwy4NxLPKPCpU7RyJCiqFzpJp0g4qUq2cmlyERDirf6eknXL
 s+dmRaglEdcQT/EL+Y+vfFdQtLdwJmOu+nPPjjFxeRcIDB+u1sXJMEFbyvkLL6FE
 HOdNxL+5n/3M8Lbh77KIS5uCcjXL2VCkZK2/hyoifUb+JZR/ENoqYjElkMXOplyV
 KQIfcTzVCLRVvZApf/MMkTO86cpxMDs7YLYkgFxDsBjRdoq/Mzub8yzWn6kLZtmP
 ANNH4uYVtjrHE1nxJSA0JgYQlJKYeNU5yhLiTLKhHL5BwDYfiz8=
 =420r
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpuid updates from Borislav Petkov:

 - Add a feature flag which denotes AMD CPUs supporting workload
   classification with the purpose of using such hints when making
   scheduling decisions

 - Determine the boost enumerator for each AMD core based on its type:
   efficiency or performance, in the cppc driver

 - Add the type of a CPU to the topology CPU descriptor with the goal of
   supporting and making decisions based on the type of the respective
   core

 - Add a feature flag to denote AMD cores which have heterogeneous
   topology and enable SD_ASYM_PACKING for those

 - Check microcode revisions before disabling PCID on Intel

 - Cleanups and fixlets

* tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Remove redundant CONFIG_NUMA guard around numa_add_cpu()
  x86/cpu: Fix FAM5_QUARK_X1000 to use X86_MATCH_VFM()
  x86/cpu: Fix formatting of cpuid_bits[] in scattered.c
  x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit
  x86/amd: Use heterogeneous core topology for identifying boost numerator
  x86/cpu: Add CPU type to struct cpuinfo_topology
  x86/cpu: Enable SD_ASYM_PACKING for PKG domain on AMD
  x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES
  x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
  x86/mm: Don't disable PCID when INVLPG has been fixed by microcode
2024-11-19 12:27:19 -08:00
Mario Limonciello
ff2653ded4 cpufreq/amd-pstate: Move registration after static function call update
On shared memory designs the static functions need to work before
registration is done or the system can hang at bootup.

Move the registration later in amd_pstate_init() to solve this.

Fixes: b427ac4084 ("cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call")
Reported-by: Klara Modin <klarasmodin@gmail.com>
Closes: https://lore.kernel.org/linux-pm/cf9c146d-bacf-444e-92e2-15ebf513af96@gmail.com/#t
Tested-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241028145542.1739160-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-29 08:50:39 -05:00
Mario Limonciello
3ac757e8db cpufreq/amd-pstate: Push adjust_perf vfunc init into cpu_init
As the driver can be changed in and out of different modes it's possible
that adjust_perf is assigned when it shouldn't be.

This could happen if an MSR design is started up in passive mode and then
switches to active mode.

To solve this explicitly clear `adjust_perf` in amd_pstate_epp_cpu_init().

Tested-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241028145542.1739160-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-29 08:31:57 -05:00
Dhananjay Ugwekar
a6960e6b1b cpufreq/amd-pstate: Align offline flow of shared memory and MSR based systems
Set min_perf to lowest_perf for shared memory systems, similar to the MSR
based systems.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Dhananjay Ugwekar
796ff50e12 cpufreq/amd-pstate: Call cppc_set_epp_perf in the reenable function
The EPP value being set in perf_ctrls.energy_perf is not being propagated
to the shared memory, fix that.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-4-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Dhananjay Ugwekar
73070a9169 cpufreq/amd-pstate: Do not attempt to clear MSR_AMD_CPPC_ENABLE
MSR_AMD_CPPC_ENABLE is a write once register, i.e. attempting to clear
it is futile, it will not take effect. Hence, return if disable (0)
argument is passed to the msr_cppc_enable()

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Dhananjay Ugwekar
7fb463aac8 cpufreq/amd-pstate: Rename functions that enable CPPC
Explicitly rename functions that enable CPPC as *_cppc_*.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Swapnil Sapkal
54ab7d7c59 amd-pstate: Switch to amd-pstate by default on some Server platforms
Currently the default cpufreq driver for all the AMD EPYC servers is
acpi-cpufreq. Going forward, switch to amd-pstate as the default
driver on the AMD EPYC server platforms with CPU family 0x1A or
higher. The default mode will be active mode.

Testing shows that amd-pstate with active mode and performance
governor provides comparable or better performance per-watt against
acpi-cpufreq + performance governor.

Likewise, amd-pstate with active mode and powersave governor with the
energy_performance_preference=power (EPP=255) provides comparable or
better performance per-watt against acpi-cpufreq + schedutil governor
for a wide range of workloads.

Users can still revert to using acpi-cpufreq driver on these platforms
with the "amd_pstate=disable" kernel commandline parameter.

Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241021101836.9047-3-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Gautham R. Shenoy
0c411b39e4 amd-pstate: Set min_perf to nominal_perf for active mode performance gov
The amd-pstate driver sets CPPC_REQ.min_perf to CPPC_REQ.max_perf when
in active mode with performance governor. Typically CPPC_REQ.max_perf
is set to CPPC.highest_perf. This causes frequency throttling on
power-limited platforms which causes performance regressions on
certain classes of workloads.

Hence, set the CPPC_REQ.min_perf to the CPPC.nominal_perf or
CPPC_REQ.max_perf, whichever is lower of the two.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241021101836.9047-2-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
b427ac4084 cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call
amd_pstate_set_driver() is called twice, once in amd_pstate_init() and once
as part of amd_pstate_register_driver(). Move around code and eliminate
the redundancy.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
162cfa4eba cpufreq/amd-pstate: Remove the switch case in amd_pstate_init()
Replace the switch case with a more readable if condition.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-4-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
e3591eebec cpufreq/amd-pstate: Call amd_pstate_set_driver() in amd_pstate_register_driver()
Replace a similar chunk of code in amd_pstate_register_driver() with
amd_pstate_set_driver() call.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
6f241fa50a cpufreq/amd-pstate: Call amd_pstate_register() in amd_pstate_init()
Replace a similar chunk of code in amd_pstate_init() with
amd_pstate_register() call.

Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
5d9a354cf8 cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
According to the AMD architectural programmer's manual volume 2 [1], in
section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described
as "Reports the most energy efficient performance level (in terms of
performance per watt). Above this threshold, lower performance levels
generally result in increased energy efficiency. Reducing performance
below this threshold does not result in total energy savings for a given
computation, although it reduces instantaneous power consumption". So
lowest_nonlinear_perf is the most power efficient performance level, and
going below that would lead to a worse performance/watt.

Also, setting the minimum frequency to lowest_nonlinear_freq (instead of
lowest_freq) allows the CPU to idle at a higher frequency which leads
to more time being spent in a deeper idle state (as trivial idle tasks
are completed sooner). This has shown a power benefit in some systems,
in other systems, power consumption has increased but so has the
throughput/watt.

Modify the initial policy_data->min set by cpufreq-core to
lowest_nonlinear_freq, in the ->verify() callback. Also set the
cpudata->req[0] to FREQ_QOS_MIN_DEFAULT_VALUE (i.e. 0), so that it also
gets overriden by the check in verify function.

Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017053927.25285-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
205cb215d0 cpufreq/amd-pstate: Remove the redundant verify() function
Merge the two verify() callback functions and rename the
cpufreq_policy_data argument for better readability.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241017053927.25285-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello
508239724b cpufreq/amd-pstate: Drop needless EPP initialization
The EPP value doesn't need to be cached to the CPPC request in
amd_pstate_epp_update_limit() because it's passed as an argument
at the end to amd_pstate_set_epp() and stored at that time.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello
047a2d0c83 cpufreq/amd-pstate: Use amd_pstate_update_min_max_limit() for EPP limits
When the EPP updates are set the maximum capable frequency for the
CPU is used to set the upper limit instead of that of the policy.

Adjust amd_pstate_epp_update_limit() to reuse policy calculation code
from amd_pstate_update_min_max_limit().

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello
67c08d303e cpufreq/amd-pstate: Don't update CPPC request in amd_pstate_cpu_boost_update()
When boost is changed the CPPC value is changed in amd_pstate_cpu_boost_update()
but then changed again when refresh_frequency_limits() and all it's callbacks
occur.  The first is a pointless write, so instead just update the limits for
the policy and let the policy refresh anchor everything properly.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello
7820e8050d cpufreq/amd-pstate: Fix non kerneldoc comment
The comment for amd_cppc_supported() isn't meant to be kernel doc.

Fixes: cb817ec667 ("cpufreq: amd-pstate: show CPPC debug message if CPPC is not supported")
Link: https://lore.kernel.org/r/20240905162351.1345560-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar
1bfe6a54d2 cpufreq/amd-pstate: Rename MSR and shared memory specific functions
Existing function names "cppc_*" and "pstate_*" for shared memory and
MSR based systems are not intuitive enough, replace them with "shmem_*" and
"msr_*" respectively.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240917091434.10685-1-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello
104edc6efc x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
This feature is an AMD unique feature of some processors, so put
AMD into the name.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241025171459.1093-2-mario.limonciello@amd.com
2024-10-25 20:09:16 +02:00
Mario Limonciello
18d9b52271 cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled
When boost has been disabled the limit for perf should be nominal perf not
the highest perf.  Using the latter to do calculations will lead to
incorrect values that are still above nominal.

Fixes: ad4caad58d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
Reported-by: Peter Jung <ptr1337@cachyos.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219348
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-15 23:54:15 -05:00
Dhananjay Ugwekar
c10e50a469 cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems
While switching the driver mode between active and passive, Collaborative
Processor Performance Control (CPPC) is disabled in
amd_pstate_unregister_driver(). But, it is not enabled back while registering
the new driver (passive or active). This leads to the new driver mode not
working correctly, so enable it back in amd_pstate_register_driver().

Fixes: 3ca7bc818d ("cpufreq: amd-pstate: Add guided mode control support via sysfs")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241004122303.94283-1-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-07 11:32:05 -05:00
Rafael J. Wysocki
9bcf30348f second round of amd-pstate changes for 6.12 (second try):
* Move the calculation of the AMD boost numerator outside of
   amd-pstate, correcting acpi-cpufreq on systems with preferred cores
 * Harden preferred core detection to avoid potential false positives
 * Add extra unit test coverage for mode state machine
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmbhviEaHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnYqDA//TrvmXcpk1mnVJw3Y7MG0
 /n8dsLpxqVtEf+USnlGR+iRhgSQ/W/Kr7b5a+jmdCwpHChuWHt2FnNgcHLIxDnZC
 vmEJ02/2BCRoPKvcvV4VTh0ATu3O9nqwQiBVWBdNjDy+Dzr0pzA+SQopt1hCIsO2
 mzUodhpiBqYKlMf/i6+aM1gZCGGqoRC40aGqnJsgegb61vl7zIc2ZcbTxUQlyTfv
 t6J73IXLx8+YtrjejBYc7mRHhMQ2hCKy92C/8cNoGocj5faSKsAA3OUDcWq8qX0U
 zK3GGGdW8MLHSbt3VyntstnfiLL7TnzowcjvrMudIWpjC1987GlE9BApbN9VRZ8e
 ARN3Y7/ltjut/1fRB97BwjI9aDpzA0122Qzy4UOcK8o+be1eIr+ihV3Z9EN/snWg
 0L/oq5+rGHvvIzf1BwGhoPSvgBIu7eMIYDcRxKPlEiKsbXrL4DdJC/nXgaZ/HiGO
 eHx1dNy7LFrdnEwVI1frZWC6ZuZcpmOBdhnfU+leVxzB3Z++Qc266rsxKBsc5taZ
 PPV18pxfbbl3iL85KDIbuBUCmA0aY8WEdCKtfXpl7zlB5g0fZQLyYeUbvahK08Sk
 vyQAnPECbX/4v1Vx54Z70GPk0XD2+TXdg8yApnXrmRc36z/SLdprk5hPKbKhZu/r
 iPxFUnvd0HCtjsLrsq/qUiQ=
 =R4HZ
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.12-2024-09-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge the second round of amd-pstate changes for 6.12 from Mario
Limonciello:

"* Move the calculation of the AMD boost numerator outside of
   amd-pstate, correcting acpi-cpufreq on systems with preferred cores
 * Harden preferred core detection to avoid potential false positives
 * Add extra unit test coverage for mode state machine"

* tag 'amd-pstate-v6.12-2024-09-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate-ut: Fix an "Uninitialized variables" issue
  cpufreq/amd-pstate-ut: Add test case for mode switches
  cpufreq/amd-pstate: Export symbols for changing modes
  amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`
  cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`
  cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
  cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
  x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
  x86/amd: Move amd_get_highest_perf() out of amd-pstate
  ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn
  ACPI: CPPC: Drop check for non zero perf ratio
  x86/amd: Rename amd_get_highest_perf() to amd_get_boost_ratio_numerator()
  ACPI: CPPC: Adjust return code for inline functions in !CONFIG_ACPI_CPPC_LIB
  x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c
2024-09-11 18:22:23 +02:00
Mario Limonciello
8d916815b0 cpufreq/amd-pstate: Export symbols for changing modes
In order to effectively test all mode switch combinations export
everything necessarily for amd-pstate-ut to trigger a mode switch.

Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello
45722e777f cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
Don't take and release the mutex when prefcore isn't present and
avoid initialization of variables that will be initially set
in the function.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello
ad4caad58d cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
The special case in amd_pstate_highest_perf_set() is the value used
for calculating the boost numerator.  Merge this into
amd_get_boost_ratio_numerator() and then use that to calculate boost
ratio.

This allows dropping more special casing of the highest perf value.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello
279f838a61 x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
AMD systems that support preferred cores will use "166" as their
numerator for max frequency calculations instead of "255".

Add a function for detecting preferred cores by looking at the
highest perf value on all cores.

If preferred cores are enabled return 166 and if disabled the
value in the highest perf register. As the function will be called
multiple times, cache the values for the boost numerator and if
preferred cores will be enabled in global variables.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello
2819bfef64 x86/amd: Move amd_get_highest_perf() out of amd-pstate
amd_pstate_get_highest_perf() is a helper used to get the highest perf
value on AMD systems.  It's used in amd-pstate as part of preferred
core handling, but applicable for acpi-cpufreq as well.

Move it out to cppc handling code as amd_get_highest_perf().

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Rafael J. Wysocki
6af3aab6c7 ARM cpufreq updates for 6.12
- Several OF related cleanups in cpufreq drivers (Rob Herring).
 
 - Enable COMPILE_TEST for ARM drivers (Rob Herrring).
 
 - Introduce quirks for syscon failures and use socinfo to get revision
   for TI cpufreq driver (Dhruva Gole and Nishanth Menon).
 
 - Minor cleanups in amd-pstate driver (Anastasia Belova and Dhananjay
   Ugwekar).
 
 - Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
   (Danila Tikhonov, Huacai Chen, and Liu Jing).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmbalyoACgkQ0rkcPK6B
 Ehw9IBAAus+BOdYMzU8VT7j8Y98oOfb5FsJCoTU2KaV2RIIpX4k+6daruCOm0BXP
 RtRiI+ILV5zLUm8CIC15f2GQE6PtDBFmjky7ItEemcbQPlTkpkZFWNFhBqE1u3hw
 jllA4p1LmUwAnr1zkwl2CEUJSRJBxWPeTxPL0Ci6pycFhiNPZwGqOreJQRsIMOh3
 pgohKSBebxpzgwES8fhR32CqaHphrEFCryHafZIqzsXSBuyETGEKg57zTmdo6ojy
 GDuaIz6kQ9lKvW/q9iwTih93SsBnzDD85AAERDZkUDxey5IBLztrJLH5QT/XN77K
 EQOHeygwyKk4su00fXy/LXmMqKHCN/mAHgb6JvWBIm2xbDWx6drBJyV/NdX4YI4w
 4m1SqmFH9Cv41UIcynQR83XthGKgIddjEDKPW0GNMQ+LHWlUS6Qm4Kb2q4rXruqD
 bUWs3NmZEvYD9P2XOKGHgfSPZ0iNXi0Lt5BBIWbeIPNwaikxHisNsNG1W2pMsfke
 n19cvt20aBJgx2s5acIH7Po8qQglrGGK9EKWRg8gInvtB7QRbHBhXVD6ZNwuIk/7
 u2+Y42R4R1GzwsD3EUl+RnnUFgRwhg53OIzcE+AaaMDqGeTdxmG42eg0jGSBA7yx
 KbljH9PAfsMjjEjsVYReiIYxS28PZNyTBaxZJxD2RyxMz53CV9w=
 =BlNP
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.12 from Viresh Kumar:

"- Several OF related cleanups in cpufreq drivers (Rob Herring).

 - Enable COMPILE_TEST for ARM drivers (Rob Herrring).

 - Introduce quirks for syscon failures and use socinfo to get revision
   for TI cpufreq driver (Dhruva Gole and Nishanth Menon).

 - Minor cleanups in amd-pstate driver (Anastasia Belova and Dhananjay
   Ugwekar).

 - Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
   (Danila Tikhonov, Huacai Chen, and Liu Jing)."

* tag 'cpufreq-arm-updates-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: ti-cpufreq: Use socinfo to get revision in AM62 family
  cpufreq: Fix the cacography in powernv-cpufreq.c
  cpufreq: ti-cpufreq: Introduce quirks to handle syscon fails appropriately
  cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request()
  cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
  cpufreq: Add SM7325 to cpufreq-dt-platdev blocklist
  cpufreq: Fix warning on unused of_device_id tables for !CONFIG_OF
  cpufreq/amd-pstate: Add the missing cpufreq_cpu_put()
  cpufreq: Drop CONFIG_ARM and CONFIG_ARM64 dependency on Arm drivers
  cpufreq: Enable COMPILE_TEST on Arm drivers
  cpufreq: armada-8k: Avoid excessive stack usage
  cpufreq: omap: Drop asm includes
  cpufreq: qcom: Add explicit io.h include for readl/writel_relaxed
  cpufreq: spear: Use of_property_for_each_u32() instead of open coding
  cpufreq: Use of_property_present()
2024-09-06 20:50:46 +02:00
Rafael J. Wysocki
222caf5520 amd-pstate development for 6.12:
* Validate return of any attempt to update EPP limits, which fixes
   the masking hardware problems.
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmbYvoIaHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnY6ZxAAw0vB5p/mC8QfS5VuDm0O
 Up+dMtjVf9VcIOx4OoqSPNTY4HmNUnBSmACfeY/LzUa22BhM+7SJ74y8UoXGYGc8
 GKzuVDRsbqnuMrGqbOd8u+eJhcc8fln7zJVa1xBOM8V5yHqvnxjAsnwVieVhjmKa
 7m5K8ht/tJsCpaY/BjF7iHMgq950UGjO+pUXzrYP5ARV1O367DMWVQ91X9NBKwvN
 w/mapWmH+mpp0//2hL7wzrtGIfYjdndFP3xM0+7v5MoEd9KuSl7xCVObGidDOEaz
 oxfMCfPPpHMJLwD/L6VOwq/AIomJQVUZS8111ucnbdkIVW5QOlSaFwKJG/KmFvUQ
 98lfX2xAaGB26tvWQ3l+vFQ7Qces0NrJfCLbpS2NdQ2oSq1+1ab3Yeh1wDXYph7o
 hF+Xa/qJHzIFwzeQ2EmdXwthpoXbF5mmIiBZCTE/v1WpBVkYj+lJGtU+WsyAwFJT
 qPASAqrRo8KoCfhdxYT47Rc3AowUWtHoJQlZNzTdbmc7KXrk+7NfDUZ5ifLq3puI
 FAPWg76NM8aefoattsC57+6/V2bygbe1wIQaXRrybZiO9cMtY0eQ9bE4ysIm0ZQ8
 bSPEHg1iYl3OrMzwRYnBHmev8E4jmKdKSnBBPSqD3ObeR3fNZQs1pxWc3CtHP8LA
 p+AEE66CGAzsed5wudcEoOc=
 =nPcu
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.12-2024-09-04' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge an amd-pstate driver update for 6.12 from Mario Limonciello:

"amd-pstate development for 6.12:
 * Validate return of any attempt to update EPP limits, which fixes
   the masking hardware problems."

* tag 'amd-pstate-v6.12-2024-09-04' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate: Catch failures for amd_pstate_epp_update_limit()
2024-09-05 13:04:46 +02:00
Mario Limonciello
c3e093efbc cpufreq/amd-pstate: Catch failures for amd_pstate_epp_update_limit()
amd_pstate_set_epp() calls cppc_set_epp_perf() which can fail for
a variety of reasons but this is ignored.  Change the return flow
to allow failures.

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-04 15:07:34 -05:00
Anastasia Belova
5493f9714e cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
cpufreq_cpu_get may return NULL. To avoid NULL-dereference check it
and return in case of error.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-09-04 20:43:59 +05:30
Dhananjay Ugwekar
49243adc71 cpufreq/amd-pstate: Add the missing cpufreq_cpu_put()
Fix the reference counting of cpufreq_policy object in amd_pstate_update()
function by adding the missing cpufreq_cpu_put().

Fixes: e8f555daac ("cpufreq/amd-pstate: fix setting policy current frequency value")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-09-04 20:43:58 +05:30
Gautham R. Shenoy
9c68a3b03e cpufreq/amd-pstate: Remove warning for X86_FEATURE_CPPC on certain Zen models
commit bff7d13c19 ("cpufreq: amd-pstate: add debug message while
CPPC is supported and disabled by SBIOS") issues a warning on plaforms
where the X86_FEATURE_CPPC is expected to be enabled, but is not due
to it being disabled in the BIOS.

This feature bit corresponds to CPUID 0x80000008.ebx[27] which is a
reserved bit on the Zen1 processors and a reserved bit on Zen2 based
models 0x70-0x7F, and is expected to be cleared on these
platforms. Thus printing the warning message for these models when
X86_FEATURE_CPPC is unavailable is incorrect. Fix this.

Modify some of the comments, and use switch-case for model range
checking for improved readability while at it.

Fixes: bff7d13c19 ("cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS")
Cc: Xiaojian Du <xiaojian.du@amd.com>
Reported-by: David Wang <00107082@163.com>
Closes: https://lore.kernel.org/lkml/20240730140111.4491-1-00107082@163.com/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-28 10:15:00 -05:00
Gautham R. Shenoy
0d8584d288 cpufreq/amd-pstate: Use topology_logical_package_id() instead of logical_die_id()
After the commit 63edbaa48a ("x86/cpu/topology: Add support for the
AMD 0x80000026 leaf"), the topolgy_logical_die_id() function returns
the logical Core Chiplet Die (CCD) ID instead of the logical socket
ID.

Since this is currently used to set MSR_AMD_CPPC_ENABLE, which needs
to be set on any one of the threads of the socket, it is prudent to
use topology_logical_package_id() in place of
topology_logical_die_id().

Fixes: 63edbaa48a ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
cc: stable@vger.kernel.org # 6.10
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Link: https://lore.kernel.org/lkml/20240801124509.3650-1-Dhananjay.Ugwekar@amd.com/
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-22 15:42:10 -05:00
Dan Carpenter
67d95303c8 cpufreq: amd-pstate: Fix uninitialized variable in amd_pstate_cpu_boost_update()
Smatch complains that "ret" could be uninitialized:

  drivers/cpufreq/amd-pstate.c:734 amd_pstate_cpu_boost_update()
  error: uninitialized symbol 'ret'.

This seems like it probably is a real issue.  Initialize "ret" to zero to
be safe.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/lkml/7ff53543-6c04-48a0-8d99-7dc010b93b3a@stanley.mountain/T/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-22 15:42:10 -05:00
Rafael J. Wysocki
7ad9eab9d4 ARM cpufreq updates for 6.11
- cpufreq: Add Loongson-3 CPUFreq driver support (Huacai Chen).
 - Make exit() callback return void (Lizhe and Viresh Kumar).
 - Minor cleanups and fixes in several drivers (Bryan Brattlof,
   Javier Carrasco, Jagadeesh Kona, Jeff Johnson, Nícolas F. R. A. Prado,
   Primoz Fiser, Raphael Gallais-Pou, and Riwen Lu).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmaM3agACgkQ0rkcPK6B
 Ehw2QA//W+HaHbEf3zOFvwDgG23h3ampEzIoZ1LTznU7rsK7as1XgJ12pHk3uZyy
 L9OppUeN0zH9LaIgOCG5C5oVnRujl30LK3jo/vyBkGROdpng6w4Wci/2XIqPEZFJ
 sMC3om+VgbXGu1UaxSTX/fBjuWeuoLY6rrGHjkDcAh52bgEWuRTzgOIrcRTRpcvb
 G8Gy1YU/t2j/UocYkiR3s5JAFyujmiWcoD4fO4wt+JaYRnDmfQXSrE9X0dpjN+Vp
 wxftLn3RgbuIXGmrDnnwUiDa/e6YSTLKgkrdzshSyOeHUzW7SoMfkMqb26bnFsLY
 m2FKnTtT2uQIPdFwrPPseXhUvjklyOAeIZH6tO/QGoteXU3SVWB1kBQNcVbztWF5
 hHGL/qERACIt3xU/WQ0h1nvTMf46+1vc944uArh6F6t/XvmcoXv05YDRymyZBWLx
 mNRqG89gDex/TB+R15GBbXibK2UEGB26Bu84m7nFgbo5B0oM+OPebm49133gfz3V
 b8XaxzQMMFgdV3CpqRxQTNSnPWiwspttBZE7hYULONDxj8Ys/yfY7Gq8khjQxEBO
 xxQ4QRtlwkLSilyNb19i5LM9F+HpmkxdjO6su3SgZW5QVUUKsNA/aY0CbrXuIRiS
 dBGwBz8/EZ/7+/bK+TIU5tdR8UCSrVifF/bVGaQnWRWvB/2gPhw=
 =qMmS
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.11 from Viresh Kumar:

"- cpufreq: Add Loongson-3 CPUFreq driver support (Huacai Chen).
 - Make exit() callback return void (Lizhe and Viresh Kumar).
 - Minor cleanups and fixes in several drivers (Bryan Brattlof,
   Javier Carrasco, Jagadeesh Kona, Jeff Johnson, Nícolas F. R. A. Prado,
   Primoz Fiser, Raphael Gallais-Pou, and Riwen Lu)."

* tag 'cpufreq-arm-updates-6.11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (21 commits)
  cpufreq: sti: fix build warning
  cpufreq: mediatek: Use dev_err_probe in every error path in probe
  cpufreq: Add Loongson-3 CPUFreq driver support
  cpufreq: Make cpufreq_driver->exit() return void
  cpufreq: pcc: Remove empty exit() callback
  cpufreq: loongson2: Remove empty exit() callback
  cpufreq: nforce2: Remove empty exit() callback
  cpufreq: sti: add missing MODULE_DEVICE_TABLE entry for stih418
  cpufreq: ti: update OPP table for AM62Px SoCs
  cpufreq: ti: update OPP table for AM62Ax SoCs
  cpufreq: sun50i: add Allwinner H700 speed bin
  cpufreq/cppc: Don't compare desired_perf in target()
  OPP: ti: Fix ti_opp_supply_probe wrong return values
  cpufreq: ti-cpufreq: Handle deferred probe with dev_err_probe()
  cpufreq: dt-platdev: add missing MODULE_DESCRIPTION() macro
  cpufreq: longhaul: Fix kernel-doc param for longhaul_setstate
  cpufreq: qcom-nvmem: eliminate uses of of_node_put()
  cpufreq: qcom-nvmem: fix memory leaks in probe error paths
  cpufreq: scmi: Avoid overflow of target_freq in fast switch
  cpufreq: sun50i: replace of_node_put() with automatic cleanup handler
  ...
2024-07-09 17:58:20 +02:00
Lizhe
b4b1ddc9df cpufreq: Make cpufreq_driver->exit() return void
The cpufreq core doesn't check the return type of the exit() callback
and there is not much the core can do on failures at that point. Just
drop the returned value and make it return void.

Signed-off-by: Lizhe <sensor1010@163.com>
[ Viresh: Reworked the patches to fix all missing changes together. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # Mediatek
Acked-by: Sudeep Holla <sudeep.holla@arm.com> # scpi, scmi, vexpress
Acked-by: Mario Limonciello <mario.limonciello@amd.com> # amd
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> # bmips
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com> # omap
2024-07-09 08:45:30 +05:30
Dhananjay Ugwekar
738d7d0357 cpufreq/amd-pstate: Fix the scaling_max_freq setting on shared memory CPPC systems
On shared memory CPPC systems, with amd_pstate=active mode, the change
in scaling_max_freq doesn't get written to the shared memory
region. Due to this, the writes to the scaling_max_freq sysfs file
don't take effect. Fix this by propagating the scaling_max_freq
changes to the shared memory region.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Reported-by: David Arcari <darcari@redhat.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240702081413.5688-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-07-07 10:32:48 -05:00
Perry Yuan
89ac482d51 cpufreq: amd-pstate: Cap the CPPC.max_perf to nominal_perf if CPB is off
When Core Performance Boost is disabled by the user, the
CPPC_REQ.max_perf should not exceed the nominal_perf since by definition
the frequencies between nominal_perf and the highest_perf are in the
boost range. Fix this in amd_pstate_update()

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Link: https://lore.kernel.org/r/66f55232be01092c423f0523f68b82b80c293943.1718988436.git.perry.yuan@amd.com
Link: https://lore.kernel.org/r/20240626042733.3747-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-26 15:48:21 -05:00
Perry Yuan
c8c68c38b5 cpufreq: amd-pstate: initialize core precision boost state
The "Core Performance Boost (CPB) feature, when enabled in the BIOS,
allows the OS to control the highest performance for each individual
core. The active, passive and the guided modes of the amd-pstate driver
do support controlling the core frequency boost when this BIOS feature
is enabled. Additionally, the amd-pstate driver provides a sysfs
interface allowing the user to activate/deactivate this core performance
boost feature at runtime.

Add support for the set_boost callback in the active mode driver to
enable boost control via the cpufreq core. This ensures a consistent
boost control interface across all pstate modes, including passive
mode, guided mode, and active mode.

With this addition, all three pstate modes can support the same boost
control interface with unique interface and global CPB control. Each
CPU also supports individual boost control, allowing global CPB to
change all cores' boost states simultaneously. Specific CPUs can
update their boost states separately, ensuring all cores' boost
states are synchronized.

Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20240626042733.3747-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-26 15:48:21 -05:00
Mario Limonciello
bc76f57574 cpufreq: amd-pstate: Don't create attributes when registration fails
If driver registration fails then immediately return the failure
instead of continuing to register attributes.

This fixes issues of falling back from amd-pstate to other drivers
when cpufreq init has failed for any reason.

Reported-by: alex.s.cochran@proton.me
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Link: https://lore.kernel.org/r/20240623200918.52104-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-24 13:17:56 -05:00
Meng Li
e8f555daac cpufreq/amd-pstate: fix setting policy current frequency value
When scaling min/max freq values were being setted,
the value of policy->cur need to update.

Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240227071133.3405003-1-li.meng@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
4e4f600ee7 cpufreq: amd-pstate: auto-load pstate driver by default
If the `amd-pstate` driver is not loaded automatically by default,
it is because the kernel command line parameter has not been added.
To resolve this issue, it is necessary to call the `amd_pstate_set_driver()`
function to enable the desired mode (passive/active/guided) before registering
the driver instance.

This ensures that the driver is loaded correctly without relying on the kernel
command line parameter.

When there is no parameter added to command line, Kernel config will
provide the default mode to load.

Meanwhile, user can add driver mode in command line which will override
the kernel config default option.

Reported-by: Andrei Amuraritei <andamu@posteo.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218705
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/83301c4cea4f92fb19e14b23f2bac7facfd8bdbb.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
918263938c cpufreq: amd-pstate: enable shared memory type CPPC by default
The amd-pstate-epp driver has been implemented and resolves the
performance drop issue seen in passive mode for shared memory type
CPPC systems. Users who enable the active mode driver will not
experience a performance drop compared to the passive mode driver.
Therefore, the EPP driver should be loaded by default for shared
memory type CPPC system to get better performance.

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/c705507cf3ee790e544251cfd897ed11e8e57712.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
c9fdaba836 cpufreq: amd-pstate: switch boot_cpu_has() to cpu_feature_enabled()
replace the usage of the deprecated boot_cpu_has() function with
the modern cpu_feature_enabled() function. The switch to cpu_feature_enabled()
ensures compatibility with the latest CPU feature detection mechanisms and
improves code maintainability.

Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/f1567593ac5e1d38343067e9c681a8c4b0707038.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
bff7d13c19 cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS
If CPPC feature is supported by the CPU however the CPUID flag bit is not
set by SBIOS, the `amd_pstate` will be failed to load while system
booting.
So adding one more debug message to inform user to check the SBIOS setting,
The change also can help maintainers to debug why amd_pstate driver failed
to be loaded at system booting if the processor support CPPC.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/42c953616ac121bd1e5c329e83d015a02e6b32c7.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
cb817ec667 cpufreq: amd-pstate: show CPPC debug message if CPPC is not supported
Add CPU ID checking in case the driver attempt to load on systems where
CPPC functionality is unavailable. And the warning message will not
be shown if CPPC is not supported.

It will also print debug message if the CPU has no CPPC support that
helps to debug the driver loading failure issue.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://lore.kernel.org/linux-pm/CYYPR12MB8655D32EA18574C9497E888A9C122@CYYPR12MB8655.namprd12.prod.outlook.com/T/#t
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/437dbd581a4119465581330081d9b1e289482ba2.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
7bf7f22906 cpufreq: amd-pstate: remove unused variable nominal_freq
removed the unused variable `nominal_freq` for build warning.
This variable was defined and assigned a value in the previous code,
but it was not used in the subsequent code.

Closes: https://lore.kernel.org/oe-kbuild-all/202405080431.BPU6Yg9s-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/b7ef41557f71d40d098393ddb27f0fe1f23648ae.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan
8f8b42c1fc cpufreq: amd-pstate: optimize the initial frequency values verification
To enhance the debugging capability of the driver loading failure for
broken CPPC ACPI tables, it can optimize the expression by moving the
verification of `min_freq`, `nominal_freq`, and other dependency values
to the `amd_pstate_init_freq()` function where they are initialized.
If any of these values are incorrect, the `amd-pstate` driver will not be registered.

By ensuring that these values are correct before they are used, it will facilitate
the debugging process when encountering driver loading failures due to faulty CPPC
ACPI tables from BIOS

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/f9793f8451c1832e34cc9dc35f89c653b39cfe38.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Mario Limonciello
fc6e083726 cpufreq: amd-pstate: Allow users to write 'default' EPP string
The EPP string for 'default' represents what the firmware had configured
as the default EPP value but once a user changes EPP to another string
they can't reset it back to 'default'.

Cache the firmware EPP value and allow the user to write 'default' using
this value.

Reported-by: Artem S. Tashkinov <aros@gmx.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931#c61
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Xiaojian Du
c00d476cbc cpufreq: amd-pstate: change cpu freq transition delay for some models
Some of AMD ZEN4 APU/CPU have support for adjusting the CPU core
clock more quickly and presicely according to CPU work loading.
This is advertised by the Fast CPPC x86 feature.
This change will only be effective in the *passive mode* of
AMD pstate driver. From the test results of different
transition delay values, 600us is chosen to make a balance
between performance and power consumption.

Some test results on AMD Ryzen 7840HS(Phoenix) APU:

1. Tbench
(Energy less is better, Throughput more is better,
PPW--Performance per Watt more is better)
============= =================== ============== =============== ============== =============== ============== =============== ===============
 Trans Delay   Tbench              governor:schedutil, 3-iterations average
============= =================== ============== =============== ============== =============== ============== =============== ===============
 1000us        Clients             1              2               4              8              12             16              32
               Energy/Joules       2010           2804            8768           17171          16170          15132           15027
               Throughput/(MB/s)   114            259             1041           3010           3135           4851            4605
               PPW                 0.0567         0.0923          0.1187         0.1752         0.1938         0.3205          0.3064
 600us         Clients             1              2               4              8              12             16              32
               Energy/Joules       2115  (5.22%)  2388  (-14.84%) 10700(22.03%)  16716 (-2.65%) 15939 (-1.43%) 15053 (-0.52%)  15083 (0.37% )
               Throughput/(MB/s)   122   (7.02%)  234   (-9.65% ) 1188 (14.12%)  3003  (-0.23%) 3143  (0.26% ) 4842  (-0.19%)  4603  (-0.04%)
               PPW                 0.0576(1.59%)  0.0979(6.07%  ) 0.111(-6.49%)  0.1796(2.51% ) 0.1971(1.70% ) 0.3216(0.34% )  0.3051(-0.42%)
============= =================== ============== ================ ============= =============== ============== =============== ===============

2.Dbench
(Energy less is better, Throughput more is better,
PPW--Performance per Watt more is better)
============= =================== ============== =============== ============== =============== ============== =============== ===============
 Trans Delay   Dbench              governor:schedutil, 3-iterations average
============= =================== ============== =============== ============== =============== ============== =============== ===============
 1000us        Clients             1             2               4              8               12             16              32
               Energy/Joules       4890          3779            3567           5157            5611           6500            8163
               Throughput/(MB/s)   327           167             220            577             775            938             1397
               PPW                 0.0668        0.0441          0.0616         0.1118          0.1381         0.1443          0.1711
 600us         Clients             1             2               4              8               12             16              32
               Energy/Joules       4915  (0.51%) 4912  (29.98%)  3506  (-1.71%) 4907  (-4.85% ) 5011 (-10.69%) 5672  (-12.74%) 8141  (-0.27%)
               Throughput/(MB/s)   348   (6.42%) 284   (70.06%)  220   (0.00% ) 518   (-10.23%) 712  (-8.13% ) 854   (-8.96% ) 1475  (5.58% )
               PPW                 0.0708(5.99%) 0.0578(31.07%)  0.0627(1.79% ) 0.1055(-5.64% ) 0.142(2.82%  ) 0.1505(4.30%  ) 0.1811(5.84% )
============= =================== ============== =============== ============== =============== ============== =============== ===============

3.Hackbench(less time is better)
============= =========================== ==========================
  hackbench     governor:schedutil
============= =========================== ==========================
  Trans Delay   Process Mode Ave time(s)  Thread Mode Ave time(s)
  1000us        14.484                      14.484
  600us         14.418(-0.46%)              15.41(+6.39%)
============= =========================== ==========================

4.Perf_sched_bench(less time is better)
============= =================== ============== ============== ============== =============== =============== =============
 Trans Delay  perf_sched_bench    governor:schedutil
============= =================== ============== ============== ============== =============== =============== =============
  1000us        Groups             1             2              4              8               12              24
                AveTime(s)        1.64          2.851          5.878          11.636          16.093          26.395
  600us         Groups             1             2              4              8               12              24
                AveTime(s)        1.69(3.05%)   2.845(-0.21%)  5.843(-0.60%)  11.576(-0.52%)  16.092(-0.01%)  26.32(-0.28%)
============= ================== ============== ============== ============== =============== =============== ==============

5.Sysbench(higher is better)
============= ================== ============== ================= ============== ================ =============== =================
  Sysbench    governor:schedutil
============= ================== ============== ================= ============== ================ =============== =================
  1000us      Thread             1               2                4              8                12               24
              Ave events         6020.98         12273.39         24119.82       46171.57         47074.37         47831.72
  600us       Thread             1               2                4              8                12               24
              Ave events         6154.82(2.22%)  12271.63(-0.01%) 24392.5(1.13%) 46117.64(-0.12%) 46852.19(-0.47%) 47678.92(-0.32%)
============= ================== ============== ================= ============== ================ =============== =================

In conclusion, a shorter transition delay
of cpu clock will make a quite positive effect to improve PPW
on Dbench test, in the meanwhile, keep stable performance
on Tbench, Hackbench, Perf_sched_bench and Sysbench.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-11 16:12:12 -05:00
Dhananjay Ugwekar
e4731baaf2 cpufreq: amd-pstate: Fix the inconsistency in max frequency units
The nominal frequency in cpudata is maintained in MHz whereas all other
frequencies are in KHz. This means we have to convert nominal frequency
value to KHz before we do any interaction with other frequency values.

In amd_pstate_set_boost(), this conversion from MHz to KHz is missed,
fix that.

Tested on a AMD Zen4 EPYC server

Before:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151
409422

After:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151000
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151000
1799527

Fixes: ec437d71db ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Peter Jung <ptr1337@cachyos.org>
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-28 22:03:11 +02:00
Arnd Bergmann
779b8a14af cpufreq: amd-pstate: remove global header file
When extra warnings are enabled, gcc points out a global variable
definition in a header:

In file included from drivers/cpufreq/amd-pstate-ut.c:29:
include/linux/amd-pstate.h:123:27: error: 'amd_pstate_mode_string' defined but not used [-Werror=unused-const-variable=]
  123 | static const char * const amd_pstate_mode_string[] = {
      |                           ^~~~~~~~~~~~~~~~~~~~~~

This header is only included from two files in the same directory,
and one of them uses only a single definition from it, so clean it
up by moving most of the contents into the driver that uses them,
and making shared bits a local header file.

Fixes: 36c5014e54 ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-28 21:59:39 +02:00
Peng Ma
cea04f3d9a cpufreq: amd-pstate: fix memory leak on CPU EPP exit
The cpudata memory from kzalloc() in amd_pstate_epp_cpu_init() is
not freed in the analogous exit function, so fix that.

Signed-off-by: Peng Ma <andypma@tencent.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-16 10:38:07 +02:00
Perry Yuan
bf202e654b cpufreq: amd-pstate: fix the highest frequency issue which limits performance
To address the performance drop issue, an optimization has been
implemented. The incorrect highest performance value previously set by the
low-level power firmware for AMD CPUs with Family ID 0x19 and Model ID
ranging from 0x70 to 0x7F series has been identified as the cause.

To resolve this, a check has been implemented to accurately determine the
CPU family and model ID. The correct highest performance value is now set
and the performance drop caused by the incorrect highest performance value
are eliminated.

Before the fix, the highest frequency was set to 4200MHz, now it is set
to 4971MHz which is correct.

CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE    MAXMHZ   MINMHZ       MHZ
  0    0      0    0 0:0:0:0          yes 4971.0000 400.0000  400.0000
  1    0      0    0 0:0:0:0          yes 4971.0000 400.0000  400.0000
  2    0      0    1 1:1:1:0          yes 4971.0000 400.0000 4865.8140
  3    0      0    1 1:1:1:0          yes 4971.0000 400.0000  400.0000

Fixes: f3a0523918 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218759
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Gaha Bana <gahabana@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 13:56:00 +02:00
Perry Yuan
5c3fd1edaa cpufreq: amd-pstate: remove unused variable lowest_nonlinear_freq
removed the unused variable `lowest_nonlinear_freq` for build warning.
This variable was defined and assigned a value in the previous code,
but it was not used in the subsequent code.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404271038.em6nJjzy-lkp@intel.com/
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-30 12:23:06 +02:00
Perry Yuan
5131a3ca35 cpufreq: amd-pstate: fix code format problems
get some code format problems fixed in the amd-pstate driver.

Changes Made:

- Fixed incorrect comment format in the functions.

- Removed unnecessary blank line.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404271148.HK9yHBlB-lkp@intel.com/
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-30 12:23:06 +02:00
Perry Yuan
eb8b6c3682 cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing
Add quirks table to get CPPC capabilities issue fixed by providing
correct perf or frequency values while driver loading.

If CPPC capabilities are not defined in the ACPI tables or wrongly
defined by platform firmware, it needs to use quick to get those
issues fixed with correct workaround values to make pstate driver
can be loaded even though there are CPPC capabilities errors.

The workaround will match the broken BIOS which lack of CPPC capabilities
nominal_freq and lowest_freq definition in the ACPI table.

$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
0
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
0

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Perry Yuan
069a2bb8c4 cpufreq: amd-pstate: get transition delay and latency value from ACPI tables
Make pstate driver initially retrieve the P-state transition delay and
latency values from the BIOS ACPI tables which has more reasonable
delay and latency values according to the platform design and
requirements.

Previously there values were hardcoded at specific value which may
have conflicted with platform and it might not reflect the most
accurate or optimized setting for the processor.

[054h 0084   8]                Preserve Mask : FFFFFFFF00000000
[05Ch 0092   8]                   Write Mask : 0000000000000001
[064h 0100   4]              Command Latency : 00000FA0
[068h 0104   4]          Maximum Access Rate : 0000EA60
[06Ch 0108   2]      Minimum Turnaround Time : 0000

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Perry Yuan
2ddb8a3946 cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
The amd-pstate driver cannot work when the min_freq, nominal_freq or
the max_freq is zero. When this happens it is prudent to error out
early on rather than waiting failing at the time of the governor
initialization.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Gautham R. Shenoy
3cbbe8871a cpufreq: amd-pstate: Remove amd_get_{min,max,nominal,lowest_nonlinear}_freq()
amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.

There is no loss in readability in replacing their invocations by
accesses to the corresponding members of cpudata.

Do so and remove these helper functions.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Li Meng <li.meng@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Perry Yuan
5547c0ebfc cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq
Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
helpers computes the values of min_freq, max_freq, nominal_freq and
lowest_nominal_freq respectively afresh from
cppc_get_perf_caps(). This is not necessary as there are fields in
cpudata to cache these values.

To simplify this, add a single helper function named
amd_pstate_init_freq() which computes all these frequencies at once, and
caches it in cpudata.

Use the cached values everywhere else in the code.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Li Meng <li.meng@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Co-developed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Meng Li
8164f74332 cpufreq: amd-pstate: adjust min/max limit perf
The min/max limit perf values calculated based on frequency
may exceed the reasonable range of perf(highest perf, lowest perf).

Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-29 20:11:40 +01:00
Tor Vic
b26ffbf800 cpufreq: amd-pstate: Fix min_perf assignment in amd_pstate_adjust_perf()
In the function amd_pstate_adjust_perf(), the 'min_perf' variable is set
to 'highest_perf' instead of 'lowest_perf'.

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12 16:40:28 +01:00
Meng Li
e571a5e206 cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically
Preferred core rankings can be changed dynamically by the
platform based on the workload and platform conditions and
accounting for thermals and aging.
When this occurs, cpu priority need to be set.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-31 14:54:50 +01:00
Meng Li
f3a0523918 cpufreq: amd-pstate: Enable amd-pstate preferred core support
amd-pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it amd-pstate preferrred core.

Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.

The initial core rankings are set up by amd-pstate when the
system boots.

Add a variable hw_prefcore in cpudata structure. It will check
if the processor and power firmware support preferred core
feature.

Add one new early parameter `disable` to allow user to disable
the preferred core.

Only when hardware supports preferred core and user set `enabled`
in early parameter, amd pstate driver supports preferred core featue.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-31 14:54:50 +01:00
Mario Limonciello
22fb4f0419 cpufreq/amd-pstate: Fix setting scaling max/min freq values
Scaling min/max freq values were being cached and lagging a setting
each time.  Fix the ordering of the clamp call to ensure they work.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Fixes: febab20cae ("cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wkarny@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-22 20:35:58 +01:00
Ayush Jain
142c169b31 cpufreq/amd-pstate: Only print supported EPP values for performance governor
show_energy_performance_available_preferences() to show only supported
values which is performance in performance governor policy.

-------Before--------
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
amd-pstate-epp
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power

-------After--------
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
amd-pstate-epp
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences
performance

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Suggested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Ayush Jain <ayush.jain3@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29 22:04:15 +01:00
Wyes Karny
febab20cae cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update
When amd_pstate is running, writing to scaling_min_freq and
scaling_max_freq has no effect. These values are only passed to the
policy level, but not to the platform level. This means that the
platform does not know about the frequency limits set by the user.

To fix this, update the min_perf and max_perf values at the platform
level whenever the user changes the scaling_min_freq and scaling_max_freq
values.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29 17:40:16 +01:00
Gautham R. Shenoy
bb87be267b cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch()
cpufreq_driver->fast_switch() callback expects a frequency as a return
value. amd_pstate_fast_switch() was returning the return value of
amd_pstate_update_freq(), which only indicates a success or failure.

Fix this by making amd_pstate_fast_switch() return the target_freq
when the call to amd_pstate_update_freq() is successful, and return
the current frequency from policy->cur when the call to
amd_pstate_update_freq() is unsuccessful.

Fixes: 4badf2eb1e ("cpufreq: amd-pstate: Add ->fast_switch() callback")
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Cc: 6.4+ <stable@vger.kernel.org> # v6.4+
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 15:06:25 +01:00
Thomas Weißschuh
5e720f8c8c cpufreq: amd-pstate: fix global sysfs attribute type
In commit 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
the "amd_pstate" attributes where moved from a dedicated kobject to the
cpu root kobject.

While the dedicated kobject expects to contain kobj_attributes the root
kobject needs device_attributes.

As the changed arguments are not used by the callbacks it works most of
the time.
However CFI will detect this issue:

[ 4947.849350] CFI failure at dev_attr_show+0x24/0x60 (target: show_status+0x0/0x70; expected type: 0x8651b1de)
...
[ 4947.849409] Call Trace:
[ 4947.849410]  <TASK>
[ 4947.849411]  ? __warn+0xcf/0x1c0
[ 4947.849414]  ? dev_attr_show+0x24/0x60
[ 4947.849415]  ? report_cfi_failure+0x4e/0x60
[ 4947.849417]  ? handle_cfi_failure+0x14c/0x1d0
[ 4947.849419]  ? __cfi_show_status+0x10/0x10
[ 4947.849420]  ? handle_bug+0x4f/0x90
[ 4947.849421]  ? exc_invalid_op+0x1a/0x60
[ 4947.849422]  ? asm_exc_invalid_op+0x1a/0x20
[ 4947.849424]  ? __cfi_show_status+0x10/0x10
[ 4947.849425]  ? dev_attr_show+0x24/0x60
[ 4947.849426]  sysfs_kf_seq_show+0xa6/0x110
[ 4947.849433]  seq_read_iter+0x16c/0x4b0
[ 4947.849436]  vfs_read+0x272/0x2d0
[ 4947.849438]  ksys_read+0x72/0xe0
[ 4947.849439]  do_syscall_64+0x76/0xb0
[ 4947.849440]  ? do_user_addr_fault+0x252/0x650
[ 4947.849442]  ? exc_page_fault+0x7a/0x1b0
[ 4947.849443]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixes: 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
Reported-by: Jannik Glückert <jannik.glueckert@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217765
Link: https://lore.kernel.org/lkml/c7f1bf9b-b183-bf6e-1cbb-d43f72494083@gmail.com/
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-07 19:41:48 +02:00