Commit Graph

4017 Commits

Author SHA1 Message Date
Rafael J. Wysocki
ece898da38 cpufreq: Use __free() for policy reference counting cleanup
Use __free() for policy reference counting cleanup where applicable in
the cpufreq core.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/9437968.CDJkKcVGEf@rjwysocki.net
2025-04-09 21:22:14 +02:00
Rafael J. Wysocki
c7282dce25 cpufreq: Drop cpufreq_cpu_acquire() and cpufreq_cpu_release()
Since cpufreq_cpu_acquire() and cpufreq_cpu_release() have no more
users in the tree, remove them.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/3880470.kQq0lBPeGt@rjwysocki.net
2025-04-09 21:22:13 +02:00
Rafael J. Wysocki
9a74bfdfd0 cpufreq: Use locking guard and __free() in cpufreq_update_policy()
Instead of using cpufreq_cpu_acquire() and cpufreq_cpu_release() in
cpufreq_update_policy(), which is the last user of these functions,
make it use __free() for policy reference counting cleanup and the
"write" locking guard for policy locking.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/22654186.EfDdHjke4D@rjwysocki.net
2025-04-09 21:22:10 +02:00
Rafael J. Wysocki
973207ae3d cpufreq: intel_pstate: Rearrange max frequency updates handling code
Rename __intel_pstate_update_max_freq() to intel_pstate_update_max_freq()
and move the cpufreq policy reference counting and locking into it (and
implement the locking with the recently introduced cpufreq policy "write"
locking guard).

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/2315023.iZASKD2KPV@rjwysocki.net
2025-04-09 21:22:08 +02:00
Rafael J. Wysocki
6fec833b9d cpufreq: Add and use cpufreq policy locking guards
Introduce "read" and "write" locking guards for cpufreq policies and use
them where applicable in the cpufreq core.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/8518682.T7Z3S40VBb@rjwysocki.net
2025-04-09 21:22:03 +02:00
Rafael J. Wysocki
68974e3a15 cpufreq: Split cpufreq_online()
In preparation for the introduction of cpufreq policy locking guards,
move the part of cpufreq_online() that is carried out under the policy
rwsem into a separate function called cpufreq_policy_online().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/3354747.aeNJFYEL58@rjwysocki.net
2025-04-09 21:21:58 +02:00
Rafael J. Wysocki
387b51709d cpufreq: Consolidate some code in cpufreq_online()
Notice that the policy->cpu update in cpufreq_policy_alloc() can be
moved to cpufreq_online() and then it can be carried out under the
policy rwsem, along with the clearing of policy->governor (unnecessary
in the "new policy" code branch, but also not harmful).  If this is
done, the bottom parts of the "if (policy)" branches become identical
and they can be collapsed and moved below the conditional.

Modify the code accordingly which makes it somewhat easier to follow.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/13741234.uLZWGnKmhe@rjwysocki.net
2025-04-09 21:18:50 +02:00
Krzysztof Kozlowski
d4f610a9ba cpufreq: Do not enable by default during compile testing
Enabling the compile test should not cause automatic enabling of all
drivers.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-04-08 11:36:33 +05:30
Pengyu Luo
fc5414a477 cpufreq: Add SM8650 to cpufreq-dt-platdev blocklist
SM8650 have already been supported by qcom-cpufreq-hw driver, but
never been added to cpufreq-dt-platdev. This makes noise

[    0.388525] cpufreq-dt cpufreq-dt: failed register driver: -17
[    0.388537] cpufreq-dt cpufreq-dt: probe with driver cpufreq-dt failed with error -17

So adding it to the cpufreq-dt-platdev driver's blocklist to fix it.

Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-04-08 11:36:32 +05:30
Andre Przywara
14c8a41815 cpufreq: sun50i: prevent out-of-bounds access
A KASAN enabled kernel reports an out-of-bounds access when handling the
nvmem cell in the sun50i cpufreq driver:
==================================================================
BUG: KASAN: slab-out-of-bounds in sun50i_cpufreq_nvmem_probe+0x180/0x3d4
Read of size 4 at addr ffff000006bf31e0 by task kworker/u16:1/38

This is because the DT specifies the nvmem cell as covering only two
bytes, but we use a u32 pointer to read the value. DTs for other SoCs
indeed specify 4 bytes, so we cannot just shorten the variable to a u16.

Fortunately nvmem_cell_read() allows to return the length of the nvmem
cell, in bytes, so we can use that information to only access the valid
portion of the data.
To cover multiple cell sizes, use memcpy() to copy the information into a
zeroed u32 buffer, then also make sure we always read the data in little
endian fashion, as this is how the data is stored in the SID efuses.

Fixes: 6cc4bcceff ("cpufreq: sun50i: Refactor speed bin decoding")
Reported-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Škrabec <jernej.skrabec@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-04-08 11:35:30 +05:30
Dhananjay Ugwekar
56a49e19e1 cpufreq/amd-pstate: Fix min_limit perf and freq updation for performance governor
The min_limit perf and freq values can get disconnected with performance
governor, as we only modify the perf value in the special case. Fix that
by modifying the perf and freq values together

Fixes: 009d1c29a4 ("cpufreq/amd-pstate: Move perf values into a union")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250407081925.850473-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-04-07 08:56:21 -05:00
Thomas Gleixner
8fa7292fee treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with coccinelle plus manual fixups where necessary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-05 10:30:12 +02:00
Linus Torvalds
e0a02923c2 Power management fix for 6.15-rc1
Prevent cpufreq_update_limits() from crashing the kernel due to a NULL
 pointer dereference when it is called before registering a cpufreq
 driver, for instance as a result of a notification triggered by the
 platform firmware (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmftQCYSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1itkIAIsiuSOfLUDRkvX/8LwHJVr9Eyv6mcqp
 sPvb3plNl1xQzH3l+SnJgWrPaZp8PEvaoO45TvDc1i4gld3hzs4qnTx4GHOsYSEf
 bGmSmz0gYj6Fa8Yy0wUSxFRmLF9cANz0gciyFNp2roAkiOyXTcvZUhZ9B0OORhFg
 qKcoYJinmgo9XAzWP0Q/V1aufn38QDvpvpqgTY4I8H70V9a5hw0zsS9q51KtcXSj
 nN9W7YDgbxqopig1aFYImsTEB00fH+X7tgcCDmoGYX/5XEEbQCee8NmaeTrwpVhI
 OHnkNapJVfYIvunElwWXGL/I0yLE9xljiVlCYdXssDj6k0TxBaTHjLQ=
 =fCku
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Prevent cpufreq_update_limits() from crashing the kernel due to a NULL
  pointer dereference when it is called before registering a cpufreq
  driver, for instance as a result of a notification triggered by the
  platform firmware (Rafael Wysocki)"

* tag 'pm-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Reference count policy in cpufreq_update_limits()
2025-04-02 15:22:22 -07:00
Rafael J. Wysocki
9e4e249018 cpufreq: Reference count policy in cpufreq_update_limits()
Since acpi_processor_notify() can be called before registering a cpufreq
driver or even in cases when a cpufreq driver is not registered at all,
cpufreq_update_limits() needs to check if a cpufreq driver is present
and prevent it from being unregistered.

For this purpose, make it call cpufreq_cpu_get() to obtain a cpufreq
policy pointer for the given CPU and reference count the corresponding
policy object, if present.

Fixes: 5a25e3f7cc ("cpufreq: intel_pstate: Driver-specific handling of _PPC updates")
Closes: https://lore.kernel.org/linux-acpi/Z-ShAR59cTow0KcR@mail-itl
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/1928789.tdWV9SEqCh@rjwysocki.net
2025-04-01 18:44:12 +02:00
Linus Torvalds
7b667acd69 powerpc updates for 6.15
- Removal of support for IBM Cell Blades
 
  - SMP support for microwatt platform
 
  - Support for inline static calls on PPC32
 
  - Enable pmu selftests for power11 platform
 
  - Enable hardware trace macro (HTM) hcall support
 
  - Support for limited address mode capability
 
  - Changes to RMA size from 512 MB to 768 MB to handle fadump
 
  - Misc fixes and cleanups
 
 Thanks to: Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
 Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav
 Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar,
 Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy,
 Segher Boessenkool, Sourabh Jain, Vaibhav Jain, Venkat Rao Bagalkote.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmfjZnYACgkQpnEsdPSH
 ZJTmKg/+NGW7wyFY9d8Iai9ncYY7GSzsMSDTaan7qg0QWOd5gjHbsdbava7TM/DW
 8p9XsC+17kSeftRNUjtc52bSN8Ei2gBdsXIagQG1alfB2X2e6wkNauifK+dz3Su6
 usMEZZTO5R/jFDotFXNM1nsUj+8dvjnPgOUrji/P8k7PT5295wpza0hz1fy5SrOA
 hM5cliBP36UgFe5Efvgm4OUX2gQIhbc3stt9MVfymW/k0Mit5f41UIPuVGiTWowY
 s0cUJGkhxUlGXT3VfOVKuZfn4u9KMha7UCl9afSceJzXOdnUIKIbskui1VEv6cD/
 iSIxi839uErAobFHlsLYprgYFciYLII3xe2qNZCA/ZxeIMS/Mm6xokESeWLhBnfa
 P7ke6l0z3GDtTvgI2eSeU9BdrVveF1NgbP9GYSKgT6gtw/kRRnxgHF8tzmLON5PT
 KXpQlzz8VuSBRtF2jnLFU89+FFwSA1bRUhDrp89HyYFqw1B5g4N7kFFTUJWHOuKS
 fwPGy+cveKehmCUBedeTRFqHvvqdwpD/WnPlQzCly3WxqdL8U/eTXYftMiAwuK28
 ovLuSs3vRThKRQ8DnUa5oB0UGsjMpRV5LdvYkhw+x8mZKUR59oj4fx2ae4TtPakg
 dbAYuPPkCORdaSga/nV6vQgsLprFpcGX3dq6E19+BVBAY5D+1PE=
 =GFcj
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Madhavan Srinivasan:

 - Remove support for IBM Cell Blades

 - SMP support for microwatt platform

 - Support for inline static calls on PPC32

 - Enable pmu selftests for power11 platform

 - Enable hardware trace macro (HTM) hcall support

 - Support for limited address mode capability

 - Changes to RMA size from 512 MB to 768 MB to handle fadump

 - Misc fixes and cleanups

Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom,
Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook,
Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani
(IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav
Jain, and Venkat Rao Bagalkote.

* tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits)
  powerpc/kexec: fix physical address calculation in clear_utlb_entry()
  crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD
  powerpc: Fix 'intra_function_call not a direct call' warning
  powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu'
  KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests
  powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7
  powerpc/microwatt: Add SMP support
  powerpc: Define config option for processors with broadcast TLBIE
  powerpc/microwatt: Define an idle power-save function
  powerpc/microwatt: Device-tree updates
  powerpc/microwatt: Select COMMON_CLK in order to get the clock framework
  net: toshiba: Remove reference to PPC_IBM_CELL_BLADE
  net: spider_net: Remove powerpc Cell driver
  cpufreq: ppc_cbe: Remove powerpc Cell driver
  genirq: Remove IRQ_EDGE_EOI_HANDLER
  docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR
  powerpc: Remove UDBG_RTAS_CONSOLE
  powerpc/io: Use standard barrier macros in io.c
  powerpc/io: Rename _insw_ns() etc.
  powerpc/io: Use generic raw accessors
  ...
2025-03-27 19:39:08 -07:00
Linus Torvalds
7d20aa5c32 Power management updates for 6.15-rc1
- Manage sysfs attributes and boost frequencies efficiently from
    cpufreq core to reduce boilerplate code in drivers (Viresh Kumar).
 
  - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
    Dhananjay Ugwekar, Imran Shaik, zuoqian).
 
  - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
    Bai).
 
  - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
 
  - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng).
 
  - Optimize the amd-pstate driver to avoid cases where call paths end
    up calling the same writes multiple times and needlessly caching
    variables through code reorganization, locking overhaul and tracing
    adjustments (Mario Limonciello, Dhananjay Ugwekar).
 
  - Make it possible to avoid enabling capacity-aware scheduling (CAS) in
    the intel_pstate driver and relocate a check for out-of-band (OOB)
    platform handling in it to make it detect OOB before checking HWP
    availability (Rafael Wysocki).
 
  - Fix dbs_update() to avoid inadvertent conversions of negative integer
    values to unsigned int which causes CPU frequency selection to be
    inaccurate in some cases when the "conservative" cpufreq governor is
    in use (Jie Zhan).
 
  - Update the handling of the most recent idle intervals in the menu
    cpuidle governor to prevent useful information from being discarded
    by it in some cases and improve the prediction accuracy (Rafael
    Wysocki).
 
  - Make it possible to tell the intel_idle driver to ignore its built-in
    table of idle states for the given processor, clean up the handling
    of auto-demotion disabling on Baytrail and Cherrytrail chips in it,
    and update its MAINTAINERS entry (David Arcari, Artem Bityutskiy,
    Rafael Wysocki).
 
  - Make some cpuidle drivers use for_each_present_cpu() instead of
    for_each_possible_cpu() during initialization to avoid issues
    occurring when nosmp or maxcpus=0 are used (Jacky Bai).
 
  - Clean up the Energy Model handling code somewhat (Rafael Wysocki).
 
  - Use kfree_rcu() to simplify the handling of runtime Energy Model
    updates (Li RongQing).
 
  - Add an entry for the Energy Model framework to MAINTAINERS as
    properly maintained (Lukasz Luba).
 
  - Address RCU-related sparse warnings in the Energy Model code (Rafael
    Wysocki).
 
  - Remove ENERGY_MODEL dependency on SMP and allow it to be selected
    when DEVFREQ is set without CPUFREQ so it can be used on a wider
    range of systems (Jeson Gao).
 
  - Unify error handling during runtime suspend and runtime resume in the
    core to help drivers to implement more consistent runtime PM error
    handling (Rafael Wysocki).
 
  - Drop a redundant check from pm_runtime_force_resume() and rearrange
    documentation related to __pm_runtime_disable() (Rafael Wysocki).
 
  - Rework the handling of the "smart suspend" driver flag in the PM core
    to avoid issues hat may occur when drivers using it depend on some
    other drivers and clean up the related PM core code (Rafael Wysocki,
    Colin Ian King).
 
  - Fix the handling of devices with the power.direct_complete flag set
    if device_suspend() returns an error for at least one device to avoid
    situations in which some of them may not be resumed (Rafael Wysocki).
 
  - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
    possible deadlock that may occur if the "compressor" hibernation
    module parameter is accessed during the registration of a new
    ieee80211 device (Lizhi Xu).
 
  - Suppress sleeping parent warning in device_pm_add() in the case when
    new children are added under a device with the power.direct_complete
    set after it has been processed by device_resume() (Xu Yang).
 
  - Remove needless return in three void functions related to system
    wakeup (Zijun Hu).
 
  - Replace deprecated kmap_atomic() with kmap_local_page() in the
    hibernation core code (David Reaver).
 
  - Remove unused helper functions related to system sleep (David Alan
    Gilbert).
 
  - Clean up s2idle_enter() so it does not lock and unlock CPU offline
    in vain and update comments in it (Ulf Hansson).
 
  - Clean up broken white space in dpm_wait_for_children() (Geert
    Uytterhoeven).
 
  - Update the cpupower utility to fix lib version-ing in it and memory
    leaks in error legs, remove hard-coded values, and implement CPU
    physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
    Khan, Yiwei Lin, Zhongqiu Han).
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmfhhTYSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO16/gIAKuRiG1fFgUcUSXC1iFu42vrB/1i4wpA
 02GICACqM3K6/5jd3ct/WOU28GUgDs+xcmqH7CnMaM6y9nXEWjWarmSfFekAO+0q
 TPtQ7xTy0hBCB3he1P2uLKBJBin4Wn47U9/rvs4J7mQd5zDxTINKIiVoHg2lEE+s
 HAeSoNRb2sp5IZDm9+/LfhHNYRP1mJ97cbZlymqctGB3xgDL7qMLid/1+gFPHAQS
 4/LXj3IgyU8DpA/j5nhtpaAqjN5g2QxIUfQgADRIcESK99Y/7aAMs1/G0WhJKaay
 9yx+4/xmkGvVCZQx1DphksFLISEzltY0SFWLsoppPzBTGVEW2GQQsNI=
 =LqVy
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These are dominated by cpufreq updates which in turn are dominated by
  updates related to boost support in the core and drivers and
  amd-pstate driver optimizations.

  Apart from the above, there are some cpuidle updates including a
  rework of the most recent idle intervals handling in the venerable
  menu governor that leads to significant improvements in some
  performance benchmarks, as the governor is now more likely to predict
  a shorter idle duration in some cases, and there are updates of the
  core device power management code, mostly related to system suspend
  and resume, that should help to avoid potential issues arising when
  the drivers of devices depending on one another want to use different
  optimizations.

  There is also a usual collection of assorted fixes and cleanups,
  including removal of some unused code.

  Specifics:

   - Manage sysfs attributes and boost frequencies efficiently from
     cpufreq core to reduce boilerplate code in drivers (Viresh Kumar)

   - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
     Dhananjay Ugwekar, Imran Shaik, zuoqian)

   - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
     Bai)

   - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski)

   - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng)

   - Optimize the amd-pstate driver to avoid cases where call paths end
     up calling the same writes multiple times and needlessly caching
     variables through code reorganization, locking overhaul and tracing
     adjustments (Mario Limonciello, Dhananjay Ugwekar)

   - Make it possible to avoid enabling capacity-aware scheduling (CAS)
     in the intel_pstate driver and relocate a check for out-of-band
     (OOB) platform handling in it to make it detect OOB before checking
     HWP availability (Rafael Wysocki)

   - Fix dbs_update() to avoid inadvertent conversions of negative
     integer values to unsigned int which causes CPU frequency selection
     to be inaccurate in some cases when the "conservative" cpufreq
     governor is in use (Jie Zhan)

   - Update the handling of the most recent idle intervals in the menu
     cpuidle governor to prevent useful information from being discarded
     by it in some cases and improve the prediction accuracy (Rafael
     Wysocki)

   - Make it possible to tell the intel_idle driver to ignore its
     built-in table of idle states for the given processor, clean up the
     handling of auto-demotion disabling on Baytrail and Cherrytrail
     chips in it, and update its MAINTAINERS entry (David Arcari, Artem
     Bityutskiy, Rafael Wysocki)

   - Make some cpuidle drivers use for_each_present_cpu() instead of
     for_each_possible_cpu() during initialization to avoid issues
     occurring when nosmp or maxcpus=0 are used (Jacky Bai)

   - Clean up the Energy Model handling code somewhat (Rafael Wysocki)

   - Use kfree_rcu() to simplify the handling of runtime Energy Model
     updates (Li RongQing)

   - Add an entry for the Energy Model framework to MAINTAINERS as
     properly maintained (Lukasz Luba)

   - Address RCU-related sparse warnings in the Energy Model code
     (Rafael Wysocki)

   - Remove ENERGY_MODEL dependency on SMP and allow it to be selected
     when DEVFREQ is set without CPUFREQ so it can be used on a wider
     range of systems (Jeson Gao)

   - Unify error handling during runtime suspend and runtime resume in
     the core to help drivers to implement more consistent runtime PM
     error handling (Rafael Wysocki)

   - Drop a redundant check from pm_runtime_force_resume() and rearrange
     documentation related to __pm_runtime_disable() (Rafael Wysocki)

   - Rework the handling of the "smart suspend" driver flag in the PM
     core to avoid issues hat may occur when drivers using it depend on
     some other drivers and clean up the related PM core code (Rafael
     Wysocki, Colin Ian King)

   - Fix the handling of devices with the power.direct_complete flag set
     if device_suspend() returns an error for at least one device to
     avoid situations in which some of them may not be resumed (Rafael
     Wysocki)

   - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
     possible deadlock that may occur if the "compressor" hibernation
     module parameter is accessed during the registration of a new
     ieee80211 device (Lizhi Xu)

   - Suppress sleeping parent warning in device_pm_add() in the case
     when new children are added under a device with the
     power.direct_complete set after it has been processed by
     device_resume() (Xu Yang)

   - Remove needless return in three void functions related to system
     wakeup (Zijun Hu)

   - Replace deprecated kmap_atomic() with kmap_local_page() in the
     hibernation core code (David Reaver)

   - Remove unused helper functions related to system sleep (David Alan
     Gilbert)

   - Clean up s2idle_enter() so it does not lock and unlock CPU offline
     in vain and update comments in it (Ulf Hansson)

   - Clean up broken white space in dpm_wait_for_children() (Geert
     Uytterhoeven)

   - Update the cpupower utility to fix lib version-ing in it and memory
     leaks in error legs, remove hard-coded values, and implement CPU
     physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
     Khan, Yiwei Lin, Zhongqiu Han)"

* tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits)
  PM: sleep: Fix bit masking operation
  dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
  dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
  cpufreq: Init cpufreq only for present CPUs
  PM: sleep: Fix handling devices with direct_complete set on errors
  cpuidle: Init cpuidle only for present CPUs
  PM: clk: Remove unused pm_clk_remove()
  PM: sleep: core: Fix indentation in dpm_wait_for_children()
  PM: s2idle: Extend comment in s2idle_enter()
  PM: s2idle: Drop redundant locks when entering s2idle
  PM: sleep: Remove unused pm_generic_ wrappers
  cpufreq: tegra186: Share policy per cluster
  cpupower: Make lib versioning scheme more obvious and fix version link
  PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
  PM: EM: Address RCU-related sparse warnings
  cpupower: Implement CPU physical core querying
  pm: cpupower: remove hard-coded topology depth values
  pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology
  ...
2025-03-25 15:00:18 -07:00
Linus Torvalds
2d09a9449e arm64 updates for 6.15:
Perf and PMUs:
 
  - Support for the "Rainier" CPU PMU from Arm
 
  - Preparatory driver changes and cleanups that pave the way for BRBE
    support
 
  - Support for partial virtualisation of the Apple-M1 PMU
 
  - Support for the second event filter in Arm CSPMU designs
 
  - Minor fixes and cleanups (CMN and DWC PMUs)
 
  - Enable EL2 requirements for FEAT_PMUv3p9
 
 Power, CPU topology:
 
  - Support for AMUv1-based average CPU frequency
 
  - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds
    a generic topology_is_primary_thread() function overridden by x86 and
    powerpc
 
 New(ish) features:
 
  - MOPS (memcpy/memset) support for the uaccess routines
 
 Security/confidential compute:
 
  - Fix the DMA address for devices used in Realms with Arm CCA. The
    CCA architecture uses the address bit to differentiate between shared
    and private addresses
 
  - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
    default
 
 Memory management clean-ups:
 
  - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs
 
  - Some minor page table accessor clean-ups
 
  - PIE/POE (permission indirection/overlay) helpers clean-up
 
 Kselftests:
 
  - MTE: skip hugetlb tests if MTE is not supported on such mappings and
    user correct naming for sync/async tag checking modes
 
 Miscellaneous:
 
  - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
    request)
 
  - Sysreg updates for new register fields
 
  - CPU type info for some Qualcomm Kryo cores
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmfjB2QACgkQa9axLQDI
 XvGrfg//W3Bx9+jw1G/XHHEQqGEVFmvltvxZUkvgV0Qki0rPSMnappJhZRL9n0Nm
 V6PvGd2KoKHZuL3g5ViZb3cs2R9BiD2JB6PncwBKuxumHGh3vz3kk1JMkDVfWdHv
 qAceOckFJD9rXjPZn+PDsfYiEi2i3RRWIP5VglZ14ue8j3prHQ6DJXLUQF2GYvzE
 /bgLSq44wp5N59ddy23+qH9rxrHzz3bgpbVv/F56W/LErvE873mRmyFwiuGJm+M0
 Pn8ra572rI6a4sgSwrMTeNPBU+F9o5AbqwauVhkz428RdMvgfEuW6qHUBnGWJDmt
 HotXmu+4Eb2KJks/iQkDo4OTJ38yUqvvZZJtP171ms3E4yqESSJngWP6O2A6LF+y
 xhe0sESF/Ew6jLhM6/hvOmBcE2AyB14JE3ymqLkXbWub4NXddBn2AF1WXFjF4CBw
 F8KSUhNLekrCYKv1k9M3nhvkcpoS9FkTF/TI+zEg546alI/GLPih6uDRkgMAODh1
 RDJYixHsf2NDDRQbfwvt9Xua/KKpDF6qNkHLA4OiqqVUwh1hkas24Lrnp8vmce4o
 wIpWCLqYWey8Rl3XWuWgWz2Xu58fHH4Dl2k72Z8I0pwp3abCDa9xEj79G0Svk7Si
 Q+FCYrNlpKee1RXBC+1MUD/Gl5r/28dEUFkAzPD80F7AgafXPd0=
 =Kc9c
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Nothing major this time around.

  Apart from the usual perf/PMU updates, some page table cleanups, the
  notable features are average CPU frequency based on the AMUv1
  counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in
  the uaccess routines.

  Perf and PMUs:

   - Support for the 'Rainier' CPU PMU from Arm

   - Preparatory driver changes and cleanups that pave the way for BRBE
     support

   - Support for partial virtualisation of the Apple-M1 PMU

   - Support for the second event filter in Arm CSPMU designs

   - Minor fixes and cleanups (CMN and DWC PMUs)

   - Enable EL2 requirements for FEAT_PMUv3p9

  Power, CPU topology:

   - Support for AMUv1-based average CPU frequency

   - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It
     adds a generic topology_is_primary_thread() function overridden by
     x86 and powerpc

  New(ish) features:

   - MOPS (memcpy/memset) support for the uaccess routines

  Security/confidential compute:

   - Fix the DMA address for devices used in Realms with Arm CCA. The
     CCA architecture uses the address bit to differentiate between
     shared and private addresses

   - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
     default

  Memory management clean-ups:

   - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs

   - Some minor page table accessor clean-ups

   - PIE/POE (permission indirection/overlay) helpers clean-up

  Kselftests:

   - MTE: skip hugetlb tests if MTE is not supported on such mappings
     and user correct naming for sync/async tag checking modes

  Miscellaneous:

   - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
     request)

   - Sysreg updates for new register fields

   - CPU type info for some Qualcomm Kryo cores"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits)
  arm64: mm: Don't use %pK through printk
  perf/arm_cspmu: Fix missing io.h include
  arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
  arm64: cputype: Add MIDR_CORTEX_A76AE
  arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
  arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
  arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
  arm64/sysreg: Enforce whole word match for open/close tokens
  arm64/sysreg: Fix unbalanced closing block
  arm64: Kconfig: Enable HOTPLUG_SMT
  arm64: topology: Support SMT control on ACPI based system
  arch_topology: Support SMT control for OF based system
  cpu/SMT: Provide a default topology_is_primary_thread()
  arm64/mm: Define PTDESC_ORDER
  perf/arm_cspmu: Add PMEVFILT2R support
  perf/arm_cspmu: Generalise event filtering
  perf/arm_cspmu: Move register definitons to header
  arm64/kernel: Always use level 2 or higher for early mappings
  arm64/mm: Drop PXD_TABLE_BIT
  arm64/mm: Check pmd_table() in pmd_trans_huge()
  ...
2025-03-25 13:16:16 -07:00
Rafael J. Wysocki
7a6589f1aa ARM cpufreq updates for 6.15
- manage sysfs attributes and boost frequencies efficiently from cpufreq
   core to reduce boilerplate code from drivers (Viresh Kumar).
 
 - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
   Dhananjay Ugwekar, Imran Shaik, and zuoqian).
 
 - Migrate to using for_each_present_cpu (Jacky Bai).
 
 - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
 
 - Use str_enable_disable() helper (Lifeng Zheng).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmfXwv8ACgkQ0rkcPK6B
 Ehyx9xAAvPbByizNlClE4Hp8Dg5sv15fpN1klvQuhVbMLQsFeTHryE2+pw/kWZYz
 mmDQoFjUVOBjr8f4ixznlGaUjU2Of6pSS5JZi4G8hRP5aGvDXEielSK/P2AFLRvb
 e33J5/Efb4DEilQVbS1oQfg1kpVZ53bVhtz8CGY/Yk1Dfh/IoUzlM9CCMKUI2h+P
 c12DyGzNeaH9Ne4A4SKcAG//JzOWUc12OAxt4M0a71T4/Hn0qiRb/pz0xVGv8rfa
 0CfSDyFs7fxt4BWHzqHa1q9a6Zvel7Mib0aWqKa9F5ptDzNkFphb6UN0WuaKeBmw
 LHmaxoq7Cn9xWlxK2sHHOGak6CksaBmSFNkrmulXxO9o0Bpt8nqaaXUp2BjqwLlG
 fcnwGGlYCp46mmTDX4NQXYAQpw4Iy7qMpKSlzbjPq/cXsYvlJ6O4+6OHHtVZtj+I
 exjs2HTDe+tS2DEkRSBRtxNYwndhKsnGeRndtSx7oTb9zJGDUPTqIBKH1VuGwHsY
 NBbR5y+b2cRz2LYI1SkurGIKLKEuYP9luR+sxCsqFl7R+yKoSnvPfQk0BXu31rgf
 l+9o/8qYzuZED79unhmqCsTsQCpH+7/s00J1ZOtdZbwNGT78+WuxcP8X2ClAJ8nH
 joJrU0XUBJhexwoDHp4nInpp1k8yjDDGwbRGJK4ZDbZAKCxJZ7k=
 =fyCe
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.15 from Viresh Kumar:

"- manage sysfs attributes and boost frequencies efficiently from cpufreq
   core to reduce boilerplate code from drivers (Viresh Kumar).

 - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
   Dhananjay Ugwekar, Imran Shaik, and zuoqian).

 - Migrate to using for_each_present_cpu (Jacky Bai).

 - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).

 - Use str_enable_disable() helper (Lifeng Zheng)."

* tag 'cpufreq-arm-updates-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (59 commits)
  dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
  dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
  cpufreq: Init cpufreq only for present CPUs
  cpufreq: tegra186: Share policy per cluster
  cpufreq: tegra194: Allow building for Tegra234
  cpufreq: enable 1200Mhz clock speed for armada-37xx
  cpufreq: Remove cpufreq_enable_boost_support()
  cpufreq: staticize policy_has_boost_freq()
  cpufreq: qcom: Set .set_boost directly
  cpufreq: dt: Set .set_boost directly
  cpufreq: scmi: Set .set_boost directly
  cpufreq: powernv: Set .set_boost directly
  cpufreq: loongson: Set .set_boost directly
  cpufreq: apple: Set .set_boost directly
  cpufreq: Restrict enabling boost on policies with no boost frequencies
  cpufreq: cppc: Set policy->boost_supported
  cpufreq: amd: Set policy->boost_supported
  cpufreq: acpi: Set policy->boost_supported
  ...
2025-03-22 15:00:00 +01:00
Jacky Bai
45f589b716 cpufreq: Init cpufreq only for present CPUs
for_each_possible_cpu() is currently used to initialize cpufreq.
However, in cpu_dev_register_generic(), for_each_present_cpu()
is used to register CPU devices which means the CPU devices are
only registered for present CPUs and not all possible CPUs.

With nosmp or maxcpus=0, only the boot CPU is present, lead
to the cpufreq probe failure or defer probe due to no cpu device
available for not present CPUs.

Change for_each_possible_cpu() to for_each_present_cpu() in the
above cpufreq drivers to ensure it only registers cpufreq for
CPUs that are actually present.

Fixes: b0c69e1214 ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES")
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-03-17 11:29:18 +05:30
Aaron Kling
be4ae8c194 cpufreq: tegra186: Share policy per cluster
This functionally brings tegra186 in line with tegra210 and tegra194,
sharing a cpufreq policy between all cores in a cluster.

Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-03-10 11:20:05 +05:30
Rafael J. Wysocki
7983a0b565 Merge back earlier cpufreq material for 6.15 2025-03-06 21:30:26 +01:00
Mario Limonciello
efb758c8c8 cpufreq/amd-pstate: Drop actions in amd_pstate_epp_cpu_offline()
When the CPU goes offline there is no need to change the CPPC request
because the CPU will go into the deepest C-state it supports already.

Actually changing the CPPC request when it goes offline messes up the
cached values and can lead to the wrong values being restored when
it comes back.

Instead drop the actions and if the CPU comes back online let
amd_pstate_epp_set_policy() restore it to expected values.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:26 -06:00
Mario Limonciello
4e16c11752 cpufreq/amd-pstate: Stop caching EPP
EPP values are cached in the cpudata structure per CPU. This is needless
though because they are also cached in the CPPC request variable.

Drop the separate cache for EPP values and always reference the CPPC
request variable when needed.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:26 -06:00
Mario Limonciello
2064543f5b cpufreq/amd-pstate: Rework CPPC enabling
The CPPC enable register is configured as "write once".  That is
any future writes don't actually do anything.

Because of this, all the cleanup paths that currently exist for
CPPC disable are non-effective.

Rework CPPC enable to only enable after all the CAP registers have
been read to avoid enabling CPPC on CPUs with invalid _CPC or
unpopulated MSRs.

As the register is write once, remove all cleanup paths as well.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
93039a60fb cpufreq/amd-pstate: Drop debug statements for policy setting
There are trace events that exist now for all amd-pstate modes that
will output information right before programming to the hardware.

This makes the existing debug statements unnecessary remaining
overhead.  Drop them.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
1905fac6f9 cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
On EPP only writes update the cached variable so that the min/max
performance controls don't need to be updated again.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
77fbea69b0 cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions
The EPP tracing is done by the caller today, but this precludes the
information about whether the CPPC request has changed.

Move it into the update_perf and set_epp functions and include information
about whether the request has changed from the last one.
amd_pstate_update_perf() and amd_pstate_set_epp() now require the policy
as an argument instead of the cpudata.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
9f5daa2f2f cpufreq/amd-pstate: Cache CPPC request in shared mem case too
In order to prevent a potential write for shmem_update_perf()
cache the request into the cppc_req_cached variable normally only
used for the MSR case.

This adds symmetry into the code and potentially avoids extra writes.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
b4cc466b97 cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
Bitfield masks are easier to follow and less error prone.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
c630458c7a cpufreq/amd-pstate-ut: Adjust variable scope
In amd_pstate_ut_check_freq() and amd_pstate_ut_check_perf() the cpudata
variable is only needed in the scope of the for loop. Move it there.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
2aac38ac06 cpufreq/amd-pstate-ut: Run on all of the correct CPUs
If a CPU is missing a policy or one has been offlined then the unit test
is skipped for the rest of the CPUs on the system.

Instead; iterate online CPUs and skip any missing policies to allow
continuing to test the rest of them.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
a7875346c6 cpufreq/amd-pstate-ut: Drop SUCCESS and FAIL enums
Enums are effectively used as a boolean and don't show
the return value of the failing call.

Instead of using enums switch to returning the actual return
code from the unit test.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
66030cc1c5 cpufreq/amd-pstate-ut: Allow lowest nonlinear and lowest to be the same
Several Ryzen AI processors support the exact same value for lowest
nonlinear perf and lowest perf.  Loosen up the unit tests to allow this
scenario.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
93984d3cea cpufreq/amd-pstate-ut: Use _free macro to free put policy
Using a scoped cleanup macro simplifies cleanup code.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
f458cf79d7 cpufreq/amd-pstate: Drop cppc_cap1_cached
The `cppc_cap1_cached` variable isn't used at all, there is no
need to read it at initialization for each CPU.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello
6f0b13f16f cpufreq/amd-pstate: Overhaul locking
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
update the policy state and have nothing to do with the amd-pstate
driver itself.

A global "limits" lock doesn't make sense because each CPU can have
policies changed independently.  Each time a CPU changes values they
will atomically be written to the per-CPU perf member. Drop per CPU
locking cases.

The remaining "global" driver lock is used to ensure that only one
entity can change driver modes at a given time.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello
009d1c29a4 cpufreq/amd-pstate: Move perf values into a union
By storing perf values in a union all the writes and reads can
be done atomically, removing the need for some concurrency protections.

While making this change, also drop the cached frequency values,
using inline helpers to calculate them on demand from perf value.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello
a9b9b4c2a4 cpufreq/amd-pstate: Drop min and max cached frequencies
Use the perf_to_freq helpers to calculate this on the fly.
As the members are no longer cached add an extra check into
amd_pstate_epp_update_limit() to avoid unnecessary calls in
amd_pstate_update_min_max_limit().

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello
a9ba0fd452 cpufreq/amd-pstate: Show a warning when a CPU fails to setup
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
populated.  This is an unexpected behavior that is most likely a
BIOS bug. In the event it happens I'd like users to report bugs
to properly root cause and get this fixed.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello
b7a4115658 cpufreq/amd-pstate: Invalidate cppc_req_cached during suspend
During resume it's possible the firmware didn't restore the CPPC request
MSR but the kernel thinks the values line up. This leads to incorrect
performance after resume from suspend.

To fix the issue invalidate the cached value at suspend. During resume use
the saved values programmed as cached limits.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Dhananjay Ugwekar
a1d1d8fb65 cpufreq/amd-pstate: Fix the clamping of perf values
The clamping in freq_to_perf() is broken right now, as we first typecast
(read wraparound) the overflowing value into a u8 and then clamp it down.
So, use a u32 to store the >255 value in certain edge cases and then clamp
it down into a u8.

Also, use a "explicit typecast + clamp" instead of just a "clamp_t" as the
latter typecasts first and then clamps between the limits, which defeats
our purpose.

Fixes: 620136ced3 ("cpufreq/amd-pstate: Modularize perf<->freq conversion")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250222033221.554976-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:17 -06:00
Aaron Kling
4a1e3bf61f cpufreq: tegra194: Allow building for Tegra234
Support was added for Tegra234 in the referenced commit, but the Kconfig
was not updated to allow building for the arch.

Fixes: 273bc890a2 ("cpufreq: tegra194: Add support for Tegra234")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-03-04 16:15:25 +05:30
Pawan Gupta
b52aaeeadf cpufreq: intel_pstate: Avoid SMP calls to get cpu-type
Intel pstate driver relies on SMP calls to get the cpu-type of a given CPU.
Remove the SMP calls and instead use the cached value of cpu-type which is
more efficient.

[ mingo: Forward ported it. ]

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-2-2ae010f50370@linux.intel.com
2025-02-27 13:34:52 +01:00
Michael Ellerman
16479389cf cpufreq: ppc_cbe: Remove powerpc Cell driver
This driver can no longer be built since support for IBM Cell Blades was
removed, in particular CBE_RAS.

Remove the driver.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20241218105523.416573-22-mpe@ellerman.id.au
2025-02-26 21:15:09 +05:30
Dhananjay Ugwekar
3e93edc58a cpufreq/amd-pstate: Remove the unncecessary driver_lock in amd_pstate_update_limits
There is no need to take a driver wide lock while updating the
highest_perf value in the percpu cpudata struct. Hence remove it.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-13-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
97a705dc1a cpufreq/amd-pstate: Use scope based cleanup for cpufreq_policy refs
There have been instances in past where refcount decrementing is missed
while exiting a function. Use automatic scope based cleanup to avoid
such errors.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-12-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
426db24d4d cpufreq/amd-pstate: Add missing NULL ptr check in amd_pstate_update
Check if policy is NULL before dereferencing it in amd_pstate_update.

Fixes: e8f555daac ("cpufreq/amd-pstate: fix setting policy current frequency value")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-11-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
b899434857 cpufreq/amd-pstate: Remove the unnecessary cpufreq_update_policy call
The update_limits callback is only called in two conditions.

* When the preferred core rankings change. In which case, we just need to
change the prefcore ranking in the cpudata struct. As there are no changes
to any of the perf values, there is no need to call cpufreq_update_policy()

* When the _PPC ACPI object changes, i.e. the highest allowed Pstate
changes. The _PPC object is only used for a table based cpufreq driver
like acpi-cpufreq, hence is irrelevant for CPPC based amd-pstate.

Hence, the cpufreq_update_policy() call becomes unnecessary and can be
removed.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-9-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
620136ced3 cpufreq/amd-pstate: Modularize perf<->freq conversion
Delegate the perf<->frequency conversion to helper functions to reduce
code duplication, and improve readability.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-8-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
555bbe67a6 cpufreq/amd-pstate: Convert all perf values to u8
All perf values are always within 0-255 range, hence convert their
datatype to u8 everywhere.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-7-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
e9869c836b cpufreq/amd-pstate: Pass min/max_limit_perf as min/max_perf to amd_pstate_update
Currently, amd_pstate_update_freq passes the hardware perf limits as
min/max_perf to amd_pstate_update, which eventually gets programmed into
the min/max_perf fields of the CPPC_REQ register.

Instead pass the effective perf limits i.e. min/max_limit_perf values to
amd_pstate_update as min/max_perf.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-6-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
932da64896 cpufreq/amd-pstate: Remove the redundant des_perf clamping in adjust_perf
des_perf is later on clamped between min_perf and max_perf in
amd_pstate_update. So, remove the redundant clamping from
amd_pstate_adjust_perf.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-5-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar
6ceb877d5c cpufreq/amd-pstate: Modify the min_perf calculation in adjust_perf callback
Instead of setting a fixed floor at lowest_nonlinear_perf, use the
min_limit_perf value, so that it gives the user the freedom to lower the
floor further.

There are two minimum frequency/perf limits that we need to consider in
the adjust_perf callback. One provided by schedutil i.e. the sg_cpu->bw_min
value passed in _min_perf arg, another is the effective value of
min_freq_qos request that is updated in cpudata->min_limit_perf. Modify the
code to use the bigger of these two values.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-4-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Rafael J. Wysocki
ed7cad0504 cpufreq: intel_pstate: Relocate platform preference check
Move the invocation of intel_pstate_platform_pwr_mgmt_exists() before
checking whether or not HWP is enabled because it does not depend on
any code running before it except for the vendor check and if CPU
performance scaling is going to be carried out by the platform, all of
the code that runs before that function (again, except for the vendor
check) is redundant.

This is not expected to alter any functionality except for the ordering
of messages printed by intel_pstate_init() when it is going to return an
error before attempting to register the driver.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/2776745.mvXUDI8C0e@rjwysocki.net
2025-02-21 18:09:21 +01:00
Jie Zhan
3698dd6b13 cpufreq: governor: Fix negative 'idle_time' handling in dbs_update()
We observed an issue that the CPU frequency can't raise up with a 100% CPU
load when NOHZ is off and the 'conservative' governor is selected.

'idle_time' can be negative if it's obtained from get_cpu_idle_time_jiffy()
when NOHZ is off.  This was found and explained in commit 9485e4ca0b
("cpufreq: governor: Fix handling of special cases in dbs_update()").

However, commit 7592019634 ("cpufreq: governors: Fix long idle detection
logic in load calculation") introduced a comparison between 'idle_time' and
'samling_rate' to detect a long idle interval.  While 'idle_time' is
converted to int before comparison, it's actually promoted to unsigned
again when compared with an unsigned 'sampling_rate'.  Hence, this leads to
wrong idle interval detection when it's in fact 100% busy and sets
policy_dbs->idle_periods to a very large value.  'conservative' adjusts the
frequency to minimum because of the large 'idle_periods', such that the
frequency can't raise up.  'Ondemand' doesn't use policy_dbs->idle_periods
so it fortunately avoids the issue.

Correct negative 'idle_time' to 0 before any use of it in dbs_update().

Fixes: 7592019634 ("cpufreq: governors: Fix long idle detection logic in load calculation")
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Link: https://patch.msgid.link/20250213035510.2402076-1-zhanjie9@hisilicon.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-20 20:27:19 +01:00
Benjamin Schneider
f2d3294202 cpufreq: enable 1200Mhz clock speed for armada-37xx
This frequency was disabled because of stability problems whose source could
not be accurately identified[1]. After seven months of testing, the evidence
points to an incorrectly configured bootloader as the source of the historical
instability. Testing was performed on two A3720 devices with this frequency
enabled and the ondemand policy in use. Marvell merged[2] changes to their
bootloader source needed to address the stability issue. This driver should
expose this frequency option to users.

[1] 484f2b7c61
[2] https://github.com/MarvellEmbeddedProcessors/mv-ddr-marvell/pull/44

Signed-off-by: Benjamin Schneider <ben@bens.haus>
Reviewed-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-19 11:21:46 +05:30
Rafael J. Wysocki
7802fce7dc cpufreq: intel_pstate: Make it possible to avoid enabling CAS
Capacity-aware scheduling (CAS) is enabled by default by intel_pstate on
hybrid systems without SMT, but in some usage scenarios it may be more
attractive to place tasks for maximum CPU performance regardless of the
extra cost in terms of energy, which is the case on such systems when
CAS is not enabled, so introduce a command line option to forbid
intel_pstate to enable CAS.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by:Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/2781262.mvXUDI8C0e@rjwysocki.net
2025-02-18 20:47:33 +01:00
Beata Michalska
fbb4a4759b cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
Currently the CPUFreq core exposes two sysfs attributes that can be used
to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
and scaling_cur_freq. Both provide slightly different view on the
subject and they do come with their own drawbacks.

cpuinfo_cur_freq provides higher precision though at a cost of being
rather expensive. Moreover, the information retrieved via this attribute
is somewhat short lived as frequency can change at any point of time
making it difficult to reason from.

scaling_cur_freq, on the other hand, tends to be less accurate but then
the actual level of precision (and source of information) varies between
architectures making it a bit ambiguous.

The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
distinct interface, exposing an average frequency of a given CPU(s), as
reported by the hardware, over a time frame spanning no more than a few
milliseconds. As it requires appropriate hardware support, this
interface is optional.

Note that under the hood, the new attribute relies on the information
provided by arch_freq_get_on_cpu, which, up to this point, has been
feeding data for scaling_cur_freq attribute, being the source of
ambiguity when it comes to interpretation. This has been amended by
restoring the intended behavior for scaling_cur_freq, with a new
dedicated config option to maintain status quo for those, who may need
it.

CC: Jonathan Corbet <corbet@lwn.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Dave Hansen <dave.hansen@linux.intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Phil Auld <pauld@redhat.com>
CC: x86@kernel.org
CC: linux-doc@vger.kernel.org

Signed-off-by: Beata Michalska <beata.michalska@arm.com>
Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20250131162439.3843071-3-beata.michalska@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-17 18:09:31 +00:00
Beata Michalska
38e480d4fc cpufreq: Allow arch_freq_get_on_cpu to return an error
Allow arch_freq_get_on_cpu to return an error for cases when retrieving
current CPU frequency is not possible, whether that being due to lack of
required arch support or due to other circumstances when the current
frequency cannot be determined at given point of time.

Signed-off-by: Beata Michalska <beata.michalska@arm.com>
Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20250131162439.3843071-2-beata.michalska@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-17 18:09:20 +00:00
Viresh Kumar
0322f3e89b cpufreq: Remove cpufreq_enable_boost_support()
Remove the now unused helper, cpufreq_enable_boost_support().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:16 +05:30
Viresh Kumar
c952775a3d cpufreq: staticize policy_has_boost_freq()
policy_has_boost_freq() isn't used outside of freq_table.c now, mark it
static.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:16 +05:30
Viresh Kumar
e8b08af135 cpufreq: qcom: Set .set_boost directly
The boost feature can be controlled at two levels currently, driver
level (applies to all policies) and per-policy.

Currently the driver enables driver level boost support from the
per-policy ->init() callback, which isn't really efficient as that gets
called for each policy and then there is online/offline path too where
this gets done unnecessarily.

Instead set the .set_boost field directly and always enable the boost
support. If a policy doesn't support boost feature, the core will not
enable it for that policy.

Keep the initial state of driver level boost to disabled and let the
user enable it if required as ideally the boost frequencies must be used
only when really required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:16 +05:30
Viresh Kumar
707e222314 cpufreq: dt: Set .set_boost directly
The boost feature can be controlled at two levels currently, driver
level (applies to all policies) and per-policy.

Currently the driver enables driver level boost support from the
per-policy ->init() callback, which isn't really efficient as that gets
called for each policy and then there is online/offline path too where
this gets done unnecessarily.

Instead set the .set_boost field directly and always enable the boost
support. If a policy doesn't support boost feature, the core will not
enable it for that policy.

Keep the initial state of driver level boost to disabled and let the
user enable it if required as ideally the boost frequencies must be used
only when really required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
11847a5c12 cpufreq: scmi: Set .set_boost directly
The boost feature can be controlled at two levels currently, driver
level (applies to all policies) and per-policy.

Currently the driver enables driver level boost support from the
per-policy ->init() callback, which isn't really efficient as that gets
called for each policy and then there is online/offline path too where
this gets done unnecessarily.

Instead set the .set_boost field directly and always enable the boost
support. If a policy doesn't support boost feature, the core will not
enable it for that policy.

Keep the initial state of driver level boost to disabled and let the
user enable it if required as ideally the boost frequencies must be used
only when really required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
2025-02-07 09:45:15 +05:30
Viresh Kumar
3fd9203778 cpufreq: powernv: Set .set_boost directly
The boost feature can be controlled at two levels currently, driver
level (applies to all policies) and per-policy.

Currently the driver enables driver level boost support from the
per-policy ->init() callback, which isn't really efficient as that gets
called for each policy and then there is online/offline path too where
this gets done unnecessarily.

Instead set the .set_boost field directly and always enable the boost
support. If a policy doesn't support boost feature, the core will not
enable it for that policy.

Keep the initial state of driver level boost to disabled and let the
user enable it if required as ideally the boost frequencies must be used
only when really required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
13e92357b6 cpufreq: loongson: Set .set_boost directly
The boost feature can be controlled at two levels currently, driver
level (applies to all policies) and per-policy.

Currently the driver enables driver level boost support from the
per-policy ->init() callback, which isn't really efficient as that gets
called for each policy and then there is online/offline path too where
this gets done unnecessarily.

Instead set the .set_boost field directly and always enable the boost
support. If a policy doesn't support boost feature, the core will not
enable it for that policy.

Keep the initial state of driver level boost to disabled and let the
user enable it if required as ideally the boost frequencies must be used
only when really required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
ddef17bb86 cpufreq: apple: Set .set_boost directly
The boost feature can be controlled at two levels currently, driver
level (applies to all policies) and per-policy.

Currently the driver enables driver level boost support from the
per-policy ->init() callback, which isn't really efficient as that gets
called for each policy and then there is online/offline path too where
this gets done unnecessarily.

Instead set the .set_boost field directly and always enable the boost
support. If a policy doesn't support boost feature, the core will not
enable it for that policy.

Keep the initial state of driver level boost to disabled and let the
user enable it if required as ideally the boost frequencies must be used
only when really required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
691b321278 cpufreq: Restrict enabling boost on policies with no boost frequencies
It is possible to have a scenario where not all cpufreq policies support
boost frequencies. And letting sysfs (or other parts of the kernel)
enable boost feature for that policy isn't correct.

Now that all drivers (that required a change) are updated to set the
policy->boost_supported properly, check this flag before enabling boost
feature for a policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
a3f48fb2e5 cpufreq: cppc: Set policy->boost_supported
With a later commit, the cpufreq core will call the ->set_boost()
callback only if the policy supports boost frequency. The
boost_supported flag is set by the cpufreq core if policy->freq_table is
set and one or more boost frequencies are present.

For other drivers, the flag must be set explicitly.

With this, the local variable boost_supported isn't required anymore.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
98f39e93d1 cpufreq: amd: Set policy->boost_supported
With a later commit, the cpufreq core will call the ->set_boost()
callback only if the policy supports boost frequency. The
boost_supported flag is set by the cpufreq core if policy->freq_table is
set and one or more boost frequencies are present.

For other drivers, the flag must be set explicitly.

The policy->boost_enabled flag is set by the cpufreq core once the
policy is initialized, don't set it anymore.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
be6b8681a0 cpufreq: acpi: Set policy->boost_supported
With a later commit, the cpufreq core will call the ->set_boost()
callback only if the policy supports boost frequency. The
boost_supported flag is set by the cpufreq core if policy->freq_table is
set and one or more boost frequencies are present.

For other drivers, the flag must be set explicitly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Viresh Kumar
1f7d1bab50 cpufreq: Introduce policy->boost_supported flag
It is possible to have a scenario where not all cpufreq policies support
boost frequencies. And letting sysfs (or other parts of the kernel)
enable boost feature for that policy isn't correct.

Add a new flag, boost_supported, which will be set to true by the
cpufreq core only if the freq table contains valid boost frequencies.

Some cpufreq drivers though don't have boost frequencies in the
freq-table, they can set this flag from their ->init() callbacks.

Once all the drivers are updated to set the flag correctly, we can check
it before enabling boost feature for a policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
9a23eb8b2b cpufreq: Export cpufreq_boost_set_sw()
This will be used directly by cpufreq driver going forward, export it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
1f04815057 cpufreq: staticize cpufreq_boost_trigger_state()
cpufreq_boost_trigger_state() is only used by cpufreq core, mark it
static.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
38bcdb635a cpufreq: Stop checking for duplicate available/boost freq attributes
None of the drivers set these attributes directly now, remove the
unnecessary check.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
486729c601 cpufreq: Remove cpufreq_generic_attrs
All users of cpufreq_generic_attr are migrated now, remove it. While at
it, also stop exporting attributes for available and boost frequencies
as they are only used by cpufreq core now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
0df09bf56e cpufreq: virtual: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
260d6cdc7b cpufreq: vexpress: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
f577fab0cc cpufreq: tegra: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
63c778aa15 cpufreq: speedstep: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:14 +05:30
Viresh Kumar
c3245e78b5 cpufreq: spear: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
7b748fa7f3 cpufreq: sh: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
ad3f116fe3 cpufreq: scpi: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
50b8cd5c91 cpufreq: scmi: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
e2079dcc2b cpufreq: sc520_freq: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
e382146efa cpufreq: qoriq: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
ac0bcf38f3 cpufreq: qcom: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
792e6a8ec2 cpufreq: powernv: Stop setting common freq attributes
The cpufreq core handles this now, the driver can skip setting it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
5b6fc62eff cpufreq: powernow: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
6cdc8c3ca9 cpufreq: pmac: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:13 +05:30
Viresh Kumar
d3d57f9d2e cpufreq: pasemi: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
047124e431 cpufreq: p4: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
ef282f6bef cpufreq: omap: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
1a867c7ce6 cpufreq: mediatek: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
06e9a34aa8 cpufreq: loongson: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
d4a3b9572b cpufreq: longhaul: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
25e4d8c131 cpufreq: kirkwood: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
03973e997f cpufreq: imx6q: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
32ada732b6 cpufreq: elanfreq: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
b9b60007e6 cpufreq: e_powersaver: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
6f80f75511 cpufreq: davinci: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:12 +05:30
Viresh Kumar
80f9f241bb cpufreq: brcmstb: Stop setting common freq attributes
The cpufreq core handles this now, the driver can skip setting it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:11 +05:30
Viresh Kumar
818c3748ad cpufreq: bmips: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:11 +05:30
Viresh Kumar
8b04d1435f cpufreq: apple: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:11 +05:30
Viresh Kumar
5c840223ab cpufreq: acpi: Stop setting common freq attributes
The core handles this now, the driver can skip setting it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:11 +05:30
Viresh Kumar
991e0a064b cpufreq: dt: Stop setting cpufreq_driver->attr field
The cpufreq core now handles this for basic attributes, including boost
frequencies, the driver can skip setting them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:11 +05:30
Viresh Kumar
dc47f23f1d cpufreq: Always create freq-table related sysfs file
Currently it is left for the individual drivers to set the available and
boost frequencies related attributes in the cpufreq_driver->attr field.
Some drivers provide them, while others don't.

A quick search revealed that only the drivers that set the
policy->freq_table field, enable these attributes. Which makes sense as
well, since the show_available_freqs() helper works only if the
freq_table is present.

In order to simplify drivers, create the relevant sysfs files forcefully
from cpufreq core.

For now, skip adding them twice. This can be removed once all the
drivers are updated.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2025-02-07 09:45:11 +05:30
Lifeng Zheng
4ba6d37ccc cpufreq: Use str_enable_disable() helper
Commit f994c1cb6c ("cpufreq: Use str_enable_disable()-like helpers") has
already introduced helpers from string_choices.h and replaced ternary
syntax with it. Use str_enable_disable() helper in this line to stay
consistent.

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:02 +05:30
Rafael J. Wysocki
b3cc5afc4d amd-pstate fixes 2/6/25
* Fix some error cleanup paths with mutex use and boost
 * Fix a ref counting issue
 * Fix a schedutil issue
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmelDV8aHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3Anbo3g/+IiTRN6zM+LfsdbHKNFzx
 X7ksJYlUPIPyF/osIUMuWvphQRucd5NEB44kaM60Iwc+2xcQmZFTL8DadIDEbYKY
 0y+Qm9bPPYWTDUrczSwqoYrGU5XP7jWFbtIt6HVfBhiFshL1w0kbj4c7VSxRBI4h
 vzMHu2WWE+qJ2520gCMXXlhQy5bu7o3VG+qNQexl32/VnZWCTeFp2G+DKzBZ61sL
 hLlXdgokNrR2MIHtEKZG9mMQ6Vvuov/P8d4D6I2wtYKc2I7JMdJd/tjIOdvjraAi
 7P9+MOnfGyUejG6m6k6u0bxDR+kWCGQXbZ+JRBJoTdDSI1STyxq4i2UoTFBHQ2M+
 xOnGOHFKWyN8OXvRIAhdnfAKTqrIn/wOSxsMMI3/d1yvFkV7CFG79R1hpipVMSxx
 WPi9a03mMP9nzt+3MsAu0O7Ma1F6RKe5O4rwDsbIueMZAbkEp54DRJq+mJEeqiEW
 mZffJOMkz+XzpEAZ4fCzHjOA09FLu0KnJclBUOzlWVZr+Dyt9dP+k8mNiBJSUoWO
 uxKMz7lwC/2Z0zSVJWnMMN66tkkhrCXg2PM9/Akx74UEmHaODUJHrRxam+SoMXta
 y1KCIXmNoNHnvPjWEE2H+TjBwVKjB4h22GpyRmpaTD/z968jUXLFcZGmgntryRWS
 NfDVzBVrCHOKix8ohNF9IO4=
 =xfUQ
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.14-2025-02-06' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge amd-pstate driver fixes for 6.14-rc2 from Mario Limonciello:

"* Fix some error cleanup paths with mutex use and boost
 * Fix a ref counting issue
 * Fix a schedutil issue"

* tag 'amd-pstate-v6.14-2025-02-06' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate: Fix cpufreq_policy ref counting
  cpufreq/amd-pstate: Fix max_perf updation with schedutil
  cpufreq/amd-pstate: Remove the goto label in amd_pstate_update_limits
  cpufreq/amd-pstate: Fix per-policy boost flag incorrect when fail
2025-02-06 20:39:43 +01:00
Dhananjay Ugwekar
3ace20038e cpufreq/amd-pstate: Fix cpufreq_policy ref counting
amd_pstate_update_limits() takes a cpufreq_policy reference but doesn't
decrement the refcount in one of the exit paths, fix that.

Fixes: 45722e777f ("cpufreq: amd-pstate: Optimize amd_pstate_update_limits()")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-10-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-06 13:19:36 -06:00
Dhananjay Ugwekar
db1cafc77a cpufreq: amd-pstate: Remove unnecessary driver_lock in set_boost
set_boost is a per-policy function call, hence a driver wide lock is
unnecessary. Also this mutex_acquire can collide with the mutex_acquire
from the mode-switch path in status_store(), which can lead to a
deadlock. So, remove it.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-06 09:04:30 +05:30
zuoqian
4742da9774 cpufreq: scpi: compare kHz instead of Hz
The CPU rate from clk_get_rate() may not be divisible by 1000
(e.g., 133333333). But the rate calculated from frequency(kHz) is
always divisible by 1000 (e.g., 133333000).
Comparing the rate causes a warning during CPU scaling:
"cpufreq: __target_index: Failed to change cpu frequency: -5".
When we choose to compare kHz here, the issue does not occur.

Fixes: 343a8d17fa ("cpufreq: scpi: remove arm_big_little dependency")
Signed-off-by: zuoqian <zuoqian113@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-06 09:04:30 +05:30
Aboorva Devarajan
0813fd2e14 cpufreq: prevent NULL dereference in cpufreq_online()
Ensure cpufreq_driver->set_boost is non-NULL before using it in
cpufreq_online() to prevent a potential NULL pointer dereference.

Reported-by: Gautam Menghani <gautam@linux.ibm.com>
Closes: https://lore.kernel.org/all/c9e56c5f54cc33338762c94e9bed7b5a0d5de812.camel@linux.ibm.com/
Fixes: dd016f379e ("cpufreq: Introduce a more generic way to set default per-policy boost flag")
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Link: https://patch.msgid.link/20250205181347.2079272-1-aboorvad@linux.ibm.com
[ rjw: Minor edits in the subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-05 21:02:39 +01:00
Arnd Bergmann
90508a1bb8 cpufreq: airoha: modify CONFIG_OF dependency
Compile-testing without CONFIG_OF leads to a harmless build warning:

drivers/cpufreq/airoha-cpufreq.c:109:34: error: 'airoha_cpufreq_match_list' defined but not used [-Werror=unused-const-variable=]
  109 | static const struct of_device_id airoha_cpufreq_match_list[] __initconst = {
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~

It would be possible to mark the variable as __maybe_unused to shut up
that warning, but a Kconfig dependency seems more appropriate as this still
allows build testing in allmodconfig and randconfig builds on all
architectures.

An earlier commit, b865a84046 ("cpufreq: airoha: Depends on OF"),
tried to fix it incorrectly. ARCH_AIROHA already requires CONFIG_OF, so
this change does nothing, and the dependency is still missing for the
COMPILE_TEST case.

Fix it properly.

Fixes: 84cf9e541c ("cpufreq: airoha: Add EN7581 CPUFreq SMCCC driver")
Fixes: b865a84046 ("cpufreq: airoha: Depends on OF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[ Viresh: updated commit log and fixed rebase conflict ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/9d51d2710061dfa7f2568287c6ed125b858b7318.1738580005.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-05 20:59:42 +01:00
Dhananjay Ugwekar
55db9b73c3 cpufreq/amd-pstate: Fix max_perf updation with schedutil
In adjust_perf() callback, we are setting the max_perf to highest_perf,
as opposed to the correct limit value i.e. max_limit_perf. Fix that.

Fixes: 3f7b835fa4 ("cpufreq/amd-pstate: Move limit updating code")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-3-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-05 12:18:27 -06:00
Dhananjay Ugwekar
d364eee14c cpufreq/amd-pstate: Remove the goto label in amd_pstate_update_limits
Scope based guard/cleanup macros should not be used together with goto
labels. Hence, remove the goto label.

Fixes: 6c093d5a5b ("cpufreq/amd-pstate: convert mutex use to guard()")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-2-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-05 12:18:27 -06:00
Lifeng Zheng
fa803513ab cpufreq/amd-pstate: Fix per-policy boost flag incorrect when fail
Commit c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision
boost state") sets per-policy boost flag to false when boost fail.
However, this boost flag will be set to reverse value in
store_local_boost() and cpufreq_boost_trigger_state() in cpufreq.c. This
will cause the per-policy boost flag set to true when fail to set boost.
Remove the extra assignment in amd_pstate_set_boost() and keep all
operations on per-policy boost flag outside of set_boost() to fix this
problem.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250110091949.3610770-1-zhenglifeng1@huawei.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-03 00:04:23 -06:00
Viresh Kumar
b865a84046 cpufreq: airoha: Depends on OF
The Airoha cpufreq depends on OF and must be marked as such. With the
kernel compiled without OF support, we get following warning:

drivers/cpufreq/airoha-cpufreq.c:109:34: warning: 'airoha_cpufreq_match_list' defined but not used [-Wunused-const-variable=]
    109 | static const struct of_device_id airoha_cpufreq_match_list[] __initconst = {
        |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501251941.0fXlcd1D-lkp@intel.com/
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/455e18c947bd9529701a2f1c796f0f934d1354d7.1738050679.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-29 10:56:11 +01:00
Lifeng Zheng
2b16c63183 cpufreq: ACPI: Remove set_boost in acpi_cpufreq_cpu_init()
At the end of cpufreq_online() in cpufreq.c, set_boost is executed and
the per-policy boost flag is set to mirror the cpufreq_driver boost, so
it is not necessary to run set_boost in acpi_cpufreq_cpu_init().

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250117101457.1530653-5-zhenglifeng1@huawei.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-23 21:06:33 +01:00
Lifeng Zheng
03d8b4e762 cpufreq: CPPC: Fix wrong max_freq in policy initialization
In policy initialization, policy->max and policy->cpuinfo.max_freq are
always set to the value calculated from caps->nominal_perf.

This will cause the frequency stay on base frequency even if the policy
is already boosted when a CPU is going online.

Fix this by using policy->boost_enabled to determine which value should
be set.

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250117101457.1530653-4-zhenglifeng1@huawei.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-23 21:06:33 +01:00
Lifeng Zheng
dd016f379e cpufreq: Introduce a more generic way to set default per-policy boost flag
In cpufreq_online() of cpufreq.c, the per-policy boost flag is already
set to mirror the cpufreq_driver boost during init but using freq_table
to judge if the policy has boost frequency. There are two drawbacks to
this approach:

 1. It doesn't work for the cpufreq drivers that do not use a frequency
    table. For now, acpi-cpufreq and amd-pstate have to enable boost in
    policy initialization. And cppc_cpufreq never set policy to boost
    when going online no matter what the cpufreq_driver boost flag is.

 2. If the CPU goes offline when cpufreq_driver boost is enabled and
    then goes online when cpufreq_driver boost is disabled, the
    per-policy boost flag will incorrectly remain true.

Running set_boost at the end of the online process is a more generic way
for all cpufreq drivers.

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250117101457.1530653-3-zhenglifeng1@huawei.com
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-23 21:05:54 +01:00
Lifeng Zheng
1608f02305 cpufreq: Fix re-boost issue after hotplugging a CPU
It turns out that CPUX will stay on the base frequency after performing
these operations:

 1. boost all CPUs: echo 1 > /sys/devices/system/cpu/cpufreq/boost

 2. offline one CPU: echo 0 > /sys/devices/system/cpu/cpuX/online

 3. deboost all CPUs: echo 0 > /sys/devices/system/cpu/cpufreq/boost

 4. online CPUX: echo 1 > /sys/devices/system/cpu/cpuX/online

 5. boost all CPUs again: echo 1 > /sys/devices/system/cpu/cpufreq/boost

This is because max_freq_req of the policy is not updated during the
online process, and the value of max_freq_req before the last offline is
retained.

When the CPU is boosted again, freq_qos_update_request() will do nothing
because the old value is the same as the new one. This causes the CPU to
stay at the base frequency. Updating max_freq_req  in cpufreq_online()
will solve this problem.

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250117101457.1530653-2-zhenglifeng1@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-23 21:04:21 +01:00
Viresh Kumar
43855ac614 cpufreq: s3c64xx: Fix compilation warning
The driver generates following warning when regulator support isn't
enabled in the kernel. Fix it.

   drivers/cpufreq/s3c64xx-cpufreq.c: In function 's3c64xx_cpufreq_set_target':
>> drivers/cpufreq/s3c64xx-cpufreq.c:55:22: warning: variable 'old_freq' set but not used [-Wunused-but-set-variable]
      55 |         unsigned int old_freq, new_freq;
         |                      ^~~~~~~~
>> drivers/cpufreq/s3c64xx-cpufreq.c:54:30: warning: variable 'dvfs' set but not used [-Wunused-but-set-variable]
      54 |         struct s3c64xx_dvfs *dvfs;
         |                              ^~~~

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501191803.CtfT7b2o-lkp@intel.com/
Cc: 5.4+ <stable@vger.kernel.org> # v5.4+
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/236b227e929e5adc04d1e9e7af6845a46c8e9432.1737525916.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-23 20:47:32 +01:00
Linus Torvalds
f4b9d3bf44 Power management updates for 6.14-rc1
- Use str_enable_disable()-like helpers in cpufreq (Krzysztof
    Kozlowski).
 
  - Extend the Apple cpufreq driver to support more SoCs (Hector Martin,
    Nick Chan).
 
  - Add new cpufreq driver for Airoha SoCs (Christian Marangi).
 
  - Fix using cpufreq-dt as module (Andreas Kemnade).
 
  - Minor fixes for Sparc, SCMI, and Qcom cpufreq drivers (Ethan Carter
    Edwards, Sibi Sankar, Manivannan Sadhasivam).
 
  - Fix the maximum supported frequency computation in the ACPI cpufreq
    driver to avoid relying on unfounded assumptions (Gautham Shenoy).
 
  - Fix an amd-pstate driver regression with preferred core rankings not
    being used (Mario Limonciello).
 
  - Fix a precision issue with frequency calculation in the amd-pstate
    driver (Naresh Solanki).
 
  - Add ftrace event to the amd-pstate driver for active mode (Mario
    Limonciello).
 
  - Set default EPP policy on Ryzen processors in amd-pstate (Mario
    Limonciello).
 
  - Clean up the amd-pstate cpufreq driver and optimize it to increase
    code reuse (Mario Limonciello, Dhananjay Ugwekar).
 
  - Use CPPC to get scaling factors between HWP performance levels and
    frequency in the intel_pstate driver and make it stop using a built
    -in scaling factor for Arrow Lake processors (Rafael Wysocki).
 
  - Make intel_pstate initialize epp_policy to CPUFREQ_POLICY_UNKNOWN for
    consistency with CPU offline (Christian Loehle).
 
  - Fix superfluous updates caused by need_freq_update in the schedutil
    cpufreq governor (Sultan Alsawaf).
 
  - Allow configuring the system suspend-resume (DPM) watchdog to warn
    earlier than panic (Douglas Anderson).
 
  - Implement devm_device_init_wakeup() helper and introduce a device-
    managed variant of dev_pm_set_wake_irq() (Joe Hattori, Peng Fan).
 
  - Remove direct inclusions of 'pm_wakeup.h' which should be only
    included via 'device.h' (Wolfram Sang).
 
  - Clean up two comments in the core system-wide PM code (Rafael
    Wysocki, Randy Dunlap).
 
  - Add Clearwater Forest processor support to the intel_idle cpuidle
    driver (Artem Bityutskiy).
 
  - Clean up the Exynos devfreq driver and devfreq core (Markus Elfring,
    Jeongjun Park).
 
  - Minor cleanups and fixes for OPP (Dan Carpenter, Neil Armstrong, Joe
    Hattori).
 
  - Implement dev_pm_opp_get_bw() (Neil Armstrong).
 
  - Expose OPP reference counting helpers for Rust (Viresh Kumar).
 
  - Fix TSC MHz calculation in cpupower (He Rongguang).
 
  - Add install and uninstall options to bindings Makefile and add header
    changes for cpufreq.h to SWIG bindings in cpupower (John B. Wyatt IV).
 
  - Add missing residency header changes in cpuidle.h to SWIG bindings in
    cpupower (John B. Wyatt IV).
 
  - Add output files to .gitignore and clean them up in "make clean" in
    selftests/cpufreq (Li Zhijian).
 
  - Fix cross-compilation in cpupower Makefile (Peng Fan).
 
  - Revise the is_valid flag handling for idle_monitor in the cpupower
    utility (wangfushuai).
 
  - Extend and clean up AMD processors support in cpupower (Mario
    Limonciello).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmeOthsSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxqQsP/ivDt8nqDnxdKB7cKFQIsEK+tl0RnFVD
 o5regvYeRcGWpUXuMaqBtTmCMjsB8bUkcj2yLquM54ubjHAGF6zJuw9ZytMPHVcC
 b2xk3RCFlXSBFXVK8eOh3XRviA9nGhuY97ZnPsQOlvoECrxT2xyeL+mWo7s+t+q9
 2NUH+yfRoi5FM+nqqDhsm0xXxJuPaNg6eAjIASuMjXap48rNk3L5kW6W/6nw7i0I
 xQWd/pKLHaI5e7DRF/QdMKu8+Fm4BbN0jMqLblKPOmTe9KggvBkck5q1Um20sYkJ
 vdKMAT02ClGavIC7DtY092Xik84NZfID4ZUchS6e2hJIQ3Uaw/eDvAo/jlT8gIzq
 fnXPdApRIzQGDvMxFaAsKaGlwxiVlAGHPDSTH6MVWzsp+1DSkbloSwVPAfeYIn44
 Jhov+6Ydux3597sSjo+YmD58acimXl7urVuk8P6m3U5+gb8/jlgbxpIn+vbxH3Ka
 o44Vt7axD63gezOQY134sj5gic5JL0GuZovOlvzrF6+FsjvVqcax6FZ4n3uIXu7P
 C1nwai+Wdzo7wvuz7RfO0g15Y15wYLQLYsRq/osRlf+sOmGVv7nA9tSzZ0LUdD5D
 Pp6PxppF6anM0Kjen8Ppuu+Bcr11JfVvhnVTJqhs6u71XdAy4TnG1JjL4lPWYJ4D
 Gfz2hyPNjiQX
 =AoMC
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "The majority of changes here are cpufreq updates which are dominated
  by amd-pstate driver changes, like in the previous cycle. Moreover,
  changes related to amd-pstate are also the majority of cpupower
  utility updates.

  Included are some pieces of new hardware support, like the addition of
  Clearwater Forest processors support to intel_idle, new cpufreq driver
  for Airoha SoCs, and Apple cpufreq driver extensions to support more
  SoCs. The intel_pstate driver is also extended to be able to support
  new platforms by using ACPI CPPC to compute scaling factors between
  HWP performance states and frequency.

  The rest is mostly fixes and cleanups in assorted pieces of power
  management code.

  Specifics:

   - Use str_enable_disable()-like helpers in cpufreq (Krzysztof
     Kozlowski)

   - Extend the Apple cpufreq driver to support more SoCs (Hector
     Martin, Nick Chan)

   - Add new cpufreq driver for Airoha SoCs (Christian Marangi)

   - Fix using cpufreq-dt as module (Andreas Kemnade)

   - Minor fixes for Sparc, SCMI, and Qcom cpufreq drivers (Ethan Carter
     Edwards, Sibi Sankar, Manivannan Sadhasivam)

   - Fix the maximum supported frequency computation in the ACPI cpufreq
     driver to avoid relying on unfounded assumptions (Gautham Shenoy)

   - Fix an amd-pstate driver regression with preferred core rankings
     not being used (Mario Limonciello)

   - Fix a precision issue with frequency calculation in the amd-pstate
     driver (Naresh Solanki)

   - Add ftrace event to the amd-pstate driver for active mode (Mario
     Limonciello)

   - Set default EPP policy on Ryzen processors in amd-pstate (Mario
     Limonciello)

   - Clean up the amd-pstate cpufreq driver and optimize it to increase
     code reuse (Mario Limonciello, Dhananjay Ugwekar)

   - Use CPPC to get scaling factors between HWP performance levels and
     frequency in the intel_pstate driver and make it stop using a
     built-in scaling factor for Arrow Lake processors (Rafael Wysocki)

   - Make intel_pstate initialize epp_policy to CPUFREQ_POLICY_UNKNOWN
     for consistency with CPU offline (Christian Loehle)

   - Fix superfluous updates caused by need_freq_update in the schedutil
     cpufreq governor (Sultan Alsawaf)

   - Allow configuring the system suspend-resume (DPM) watchdog to warn
     earlier than panic (Douglas Anderson)

   - Implement devm_device_init_wakeup() helper and introduce a device-
     managed variant of dev_pm_set_wake_irq() (Joe Hattori, Peng Fan)

   - Remove direct inclusions of 'pm_wakeup.h' which should be only
     included via 'device.h' (Wolfram Sang)

   - Clean up two comments in the core system-wide PM code (Rafael
     Wysocki, Randy Dunlap)

   - Add Clearwater Forest processor support to the intel_idle cpuidle
     driver (Artem Bityutskiy)

   - Clean up the Exynos devfreq driver and devfreq core (Markus
     Elfring, Jeongjun Park)

   - Minor cleanups and fixes for OPP (Dan Carpenter, Neil Armstrong,
     Joe Hattori)

   - Implement dev_pm_opp_get_bw() (Neil Armstrong)

   - Expose OPP reference counting helpers for Rust (Viresh Kumar)

   - Fix TSC MHz calculation in cpupower (He Rongguang)

   - Add install and uninstall options to bindings Makefile and add
     header changes for cpufreq.h to SWIG bindings in cpupower (John B.
     Wyatt IV)

   - Add missing residency header changes in cpuidle.h to SWIG bindings
     in cpupower (John B. Wyatt IV)

   - Add output files to .gitignore and clean them up in "make clean" in
     selftests/cpufreq (Li Zhijian)

   - Fix cross-compilation in cpupower Makefile (Peng Fan)

   - Revise the is_valid flag handling for idle_monitor in the cpupower
     utility (wangfushuai)

   - Extend and clean up AMD processors support in cpupower (Mario
     Limonciello)"

* tag 'pm-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (67 commits)
  PM / OPP: Add reference counting helpers for Rust implementation
  PM: sleep: wakeirq: Introduce device-managed variant of dev_pm_set_wake_irq()
  cpufreq: Use str_enable_disable()-like helpers
  cpufreq: airoha: Add EN7581 CPUFreq SMCCC driver
  PM: sleep: Allow configuring the DPM watchdog to warn earlier than panic
  PM: sleep: convert comment from kernel-doc to plain comment
  cpufreq: ACPI: Fix max-frequency computation
  pm: cpupower: Add missing residency header changes in cpuidle.h to SWIG
  PM / devfreq: exynos: remove unused function parameter
  OPP: OF: Fix an OF node leak in _opp_add_static_v2()
  cpufreq/amd-pstate: Refactor max frequency calculation
  cpufreq/amd-pstate: Fix prefcore rankings
  pm: cpupower: Add header changes for cpufreq.h to SWIG bindings
  cpufreq: sparc: change kzalloc to kcalloc
  cpufreq: qcom: Implement clk_ops::determine_rate() for qcom_cpufreq* clocks
  cpufreq: qcom: Fix qcom_cpufreq_hw_recalc_rate() to query LUT if LMh IRQ is not available
  cpufreq: apple-soc: Add Apple A7-A8X SoC cpufreq support
  cpufreq: apple-soc: Set fallback transition latency to APPLE_DVFS_TRANSITION_TIMEOUT
  cpufreq: apple-soc: Increase cluster switch timeout to 400us
  cpufreq: apple-soc: Use 32-bit read for status register
  ...
2025-01-22 11:16:14 -08:00
Linus Torvalds
1d6d399223 Kthreads affinity follow either of 4 existing different patterns:
1) Per-CPU kthreads must stay affine to a single CPU and never execute
    relevant code on any other CPU. This is currently handled by smpboot
    code which takes care of CPU-hotplug operations. Affinity here is
    a correctness constraint.
 
 2) Some kthreads _have_ to be affine to a specific set of CPUs and can't
    run anywhere else. The affinity is set through kthread_bind_mask()
    and the subsystem takes care by itself to handle CPU-hotplug
    operations. Affinity here is assumed to be a correctness constraint.
 
 3) Per-node kthreads _prefer_ to be affine to a specific NUMA node. This
    is not a correctness constraint but merely a preference in terms of
    memory locality. kswapd and kcompactd both fall into this category.
    The affinity is set manually like for any other task and CPU-hotplug
    is supposed to be handled by the relevant subsystem so that the task
    is properly reaffined whenever a given CPU from the node comes up.
    Also care should be taken so that the node affinity doesn't cross
    isolated (nohz_full) cpumask boundaries.
 
 4) Similar to the previous point except kthreads have a _preferred_
    affinity different than a node. Both RCU boost kthreads and RCU
    exp kworkers fall into this category as they refer to "RCU nodes"
    from a distinctly distributed tree.
 
 Currently the preferred affinity patterns (3 and 4) have at least 4
 identified users, with more or less success when it comes to handle
 CPU-hotplug operations and CPU isolation. Each of which do it in its own
 ad-hoc way.
 
 This is an infrastructure proposal to handle this with the following API
 changes:
 
 _ kthread_create_on_node() automatically affines the created kthread to
   its target node unless it has been set as per-cpu or bound with
   kthread_bind[_mask]() before the first wake-up.
 
 - kthread_affine_preferred() is a new function that can be called right
   after kthread_create_on_node() to specify a preferred affinity
   different than the specified node.
 
 When the preferred affinity can't be applied because the possible
 targets are offline or isolated (nohz_full), the kthread is affine
 to the housekeeping CPUs (which means to all online CPUs most of the
 time or only the non-nohz_full CPUs when nohz_full= is set).
 
 kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
 converted, along with a few old drivers.
 
 Summary of the changes:
 
 * Consolidate a bunch of ad-hoc implementations of kthread_run_on_cpu()
 
 * Introduce task_cpu_fallback_mask() that defines the default last
   resort affinity of a task to become nohz_full aware
 
 * Add some correctness check to ensure kthread_bind() is always called
   before the first kthread wake up.
 
 * Default affine kthread to its preferred node.
 
 * Convert kswapd / kcompactd and remove their halfway working ad-hoc
   affinity implementation
 
 * Implement kthreads preferred affinity
 
 * Unify kthread worker and kthread API's style
 
 * Convert RCU kthreads to the new API and remove the ad-hoc affinity
   implementation.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmeNf8gACgkQhSRUR1CO
 jHedQQ/+IxTjjqQiItzrq41TES2S0desHDq8lNJFb7rsR/DtKFyLx3s67cOYV+cM
 Yx54QHg2m/Fz4nXMQ7Po5ygOtJGCKBc5C5QQy7y0lVKeTQK+daDfEtBSa3oG7j3C
 u+E3tTY6qxkbCzymUyaKkHN4/ay2vLvjFS50luV7KMyI3x47Aji+t7VdCX4LCPP2
 eAwOALWD0+7qLJ/VF6gsmQLKA4Qx7PQAzBa3KSBmUN9UcN8Gk1bQHCTIQKDHP9LQ
 v8BXrNZtYX1o2+snNYpX2z6/ECjxkdwriOgqqZY5306hd9RAQ1u46Dx3byrIqjGn
 ULG/XQ2istPyhTqb/h+RbrobdOcwEUIeqk8hRRbBXE8bPpqUz9EMuaCMxWDbQjgH
 NTuKG4ifKJ/IqstkkuDkdOiByE/ysMmwqrTXgSnu2ITNL9yY3BEgFbvA95hgo42s
 f7QCxEfZb1MHcNEMENSMwM3xw5lLMGMpxVZcMQ3gLwyotMBRrhFZm1qZJG7TITYW
 IDIeCbH4JOMdQwLs3CcWTXio0N5/85NhRNFV+IDn96OrgxObgnMtV8QwNgjXBAJ5
 wGeJWt8s34W1Zo3qS9gEuVzEhW4XaxISQQMkHe8faKkK6iHmIB/VjSQikDwwUNQ/
 AspYj82RyWBCDZsqhiYh71kpxjvS6Xp0bj39Ce1sNsOnuksxKkQ=
 =g8In
 -----END PGP SIGNATURE-----

Merge tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks

Pull kthread updates from Frederic Weisbecker:
 "Kthreads affinity follow either of 4 existing different patterns:

   1) Per-CPU kthreads must stay affine to a single CPU and never
      execute relevant code on any other CPU. This is currently handled
      by smpboot code which takes care of CPU-hotplug operations.
      Affinity here is a correctness constraint.

   2) Some kthreads _have_ to be affine to a specific set of CPUs and
      can't run anywhere else. The affinity is set through
      kthread_bind_mask() and the subsystem takes care by itself to
      handle CPU-hotplug operations. Affinity here is assumed to be a
      correctness constraint.

   3) Per-node kthreads _prefer_ to be affine to a specific NUMA node.
      This is not a correctness constraint but merely a preference in
      terms of memory locality. kswapd and kcompactd both fall into this
      category. The affinity is set manually like for any other task and
      CPU-hotplug is supposed to be handled by the relevant subsystem so
      that the task is properly reaffined whenever a given CPU from the
      node comes up. Also care should be taken so that the node affinity
      doesn't cross isolated (nohz_full) cpumask boundaries.

   4) Similar to the previous point except kthreads have a _preferred_
      affinity different than a node. Both RCU boost kthreads and RCU
      exp kworkers fall into this category as they refer to "RCU nodes"
      from a distinctly distributed tree.

  Currently the preferred affinity patterns (3 and 4) have at least 4
  identified users, with more or less success when it comes to handle
  CPU-hotplug operations and CPU isolation. Each of which do it in its
  own ad-hoc way.

  This is an infrastructure proposal to handle this with the following
  API changes:

   - kthread_create_on_node() automatically affines the created kthread
     to its target node unless it has been set as per-cpu or bound with
     kthread_bind[_mask]() before the first wake-up.

   - kthread_affine_preferred() is a new function that can be called
     right after kthread_create_on_node() to specify a preferred
     affinity different than the specified node.

  When the preferred affinity can't be applied because the possible
  targets are offline or isolated (nohz_full), the kthread is affine to
  the housekeeping CPUs (which means to all online CPUs most of the time
  or only the non-nohz_full CPUs when nohz_full= is set).

  kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
  converted, along with a few old drivers.

  Summary of the changes:

   - Consolidate a bunch of ad-hoc implementations of
     kthread_run_on_cpu()

   - Introduce task_cpu_fallback_mask() that defines the default last
     resort affinity of a task to become nohz_full aware

   - Add some correctness check to ensure kthread_bind() is always
     called before the first kthread wake up.

   - Default affine kthread to its preferred node.

   - Convert kswapd / kcompactd and remove their halfway working ad-hoc
     affinity implementation

   - Implement kthreads preferred affinity

   - Unify kthread worker and kthread API's style

   - Convert RCU kthreads to the new API and remove the ad-hoc affinity
     implementation"

* tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks:
  kthread: modify kernel-doc function name to match code
  rcu: Use kthread preferred affinity for RCU exp kworkers
  treewide: Introduce kthread_run_worker[_on_cpu]()
  kthread: Unify kthread_create_on_cpu() and kthread_create_worker_on_cpu() automatic format
  rcu: Use kthread preferred affinity for RCU boost
  kthread: Implement preferred affinity
  mm: Create/affine kswapd to its preferred node
  mm: Create/affine kcompactd to its preferred node
  kthread: Default affine kthread to its preferred NUMA node
  kthread: Make sure kthread hasn't started while binding it
  sched,arm64: Handle CPU isolation on last resort fallback rq selection
  arm64: Exclude nohz_full CPUs from 32bits el0 support
  lib: test_objpool: Use kthread_run_on_cpu()
  kallsyms: Use kthread_run_on_cpu()
  soc/qman: test: Use kthread_run_on_cpu()
  arm/bL_switcher: Use kthread_run_on_cpu()
2025-01-21 17:10:05 -08:00
Rafael J. Wysocki
a5c16f29a8 Merge branch 'pm-cpufreq'
Merge cpufreq updates for 6.14:

 - Use str_enable_disable()-like helpers in cpufreq (Krzysztof
   Kozlowski).

 - Extend the Apple cpufreq driver to support more SoCs (Hector Martin,
   Nick Chan).

 - Add new cpufreq driver for Airoha SoCs (Christian Marangi).

 - Fix using cpufreq-dt as module (Andreas Kemnade).

 - Minor fixes for Sparc, SCMI, and Qcom cpufreq drivers (Ethan Carter
   Edwards, Sibi Sankar, Manivannan Sadhasivam).

 - Fix the maximum supported frequency computation in the ACPI cpufreq
   driver to avoid relying on unfounded assumptions (Gautham Shenoy).

 - Fix an amd-pstate driver regression with preferred core rankings not
   being used (Mario Limonciello).

 - Fix a precision issue with frequency calculation in the amd-pstate
   driver (Naresh Solanki).

 - Add ftrace event to the amd-pstate driver for active mode (Mario
   Limonciello).

 - Set default EPP policy on Ryzen processors in amd-pstate (Mario
   Limonciello).

 - Clean up the amd-pstate cpufreq driver and optimize it to increase
   code reuse (Mario Limonciello, Dhananjay Ugwekar).

 - Use CPPC to get scaling factors between HWP performance levels and
   frequency in the intel_pstate driver and make it stop using a built
   -in scaling factor for the Arrow Lake processor (Rafael Wysocki).

 - Make intel_pstate initialize epp_policy to CPUFREQ_POLICY_UNKNOWN for
   consistency with CPU offline (Christian Loehle).

 - Fix superfluous updates caused by need_freq_update in the schedutil
   cpufreq governor (Sultan Alsawaf).

* pm-cpufreq: (40 commits)
  cpufreq: Use str_enable_disable()-like helpers
  cpufreq: airoha: Add EN7581 CPUFreq SMCCC driver
  cpufreq: ACPI: Fix max-frequency computation
  cpufreq/amd-pstate: Refactor max frequency calculation
  cpufreq/amd-pstate: Fix prefcore rankings
  cpufreq: sparc: change kzalloc to kcalloc
  cpufreq: qcom: Implement clk_ops::determine_rate() for qcom_cpufreq* clocks
  cpufreq: qcom: Fix qcom_cpufreq_hw_recalc_rate() to query LUT if LMh IRQ is not available
  cpufreq: apple-soc: Add Apple A7-A8X SoC cpufreq support
  cpufreq: apple-soc: Set fallback transition latency to APPLE_DVFS_TRANSITION_TIMEOUT
  cpufreq: apple-soc: Increase cluster switch timeout to 400us
  cpufreq: apple-soc: Use 32-bit read for status register
  cpufreq: apple-soc: Allow per-SoC configuration of APPLE_DVFS_CMD_PS1
  cpufreq: apple-soc: Drop setting the PS2 field on M2+
  dt-bindings: cpufreq: apple,cluster-cpufreq: Add A7-A11, T2 compatibles
  dt-bindings: cpufreq: Document support for Airoha EN7581 CPUFreq
  cpufreq: fix using cpufreq-dt as module
  cpufreq: scmi: Register for limit change notifications
  cpufreq: schedutil: Fix superfluous updates caused by need_freq_update
  cpufreq: intel_pstate: Use CPUFREQ_POLICY_UNKNOWN
  ...
2025-01-20 20:34:50 +01:00
Rafael J. Wysocki
1225bb42b8 Merge branches 'pm-sleep', 'pm-cpuidle' and 'pm-em'
Merge updates related to system sleep, a cpuidle update and an Energy
Model handling code update for 6.14-rc1:

 - Allow configuring the system suspend-resume (DPM) watchdog to warn
   earlier than panic (Douglas Anderson).

 - Implement devm_device_init_wakeup() helper and introduce a device-
   managed variant of dev_pm_set_wake_irq() (Joe Hattori, Peng Fan).

 - Remove direct inclusions of 'pm_wakeup.h' which should be only
   included via 'device.h' (Wolfram Sang).

 - Clean up two comments in the core system-wide PM code (Rafael
   Wysocki, Randy Dunlap).

 - Add Clearwater Forest processor support to the intel_idle cpuidle
   driver (Artem Bityutskiy).

 - Move sched domains rebuild function from the schedutil cpufreq
   governor to the Energy Model handling code (Rafael Wysocki).

* pm-sleep:
  PM: sleep: wakeirq: Introduce device-managed variant of dev_pm_set_wake_irq()
  PM: sleep: Allow configuring the DPM watchdog to warn earlier than panic
  PM: sleep: convert comment from kernel-doc to plain comment
  PM: wakeup: implement devm_device_init_wakeup() helper
  PM: sleep: sysfs: don't include 'pm_wakeup.h' directly
  PM: sleep: autosleep: don't include 'pm_wakeup.h' directly
  PM: sleep: Update stale comment in device_resume()

* pm-cpuidle:
  intel_idle: add Clearwater Forest SoC support

* pm-em:
  PM: EM: Move sched domains rebuild function from schedutil to EM
2025-01-20 19:14:15 +01:00
Rafael J. Wysocki
251be0b542 ARM cpufreq updates for 6.14
- Extended support for more SoCs in apple cpufreq driver (Hector Martin
   and Nick Chan).
 
 - Add new cpufreq driver for Airoha SoCs (Christian Marangi).
 
 - Fix using cpufreq-dt as module (Andreas Kemnade).
 
 - Minor fixes for Sparc, scmi, and Qcom drivers (Ethan Carter Edwards,
   Sibi Sankar and Manivannan Sadhasivam).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmeNxBsACgkQ0rkcPK6B
 EhyhQBAArv2A97hZz0frOz4PWNpZ7DUSsp8UUhBZ5pzM+ZK3hckgReyxm/n14qbJ
 iOvX+S8VenJrartsWDH9ucDiK8r2JvJpc+TNliPKvGkMd1qGDizC3+mgb/zdtJwz
 ZrspIdE/5HQ/yVyOylXQr8mpUe6o4Jz/n1vjPZV6CcbxCj8OUvdBriqFvUVAm+Dq
 s5EUWeziSdXgholUDlCM2E03yzkLSnf9aJJKzdRhm4raDoMm4rT8RkPXUvPrB+TY
 tv1ykTWyLnn6gzRT5o7odAi0lenjWNLJWZcixcY/CAjax6/5RjirYze2vdwZ+g8k
 EWd+y84eGSvmcBK5LVvuxN6XZja8pP5sEaZLGknmOsNkIOuNeXKPQXGKBg+SzQlp
 atWiHeoTLorkDkCW5Hu6NFnHOKJM+nqPo8vqQPpQty2fUarc3L/k6jZTsyx0+0gX
 46axncUenJasvXNQ1Oab3tW4SJ9hpv8AaHUY6TQEwIvGlkBctq/wr37wKDAN0KRy
 WhtvCuybIvlaJ2FES61nPMDITV8ZYbKRHYbPP074qqiI2W4sIAjbT+qlx3f/3XWx
 Ese7PaPiPg7ZoqUfXPdk77KKSDtbxsg8rCY3eaoKfusgMn5XkVFZgK5N19fAYPtP
 5VA2vHQxLpRV8C/yZrmyhrQWFSFdFdtFeMvQrPeBn4QGYR+Rcms=
 =J/XA
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.14' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.14 from Viresh Kumar:

"- Extended support for more SoCs in apple cpufreq driver (Hector Martin
   and Nick Chan).

 - Add new cpufreq driver for Airoha SoCs (Christian Marangi).

 - Fix using cpufreq-dt as module (Andreas Kemnade).

 - Minor fixes for Sparc, scmi, and Qcom drivers (Ethan Carter Edwards,
   Sibi Sankar and Manivannan Sadhasivam)."

* tag 'cpufreq-arm-updates-6.14' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: airoha: Add EN7581 CPUFreq SMCCC driver
  cpufreq: sparc: change kzalloc to kcalloc
  cpufreq: qcom: Implement clk_ops::determine_rate() for qcom_cpufreq* clocks
  cpufreq: qcom: Fix qcom_cpufreq_hw_recalc_rate() to query LUT if LMh IRQ is not available
  cpufreq: apple-soc: Add Apple A7-A8X SoC cpufreq support
  cpufreq: apple-soc: Set fallback transition latency to APPLE_DVFS_TRANSITION_TIMEOUT
  cpufreq: apple-soc: Increase cluster switch timeout to 400us
  cpufreq: apple-soc: Use 32-bit read for status register
  cpufreq: apple-soc: Allow per-SoC configuration of APPLE_DVFS_CMD_PS1
  cpufreq: apple-soc: Drop setting the PS2 field on M2+
  dt-bindings: cpufreq: apple,cluster-cpufreq: Add A7-A11, T2 compatibles
  dt-bindings: cpufreq: Document support for Airoha EN7581 CPUFreq
  cpufreq: fix using cpufreq-dt as module
  cpufreq: scmi: Register for limit change notifications
2025-01-20 12:53:59 +01:00
Krzysztof Kozlowski
f994c1cb6c cpufreq: Use str_enable_disable()-like helpers
Replace ternary (condition ? "enable" : "disable") syntax with helpers
from string_choices.h because:

 1. Simple function call with one argument is easier to read.  Ternary
    operator has three arguments and with wrapping might lead to quite
    long code.
 2. Is slightly shorter thus also easier to read.
 3. It brings uniformity in the text - same string.
 4. Allows deduping by the linker, which results in a smaller binary
    file.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250114190600.846651-1-krzysztof.kozlowski@linaro.org
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-17 18:54:57 +01:00
Christian Marangi
84cf9e541c cpufreq: airoha: Add EN7581 CPUFreq SMCCC driver
Add simple CPU Freq driver for Airoha EN7581 SoC that control CPU
frequency scaling with SMC APIs and register a generic "cpufreq-dt"
device.

All CPU share the same frequency and can't be controlled independently.
CPU frequency is controlled by the attached PM domain.

Add SoC compatible to cpufreq-dt-plat block list as a dedicated cpufreq
driver is needed with OPP v2 nodes declared in DTS.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-01-17 09:05:07 +05:30
Rafael J. Wysocki
423124ab97 Merge back earlier cpufreq material for 6.14 2025-01-16 20:20:20 +01:00
Gautham R. Shenoy
0834667545 cpufreq: ACPI: Fix max-frequency computation
Commit 3c55e94c0a ("cpufreq: ACPI: Extend frequency tables to cover
boost frequencies") introduced an assumption in acpi_cpufreq_cpu_init()
that the first entry in the P-state table was the nominal frequency.
This assumption is incorrect. The frequency corresponding to the P0
P-State need not be the same as the nominal frequency advertised via
CPPC.

Since the driver is using the CPPC.highest_perf and CPPC.nominal_perf
to compute the boost-ratio, it makes sense to use CPPC.nominal_freq to
compute the max-frequency. CPPC.nominal_freq is advertised on
platforms supporting CPPC revisions 3 or higher.

Hence, fallback to using the first entry in the P-State table only on
platforms that do not advertise CPPC.nominal_freq.

Fixes: 3c55e94c0a ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies")
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250113044107.566-1-gautham.shenoy@amd.com
[ rjw: Retain reverse X-mas tree ordering of local variable declarations ]
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-14 21:10:07 +01:00
Viresh Kumar
7e265fc046 cpufreq: Move endif to the end of Kconfig file
It is possible to enable few cpufreq drivers, without the framework
being enabled. This happened due to a bug while moving the entries
earlier. Fix it.

Fixes: 7ee1378736 ("cpufreq: Move CPPC configs to common Kconfig and add RISC-V")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/84ac7a8fa72a8fe20487bb0a350a758bce060965.1736488384.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-14 20:54:04 +01:00
Rafael J. Wysocki
7420a7e867 amd-pstate-6.14 content 1/7/25
Fix a regression with preferred core rankings not being used.
 Fix a precision issue with frequency calculation.
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmd9nfoaHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnZwjBAA4KrGmCNlV3At/v82MQXx
 Bwe+I6ngIweHDglV9UDDGgtqtQJfP3cJfXp36ppZSHn+xidMoEt3TjVpnCF3dP7X
 GYQOsMWQlPnsBQfh8TQusPoDGZK+UnS5GQ6F9jHkUxo1p4PsOcajW6fu6EZGQBc9
 dAf+roHDZqEcAWwmxAzycpYrFxNU2YWvWW6gX24yYdO+mbNQpkgw6zMmzLKUtlCm
 uWBtOXxOepee65dMqPMgsRyjWJPoGXWB4mnAo17147LW0qYQ0mPw4D6wyKJsOj5y
 UTLxz4QxHH5WbRPdMBcdX5m85CqxOqj+KeauWHsnQ4XGrA1n+D5SOt0ieAIWgSOm
 k9/KCYvw2+ZZ8dFYEvwtuImrDbBrJDQGjAQr1HAiPl5f4TE7vfbZY6SLYO8gQ0RH
 U4POkGUDWdGoKwCjyBReZ6dVIerCCH1gYmLp+HJqPK7k4BoX7bHNOCJ9sc+hCop5
 x9SZFA5Gmf8TkXRcSMctatm9CxvIawh4THkkzctQErKZs0qJ/+KdTVEwg0i44sKm
 XNH3iRjWPttFI3b+wJZMwwoBmUSTvOMycz4oJZ5jN0wYXfvdPh3FcF7XUpdn9wNq
 8qWE00zMPy9u9wCpjOiIQ1BPHBaqXxfvTDfcoi2zoxpv/f9QM6tsqtjUhZG0+n5t
 oKkupREcIpKqUIvgdCf1v8M=
 =vmI+
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.14-2025-01-07' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge amd-pstate changes for 6.14 from Mario Limonciello:

"Fix a regression with preferred core rankings not being used.
 Fix a precision issue with frequency calculation."

* tag 'amd-pstate-v6.14-2025-01-07' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate: Refactor max frequency calculation
  cpufreq/amd-pstate: Fix prefcore rankings
2025-01-10 14:17:22 +01:00
Frederic Weisbecker
b04e317b52 treewide: Introduce kthread_run_worker[_on_cpu]()
kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2025-01-08 18:15:03 +01:00
Naresh Solanki
857a61c2ce cpufreq/amd-pstate: Refactor max frequency calculation
The previous approach introduced roundoff errors during division when
calculating the boost ratio. This, in turn, affected the maximum
frequency calculation, often resulting in reporting lower frequency
values.

For example, on the Glinda SoC based board with the following
parameters:

max_perf = 208
nominal_perf = 100
nominal_freq = 2600 MHz

The Linux kernel previously calculated the frequency as:
freq = ((max_perf * 1024 / nominal_perf) * nominal_freq) / 1024
freq = 5405 MHz  // Integer arithmetic.

With the updated formula:
freq = (max_perf * nominal_freq) / nominal_perf
freq = 5408 MHz

This change ensures more accurate frequency calculations by eliminating
unnecessary shifts and divisions, thereby improving precision.

Signed-off-by: Naresh Solanki <naresh.solanki@9elements.com>
[ML: trim the changelog from commit message]
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241219201833.2750998-1-naresh.solanki@9elements.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-01-03 23:44:07 -06:00
Mario Limonciello
fd604ae6c2 cpufreq/amd-pstate: Fix prefcore rankings
commit 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as
highest perf again") updated the value stored for highest perf to no longer
store the highest perf value but instead the boost numerator.

This is a fixed value for systems with preferred cores and not appropriate
for use ITMT rankings. Update the value used for ITMT rankings to be the
preferred core ranking.

Reported-and-tested-by: Sebastian <sobrus@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219640
Fixes: 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as highest perf again")
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20250102141204.3413202-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-01-03 10:27:32 -06:00
Ethan Carter Edwards
af6cc45af3 cpufreq: sparc: change kzalloc to kcalloc
Refactor to use kcalloc instead of kzalloc when multiplying
allocation size by count. This refactor prevents unintentional
memory overflows. Discovered by checkpatch.pl.

Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-24 09:48:58 +05:30
Manivannan Sadhasivam
a9ba290d0b cpufreq: qcom: Implement clk_ops::determine_rate() for qcom_cpufreq* clocks
determine_rate() callback is used by the clk_set_rate() API to get the
closest rate of the target rate supported by the clock. If this callback
is not implemented (nor round_rate() callback), then the API will assume
that the clock cannot set the requested rate. And since there is no parent,
it will return -EINVAL.

This is not an issue right now as clk_set_rate() mistakenly compares the
target rate with cached rate and bails out early. But once that is fixed
to compare the target rate with the actual rate of the clock (returned by
recalc_rate()), then clk_set_rate() for this clock will start to fail as
below:

cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22

So implement the determine_rate() callback that just returns the actual
rate at which the clock is passed to the CPUs in a domain.

Fixes: 4370232c72 ("cpufreq: qcom-hw: Add CPU clock provider support")
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Suggested-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:49 +05:30
Manivannan Sadhasivam
85d8b11351 cpufreq: qcom: Fix qcom_cpufreq_hw_recalc_rate() to query LUT if LMh IRQ is not available
Currently, qcom_cpufreq_hw_recalc_rate() returns the LMh throttled
frequency for the domain even if LMh IRQ is not available. But as per
qcom_cpufreq_hw_get(), the driver has to query LUT entries to get the
actual frequency of the domain. So do the same in
qcom_cpufreq_hw_recalc_rate().

While doing so, refactor the existing qcom_cpufreq_hw_get() function so
that qcom_cpufreq_hw_recalc_rate() can make use of the existing code and
avoid code duplication. This also requires setting the
qcom_cpufreq_data::policy even if LMh IRQ is not available.

Fixes: 4370232c72 ("cpufreq: qcom-hw: Add CPU clock provider support")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:49 +05:30
Nick Chan
1a4ddf6ab9 cpufreq: apple-soc: Add Apple A7-A8X SoC cpufreq support
These SoCs only use 3 bits for p-states, and have a different
APPLE_DVFS_CMD_PS1 mask value.

Signed-off-by: Nick Chan <towinchenmi@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:48 +05:30
Nick Chan
13b147b2a9 cpufreq: apple-soc: Set fallback transition latency to APPLE_DVFS_TRANSITION_TIMEOUT
The driver already assumes transitions will not take longer than
APPLE_DVFS_TRANSITION_TIMEOUT in apple_soc_cpufreq_set_target(), so it
makes little sense to set CPUFREQ_ETERNAL as the transition latency
when the transistion latency is not given by the opp-table.

Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Nick Chan <towinchenmi@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:47 +05:30
Nick Chan
0dc21f6091 cpufreq: apple-soc: Increase cluster switch timeout to 400us
Apple A11 SoC takes a long time to switch. Maximum switch time
observed is 345us, so increase the cluster switch timeout to 400us
to be safe.

Signed-off-by: Nick Chan <towinchenmi@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:47 +05:30
Nick Chan
55aac9f570 cpufreq: apple-soc: Use 32-bit read for status register
Apple A7-A9(X) SoCs requires 32-bit reads on the status register. Newer
SoCs accepts 32-bit reads on the status register as well.

Signed-off-by: Nick Chan <towinchenmi@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:46 +05:30
Nick Chan
0755a9376e cpufreq: apple-soc: Allow per-SoC configuration of APPLE_DVFS_CMD_PS1
Support for SoC that has a different APPLE_DVFS_CMD_PS1 will be added soon,
so modify the driver first to allow it to be configured per-SoC.

Signed-off-by: Nick Chan <towinchenmi@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:46 +05:30
Hector Martin
4a06c250ab cpufreq: apple-soc: Drop setting the PS2 field on M2+
Newer device do not use this. It is not known what this field does,
but change the behavior to be same as macOS to be safe.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Nick Chan <towinchenmi@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:45 +05:30
Andreas Kemnade
f1f010c9d9 cpufreq: fix using cpufreq-dt as module
This driver can be built as a module since commit 3b062a0869 ("cpufreq:
dt-platdev: Support building as module"), but unfortunately this caused
a regression because the cputfreq-dt-platdev.ko module does not autoload.

Usually, this is solved by just using the MODULE_DEVICE_TABLE() macro to
export all the device IDs as module aliases. But this driver is special
due how matches with devices and decides what platform supports.

There are two of_device_id lists, an allow list that are for CPU devices
that always match and a deny list that's for devices that must not match.

The driver registers a cpufreq-dt platform device for all the CPU device
nodes that either are in the allow list or contain an operating-points-v2
property and are not in the deny list.

Enforce builtin compile of cpufreq-dt-platdev to make autoload work.

Fixes: 3b062a0869 ("cpufreq: dt-platdev: Support building as module")
Link: https://lore.kernel.org/all/20241104201424.2a42efdd@akair/
Link: https://lore.kernel.org/all/20241119111918.1732531-1-javierm@redhat.com/
Cc: stable@vger.kernel.org
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Reported-by: Radu Rendec <rrendec@redhat.com>
Reported-by: Javier Martinez Canillas <javierm@redhat.com>
[ Viresh: Picked commit log from Javier, updated tags ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:43 +05:30
Sibi Sankar
34059ed0f3 cpufreq: scmi: Register for limit change notifications
Register for limit change notifications if supported and use the throttled
frequency from the notification to apply HW pressure.

Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Tested-by: Mike Tipton <quic_mdtipton@quicinc.com>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23 16:26:42 +05:30
Rafael J. Wysocki
2dfed74038 amd-pstate changes for 6.14
Mostly cleanups and optimizations to increase code reuse by
 shuffling around and using helpers.
 
 Notable other changes:
  * Add ftrace event for active mode to use
  * Set default EPP policy on Ryzen
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmdjJMEaHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnYr0g/+KbniaTKVcEVa1IVz55+i
 X5O72FSJjrKP2plMtK52wVZS7hVSawA/hfKeOqX1+40Ynss1LI/s7NzeUElLbQiZ
 QapgY6LyQCxz7lso0F85gbjlU3DTvA5twwUEjk0rUWSwqfWOODkN1dF6NlskcIMQ
 9tF+5SDORnQp4eGJbUhFasfxpCLexQP4EprWhq62lgYJcvZePQnm2ccMK/pJWBw0
 5OgIX9T0dzpC/lCiHeyVAInB/Zri51o3ZdLwJAUwEOjILrJ6de4b7HeoPxAGV8Y7
 21rU5QhWZgowmmGX8tbz3VM8sl5eaPXqQYS1n+b8+sIzzBgoE4rO5Ns6pYAzMHIN
 7WehN724gD+RVPs7072OuhzENAKmWbpZEeu6YDaXTe5JA4XJQ/uiez7irFWtcr6k
 mxUFhBoiQxEdq1Rjvt7HfWpPSUZrH3syh1OsxAecm5QNGUU+9W/L7Fbyl2OHFL5V
 stTufe3i2igd4cczjrczgKheKsCVU6cfulysFnN4u0TiRnn8PHRKxT5K5KZX4O67
 U7F+S8nAG2bNgBmVuwHterrypBmsAa9NAvD+3M/IWBGlgxJTDcAizwTugMZT1Nqb
 lmupOJnAL7eRmJS0Bdd9Pcd3C62c1ayTVITN3hCKLMTRJPxEgChAPs2DgUOt72cR
 ZQT3nbM2FAg0LFKl+kewnl0=
 =44/p
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.14-2024-12-18' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge amd-pstate changes for 6.14 from Mario Limonciello:

"Mostly cleanups and optimizations to increase code reuse by
 shuffling around and using helpers.

 Notable other changes:
  * Add ftrace event for active mode to use
  * Set default EPP policy on Ryzen"

* tag 'amd-pstate-v6.14-2024-12-18' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (21 commits)
  cpufreq/amd-pstate: Drop boost_state variable
  cpufreq/amd-pstate: Set different default EPP policy for Epyc and Ryzen
  cpufreq/amd-pstate: Drop ret variable from amd_pstate_set_energy_pref_index()
  cpufreq/amd-pstate: Always write EPP value when updating perf
  cpufreq/amd-pstate: Cache EPP value and use that everywhere
  cpufreq/amd-pstate: Move limit updating code
  cpufreq/amd-pstate: Change amd_pstate_update_perf() to return an int
  cpufreq/amd-pstate: store all values in cpudata struct in khz
  cpufreq/amd-pstate: Only update the cached value in msr_set_epp() on success
  cpufreq/amd-pstate: Use FIELD_PREP and FIELD_GET macros
  cpufreq/amd-pstate: Drop cached epp_policy variable
  cpufreq/amd-pstate: convert mutex use to guard()
  cpufreq/amd-pstate: Add trace event for EPP perf updates
  cpufreq/amd-pstate: Merge amd_pstate_epp_cpu_offline() and amd_pstate_epp_offline()
  cpufreq/amd-pstate: Remove the cppc_state check in offline/online functions
  cpufreq/amd-pstate: Refactor amd_pstate_epp_reenable() and amd_pstate_epp_offline()
  cpufreq/amd-pstate: Move the invocation of amd_pstate_update_perf()
  cpufreq/amd-pstate: Convert the amd_pstate_get/set_epp() to static calls
  cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies
  cpufreq/amd-pstate: Store the boost numerator as highest perf again
  ...
2024-12-19 20:49:50 +01:00
Rafael J. Wysocki
ebeeee390b PM: EM: Move sched domains rebuild function from schedutil to EM
Function sugov_eas_rebuild_sd() defined in the schedutil cpufreq governor
implements generic functionality that may be useful in other places.  In
particular, there is a plan to use it in the intel_pstate driver in the
future.

For this reason, move it from schedutil to the energy model code and
rename it to em_rebuild_sched_domains().

This also helps to get rid of some #ifdeffery in schedutil which is a
plus.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
2024-12-18 20:32:13 +01:00