Now that the driver core can properly handle constant struct bus_type,
change the local driver core bus_type variables to be a constant
structure as well, placing them into read-only memory which can not be
modified at runtime.
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Len Brown <len.brown@intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Acked-by: Dave Ertman <david.m.ertman@intel.com>
Link: https://lore.kernel.org/r/2023121908-paver-follow-cc21@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the driver core can properly handle constant struct bus_type,
move the container_subsys variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023121919-chatter-grumbling-9ef3@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The functions subsys_register() and subsys_virtual_register() should be
taking a constant pointer to a struct bus_type, as they do not actually
modify anything in it, so fix up the function definitions to do so
properly.
This also changes the pointer type in struct subsys_interface to be
constant as well, as again, that's the proper signature of it.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/2023121908-grove-genetics-f8af@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some reason, during the big "clean up the driver core for a const
struct bus_type" work, the bus_sort_breadthfirst() call was missed. Fix
this up by changing the type to be a const * as it should be.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023121935-stinking-ditzy-fd5d@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When compiling with gcc version 14.0.0 20231220 (experimental)
and W=1, I've noticed a bunch of four similar warnings like:
drivers/base/regmap/regmap-ram.c: In function '__regmap_init_ram':
drivers/base/regmap/regmap-ram.c:68:37: warning: 'kcalloc' sizes specified with
'sizeof' in the earlier argument and not in the later argument [-Wcalloc-transposed-args]
68 | data->read = kcalloc(sizeof(bool), config->max_register + 1,
| ^~~~
Since 'n' and 'size' arguments of 'kcalloc()' are multiplied to
calculate the final size, their actual order doesn't affect the
result and so this is not a bug. But it's still worth to fix it.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://msgid.link/r/20231220175829.533700-1-dmantipov@yandex.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
New device support
------------------
adi,hmc425a
* Add support for ADRF5740 attenuators. Minor changes to driver needed
alongside new IDs.
aosong,ags02ma
* New driver for this volatile organic compounds sensor.
bosch,bmp280
* Add BMP390 (small amount of refactoring + ID)
bosch,bmi323
* New driver to support the BMI323 6-axis IMU.
honeywell,hsc030pa
* New driver supporting a huge number of SSC and HSC series pressure and
temperature sensors.
isil,isl76682
* New driver for this simple Ambient Light sensor.
liteon,ltr390
* New driver for this ambient and ultraviolet light sensor.
maxim,max34408
* New driver to support the MAX34408 and MAX34409 current monitoring ADCs.
melexis,mlx90635
* New driver for this Infrared contactless temperature sensor.
mirochip,mcp9600
* New driver for this thermocouple EMF convertor.
ti,hdc3020
* New driver for this integrated relative humidity and temperature
sensor.
vishay,veml6075
* New driver for this UVA and UVB light sensor.
General features
----------------
Device properties
* Add fwnode_property_match_property_string() helper to allow matching
single value property against an array of predefined strings.
* Use fwnode_property_string_array_count() inside
fwnode_property_match_string() instead of open coding the same.
checkpatch.pl
* Add exclusion of __aligned() from a warning reducing false positives
on IIO drivers (and hopefully beyond)
IIO Features
------------
core
* New light channel modifiers for UVA and UVB.
* Add IIO_CHAN_INFO_TROUGH as counterpart to IIO_CHAN_INFO_PEAK so that
we can support device that keep running track of the lowest value they
have seen in similar fashion to the existing peak tracking.
adi,adis library
* Use spi cs inactive delay even when a burst reading is performed.
As it's now used every time, can centralize the handling in the SPI
setup code in the driver.
adi,ad2s1210
* Support for fixed-mode to this resolver driver where the A0 and A1
pins are hard wired to config mode in which case position and config
must be read from appropriate config registers.
* Support reset GPIO if present.
adi,ad5791
* Allow configuration of presence of external amplifier in DT binding.
adi,adis16400
* Add spi-cs-inactive-delay-ns to bindings to allow it to be tweaked
if default delays are not quite enough for a specific board.
adi,adis16475
* Add spi-cs-inactive-delay-ns to bindings to allow it to be tweaked
if default delays are not quite enough for a specific board.
bosch,bmp280
* Enable multiple chip IDs per family of devices.
rohm,bu27008
* Add an illuminance channel calculated from RGB and IR data.
Cleanup
-------
Minor white space, typos and tidy up not explicitly called out.
Core
* Check that the available_scan_masks array passed to the IIO core
by a driver is sensible by ensuring the entries are ordered so the
minimum number of channels is enabled in the earlier entries (as they
will be selected if sufficient for the requested channels).
* Document that the available_scan_masks infrastructure doesn't currently
handle masks that don't fit in a long int.
* Improve intensity documentation to reflect that there is no expectation
of sensible units (it's dependent on a frequency sensitivity curve)
Various
* Use new device_property_match_property_string() to replace open coded
versions of the same thing.
* Fix a few MAINTAINERS filenames.
* i2c_get_match_data() and spi_get_device_match_data() pushed into
more drivers reducing boilerplate handling.
* Some unnecessary headers removed.
* ACPI_PTR() removals. It's rarely worth using this.
adi,ad7091r (early part of a series adding device support - useful in
their own right)
* Pass iio_dev directly an event handler rather than relying
on broken use of dev_get_drvdata() as drvdata is never set in this driver.
* Make sure alert is turned on.
adi,ad9467 (general driver fixing up as precursor to iio-backend proposal
which is under review for 6.9)
* Fix reset gpio handling to match expected polarity.
* Always handle error codes from spi_writes.
* Add a driver instance local mutex to avoid some races.
* Fix scale setting to align with available scale values.
* Split array of chip_info structures up into named individual elements.
* Convert to regmap.
honeywell,mprls0025pa
* Drop now unnecessary type references in DT binding for properties in
pascals.
invensense,mpu6050
* Don't eat a potentially useful return value from regmap_bulk_write()
invensense,icm42600
* Use max macro to improve code readability and save a few lines.
liteon,ltrf216a
* Improve prevision of light intensity.
microchip,mcp3911
* Use cleanup.h magic.
qcom,spmi*
* Fix wrong descriptions of SPMI reg fields in bindings.
Other
----
mailmap
* Update for Matt Ranostay
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmWDAIIRHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0Foi37g//f9PEMOdLToWM6GoNidJmX+V8QTEKkDF5
omUTSjMTAa7aXMt5dgoevrRrg4tIZs5TfheEfHg2io+Dg3aGQxxPpgyZnVQSC4MN
bJwd+dVLjvi5sr7dEnwtJRfYCnt9LLI900xCg6b6YUFgJxY02HH6REPZVqU7M7Jl
ANTAycSrYdLJlMReHsuDZT/kMJaWo4M0azSBxWtICTtLmlfii8dWTfSex4xzk1Pf
nu13+ivz3FHpQOtLAHdtHf5CCDwcc+9DQw5O8Ihr8RQY9JEnwolaNgc0Fsmotfww
sJ8bhFy8Zyc7bYwskGl1vEX/lJdk+drDt1LmbxxaQPKISpUx3A4ITVvwiG60KGI3
gBwJmirxG9CFCa3MY4V608dvOkc58IyOX/GFq9l+5MlZvMdRqeRCFPOs8UT0G3zi
Kd422wjVpAwjV70wOb9NjnkVxQD69FaMiSA1QBCp8PYams/okwqT3ZiBqdqahwjb
77vx3gKL+Pa8hJJynA9eP8uRl6KpTxmallSuHsK4c1qdj+NaJfHPbSu0pmPALbE0
N5fcv+vA0WS/J2fWVoJuqPQ/+yYk6WddjpcOn0imlW6Uo4CP6CQISnbAuNWO6TTe
c8YAQiyoHMOk8slyU+++sPH6qS7hcyoPdBVJgCPwEN12PaZxnM7WaTN2WJX+TFXP
KARwsZQ+rUE=
=jH+G
-----END PGP SIGNATURE-----
Merge tag 'iio-for-6.8a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
1st set of IIO new device support, features and cleanup for 6.8
New device support
------------------
adi,hmc425a
* Add support for ADRF5740 attenuators. Minor changes to driver needed
alongside new IDs.
aosong,ags02ma
* New driver for this volatile organic compounds sensor.
bosch,bmp280
* Add BMP390 (small amount of refactoring + ID)
bosch,bmi323
* New driver to support the BMI323 6-axis IMU.
honeywell,hsc030pa
* New driver supporting a huge number of SSC and HSC series pressure and
temperature sensors.
isil,isl76682
* New driver for this simple Ambient Light sensor.
liteon,ltr390
* New driver for this ambient and ultraviolet light sensor.
maxim,max34408
* New driver to support the MAX34408 and MAX34409 current monitoring ADCs.
melexis,mlx90635
* New driver for this Infrared contactless temperature sensor.
mirochip,mcp9600
* New driver for this thermocouple EMF convertor.
ti,hdc3020
* New driver for this integrated relative humidity and temperature
sensor.
vishay,veml6075
* New driver for this UVA and UVB light sensor.
General features
----------------
Device properties
* Add fwnode_property_match_property_string() helper to allow matching
single value property against an array of predefined strings.
* Use fwnode_property_string_array_count() inside
fwnode_property_match_string() instead of open coding the same.
checkpatch.pl
* Add exclusion of __aligned() from a warning reducing false positives
on IIO drivers (and hopefully beyond)
IIO Features
------------
core
* New light channel modifiers for UVA and UVB.
* Add IIO_CHAN_INFO_TROUGH as counterpart to IIO_CHAN_INFO_PEAK so that
we can support device that keep running track of the lowest value they
have seen in similar fashion to the existing peak tracking.
adi,adis library
* Use spi cs inactive delay even when a burst reading is performed.
As it's now used every time, can centralize the handling in the SPI
setup code in the driver.
adi,ad2s1210
* Support for fixed-mode to this resolver driver where the A0 and A1
pins are hard wired to config mode in which case position and config
must be read from appropriate config registers.
* Support reset GPIO if present.
adi,ad5791
* Allow configuration of presence of external amplifier in DT binding.
adi,adis16400
* Add spi-cs-inactive-delay-ns to bindings to allow it to be tweaked
if default delays are not quite enough for a specific board.
adi,adis16475
* Add spi-cs-inactive-delay-ns to bindings to allow it to be tweaked
if default delays are not quite enough for a specific board.
bosch,bmp280
* Enable multiple chip IDs per family of devices.
rohm,bu27008
* Add an illuminance channel calculated from RGB and IR data.
Cleanup
-------
Minor white space, typos and tidy up not explicitly called out.
Core
* Check that the available_scan_masks array passed to the IIO core
by a driver is sensible by ensuring the entries are ordered so the
minimum number of channels is enabled in the earlier entries (as they
will be selected if sufficient for the requested channels).
* Document that the available_scan_masks infrastructure doesn't currently
handle masks that don't fit in a long int.
* Improve intensity documentation to reflect that there is no expectation
of sensible units (it's dependent on a frequency sensitivity curve)
Various
* Use new device_property_match_property_string() to replace open coded
versions of the same thing.
* Fix a few MAINTAINERS filenames.
* i2c_get_match_data() and spi_get_device_match_data() pushed into
more drivers reducing boilerplate handling.
* Some unnecessary headers removed.
* ACPI_PTR() removals. It's rarely worth using this.
adi,ad7091r (early part of a series adding device support - useful in
their own right)
* Pass iio_dev directly an event handler rather than relying
on broken use of dev_get_drvdata() as drvdata is never set in this driver.
* Make sure alert is turned on.
adi,ad9467 (general driver fixing up as precursor to iio-backend proposal
which is under review for 6.9)
* Fix reset gpio handling to match expected polarity.
* Always handle error codes from spi_writes.
* Add a driver instance local mutex to avoid some races.
* Fix scale setting to align with available scale values.
* Split array of chip_info structures up into named individual elements.
* Convert to regmap.
honeywell,mprls0025pa
* Drop now unnecessary type references in DT binding for properties in
pascals.
invensense,mpu6050
* Don't eat a potentially useful return value from regmap_bulk_write()
invensense,icm42600
* Use max macro to improve code readability and save a few lines.
liteon,ltrf216a
* Improve prevision of light intensity.
microchip,mcp3911
* Use cleanup.h magic.
qcom,spmi*
* Fix wrong descriptions of SPMI reg fields in bindings.
Other
----
mailmap
* Update for Matt Ranostay
* tag 'iio-for-6.8a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (83 commits)
iio: adc: ad7091r: Align arguments to function call parenthesis
iio: adc: ad7091r: Set alert bit in config register
iio: adc: ad7091r: Pass iio_dev to event handler
scripts: checkpatch: Add __aligned to the list of attribute notes
iio: chemical: add support for Aosong AGS02MA
dt-bindings: iio: chemical: add aosong,ags02ma
dt-bindings: vendor-prefixes: add aosong
iio: accel: bmi088: update comments and Kconfig
dt-bindings: iio: humidity: Add TI HDC302x support
iio: humidity: Add driver for ti HDC302x humidity sensors
iio: ABI: document temperature and humidity peak/trough raw attributes
iio: core: introduce trough info element for minimum values
iio: light: driver for Lite-On ltr390
dt-bindings: iio: light: add ltr390
iio: light: isl76682: remove unreachable code
iio: pressure: driver for Honeywell HSC/SSC series
dt-bindings: iio: pressure: add honeywell,hsc030
doc: iio: Document intensity scale as poorly defined
dt-bindings: iio: temperature: add MLX90635 device
iio: temperature: mlx90635 MLX90635 IR Temperature sensor
...
It seems reasonable to collect the core parts for the generic PM domain,
along with its corresponding provider drivers. Therefore let's move the
files from drivers/base/power/ to drivers/pmdomain/ and while at it, let's
also rename the files accordingly.
Moreover, let's also update MAINTAINERS to reflect the update.
Cc: Kevin Hilman <khilman@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20231213113305.29098-1-ulf.hansson@linaro.org
Specs don't say anything about UIP being cleared within 10ms. They
only say that UIP won't occur for another 244uS. If a long NMI occurs
while UIP is still updating it might not be possible to get valid
data in 10ms.
This has been observed in the wild that around s2idle some calls can
take up to 480ms before UIP is clear.
Adjust callers from outside an interrupt context to wait for up to a
1s instead of 10ms.
Cc: <stable@vger.kernel.org> # 6.1.y
Fixes: ec5895c0f2 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
Reported-by: Carsten Hatger <xmb8dsv4@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217626
Tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Reviewed-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Acked-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20231128053653.101798-5-mario.limonciello@amd.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The UIP timeout is hardcoded to 10ms for all RTC reads, but in some
contexts this might not be enough time. Add a timeout parameter to
mc146818_get_time() and mc146818_get_time_callback().
If UIP timeout is configured by caller to be >=100 ms and a call
takes this long, log a warning.
Make all callers use 10ms to ensure no functional changes.
Cc: <stable@vger.kernel.org> # 6.1.y
Fixes: ec5895c0f2 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Reviewed-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Acked-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Link: https://lore.kernel.org/r/20231128053653.101798-4-mario.limonciello@amd.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Describing the usage of dev_err_probe() as being (only?) "deemed
acceptable" has a bad connotation. In fact dev_err_probe() fulfills
three tasks:
- handling of EPROBE_DEFER (even more than degrading to dev_dbg())
- symbolic output of the error code
- return err for compact error code paths
Advertise these better and claim the usage to be "fine" to get rid of
the bad connotation.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231215174540.2438601-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit 7c41cdcd3b ("OPP: Simplify the over-designed pstate <->
level dance"), there is no longer any users of the
pm_genpd_opp_to_performance_state() API. Let's therefore drop it and its
corresponding ->opp_to_performance_state() callback, which also no longer
has any users.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20231127151931.47055-1-ulf.hansson@linaro.org
Fix kernel-doc warnings found when using "W=1".
domain_governor.c:54: warning: No description found for return value of 'default_suspend_ok'
domain_governor.c:266: warning: No description found for return value of '_default_power_down_ok'
domain_governor.c:412: warning: cannot understand function prototype: 'struct dev_power_governor pm_domain_always_on_gov = '
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20231205225856.32739-1-rdunlap@infradead.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Here are some small fixes for 6.7-rc5 for a variety of small driver
subsystems. Included in here are:
- debugfs revert for reported issue
- greybus revert for reported issue
- greybus fixup for endian build warning
- coresight driver fixes
- nvmem driver fixes
- devcoredump fix
- parport new device id
- ndtest build fix
All of these have ben in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZXRxAQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykphwCfaF0Dh6oajneYbo/pq70+an876uYAnjwALPfr
g2EezrYYUAkkPACOd27t
=q7gC
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are some small fixes for 6.7-rc5 for a variety of small driver
subsystems. Included in here are:
- debugfs revert for reported issue
- greybus revert for reported issue
- greybus fixup for endian build warning
- coresight driver fixes
- nvmem driver fixes
- devcoredump fix
- parport new device id
- ndtest build fix
All of these have ben in linux-next with no reported issues"
* tag 'char-misc-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
nvmem: Do not expect fixed layouts to grab a layout driver
parport: Add support for Brainboxes IX/UC/PX parallel cards
Revert "greybus: gb-beagleplay: Ensure le for values in transport"
greybus: gb-beagleplay: Ensure le for values in transport
greybus: BeaglePlay driver needs CRC_CCITT
Revert "debugfs: annotate debugfs handlers vs. removal with lockdep"
devcoredump: Send uevent once devcd is ready
ndtest: fix typo class_regster -> class_register
misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write
misc: mei: client.c: return negative error code in mei_cl_write
mei: pxp: fix mei_pxp_send_message return value
coresight: ultrasoc-smb: Fix uninitialized before use buf_hw_base
coresight: ultrasoc-smb: Config SMB buffer before register sink
coresight: ultrasoc-smb: Fix sleep while close preempt in enable_smb
Documentation: coresight: fix `make refcheckdocs` warning
hwtracing: hisi_ptt: Don't try to attach a task
hwtracing: hisi_ptt: Handle the interrupt in hardirq context
hwtracing: hisi_ptt: Add dummy callback pmu::read()
coresight: Fix crash when Perf and sysfs modes are used concurrently
coresight: etm4x: Remove bogous __exit annotation for some functions
The remainder address post-6.6 issues or aren't considered serious enough
to justify backporting.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZXKEfwAKCRDdBJ7gKXxA
jlRpAQCiAp1nSqIz/fOKTzoQRaTDXU/m+C+6ZAXdKLDfvQBhpwEAnxxjZ8IgF+8Z
Klz/GirHX5w5o7jE2wb8iObo1nR75Qo=
=omRq
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-12-07-18-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"31 hotfixes. Ten of these address pre-6.6 issues and are marked
cc:stable. The remainder address post-6.6 issues or aren't considered
serious enough to justify backporting"
* tag 'mm-hotfixes-stable-2023-12-07-18-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits)
mm/madvise: add cond_resched() in madvise_cold_or_pageout_pte_range()
nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
mm/hugetlb: have CONFIG_HUGETLB_PAGE select CONFIG_XARRAY_MULTI
scripts/gdb: fix lx-device-list-bus and lx-device-list-class
MAINTAINERS: drop Antti Palosaari
highmem: fix a memory copy problem in memcpy_from_folio
nilfs2: fix missing error check for sb_set_blocksize call
kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP
units: add missing header
drivers/base/cpu: crash data showing should depends on KEXEC_CORE
mm/damon/sysfs-schemes: add timeout for update_schemes_tried_regions
scripts/gdb/tasks: fix lx-ps command error
mm/Kconfig: make userfaultfd a menuconfig
selftests/mm: prevent duplicate runs caused by TEST_GEN_PROGS
mm/damon/core: copy nr_accesses when splitting region
lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
checkstack: fix printed address
mm/memory_hotplug: fix error handling in add_memory_resource()
mm/memory_hotplug: add missing mem_hotplug_lock
.mailmap: add a new address mapping for Chester Lin
...
Ending a boot log with
platform 3f202000.mmc: deferred probe pending
is already a nice hint about the problem. Sometimes there is a more
detailed error indicator available, add that to the output.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231122093332.274145-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink=on has stabilized and is working correctly. Let's start using
device links created by fw_devlink to also enforce runtime PM ordering.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20231113220948.80089-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All three fwnode_property_get_reference_args() implemantations now allow
args argument to be NULL. Document this.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20231109101010.1329587-4-sakari.ailus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fwnode_get_property_reference_args() may not be called with args argument
NULL and while OF already supports this. Add the missing NULL check.
The purpose is to be able to count the references.
Fixes: b06184acf7 ("software node: Add software_node_get_reference_args()")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20231109101010.1329587-3-sakari.ailus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code registers the node as available in the node array
before initializing the accessor list. This makes it so that
anything which might access the accessor list as a result of
allocations will cause an undefined memory access.
In one example, an extension to access hmat data during interleave
caused this undefined access as a result of a bulk allocation
that occurs during node initialization but before the accessor
list is initialized.
Initialize the accessor list before making the node generally
available to the global system.
Fixes: 08d9dbe72b ("node: Link memory nodes to their compute nodes")
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Link: https://lore.kernel.org/r/20231030044239.971756-1-gregory.price@memverge.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit 88a6f89944 ("crash: memory and CPU hotplug sysfs
attributes"), on x86_64, if only below kernel configs related to kdump are
set, compiling error are triggered.
----
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_CRASH_DUMP=y
CONFIG_CRASH_HOTPLUG=y
------
------------------------------------------------------
drivers/base/cpu.c: In function `crash_hotplug_show':
drivers/base/cpu.c:309:40: error: implicit declaration of function `crash_hotplug_cpu_support'; did you mean `crash_hotplug_show'? [-Werror=implicit-function-declaration]
309 | return sysfs_emit(buf, "%d\n", crash_hotplug_cpu_support());
| ^~~~~~~~~~~~~~~~~~~~~~~~~
| crash_hotplug_show
cc1: some warnings being treated as errors
------------------------------------------------------
CONFIG_KEXEC is used to enable kexec_load interface, the
crash_notes/crash_notes_size/crash_hotplug showing depends on
CONFIG_KEXEC is incorrect. It should depend on KEXEC_CORE instead.
Fix it now.
Link: https://lkml.kernel.org/r/20231128055248.659808-1-bhe@redhat.com
Fixes: 88a6f89944 ("crash: memory and CPU hotplug sysfs attributes")
Signed-off-by: Baoquan He <bhe@redhat.com>
Tested-by: Ignat Korchagin <ignat@cloudflare.com> [compile-time only]
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Eric DeVolder <eric_devolder@yahoo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
From Documentation/core-api/memory-hotplug.rst:
When adding/removing/onlining/offlining memory or adding/removing
heterogeneous/device memory, we should always hold the mem_hotplug_lock
in write mode to serialise memory hotplug (e.g. access to global/zone
variables).
mhp_(de)init_memmap_on_memory() functions can change zone stats and
struct page content, but they are currently called w/o the
mem_hotplug_lock.
When memory block is being offlined and when kmemleak goes through each
populated zone, the following theoretical race conditions could occur:
CPU 0: | CPU 1:
memory_offline() |
-> offline_pages() |
-> mem_hotplug_begin() |
... |
-> mem_hotplug_done() |
| kmemleak_scan()
| -> get_online_mems()
| ...
-> mhp_deinit_memmap_on_memory() |
[not protected by mem_hotplug_begin/done()]|
Marks memory section as offline, | Retrieves zone_start_pfn
poisons vmemmap struct pages and updates | and struct page members.
the zone related data |
| ...
| -> put_online_mems()
Fix this by ensuring mem_hotplug_lock is taken before performing
mhp_init_memmap_on_memory(). Also ensure that
mhp_deinit_memmap_on_memory() holds the lock.
online/offline_pages() are currently only called from
memory_block_online/offline(), so it is safe to move the locking there.
Link: https://lkml.kernel.org/r/20231120145354.308999-2-sumanthk@linux.ibm.com
Fixes: a08a2ae346 ("mm,memory_hotplug: allocate memmap from the added memory range")
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: kernel test robot <lkp@intel.com>
Cc: <stable@vger.kernel.org> [5.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
loongarch, mips, parisc, riscv and sh all print a warning if
register_cpu() returns an error. Architectures that use
GENERIC_CPU_DEVICES call panic() instead.
Errors in this path indicate something is wrong with the firmware
description of the platform, but the kernel is able to keep running.
Downgrade this to a warning to make it easier to debug this issue.
This will allow architectures that switching over to GENERIC_CPU_DEVICES
to drop their warning, but keep the existing behaviour.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R3W-00CszU-GM@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NUMA systems require the node descriptions to be ready before CPUs are
registered. This is so that the node symlinks can be created in sysfs.
Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs
are registered by arch code, instead of cpu_dev_init().
Move cpu_dev_init() after node_dev_init() so that NUMA architectures
can use GENERIC_CPU_DEVICES.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R3R-00CszO-C0@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The differences between architecture specific implementations of
arch_register_cpu() are down to whether the CPU is hotpluggable or not.
Rather than overriding the weak version of arch_register_cpu(), provide
a function that can be used to provide this detail instead.
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R3M-00CszH-6r@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add arch_unregister_cpu() to allow the ACPI machinery to call
unregister_cpu(). This is enough for arm64, riscv and loongarch, but
needs to be overridden by x86 and ia64 who need to do more work.
CC: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R3H-00CszC-2n@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Architectures often have extra per-cpu work that needs doing
before a CPU is registered, often to determine if a CPU is
hotpluggable.
To allow the ACPI architectures to use GENERIC_CPU_DEVICES, move
the cpu_register() call into arch_register_cpu(), which is made __weak
so architectures with extra work can override it.
This aligns with the way x86, ia64 and loongarch register hotplug CPUs
when they become present.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R3B-00Csz6-Uh@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Three of the five ACPI architectures create sysfs entries using
register_cpu() for present CPUs, whereas arm64, riscv and all
GENERIC_CPU_DEVICES do this for possible CPUs.
Registering a CPU is what causes them to show up in sysfs.
It makes very little sense to register all possible CPUs. Registering
a CPU is what triggers the udev notifications allowing user-space to
react to newly added CPUs.
To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change
it to use for_each_present_cpu().
Making the ACPI architectures use GENERIC_CPU_DEVICES is a pre-requisite
step to centralise their register_cpu() logic, before moving it into the
ACPI processor driver. When we add support for register CPUs from ACPI
in a later patch, we will avoid registering CPUs in this path.
Of the ACPI architectures that register possible CPUs, arm64 and riscv
do not support making possible CPUs present as they use the weak 'always
fails' version of arch_register_cpu().
Only two of the eight architectures that use GENERIC_CPU_DEVICES have a
distinction between present and possible CPUs.
The following architectures use GENERIC_CPU_DEVICES but are not SMP,
so possible == present:
* m68k
* microblaze
* nios2
The following architectures use GENERIC_CPU_DEVICES and consider
possible == present:
* csky: setup_smp()
* processor_probe() sets possible for all CPUs and present for all CPUs
except the boot cpu, which will have been done by
init/main.c::start_kernel().
um appears to be a subarchitecture of x86.
The remaining architecture using GENERIC_CPU_DEVICES are:
* openrisc and hexagon:
where smp_init_cpus() makes all CPUs < NR_CPUS possible,
whereas smp_prepare_cpus() only makes CPUs < setup_max_cpus present.
After this change, openrisc and hexagon systems that use the max_cpus
command line argument would not see the other CPUs present in sysfs.
This should not be a problem as these CPUs can't be brought online as
_cpu_up() checks cpu_present().
After this change, only CPUs which are present appear in sysfs.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R36-00Csz0-Px@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
register_cpu_capacity_sysctl() adds a property to sysfs that describes
the CPUs capacity. This is done from a subsys_initcall() that assumes
all possible CPUs are registered.
With CPU hotplug, possible CPUs aren't registered until they become
present, (or for arm64 enabled). This leads to messages during boot:
| register_cpu_capacity_sysctl: too early to get CPU1 device!
and once these CPUs are added to the system, the file is missing.
Move this to a cpuhp callback, so that the file is created once
CPUs are brought online. This covers CPUs that are added late by
mechanisms like hotplug.
One observable difference is the file is now missing for offline CPUs.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R2g-00CsyV-Ss@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
arch_call_rest_init() --> rest_init() -- kernel_init()
--> kernel_init_freeable() --> smp_prepare_cpus()
But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().
So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.
This patch fixes it by:
1.) export the early_cpu_to_node(), and use it in the init_irq_stacks().
2.) change init_irq_stacks() to __init function.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231124031513.81548-1-shijie@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Since commit 0ec7731655 ("regmap: Ensure range selector registers
are updated after cache sync") opening pcm512x based soundcards fail
with EINVAL and dmesg shows sync cache and pm_runtime_get errors:
[ 228.794676] pcm512x 1-004c: Failed to sync cache: -22
[ 228.794740] pcm512x 1-004c: ASoC: error at snd_soc_pcm_component_pm_runtime_get on pcm512x.1-004c: -22
This is caused by the cache check result leaking out into the
regcache_sync return value.
Fix this by making the check local-only, as the comment above the
regcache_read call states a non-zero return value means there's
nothing to do so the return value should not be altered.
Fixes: 0ec7731655 ("regmap: Ensure range selector registers are updated after cache sync")
Cc: stable@vger.kernel.org
Signed-off-by: Matthias Reichl <hias@horus.com>
Link: https://lore.kernel.org/r/20231203222216.96547-1-hias@horus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Add fwnode_name_eq() to implement the functionality of of_node_name_eq()
on fwnode property API. The same convention of ending the comparison at
'@' (besides NUL) is applied on also both ACPI and swnode. The function
is intended for comparing unit address-less node names on DT and firmware
or swnodes compliant with DT bindings.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
The function device_is_dependent() is only called by the driver core
internally and should not, at this time, be called by anyone else
outside of it, so mark it as static so as not to give driver authors the
wrong idea.
Cc: Saravana Kannan <saravanak@google.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023112815-faculty-thud-add8@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dev_coredumpm() creates a devcoredump device and adds it
to the core kernel framework which eventually end up
sending uevent to the user space and later creates a
symbolic link to the failed device. An application
running in userspace may be interested in this symbolic
link to get the name of the failed device.
In a issue scenario, once uevent sent to the user space
it start reading '/sys/class/devcoredump/devcdX/failing_device'
to get the actual name of the device which might not been
created and it is in its path of creation.
To fix this, suppress sending uevent till the failing device
symbolic link gets created and send uevent once symbolic
link is created successfully.
Fixes: 833c95456a ("device coredump: add new device coredump class")
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1700232572-25823-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No error code are available to signal an invalid firmware content.
Drivers that can check the firmware content validity can not return this
specific failure to the user-space
Expand the firmware error code with an additional code:
- "firmware invalid" code which can be used when the provided firmware
is invalid
Sync lib/test_firmware.c file accordingly.
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231122-feature_firmware_error_code-v3-1-04ec753afb71@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sometimes the users want to match the single value string property
against an array of predefined strings. Create a helper for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230808162800.61651-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Use fwnode_property_string_array_count() instead of open coded variant.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230808162800.61651-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Add a test for writing to a noinc register, which verifies that the
write does not touch adjacent registers. This test succeeds with [1]
applied and fails without it.
[1] 984a4afdc8 ("regmap: prevent noinc writes from clobbering cache")
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Link: https://lore.kernel.org/r/20231102203039.3069305-2-ben.wolsieffer@hefring.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Support noinc semantics in RAM backed regmaps, for testing purposes. Add
a new callback that selects registers which should have noinc behavior.
Bulk writes to a noinc register will cause the last value in the buffer
to be assigned to the register, while bulk reads will copy the same
value repeatedly into the buffer.
This patch only adds support to regmap-raw-ram, since regmap-ram does
not support bulk operations.
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Link: https://lore.kernel.org/r/20231102203039.3069305-1-ben.wolsieffer@hefring.com
Signed-off-by: Mark Brown <broonie@kernel.org>
One fix here, for an interaction between noinc registers and caches - if
a device uses noinc registers (which is rare) then we could corrupt
registers after the noinc register in the cache.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmVKGw8ACgkQJNaLcl1U
h9C8Bwf/U2a00UtJqvg+deoXHQ/7QY8w093DnHxdTLyUcDvwffbfTMrPPOmPtRLh
1OYTtPPAgcF2PjyMaazHON2Y3a1/jPLEyjfRI3UmMWoVCVObUzh30ARASdRlvZj0
1bnj88v32W2hFu/TuUPVVii3g2qH3XccEvX/JaNm0yG9lBHfugPGbVOI2S+MzWL9
iFq+nYyqi6RzD7gq99ZaoA6UdacbCyPWgvHKxoZqhknTlX/KQQDdBlq2alhmbOoU
lW3t9dwtZlcoskSTmKDI7Ukz0EnRvK1fcrkhZ/3XrQl349eLmtqcG7rM/ZMoORqL
wx8eoxNCyCDnjxV+QfEL6SSQEX+8Mg==
=M81g
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.7-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"One fix here, for an interaction between noinc registers and caches.
If a device uses noinc registers (which is rare) then we could corrupt
registers after the noinc register in the cache"
* tag 'regmap-fix-v6.7-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: prevent noinc writes from clobbering cache
Here is the set of driver core updates for 6.7-rc1. Nothing major in
here at all, just a small number of changes including:
- minor cleanups and updates from Andy Shevchenko
- __counted_by addition
- firmware_loader update for aborting loads cleaner
- other minor changes, details in the shortlog
- documentation update
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZUTe8A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynP0QCfT2jQx3OcL22MoqCvdTuZJKPiHSIAoMxrliJF
d4cUeICW17ywlTFzsKg8
=nTeu
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core updates for 6.7-rc1. Nothing major in
here at all, just a small number of changes including:
- minor cleanups and updates from Andy Shevchenko
- __counted_by addition
- firmware_loader update for aborting loads cleaner
- other minor changes, details in the shortlog
- documentation update
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (21 commits)
firmware_loader: Abort all upcoming firmware load request once reboot triggered
firmware_loader: Refactor kill_pending_fw_fallback_reqs()
Documentation: security-bugs.rst: linux-distros relaxed their rules
driver core: Release all resources during unbind before updating device links
driver core: class: remove boilerplate code
driver core: platform: Annotate struct irq_affinity_devres with __counted_by
resource: Constify resource crosscheck APIs
resource: Unify next_resource() and next_resource_skip_children()
resource: Reuse for_each_resource() macro
PCI: Implement custom llseek for sysfs resource entries
kernfs: sysfs: support custom llseek method for sysfs entries
debugfs: Fix __rcu type comparison warning
device property: Replace custom implementation of COUNT_ARGS()
drivers: base: test: Make property entry API test modular
driver core: Add missing parameter description to __fwnode_link_add()
device property: Clarify usage scope of some struct fwnode_handle members
devres: rename the first parameter of devm_add_action(_or_reset)
driver core: platform: Unify the firmware node type check
driver core: platform: Use temporary variable in platform_device_add()
driver core: platform: Refactor error path in a couple places
...
included in this merge do the following:
- Kemeng Shi has contributed some compation maintenance work in the
series "Fixes and cleanups to compaction".
- Joel Fernandes has a patchset ("Optimize mremap during mutual
alignment within PMD") which fixes an obscure issue with mremap()'s
pagetable handling during a subsequent exec(), based upon an
implementation which Linus suggested.
- More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
following patch series:
mm/damon: misc fixups for documents, comments and its tracepoint
mm/damon: add a tracepoint for damos apply target regions
mm/damon: provide pseudo-moving sum based access rate
mm/damon: implement DAMOS apply intervals
mm/damon/core-test: Fix memory leaks in core-test
mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
- In the series "Do not try to access unaccepted memory" Adrian Hunter
provides some fixups for the recently-added "unaccepted memory' feature.
To increase the feature's checking coverage. "Plug a few gaps where
RAM is exposed without checking if it is unaccepted memory".
- In the series "cleanups for lockless slab shrink" Qi Zheng has done
some maintenance work which is preparation for the lockless slab
shrinking code.
- Qi Zheng has redone the earlier (and reverted) attempt to make slab
shrinking lockless in the series "use refcount+RCU method to implement
lockless slab shrink".
- David Hildenbrand contributes some maintenance work for the rmap code
in the series "Anon rmap cleanups".
- Kefeng Wang does more folio conversions and some maintenance work in
the migration code. Series "mm: migrate: more folio conversion and
unification".
- Matthew Wilcox has fixed an issue in the buffer_head code which was
causing long stalls under some heavy memory/IO loads. Some cleanups
were added on the way. Series "Add and use bdev_getblk()".
- In the series "Use nth_page() in place of direct struct page
manipulation" Zi Yan has fixed a potential issue with the direct
manipulation of hugetlb page frames.
- In the series "mm: hugetlb: Skip initialization of gigantic tail
struct pages if freed by HVO" has improved our handling of gigantic
pages in the hugetlb vmmemmep optimizaton code. This provides
significant boot time improvements when significant amounts of gigantic
pages are in use.
- Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
rationalization and folio conversions in the hugetlb code.
- Yin Fengwei has improved mlock()'s handling of large folios in the
series "support large folio for mlock"
- In the series "Expose swapcache stat for memcg v1" Liu Shixin has
added statistics for memcg v1 users which are available (and useful)
under memcg v2.
- Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
prctl so that userspace may direct the kernel to not automatically
propagate the denial to child processes. The series is named "MDWE
without inheritance".
- Kefeng Wang has provided the series "mm: convert numa balancing
functions to use a folio" which does what it says.
- In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
makes is possible for a process to propagate KSM treatment across
exec().
- Huang Ying has enhanced memory tiering's calculation of memory
distances. This is used to permit the dax/kmem driver to use "high
bandwidth memory" in addition to Optane Data Center Persistent Memory
Modules (DCPMM). The series is named "memory tiering: calculate
abstract distance based on ACPI HMAT"
- In the series "Smart scanning mode for KSM" Stefan Roesch has
optimized KSM by teaching it to retain and use some historical
information from previous scans.
- Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
series "mm: memcg: fix tracking of pending stats updates values".
- In the series "Implement IOCTL to get and optionally clear info about
PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
us to atomically read-then-clear page softdirty state. This is mainly
used by CRIU.
- Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
- a bunch of relatively minor maintenance tweaks to this code.
- Matthew Wilcox has increased the use of the VMA lock over file-backed
page faults in the series "Handle more faults under the VMA lock". Some
rationalizations of the fault path became possible as a result.
- In the series "mm/rmap: convert page_move_anon_rmap() to
folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
and folio conversions.
- In the series "various improvements to the GUP interface" Lorenzo
Stoakes has simplified and improved the GUP interface with an eye to
providing groundwork for future improvements.
- Andrey Konovalov has sent along the series "kasan: assorted fixes and
improvements" which does those things.
- Some page allocator maintenance work from Kemeng Shi in the series
"Two minor cleanups to break_down_buddy_pages".
- In thes series "New selftest for mm" Breno Leitao has developed
another MM self test which tickles a race we had between madvise() and
page faults.
- In the series "Add folio_end_read" Matthew Wilcox provides cleanups
and an optimization to the core pagecache code.
- Nhat Pham has added memcg accounting for hugetlb memory in the series
"hugetlb memcg accounting".
- Cleanups and rationalizations to the pagemap code from Lorenzo
Stoakes, in the series "Abstract vma_merge() and split_vma()".
- Audra Mitchell has fixed issues in the procfs page_owner code's new
timestamping feature which was causing some misbehaviours. In the
series "Fix page_owner's use of free timestamps".
- Lorenzo Stoakes has fixed the handling of new mappings of sealed files
in the series "permit write-sealed memfd read-only shared mappings".
- Mike Kravetz has optimized the hugetlb vmemmap optimization in the
series "Batch hugetlb vmemmap modification operations".
- Some buffer_head folio conversions and cleanups from Matthew Wilcox in
the series "Finish the create_empty_buffers() transition".
- As a page allocator performance optimization Huang Ying has added
automatic tuning to the allocator's per-cpu-pages feature, in the series
"mm: PCP high auto-tuning".
- Roman Gushchin has contributed the patchset "mm: improve performance
of accounted kernel memory allocations" which improves their performance
by ~30% as measured by a micro-benchmark.
- folio conversions from Kefeng Wang in the series "mm: convert page
cpupid functions to folios".
- Some kmemleak fixups in Liu Shixin's series "Some bugfix about
kmemleak".
- Qi Zheng has improved our handling of memoryless nodes by keeping them
off the allocation fallback list. This is done in the series "handle
memoryless nodes more appropriately".
- khugepaged conversions from Vishal Moola in the series "Some
khugepaged folio conversions".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
FgeUPAD1oasg6CP+INZvCj34waNxwAc=
=E+Y4
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
- Kemeng Shi has contributed some compation maintenance work in the
series 'Fixes and cleanups to compaction'
- Joel Fernandes has a patchset ('Optimize mremap during mutual
alignment within PMD') which fixes an obscure issue with mremap()'s
pagetable handling during a subsequent exec(), based upon an
implementation which Linus suggested
- More DAMON/DAMOS maintenance and feature work from SeongJae Park i
the following patch series:
mm/damon: misc fixups for documents, comments and its tracepoint
mm/damon: add a tracepoint for damos apply target regions
mm/damon: provide pseudo-moving sum based access rate
mm/damon: implement DAMOS apply intervals
mm/damon/core-test: Fix memory leaks in core-test
mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
- In the series 'Do not try to access unaccepted memory' Adrian
Hunter provides some fixups for the recently-added 'unaccepted
memory' feature. To increase the feature's checking coverage. 'Plug
a few gaps where RAM is exposed without checking if it is
unaccepted memory'
- In the series 'cleanups for lockless slab shrink' Qi Zheng has done
some maintenance work which is preparation for the lockless slab
shrinking code
- Qi Zheng has redone the earlier (and reverted) attempt to make slab
shrinking lockless in the series 'use refcount+RCU method to
implement lockless slab shrink'
- David Hildenbrand contributes some maintenance work for the rmap
code in the series 'Anon rmap cleanups'
- Kefeng Wang does more folio conversions and some maintenance work
in the migration code. Series 'mm: migrate: more folio conversion
and unification'
- Matthew Wilcox has fixed an issue in the buffer_head code which was
causing long stalls under some heavy memory/IO loads. Some cleanups
were added on the way. Series 'Add and use bdev_getblk()'
- In the series 'Use nth_page() in place of direct struct page
manipulation' Zi Yan has fixed a potential issue with the direct
manipulation of hugetlb page frames
- In the series 'mm: hugetlb: Skip initialization of gigantic tail
struct pages if freed by HVO' has improved our handling of gigantic
pages in the hugetlb vmmemmep optimizaton code. This provides
significant boot time improvements when significant amounts of
gigantic pages are in use
- Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
rationalization and folio conversions in the hugetlb code
- Yin Fengwei has improved mlock()'s handling of large folios in the
series 'support large folio for mlock'
- In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
added statistics for memcg v1 users which are available (and
useful) under memcg v2
- Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
prctl so that userspace may direct the kernel to not automatically
propagate the denial to child processes. The series is named 'MDWE
without inheritance'
- Kefeng Wang has provided the series 'mm: convert numa balancing
functions to use a folio' which does what it says
- In the series 'mm/ksm: add fork-exec support for prctl' Stefan
Roesch makes is possible for a process to propagate KSM treatment
across exec()
- Huang Ying has enhanced memory tiering's calculation of memory
distances. This is used to permit the dax/kmem driver to use 'high
bandwidth memory' in addition to Optane Data Center Persistent
Memory Modules (DCPMM). The series is named 'memory tiering:
calculate abstract distance based on ACPI HMAT'
- In the series 'Smart scanning mode for KSM' Stefan Roesch has
optimized KSM by teaching it to retain and use some historical
information from previous scans
- Yosry Ahmed has fixed some inconsistencies in memcg statistics in
the series 'mm: memcg: fix tracking of pending stats updates
values'
- In the series 'Implement IOCTL to get and optionally clear info
about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
which permits us to atomically read-then-clear page softdirty
state. This is mainly used by CRIU
- Hugh Dickins contributed the series 'shmem,tmpfs: general
maintenance', a bunch of relatively minor maintenance tweaks to
this code
- Matthew Wilcox has increased the use of the VMA lock over
file-backed page faults in the series 'Handle more faults under the
VMA lock'. Some rationalizations of the fault path became possible
as a result
- In the series 'mm/rmap: convert page_move_anon_rmap() to
folio_move_anon_rmap()' David Hildenbrand has implemented some
cleanups and folio conversions
- In the series 'various improvements to the GUP interface' Lorenzo
Stoakes has simplified and improved the GUP interface with an eye
to providing groundwork for future improvements
- Andrey Konovalov has sent along the series 'kasan: assorted fixes
and improvements' which does those things
- Some page allocator maintenance work from Kemeng Shi in the series
'Two minor cleanups to break_down_buddy_pages'
- In thes series 'New selftest for mm' Breno Leitao has developed
another MM self test which tickles a race we had between madvise()
and page faults
- In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
and an optimization to the core pagecache code
- Nhat Pham has added memcg accounting for hugetlb memory in the
series 'hugetlb memcg accounting'
- Cleanups and rationalizations to the pagemap code from Lorenzo
Stoakes, in the series 'Abstract vma_merge() and split_vma()'
- Audra Mitchell has fixed issues in the procfs page_owner code's new
timestamping feature which was causing some misbehaviours. In the
series 'Fix page_owner's use of free timestamps'
- Lorenzo Stoakes has fixed the handling of new mappings of sealed
files in the series 'permit write-sealed memfd read-only shared
mappings'
- Mike Kravetz has optimized the hugetlb vmemmap optimization in the
series 'Batch hugetlb vmemmap modification operations'
- Some buffer_head folio conversions and cleanups from Matthew Wilcox
in the series 'Finish the create_empty_buffers() transition'
- As a page allocator performance optimization Huang Ying has added
automatic tuning to the allocator's per-cpu-pages feature, in the
series 'mm: PCP high auto-tuning'
- Roman Gushchin has contributed the patchset 'mm: improve
performance of accounted kernel memory allocations' which improves
their performance by ~30% as measured by a micro-benchmark
- folio conversions from Kefeng Wang in the series 'mm: convert page
cpupid functions to folios'
- Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
kmemleak'
- Qi Zheng has improved our handling of memoryless nodes by keeping
them off the allocation fallback list. This is done in the series
'handle memoryless nodes more appropriately'
- khugepaged conversions from Vishal Moola in the series 'Some
khugepaged folio conversions'"
[ bcachefs conflicts with the dynamically allocated shrinkers have been
resolved as per Stephen Rothwell in
https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/
with help from Qi Zheng.
The clone3 test filtering conflict was half-arsed by yours truly ]
* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
mm/damon/sysfs: update monitoring target regions for online input commit
mm/damon/sysfs: remove requested targets when online-commit inputs
selftests: add a sanity check for zswap
Documentation: maple_tree: fix word spelling error
mm/vmalloc: fix the unchecked dereference warning in vread_iter()
zswap: export compression failure stats
Documentation: ubsan: drop "the" from article title
mempolicy: migration attempt to match interleave nodes
mempolicy: mmap_lock is not needed while migrating folios
mempolicy: alloc_pages_mpol() for NUMA policy without vma
mm: add page_rmappable_folio() wrapper
mempolicy: remove confusing MPOL_MF_LAZY dead code
mempolicy: mpol_shared_policy_init() without pseudo-vma
mempolicy trivia: use pgoff_t in shared mempolicy tree
mempolicy trivia: slightly more consistent naming
mempolicy trivia: delete those ancient pr_debug()s
mempolicy: fix migrate_pages(2) syscall return nr_failed
kernfs: drop shared NUMA mempolicy hooks
hugetlbfs: drop shared NUMA mempolicy pretence
mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
...
To help make the move of sysctls out of kernel/sysctl.c not incur a size
penalty sysctl has been changed to allow us to not require the sentinel, the
final empty element on the sysctl array. Joel Granados has been doing all this
work. On the v6.6 kernel we got the major infrastructure changes required to
support this. For v6.7-rc1 we have all arch/ and drivers/ modified to remove
the sentinel. Both arch and driver changes have been on linux-next for a bit
less than a month. It is worth re-iterating the value:
- this helps reduce the overall build time size of the kernel and run time
memory consumed by the kernel by about ~64 bytes per array
- the extra 64-byte penalty is no longer inncurred now when we move sysctls
out from kernel/sysctl.c to their own files
For v6.8-rc1 expect removal of all the sentinels and also then the unneeded
check for procname == NULL.
The last 2 patches are fixes recently merged by Krister Johansen which allow
us again to use softlockup_panic early on boot. This used to work but the
alias work broke it. This is useful for folks who want to detect softlockups
super early rather than wait and spend money on cloud solutions with nothing
but an eventual hung kernel. Although this hadn't gone through linux-next it's
also a stable fix, so we might as well roll through the fixes now.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmVCqKsSHG1jZ3JvZkBr
ZXJuZWwub3JnAAoJEM4jHQowkoinEgYQAIpkqRL85DBwems19Uk9A27lkctwZ6Fc
HdslQCObQTsbuKVimZFP4IL2beUfUE0cfLZCXlzp+4nRDOf6vyhyf3w19jPQtI0Q
YdqwTk9y6G5VjDsb35QK0+UBloY/kZ1H3/LW4uCwjXTuksUGmWW2Qvey35696Scv
hDMLADqKQmdpYxLUaNi9QyYbEAjYtOai2ezg3+i7hTG168t1k/Ab2BxIFrPVsCR2
FAiq05L4ugWjNskdsWBjck05JZsx9SK/qcAxpIPoUm4nGiFNHApXE0E0hs3vsnmn
WIHIbxCQw8ZlUDlmw4S+0YH3NFFzFbWfmW8k2b0f2qZTJm/rU4KiJfcJVknkAUVF
raFox6XDW0AUQ9L/NOUJ9ip5rup57GcFrMYocdJ3PPAvvmHKOb1D1O741p75RRcc
9j7zwfIRrzjPUqzhsQS/GFjdJu3lJNmEBK1AcgrVry6WoItrAzJHKPPDC7TwaNmD
eXpjxMl1sYzzHqtVh4hn+xkUYphj/6gTGMV8zdo+/FopFswgeJW9G8kHtlEWKDPk
MRIKwACmfetP6f3ngHunBg+BOipbjCANL7JI0nOhVOQoaULxCCPx+IPJ6GfSyiuH
AbcjH8DGI7fJbUkBFoF0dsRFZ2gH8ds1PYMbWUJ6x3FtuCuv5iIuvQYoaWU6itm7
6f0KvCogg0fU
=Qf50
-----END PGP SIGNATURE-----
Merge tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull sysctl updates from Luis Chamberlain:
"To help make the move of sysctls out of kernel/sysctl.c not incur a
size penalty sysctl has been changed to allow us to not require the
sentinel, the final empty element on the sysctl array. Joel Granados
has been doing all this work. On the v6.6 kernel we got the major
infrastructure changes required to support this. For v6.7-rc1 we have
all arch/ and drivers/ modified to remove the sentinel. Both arch and
driver changes have been on linux-next for a bit less than a month. It
is worth re-iterating the value:
- this helps reduce the overall build time size of the kernel and run
time memory consumed by the kernel by about ~64 bytes per array
- the extra 64-byte penalty is no longer inncurred now when we move
sysctls out from kernel/sysctl.c to their own files
For v6.8-rc1 expect removal of all the sentinels and also then the
unneeded check for procname == NULL.
The last two patches are fixes recently merged by Krister Johansen
which allow us again to use softlockup_panic early on boot. This used
to work but the alias work broke it. This is useful for folks who want
to detect softlockups super early rather than wait and spend money on
cloud solutions with nothing but an eventual hung kernel. Although
this hadn't gone through linux-next it's also a stable fix, so we
might as well roll through the fixes now"
* tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (23 commits)
watchdog: move softlockup_panic back to early_param
proc: sysctl: prevent aliased sysctls from getting passed to init
intel drm: Remove now superfluous sentinel element from ctl_table array
Drivers: hv: Remove now superfluous sentinel element from ctl_table array
raid: Remove now superfluous sentinel element from ctl_table array
fw loader: Remove the now superfluous sentinel element from ctl_table array
sgi-xp: Remove the now superfluous sentinel element from ctl_table array
vrf: Remove the now superfluous sentinel element from ctl_table array
char-misc: Remove the now superfluous sentinel element from ctl_table array
infiniband: Remove the now superfluous sentinel element from ctl_table array
macintosh: Remove the now superfluous sentinel element from ctl_table array
parport: Remove the now superfluous sentinel element from ctl_table array
scsi: Remove now superfluous sentinel element from ctl_table array
tty: Remove now superfluous sentinel element from ctl_table array
xen: Remove now superfluous sentinel element from ctl_table array
hpet: Remove now superfluous sentinel element from ctl_table array
c-sky: Remove now superfluous sentinel element from ctl_talbe array
powerpc: Remove now superfluous sentinel element from ctl_table arrays
riscv: Remove now superfluous sentinel element from ctl_table array
x86/vdso: Remove now superfluous sentinel element from ctl_table array
...
The highlights for the driver support this time are
- Qualcomm platforms gain support for the Qualcomm Secure Execution
Environment firmware interface to access EFI variables on certain
devices, and new features for multiple platform and firmware drivers.
- Arm FF-A firmware support gains support for v1.1 specification features,
in particular notification and memory transaction descriptor changes.
- SCMI firmware support now support v3.2 features for clock and DVFS
configuration and a new transport for Qualcomm platforms.
- Minor cleanups and bugfixes are added to pretty much all the active
platforms: qualcomm, broadcom, dove, ti-k3, rockchip, sifive, amlogic,
atmel, tegra, aspeed, vexpress, mediatek, samsung and more.
In particular, this contains portions of the treewide conversion to
use __counted_by annotations and the device_get_match_data helper.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC10IACgkQYKtH/8kJ
UifFoQ//Tw7aux88EA2UkyL2Wulv80NwRQn3tQlxI/6ltjBX64yeQ6Y8OzmYdSYK
20NEpbU7VWOFftN+D6Jp1HLrvfi0OV9uJn3WiTX3ChgDXixpOXo4TYgNNTlb9uZ4
MrSTG3NkS27m/oTaCmYprOObgSNLq1FRCGIP7w4U9gyMk9N9FSKMpSJjlH06qPz6
WBLTaIwPgBsyrLfCdxfA1y7AFCAHVxQJO4bp0VWSIalTrneGTeQrd2FgYMUesQ2e
fIUNCaU4mpmj8XnQ/W19Wsek8FRB+fOh0hn/Gl+iHYibpxusIsn7bkdZ5BOJn2J0
OY3C1biopaaxXcZ+wmnX9X0ieZ3TDsHzYOEf0zmNGzMZaZkV8kQt4/Ykv77xz6Gc
4Bl6JI5QZ4rTZvlHYGMYxhy3hKuB31mO2rHbei7eR7J7UmjzWcl5P6HYfCgj7wzH
crIWj1IR1Nx6Dt/wXf3HlRcEiAEJ2D0M3KIFjAVT239TsxacBfDrRk+YedF2bKbn
WMYfVM6jJnPOykGg/gMRlttS/o/7TqHBl3y/900Idiijcm3cRPbQ+uKfkpHXftN/
2vOtsw7pzEg7QQI9GVrb4drTrLvYJ7GQOi4o0twXTCshlXUk2V684jvHt0emFkdX
ew9Zft4YLAYSmuJ3XqGhhMP63FsHKMlB1aSTKKPeswdIJmrdO80=
=QIut
-----END PGP SIGNATURE-----
Merge tag 'soc-drivers-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
"The highlights for the driver support this time are
- Qualcomm platforms gain support for the Qualcomm Secure Execution
Environment firmware interface to access EFI variables on certain
devices, and new features for multiple platform and firmware
drivers.
- Arm FF-A firmware support gains support for v1.1 specification
features, in particular notification and memory transaction
descriptor changes.
- SCMI firmware support now support v3.2 features for clock and DVFS
configuration and a new transport for Qualcomm platforms.
- Minor cleanups and bugfixes are added to pretty much all the active
platforms: qualcomm, broadcom, dove, ti-k3, rockchip, sifive,
amlogic, atmel, tegra, aspeed, vexpress, mediatek, samsung and
more.
In particular, this contains portions of the treewide conversion to
use __counted_by annotations and the device_get_match_data helper"
* tag 'soc-drivers-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (156 commits)
soc: qcom: pmic_glink_altmode: Print return value on error
firmware: qcom: scm: remove unneeded 'extern' specifiers
firmware: qcom: scm: add a missing forward declaration for struct device
firmware: qcom: move Qualcomm code into its own directory
soc: samsung: exynos-chipid: Convert to platform remove callback returning void
soc: qcom: apr: Add __counted_by for struct apr_rx_buf and use struct_size()
soc: qcom: pmic_glink: fix connector type to be DisplayPort
soc: ti: k3-socinfo: Avoid overriding return value
soc: ti: k3-socinfo: Fix typo in bitfield documentation
soc: ti: knav_qmss_queue: Use device_get_match_data()
firmware: ti_sci: Use device_get_match_data()
firmware: qcom: qseecom: add missing include guards
soc/pxa: ssp: Convert to platform remove callback returning void
soc/mediatek: mtk-mmsys: Convert to platform remove callback returning void
soc/mediatek: mtk-devapc: Convert to platform remove callback returning void
soc/loongson: loongson2_guts: Convert to platform remove callback returning void
soc/litex: litex_soc_ctrl: Convert to platform remove callback returning void
soc/ixp4xx: ixp4xx-qmgr: Convert to platform remove callback returning void
soc/ixp4xx: ixp4xx-npe: Convert to platform remove callback returning void
soc/hisilicon: kunpeng_hccs: Convert to platform remove callback returning void
...
Currently, noinc writes are cached as if they were standard incrementing
writes, overwriting unrelated register values in the cache. Instead, we
want to cache the last value written to the register, as is done in the
accelerated noinc handler (regmap_noinc_readwrite).
Fixes: cdf6b11daa ("regmap: Add regmap_noinc_write API")
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Link: https://lore.kernel.org/r/20231101142926.2722603-2-ben.wolsieffer@hefring.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The main change here is a fix for an issue where we were letting the
selector for windowed register ranges get out of sync with the hardware
during a cache sync plus associated KUnit tests. This was reported just
at the end of the release cycle and only in -next for a day prior to the
merge window so it seemed better to hold off for now, the bug had been
present for more than a decade so wasn't causing too many practical
problems hopefully. There's also a fix for error handling in the
debugfs output from Christope Jaillet.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmU/qxsACgkQJNaLcl1U
h9CvIgf+JL9LrFApTZliWjEPg1K8CxhRzoFRxUb0tP47f0cJ7zJ6+5uv5GGbfj3N
liZvRIMetTs8jk4T/avahS1t55XZ9EkGlItsAFny83otfOw5xA6WOqFTRl5D39G6
yHSJw1gyX+4L5lhJrWbEZG7/SgpKjptZQH6j1/4cwXYTU19gpriypJi4P9nBOzyw
GXJqQzTgq8LMdF7twPLR6QipgoE5Z3Oum6qM6C8FfibH/+AsbctzboUhXpJxIO2j
CdnYxCkW2Bu8Y0ngkayz0Hj6UnEntRpHklzJeSiP0cVy8xm4DemFSbOB5l6dQ20O
RWwpnXfQPYtaMzruKRd3FYNQRrOM9Q==
=mrBL
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The main change here is a fix for an issue where we were letting the
selector for windowed register ranges get out of sync with the
hardware during a cache sync plus associated KUnit tests. This was
reported just at the end of the release cycle and only in -next for a
day prior to the merge window so it seemed better to hold off for now,
the bug had been present for more than a decade so wasn't causing too
many practical problems hopefully.
There's also a fix for error handling in the debugfs output from
Christope Jaillet"
* tag 'regmap-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Ensure range selector registers are updated after cache sync
regmap: kunit: Add test for cache sync interaction with ranges
regmap: kunit: Fix marking of the range window as volatile
regmap: debugfs: Fix a erroneous check after snprintf()
There could be following scenario where there is a ongoing reboot
is going from processA which tries to call all the reboot notifier
callback and one of them is firmware reboot call which tries to
abort all the ongoing firmware userspace request under fw_lock but
there could be another processB which tries to do request firmware,
which came just after abort done from ProcessA and ask for userspace
to load the firmware and this can stop the ongoing reboot ProcessA
to stall for next 60s(default timeout) which may not be expected
behaviour everyone like to see, instead we should abort any firmware
load request which came once firmware knows about the reboot through
notification.
ProcessA ProcessB
kernel_restart_prepare
blocking_notifier_call_chain
fw_shutdown_notify
kill_pending_fw_fallback_reqs
__fw_load_abort
fw_state_aborted request_firmware
__fw_state_set firmware_fallback_sysfs
... fw_load_from_user_helper
.. ...
. ..
usermodehelper_read_trylock
fw_load_sysfs_fallback
fw_sysfs_wait_timeout
usermodehelper_disable
__usermodehelper_disable
down_write()
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/1698330459-31776-2-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rename 'only_kill_custom' and refactor logic related to it
to be more meaningful.
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/1698330459-31776-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we sync the register cache we do so with the cache bypassed in order
to avoid overhead from writing the synced values back into the cache. If
the regmap has ranges and the selector register for those ranges is in a
register which is cached this has the unfortunate side effect of meaning
that the physical and cached copies of the selector register can be out of
sync after a cache sync. The cache will have whatever the selector was when
the sync started and the hardware will have the selector for the register
that was synced last.
Fix this by rewriting all cached selector registers after every sync,
ensuring that the hardware and cache have the same content. This will
result in extra writes that wouldn't otherwise be needed but is simple
so hopefully robust. We don't read from the hardware since not all
devices have physical read support.
Given that nobody noticed this until now it is likely that we are rarely if
ever hitting this case.
Reported-by: Hector Martin <marcan@marcan.st>
Cc: stable@vger.kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231026-regmap-fix-selector-sync-v1-1-633ded82770d@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Hector Martin reports that since when doing a cache sync we enable cache
bypass if the selector register for a range is cached then we might leave
the physical selector register pointing to a different value to that which
we have in the cache. If we then try to write to the page that our cache
tells us is selected we will not update the selector register and write to
the wrong page. Add a test case covering this.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231023-regmap-test-window-cache-v1-2-d8a71f441968@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
For some reason the regmap used for testing ranges was not including the
end of the range of paged registers as volatile since it found the end by
counting from the selector register rather than the base of the window.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231023-regmap-test-window-cache-v1-1-d8a71f441968@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
In commit f26b3fa046 ("mm/page_alloc: limit number of high-order pages
on PCP during bulk free"), the PCP (Per-CPU Pageset) will be drained when
PCP is mostly used for high-order pages freeing to improve the cache-hot
pages reusing between page allocating and freeing CPUs.
On system with small per-CPU data cache slice, pages shouldn't be cached
before draining to guarantee cache-hot. But on a system with large
per-CPU data cache slice, some pages can be cached before draining to
reduce zone lock contention.
So, in this patch, instead of draining without any caching, "pcp->batch"
pages will be cached in PCP before draining if the size of the per-CPU
data cache slice is more than "3 * batch".
In theory, if the size of per-CPU data cache slice is more than "2 *
batch", we can reuse cache-hot pages between CPUs. But considering the
other usage of cache (code, other data accessing, etc.), "3 * batch" is
used.
Note: "3 * batch" is chosen to make sure the optimization works on recent
x86_64 server CPUs. If you want to increase it, please check whether it
breaks the optimization.
On a 2-socket Intel server with 128 logical CPU, with the patch, the
network bandwidth of the UNIX (AF_UNIX) test case of lmbench test suite
with 16-pair processes increase 70.5%. The cycles% of the spinlock
contention (mostly for zone lock) decreases from 46.1% to 21.3%. The
number of PCP draining for high order pages freeing (free_high) decreases
89.9%. The cache miss rate keeps 0.2%.
Link: https://lkml.kernel.org/r/20231016053002.756205-4-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This can be used to estimate the size of the data cache slice that can be
used by one CPU under ideal circumstances. Both DATA caches and UNIFIED
caches are used in calculation. So, the users need to consider the impact
of the code cache usage.
Because the cache inclusive/non-inclusive information isn't available now,
we just use the size of the per-CPU slice of LLC to make the result more
predictable across architectures. This may be improved when more cache
information is available in the future.
A brute-force algorithm to iterate all online CPUs is used to avoid to
allocate an extra cpumask, especially in offline callback.
Link: https://lkml.kernel.org/r/20231016053002.756205-3-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit fixes a bug in commit 9ed9895370 ("driver core: Functional
dependencies tracking support") where the device link status was
incorrectly updated in the driver unbind path before all the device's
resources were released.
Fixes: 9ed9895370 ("driver core: Functional dependencies tracking support")
Cc: stable <stable@kernel.org>
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/
Signed-off-by: Saravana Kannan <saravanak@google.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Matti Vaittinen <mazziesaccount@gmail.com>
Cc: James Clark <james.clark@arm.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A straightforward fix from Johan for a long standing bug in cases where
we both have regmaps without devices and something is using
dev_get_regmap().
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmUv6g4ACgkQJNaLcl1U
h9BAFAf/Xcx9jWPmYJ4Peon5j25nkzuMn0A4aka2uqfr5juTFZYXi3milp5hTpfD
tTC0cgBaADKCTXPU2aDrQeiHd1Rr18sOxAUdQWr7POPsuT+LJFpTjl5agNgPRa9a
LqdPA+KfrYV1snIC+sQDhEuHX1UtxDdXCS4Q0KpxIiEZGt3/Z6kDysWEQCPR8m1e
uNauKz2GaDB1ejXnEnZFMjkleKbBPc8qMF8A/c2lTayIjDn1xYVbj9IzHKoIP+au
S9eMBLrnPNPqhOJxe0pQKIfSz5nbMls/NCZRD/IQN8TTQ/Z6LActfUoKMS09hR6p
t8aEB9YCm8Wb+5ad8A0e3molsylvGQ==
=RWQo
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A straightforward fix from Johan for a long standing bug in cases
where we both have regmaps without devices and something is using
dev_get_regmap()"
* tag 'regmap-fix-v6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: fix NULL deref on lookup
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)
Remove sentinel from firmware_config_table
Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Not all regmaps have a name so make sure to check for that to avoid
dereferencing a NULL pointer when dev_get_regmap() is used to lookup a
named regmap.
Fixes: e84861fec3 ("regmap: dev_get_regmap_match(): fix string comparison")
Cc: stable@vger.kernel.org # 5.8
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20231006082104.16707-1-johan+linaro@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct
irq_affinity_devres.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: linux-hardening@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20231006201749.work.432-kees@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is the merge of immutable point in PM OPP tree shared with SCMI so
that the SCMI changes based on these OPP changes can be merged via the
SCMI tree.
* 'opp/pm-domain-scmi' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
OPP: Extend support for the opp-level beyond required-opps
OPP: Switch to use dev_pm_domain_set_performance_state()
OPP: Extend dev_pm_opp_data with a level
OPP: Add dev_pm_opp_add_dynamic() to allow more flexibility
PM: domains: Implement the ->set_performance_state() callback for genpd
PM: domains: Introduce dev_pm_domain_set_performance_state()
To enable generic support for performance scaling for PM domains, let's
implement the ->set_performance_state() callback for genpd.
Beyond this change, users of the corresponding genpd specific API,
dev_pm_genpd_set_performance_state() are encouraged to switch to the common
dev_pm_domain_set_performance_state() API.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The generic PM domain is currently the only PM domain variant that supports
performance scaling. To allow performance scaling to be supported through a
common interface, let's add an optional callback ->set_performance_state(),
in the struct dev_pm_domain.
Moreover, let's add a function, dev_pm_domain_set_performance_state(), that
may be called by consumers to request a new performance state for a device
through its PM domain.
Note that, in most cases it's preferred that a consumer use the OPP library
to request a new performance state for its device. Although, this requires
some additional changes to be supported, which are being implemented from
subsequent changes.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
There is no reason why the KUnit Tests for the property entry API can
only be built-in. Add support for building these tests as a loadable
module, like is supported by most other tests.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/98388154383df9d4ced73946efd18318aeea50e2.1695820382.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel documentation validator is not happy with:
drivers/base/core.c:67: warning: Function parameter or member 'flags' not described in '__fwnode_link_add'
Add missing parameter description.
Fixes: 6a6dfdf8b3 ("driver core: fw_devlink: Allow marking a fwnode link as being part of a cycle")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230919195048.3197551-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
OF and ACPI currently are using asymmetrical APIs to check
for the firmware node type. Unify them by using is_*_node()
against struct fwnode_handle pointer.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231003142122.3072824-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the temporary variable for the struct device pointer the code
looks better and slightly easier to read and parse by human being.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231003142122.3072824-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The usual pattern is to bail out on the error case. Besides that
one of the labels is redundant as we may return directly. Refactor
platform_device_add() and platform_dma_configure() accordingly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231003142122.3072824-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting from the commit 37c12e7497 ("[DRIVER MODEL] Improved
dynamically allocated platform_device interface") the pdev expects
to be allocated beforehand or guaranteed to be non-NULL.
Hence the leftover check is now redundant (as we have no combined
calls like platform_device_add(platform_device_alloc(...)) in the
entire kernel source code.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231003142122.3072824-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A fix for a long standing issue where when we create a new node in an
rbtree register cache we were failing to convert the register address
of the new register into a bitmask correctly and marking the wrong
register as being present in the newly created node. This would only
have affected devices with a register stride other than 1 but would
corrupt data on those devices.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmUcVmcACgkQJNaLcl1U
h9A4sgf+K/KziZcfrjzZhsuQ02qfNVDwBrYcsJCW8CjomFq1kb7QMwX1NgDkZRRD
+RbBuPvj1t/BMzsePJWrY7wDcOQHPijQPeOufcz6ZxASX0LQN3SCQR2GV+JJJSXb
XW0RLws6WcpG15XZvV8ddxYgeIeAKOhrECZg8bcHvsdin5rU19H4FydiLDS7ZCvV
jN1XYtwtFosaV/Zi28cE5nOJ3xFZeYzzx6SbdpCiQSY4llqiMxVBfffpS3wDQHWb
hLHDHeiD/kNqS+sDVDsHEOgHiWMv1JN8P6pNvPMdNqVDAgq5lVlAB3zex1pFCTV9
FRVBTgKLZEafeMqIpMus6WJe0oKCTQ==
=LK1Y
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix for a long standing issue where when we create a new node in an
rbtree register cache we were failing to convert the register address
of the new register into a bitmask correctly and marking the wrong
register as being present in the newly created node.
This would only have affected devices with a register stride other
than 1 but would corrupt data on those devices"
* tag 'regmap-fix-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: rbtree: Fix wrong register marked as in-cache when creating new node
When regcache_rbtree_write() creates a new rbtree_node it was passing the
wrong bit number to regcache_rbtree_set_register(). The bit number is the
offset __in number of registers__, but in the case of creating a new block
regcache_rbtree_write() was not dividing by the address stride to get the
number of registers.
Fix this by dividing by map->reg_stride.
Compare with regcache_rbtree_read() where the bit is checked.
This bug meant that the wrong register was marked as present. The register
that was written to the cache could not be read from the cache because it
was not marked as cached. But a nearby register could be marked as having
a cached value even if it was never written to the cache.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 3f4ff561bc ("regmap: rbtree: Make cache_present bitmap per node")
Link: https://lore.kernel.org/r/20230922153711.28103-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
In some cases the OPP tables aren't specified in device tree, but rather
encoded in the FW. To allow a genpd provider to specify them dynamically
instead, let's add a new genpd flag, GENPD_FLAG_OPP_TABLE_FW.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20230825112633.236607-13-ulf.hansson@linaro.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The commit d21fdd07ce ("driver core: Return proper error code when
dev_set_name() fails") rewrote the logic of handling the dev_set_name()
error codes, but missed the point that initially set error value to
-EINVAL might be rewritten and hence the error path can't be triggered
at some circumstances. To fix this, make sure that error variable is
set to -EINVAL when other conditionals are false.
Reported-by: syzbot+bdfb03b1ec8b342c12cb@syzkaller.appspotmail.com
Fixes: d21fdd07ce ("driver core: Return proper error code when dev_set_name() fails")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230828145824.3895288-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is a small set of driver core updates and additions for 6.6-rc1.
Included in here are:
- stable kernel documentation updates
- class structure const work from Ivan on various subsystems
- kernfs tweaks
- driver core tests!
- kobject sanity cleanups
- kobject structure reordering to save space
- driver core error code handling fixups
- other minor driver core cleanups
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH77Q8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylZMACePk8SitfaJc6FfFf5I7YK7Nq0V8MAn0nUjgsR
i8NcNpu/Yv4HGrDgTdh/
=PJbk
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is a small set of driver core updates and additions for 6.6-rc1.
Included in here are:
- stable kernel documentation updates
- class structure const work from Ivan on various subsystems
- kernfs tweaks
- driver core tests!
- kobject sanity cleanups
- kobject structure reordering to save space
- driver core error code handling fixups
- other minor driver core cleanups
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
driver core: Call in reversed order in device_platform_notify_remove()
driver core: Return proper error code when dev_set_name() fails
kobject: Remove redundant checks for whether ktype is NULL
kobject: Add sanity check for kset->kobj.ktype in kset_register()
drivers: base: test: Add missing MODULE_* macros to root device tests
drivers: base: test: Add missing MODULE_* macros for platform devices tests
drivers: base: Free devm resources when unregistering a device
drivers: base: Add basic devm tests for platform devices
drivers: base: Add basic devm tests for root devices
kernfs: fix missing kernfs_iattr_rwsem locking
docs: stable-kernel-rules: mention that regressions must be prevented
docs: stable-kernel-rules: fine-tune various details
docs: stable-kernel-rules: make the examples for option 1 a proper list
docs: stable-kernel-rules: move text around to improve flow
docs: stable-kernel-rules: improve structure by changing headlines
base/node: Remove duplicated include
kernfs: attach uuid for every kernfs and report it in fsid
kernfs: add stub helper for kernfs_generic_poll()
x86/resctrl: make pseudo_lock_class a static const structure
x86/MSR: make msr_class a static const structure
...
changes to the clk framework this time around. That's probably because everyone
was on vacation (yours truly included). We did lose a couple clk drivers this
time around because nobody was using those devices. That skews the diffstat a
bit, but either way, nothing looks out of the ordinary here. The usual suspects
are chugging along adding support for more SoCs and fixing bugs.
If I had to choose, I'd say the theme for the past few months has been
"polish". There's quite a few patches that migrate to
devm_platform_ioremap_resource() in here. And there's more than a handful of
patches that move the NR_CLKS define from the DT binding header to the driver.
There's even patches that migrate drivers to use clk_parent_data and clk_hw to
describe clk tree topology. It seems that the spring (summer?) cleaning bug got
some folks, or the semiconductor shortage finally hit the software side.
New Drivers:
- StarFive JH7110 SoC clock drivers
- Qualcomm IPQ5018 Global Clock Controller driver
- Versa3 clk generator to support 48KHz playback/record with audio codec on
RZ/G2L SMARC EVK
Removed Drivers:
- Remove non-OF mmp clk drivers
- Remove OXNAS clk driver
Updates:
- Add __counted_by to struct clk_hw_onecell_data and struct spmi_pmic_div_clk_cc
- Move defines for numbers of clks (NR_CLKS) from DT headers to drivers
- Introduce kstrdup_and_replace() and use it
- Add PLL rates for Rockchip rk3568
- Add the display clock tree for Rockchip rv1126
- Add Audio Clock Generator (ADG) clocks on Renesas R-Car Gen3 and RZ/G2 SoCs
- Convert sun9i-mmc clock to use devm_platform_get_and_ioremap_resource()
- Fix function name in a comment in ccu_mmc_timing.c
- Parameter name correction for ccu_nkm_round_rate()
- Implement CLK_SET_RATE_PARENT for Allwinner NKM clocks, i.e. consider alternative
parent rates when determining clock rates
- Set CLK_SET_RATE_PARENT for Allwinner A64 pll-mipi
- Support finding closest (as opposed to closest but not higher) clock rate
for NM, NKM, mux and div type clocks, as use it for Allwinner A64 pll-video0
- Prefer current parent rate if able to generate ideal clock rate for Allwinner NKM clocks
- Clean up Qualcomm SMD RPM driver, with interconnect bus clocks moved out to
the interconnect drivers
- Fix various PM runtime bugs across many Qualcomm clk drivers
- Migrate Qualcomm MDM9615 is to parent_hw and parent_data
- Add network related resets on Qualcomm IPQ4019
- Add a couple missing USB related clocks to Qualcomm IPQ9574
- Add missing gpll0_sleep_clk_src to Qualcomm MSM8917 global clock controller
- In the Qualcomm QDU1000 global clock controller, GDSCs, clkrefs, and GPLL1 are
added, while PCIe pipe clock, SDCC rcg ops are corrected
- Add missing GDSCs to and correct GDSCs for the SC8280XP global clock controller driver
- Support retention for the Qualcomm SC8280XP display clock controller GDSCs.
- Qualcommm's SDCC apps_clk_src is marked with CLK_OPS_PARENT_ENABLE to fix
issues with missing parent clocks across sc7180, sm7150, sm6350 and sm8250,
while sm8450 is corrected to use floor ops
- Correct Qualcomm SM6350 GPU clock controller's clock supplies
- Drop unwanted clocks from the Qualcomm IPQ5332 GCC driver
- Add missing OXILICX GDSC to Qualcomm MSM8226 GCC
- Change the delay in the Qualcomm reset controller to fsleep() for correctness
- Extend the Qualcomm SM83550 Video clock controller to support SC8280XP
- Add graphics clock support on Renesas RZ/G2M, RZ/G2N, RZ/G2E, and R-Car H3,
M3-W, and M3-N SoCs
- Add Clocked Serial Interface (CSI) clocks on Renesas RZ/V2M
- Add PWM (MTU3) clock and reset on Renesas RZ/G2UL and RZ/Five
- Add the PDM IPC clock for i.MX93
- Add 519.75MHz frequency support for i.MX9 PLL
- Simplify the .determine_rate() implementation for i.MX GPR mux
- Make the i.MX8QXP LPCG clock use devm_platform_ioremap_resource()
- Add the audio mux clock to i.MX8
- Fix the SPLL2 MULT range for PLLv4
- Update the SPLL2 type in i.MX8ULP
- Fix the SAI4 clock on i.MX8MP
- Add silicon revision print for i.MX25 on clocks init
- Drop the return value from __mx25_clocks_init()
- Fix the clock pauses on no-op set_rate for i.MX8M composite clock
- Drop restrictions for i.MX PLL14xx and fix its max prediv value
- Drop the 393216000 and 361267200 from i.MX PLL14xx rate table to allow
glitch free switching
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmTv2wkRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSW1LRAAuHR2HoyB4bRHmCa1bfOfYYDfSWsBWEav
tWIfBl86Nl/Je50Gk2NJ9vqU5OPqRZ57TIniijHHoX5n7/kYcr8KVmlomY07hUeg
CzWyothkxg4k7+rQwVAWvmlR2YAVwzHDKcwq7gkMZOnW/y26LXip99cjopu2CJLx
zVwTgvWollmd4KVlicnAlx4zUjgNkWR24iA4Lcf5ir+Dr6FYNjxLI+akBA8EPxxi
wLixZbScgBSgpGn6KVgoFhclCToPS0gt5m6HfQxJ/svOCU54l+jRKpzkNZGWvyu4
A8t3CRrwL2iS/mfCGk2yRlaKySoLLpjlpW1AI7fHTWbG2P6p8ZphtN7jOeeAEsbq
TNpzWEjtY6B/lfRzxxINXkrtLaqmlnFY/P5np5fDrf/61gRFxLFQemyRdY/xCSJf
Kwq8ja1mrSGWoDGG9XhDqTf9Yek9LRObNzlDrEmn/i/qLTcxhOIz58pzHg4iAlx5
9HDtnJ8hKg4uE1TtT12Bmasb1+WzG7GYYESNfKWZhCvbRqEUzcDOHk7xpwYa1ffx
yZIgMs7Sb/exNW8LMPYmgnyj/f9eo5IdjiQvune+Zy5NrdzfyN6Sf/LSibrqCF2z
X5aFHqQrR8+PifD+se+g5HPa0ezSmBIhXzYUTOC6f+nywlrJjhwDXPDYI6Lcd//p
r4mpOmJS+G4=
=h2Jz
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk subsystem updates from Stephen Boyd:
"This pull request is full of clk driver changes. In fact, there aren't
any changes to the clk framework this time around. That's probably
because everyone was on vacation (yours truly included). We did lose a
couple clk drivers this time around because nobody was using those
devices. That skews the diffstat a bit, but either way, nothing looks
out of the ordinary here. The usual suspects are chugging along adding
support for more SoCs and fixing bugs.
If I had to choose, I'd say the theme for the past few months has been
"polish". There's quite a few patches that migrate to
devm_platform_ioremap_resource() in here. And there's more than a
handful of patches that move the NR_CLKS define from the DT binding
header to the driver. There's even patches that migrate drivers to use
clk_parent_data and clk_hw to describe clk tree topology. It seems
that the spring (summer?) cleaning bug got some folks, or the
semiconductor shortage finally hit the software side.
New Drivers:
- StarFive JH7110 SoC clock drivers
- Qualcomm IPQ5018 Global Clock Controller driver
- Versa3 clk generator to support 48KHz playback/record with audio
codec on RZ/G2L SMARC EVK
Removed Drivers:
- Remove non-OF mmp clk drivers
- Remove OXNAS clk driver
Updates:
- Add __counted_by to struct clk_hw_onecell_data and struct
spmi_pmic_div_clk_cc
- Move defines for numbers of clks (NR_CLKS) from DT headers to
drivers
- Introduce kstrdup_and_replace() and use it
- Add PLL rates for Rockchip rk3568
- Add the display clock tree for Rockchip rv1126
- Add Audio Clock Generator (ADG) clocks on Renesas R-Car Gen3 and
RZ/G2 SoCs
- Convert sun9i-mmc clock to use
devm_platform_get_and_ioremap_resource()
- Fix function name in a comment in ccu_mmc_timing.c
- Parameter name correction for ccu_nkm_round_rate()
- Implement CLK_SET_RATE_PARENT for Allwinner NKM clocks, i.e.
consider alternative parent rates when determining clock rates
- Set CLK_SET_RATE_PARENT for Allwinner A64 pll-mipi
- Support finding closest (as opposed to closest but not higher)
clock rate for NM, NKM, mux and div type clocks, as use it for
Allwinner A64 pll-video0
- Prefer current parent rate if able to generate ideal clock rate for
Allwinner NKM clocks
- Clean up Qualcomm SMD RPM driver, with interconnect bus clocks
moved out to the interconnect drivers
- Fix various PM runtime bugs across many Qualcomm clk drivers
- Migrate Qualcomm MDM9615 is to parent_hw and parent_data
- Add network related resets on Qualcomm IPQ4019
- Add a couple missing USB related clocks to Qualcomm IPQ9574
- Add missing gpll0_sleep_clk_src to Qualcomm MSM8917 global clock
controller
- In the Qualcomm QDU1000 global clock controller, GDSCs, clkrefs,
and GPLL1 are added, while PCIe pipe clock, SDCC rcg ops are
corrected
- Add missing GDSCs to and correct GDSCs for the SC8280XP global
clock controller driver
- Support retention for the Qualcomm SC8280XP display clock
controller GDSCs.
- Qualcommm's SDCC apps_clk_src is marked with CLK_OPS_PARENT_ENABLE
to fix issues with missing parent clocks across sc7180, sm7150,
sm6350 and sm8250, while sm8450 is corrected to use floor ops
- Correct Qualcomm SM6350 GPU clock controller's clock supplies
- Drop unwanted clocks from the Qualcomm IPQ5332 GCC driver
- Add missing OXILICX GDSC to Qualcomm MSM8226 GCC
- Change the delay in the Qualcomm reset controller to fsleep() for
correctness
- Extend the Qualcomm SM83550 Video clock controller to support
SC8280XP
- Add graphics clock support on Renesas RZ/G2M, RZ/G2N, RZ/G2E, and
R-Car H3, M3-W, and M3-N SoCs
- Add Clocked Serial Interface (CSI) clocks on Renesas RZ/V2M
- Add PWM (MTU3) clock and reset on Renesas RZ/G2UL and RZ/Five
- Add the PDM IPC clock for i.MX93
- Add 519.75MHz frequency support for i.MX9 PLL
- Simplify the .determine_rate() implementation for i.MX GPR mux
- Make the i.MX8QXP LPCG clock use devm_platform_ioremap_resource()
- Add the audio mux clock to i.MX8
- Fix the SPLL2 MULT range for PLLv4
- Update the SPLL2 type in i.MX8ULP
- Fix the SAI4 clock on i.MX8MP
- Add silicon revision print for i.MX25 on clocks init
- Drop the return value from __mx25_clocks_init()
- Fix the clock pauses on no-op set_rate for i.MX8M composite clock
- Drop restrictions for i.MX PLL14xx and fix its max prediv value
- Drop the 393216000 and 361267200 from i.MX PLL14xx rate table to
allow glitch free switching"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (207 commits)
clk: qcom: Fix SM_GPUCC_8450 dependencies
clk: lmk04832: Support using PLL1_LD as SPI readback pin
clk: lmk04832: Don't disable vco clock on probe fail
clk: lmk04832: Set missing parent_names for output clocks
clk: mvebu: Convert to devm_platform_ioremap_resource()
clk: nuvoton: Convert to devm_platform_ioremap_resource()
clk: socfpga: agilex: Convert to devm_platform_ioremap_resource()
clk: ti: Use devm_platform_get_and_ioremap_resource()
clk: mediatek: Convert to devm_platform_ioremap_resource()
clk: hsdk-pll: Convert to devm_platform_ioremap_resource()
clk: gemini: Convert to devm_platform_ioremap_resource()
clk: fsl-sai: Convert to devm_platform_ioremap_resource()
clk: bm1880: Convert to devm_platform_ioremap_resource()
clk: axm5516: Convert to devm_platform_ioremap_resource()
clk: actions: Convert to devm_platform_ioremap_resource()
clk: cdce925: Remove redundant of_match_ptr()
clk: pxa910: Move number of clocks to driver source
clk: pxa1928: Move number of clocks to driver source
clk: pxa168: Move number of clocks to driver source
clk: mmp2: Move number of clocks to driver source
...
DT core:
- Add support for generating DT nodes for PCI devices. This is the
groundwork for applying overlays to PCI devices containing
non-discoverable downstream devices.
- DT unittest additions to check reverted changesets, to test for
refcount issues, and to test unresolved symbols. Also, various
clean-ups of the unittest along the way.
- Refactor node and property manipulation functions to better share code
with old API and changeset API
- Refactor changeset print functions to a common implementation
- Move some platform_device specific functions into of_platform.c
Bindings:
- Treewide fixing of typos
- Treewide clean-up of SPDX tags to use 'OR' consistently
- Last chunk of dropping unnecessary quotes. With that, the check
for unnecessary quotes is enabled in yamllint.
- Convert ftgmac100, zynqmp-genpd, pps-gpio, syna,rmi4, and qcom,ssbi
bindings to DT schema format
- Add Allwinner V3s xHCI USB, Saef SF-TC154B display, QCom SM8450 Inline
Crypto Engine, QCom SM6115 UFS, QCom SDM670 PDC interrupt controller,
Arm 2022 Cortex cores, and QCom IPQ9574 Crypto bindings
- Fixes for Rockchip DWC PCI binding
- Ensure all properties are evaluated on USB connector schema
- Fix dt-check-compatible script to find of_device_id instances with
compiler annotations
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmTubyoACgkQ+vtdtY28
YcPamA//feXFYNPiIbSa7XqfAu1PE5XSg3PqCe77QvLBGJU7saTwRJApc88iTjlA
hc5EELnZKp3FE9N7DJdmvEjYxKDqtJOukO+txKy3mFBWo+gZQURthZVcbLxUZmpw
XmYA4b/GrIv5h8YWG1wokyaGTtSfTcf0+RmAtVepiDk5kWQKaC04Let356fKn9xi
ePgLTZV6BJvPoGpMWd08o+1szUAc6Vihs9qWu7g0+mtb5K5xi/l05YMz3REu7kpf
iz06CE/uzYvHpFBJZ6izN+9Qqxh52DnWckXX68v8kStHUON2h1YmZYvjhGrfay4k
rHeDnHoRBrepDDCytXQ/fxzGtURr3b8yBnlhzEQadMLXmf25mm+TRBDmf6GnX5ij
QmHlj+eSARIafcbb4fqF1Hdyv8c7XM0AkEnj1XrIWLtXPuRNSHlS25dngCztbII/
lqmtBaH1ifCKj2VQ8YL8sVX7k208YU9vDNKZHQyA8dPEYwhknrWmp1F0OAnBB+wz
F11kDE7xkZ0/gE7mUHwe9mP94hC6Ceks4IuBvsTzBmSwqXxyCz8gM2KHK4U3gNUr
Sk2hWgZn+k2HM9zLb38FE18C6hqws6RBUWnJwZ4V3qPo2eYJ8Jzkvm7oonxjHgCC
4FmYYAoCQhBEkZPOJ4map0eO5VbShn9Hrgs46Jj4WoXmm7dFDLc=
=kl+z
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DT core:
- Add support for generating DT nodes for PCI devices. This is the
groundwork for applying overlays to PCI devices containing
non-discoverable downstream devices.
- DT unittest additions to check reverted changesets, to test for
refcount issues, and to test unresolved symbols. Also, various
clean-ups of the unittest along the way.
- Refactor node and property manipulation functions to better share
code with old API and changeset API
- Refactor changeset print functions to a common implementation
- Move some platform_device specific functions into of_platform.c
Bindings:
- Treewide fixing of typos
- Treewide clean-up of SPDX tags to use 'OR' consistently
- Last chunk of dropping unnecessary quotes. With that, the check for
unnecessary quotes is enabled in yamllint.
- Convert ftgmac100, zynqmp-genpd, pps-gpio, syna,rmi4, and qcom,ssbi
bindings to DT schema format
- Add Allwinner V3s xHCI USB, Saef SF-TC154B display, QCom SM8450
Inline Crypto Engine, QCom SM6115 UFS, QCom SDM670 PDC interrupt
controller, Arm 2022 Cortex cores, and QCom IPQ9574 Crypto bindings
- Fixes for Rockchip DWC PCI binding
- Ensure all properties are evaluated on USB connector schema
- Fix dt-check-compatible script to find of_device_id instances with
compiler annotations"
* tag 'devicetree-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (64 commits)
dt-bindings: usb: Add V3s compatible string for OHCI
dt-bindings: usb: Add V3s compatible string for EHCI
dt-bindings: display: panel: mipi-dbi-spi: add Saef SF-TC154B
dt-bindings: vendor-prefixes: document Saef Technology
dt-bindings: thermal: lmh: update maintainer address
of: unittest: Fix of_unittest_pci_node() kconfig dependencies
dt-bindings: crypto: ice: Document sm8450 inline crypto engine
dt-bindings: ufs: qcom: Add ICE to sm8450 example
dt-bindings: ufs: qcom: Add sm6115 binding
dt-bindings: ufs: qcom: Add reg-names property for ICE
dt-bindings: yamllint: Enable quoted string check
dt-bindings: Drop remaining unneeded quotes
of: unittest-data: Fix whitespace - angular brackets
of: unittest-data: Fix whitespace - indentation
of: unittest-data: Fix whitespace - blank lines
of: unittest-data: Convert remaining overlay DTS files to sugar syntax
of: overlay: unittest: Add test for unresolved symbol
of: unittest: Add separators to of_unittest_overlay_high_level()
of: unittest: Cleanup partially-applied overlays
of: unittest: Merge of_unittest_apply{,_revert}_overlay_check()
...
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmTuDHkLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYOqvhAApMk2/ceTgVH17sXaKE822+xKvgv377O6TlggMeGG
W4zA0KD69DNz0AfaaCc5U5f7n8Ld/YY1RsvkHW4b3jgw+KRTeQr0jjitBgP5kP2M
A1+qxdyJpCTwiPt9s2+JFVPeyZ0s52V6OJODKRG3s0ore55R+U09VySKtASON+q3
GMKfWqQteKC+thg7NkrQ7JUixuo84oICws+rZn4K9ifsX2O0HYW6aMW0feRfZjJH
r0TgqZc4RdPTSaF22oapR9Ls39+7hp/pBvoLm5sBNA3cl5C3X4VWo9ERMU1jW9h+
VYQv39NycUspgskWJmpbU06/+ooYqQlwHSR/vdNusmFIvxo4tf6/UX72YO5F8Dar
ap0wYGauiEwTjSnhVxPTXk3obWyWEsgFAeRnPdTlH2CNmv38QZU2HLb8eU1pcXxX
j+WI2Ewy9z22uBVYiPOKpdW1jkSfmlmfPp/8SbAdua7I3YQ90rQN6AvU06zAi/cL
NQTgO81E4jPkygqAVgS/LeYziWAQ73yM7m9ExThtTgqFtHortwhJ4Fd8XKtvtvEb
viXAZ/WZtQBv/CIKAW98NhgIDP/SPOT8ym6V35WK+kkNFMS6LMSQUfl9GgbHGyFa
n9icMm7BmbDtT1+AKNafG9En4DtAf9M9QNidAVOyfrsIk6S0gZoZwvIStkA7on8a
cNY=
=kVVr
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-maping updates from Christoph Hellwig:
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: optimize get_max_slots()
swiotlb: move slot allocation explanation comment where it belongs
swiotlb: search the software IO TLB only if the device makes use of it
swiotlb: allocate a new memory pool when existing pools are full
swiotlb: determine potential physical address limit
swiotlb: if swiotlb is full, fall back to a transient memory pool
swiotlb: add a flag whether SWIOTLB is allowed to grow
swiotlb: separate memory pool data from other allocator data
swiotlb: add documentation and rename swiotlb_do_find_slots()
swiotlb: make io_tlb_default_mem local to swiotlb.c
swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
dma-contiguous: check for memory region overlap
dma-contiguous: support numa CMA for specified node
dma-contiguous: support per-numa CMA for all architectures
dma-mapping: move arch_dma_set_mask() declaration to header
swiotlb: unexport is_swiotlb_active
x86: always initialize xen-swiotlb when xen-pcifront is enabling
xen/pci: add flag for PCI passthrough being possible
("refactor Kconfig to consolidate KEXEC and CRASH options").
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h").
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands").
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions").
- Switch the handling of kdump from a udev scheme to in-kernel handling,
by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
un/plug").
- Many singleton patches to various parts of the tree
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
=WJQa
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support tracking
KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the GENERIC_IOREMAP
ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency improvements
("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation, from
Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code ("Two
minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table range
API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM subsystem
documentation ("Improve mm documentation").
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
=Afws
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Some swap cleanups from Ma Wupeng ("fix WARN_ON in
add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support
tracking KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with
UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the
GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
architectures to take GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency
improvements ("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation,
from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code
("Two minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
most file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
memmap on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock
migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table
range API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM
subsystem documentation ("Improve mm documentation").
* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
maple_tree: shrink struct maple_tree
maple_tree: clean up mas_wr_append()
secretmem: convert page_is_secretmem() to folio_is_secretmem()
nios2: fix flush_dcache_page() for usage from irq context
hugetlb: add documentation for vma_kernel_pagesize()
mm: add orphaned kernel-doc to the rst files.
mm: fix clean_record_shared_mapping_range kernel-doc
mm: fix get_mctgt_type() kernel-doc
mm: fix kernel-doc warning from tlb_flush_rmaps()
mm: remove enum page_entry_size
mm: allow ->huge_fault() to be called without the mmap_lock held
mm: move PMD_ORDER to pgtable.h
mm: remove checks for pte_index
memcg: remove duplication detection for mem_cgroup_uncharge_swap
mm/huge_memory: work on folio->swap instead of page->private when splitting folio
mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
mm/swap: use dedicated entry for swap in folio
mm/swap: stop using page->private on tail pages for THP_SWAP
selftests/mm: fix WARNING comparing pointer to 0
selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
...
This is a much quieter release than the past few, there's one small API
addition that I noticed a user for in ALSA and a bunch of cleanups:
- Provide an interface for determining if a register is present in the
cache and add a user of it in ALSA.
- Full support for dynamic allocations, following the temporary bodges
that were done as fixes in the previous release.
- Remove the unused and questionably working 64 bit support.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmTp43QACgkQJNaLcl1U
h9BjUQf/RDJztYDL/ELIxVPfdFWAgTfAErcRgKpdpnrhMR0GljGYNyrTMwUBbnuF
+EL2v+k34wDNGobLbnGaHJIMb+r26QrHCoX9Zm0r/b5z9v3yS3QXco/MueZyMz45
SYbiEUnXj/WjYUUtVpVw6nxnREH6XpOZ1080XF4Bg9YpyCiq7128WJwivuT8XLJM
qZSb4DOQXFV26nitpTuJMkRIS1FgIh5ZKAPWpvZfP+wOStZhW3IFGlWIAkOQQ5gK
PqFA0BQV1fOVttjRGAR2sKCS4MfuIkIHmCfSfvNfLQG2GW86zb+Bpi0BFEUnl4Py
dqzztIknq19fWe9mCElSrdww25M2vg==
=fqmz
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This is a much quieter release than the past few, there's one small
API addition that I noticed a user for in ALSA and a bunch of
cleanups:
- Provide an interface for determining if a register is present in
the cache and add a user of it in ALSA.
- Full support for dynamic allocations, following the temporary
bodges that were done as fixes in the previous release.
- Remove the unused and questionably working 64 bit support"
* tag 'regmap-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fix the type used for a bitmap pointer
regmap: Remove dynamic allocation warnings for rbtree and maple
regmap: rbtree: Use alloc_flags for memory allocations
regmap: maple: Use alloc_flags for memory allocations
regmap: Reject fast_io regmap configurations with RBTREE and MAPLE caches
ALSA: hda: Use regcache_reg_cached() rather than open coding
regmap: Provide test for regcache_reg_present()
regmap: Let users check if a register is cached
regmap: Provide user selectable option to enable regmap
regmap: mmio: Remove unused 64-bit support code
regmap: cache: Revert "Add 64-bit mode support"
regmap: Revert "add 64-bit mode support" and Co.
Introduce the crash_hotplug attribute for memory and CPUs for use by
userspace. These attributes directly facilitate the udev rule for
managing userspace re-loading of the crash kernel upon hot un/plug
changes.
For memory, expose the crash_hotplug attribute to the
/sys/devices/system/memory directory. For example:
# udevadm info --attribute-walk /sys/devices/system/memory/memory81
looking at device '/devices/system/memory/memory81':
KERNEL=="memory81"
SUBSYSTEM=="memory"
DRIVER==""
ATTR{online}=="1"
ATTR{phys_device}=="0"
ATTR{phys_index}=="00000051"
ATTR{removable}=="1"
ATTR{state}=="online"
ATTR{valid_zones}=="Movable"
looking at parent device '/devices/system/memory':
KERNELS=="memory"
SUBSYSTEMS==""
DRIVERS==""
ATTRS{auto_online_blocks}=="offline"
ATTRS{block_size_bytes}=="8000000"
ATTRS{crash_hotplug}=="1"
For CPUs, expose the crash_hotplug attribute to the
/sys/devices/system/cpu directory. For example:
# udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
looking at device '/devices/system/cpu/cpu0':
KERNEL=="cpu0"
SUBSYSTEM=="cpu"
DRIVER=="processor"
ATTR{crash_notes}=="277c38600"
ATTR{crash_notes_size}=="368"
ATTR{online}=="1"
looking at parent device '/devices/system/cpu':
KERNELS=="cpu"
SUBSYSTEMS==""
DRIVERS==""
ATTRS{crash_hotplug}=="1"
ATTRS{isolated}==""
ATTRS{kernel_max}=="8191"
ATTRS{nohz_full}==" (null)"
ATTRS{offline}=="4-7"
ATTRS{online}=="0-3"
ATTRS{possible}=="0-7"
ATTRS{present}=="0-3"
With these sysfs attributes in place, it is possible to efficiently
instruct the udev rule to skip crash kernel reloading for kernels
configured with crash hotplug support.
For example, the following is the proposed udev rule change for RHEL
system 98-kexec.rules (as the first lines of the rule file):
# The kernel updates the crash elfcorehdr for CPU and memory changes
SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
When examined in the context of 98-kexec.rules, the above rules test if
crash_hotplug is set, and if so, the userspace initiated
unload-then-reload of the crash kernel is skipped.
CPU and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU
and CONFIG_MEMORY_HOTPLUG kernel config options. If an architecture
supports, for example, memory hotplug but not CPU hotplug, then the
/sys/devices/system/memory/crash_hotplug attribute file is present, but
the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be
present. Thus the udev rule skips userspace processing of memory hot
un/plug events, but the udev rule will evaluate false for CPU events, thus
allowing userspace to process CPU hot un/plug events (ie the
unload-then-reload of the kdump capture kernel).
Link: https://lkml.kernel.org/r/20230814214446.6659-5-eric.devolder@oracle.com
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Akhil Raj <lf32.dev@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
It's logically correct to call the removal notifiers in the reversed order
as it might be dependent to each other. Luckily, platform_notify_remove()
currently is not used and the others have no dependency use, but theoretically
it's still possible.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230818133654.767986-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Whe device_add() tries to assign a device name with help of
dev_set_name() the error path explicitly uses -EINVAL, while
the function may return something different.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230817091221.463721-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add one more space to FileHugePages and FilePmdMapped, so the output is
aligned with other rows in /sys/devices/system/node/nodeN/meminfo.
Link: https://lkml.kernel.org/r/be861b50-a790-e041-bcb0-2a987dcfd1a@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With memmap on memory, some architecture needs more details w.r.t altmap
such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of
computing them again when we remove a memory block, embed vmem_altmap
details in struct memory_block if we are using memmap on memory block
feature.
[yangyingliang@huawei.com: fix error return code in add_memory_resource()]
Link: https://lkml.kernel.org/r/20230809081552.1351184-1-yangyingliang@huawei.com
Link: https://lkml.kernel.org/r/20230808091501.287660-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 06188bc80c ("drivers: base: Add basic devm tests for root
devices") introduced a new set of tests for root devices that could be
compiled as a module, but didn't have the usual module macros.
Make sure they're there.
Fixes: 06188bc80c ("drivers: base: Add basic devm tests for root devices")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230816073019.1446155-2-mripard@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit b4cc44301b ("drivers: base: Add basic devm tests for platform
devices") introduced a new set of tests for platform devices that could
be compiled as a module, but didn't have the usual module macros.
Make sure they're there.
Fixes: b4cc44301b ("drivers: base: Add basic devm tests for platform devices")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230816073019.1446155-1-mripard@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the current code, devres_release_all() only gets called if the device
has a bus and has been probed.
This leads to issues when using bus-less or driver-less devices where
the device might never get freed if a managed resource holds a reference
to the device. This is happening in the DRM framework for example.
We should thus call devres_release_all() in the device_del() function to
make sure that the device-managed actions are properly executed when the
device is unregistered, even if it has neither a bus nor a driver.
This is effectively the same change than commit 2f8d16a996 ("devres:
release resources on device_del()") that got reverted by commit
a525a3ddea ("driver core: free devres in device_release") over
memory leaks concerns.
This patch effectively combines the two commits mentioned above to
release the resources both on device_del() and device_release() and get
the best of both worlds.
Fixes: a525a3ddea ("driver core: free devres in device_release")
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230720-kunit-devm-inconsistencies-test-v3-3-6aa7e074f373@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Platform devices show some inconsistencies with how devm resources are
released when the device has been probed and when it hasn't.
In particular, we can register device-managed actions no matter if the
device has be bound to a driver or not, but devres_release_all() will
only be called if it was bound to a driver or if there's no reference
held to it anymore.
If it wasn't bound to a driver and we still have a reference,
devres_release_all() will never get called. This is surprising
considering that if the driver isn't bound but doesn't have any
reference to it anymore, that function will get called, and if it was
bound to a driver but still has references, that function will get
called as well.
Even if that case is fairly unusual, it can easily lead to memory leaks.
The plan is, with the next patch, to make it consistent and enforce that
devres_release_all() is called no matter what situation we're in. For
now, it just tests for the current behaviour and skips over the
inconsistencies.
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230720-kunit-devm-inconsistencies-test-v3-2-6aa7e074f373@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The root devices show some odd behaviours compared to regular "bus" devices
that have been probed through the usual mechanism, so let's create kunit
tests to exercise those paths and odd cases.
It's not clear whether root devices are even allowed to use device
managed resources, but the fact that it works in some cases but not
others like shown in that test suite shouldn't happen either way: we
want to make it consistent and documented.
These tests will (after the following patches) ensure that consistency
and effectively document that it's allowed.
If it ever turns out to be a bad idea, we can always roll back and
modify the tests then.
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230720-kunit-devm-inconsistencies-test-v3-1-6aa7e074f373@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In
6524c798b7 ("driver core: cpu: Make cpu_show_not_affected() static")
I fat-fingered the name of cpu_show_gds(). Usually, I'd rebase but since
those are extraordinary embargoed times, the commit above was already
pulled into another tree so no no.
Therefore, fix it ontop.
Fixes: 6524c798b7 ("driver core: cpu: Make cpu_show_not_affected() static")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230811095831.27513-1-bp@alien8.de
Make them all a weak function, aliasing to a single function which
issues the "Not affected" string.
No functional changes.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20230809102700.29449-3-bp@alien8.de
* Add Base GDS mitigation
* Support GDS_NO under KVM
* Fix a documentation typo
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmTJh5YACgkQaDWVMHDJ
krAzAw/8DzjhAYEa7a1AodCBMNg8uNOPnLNoRPPNhaN5Iw6W3zXYDBDKT9PyjAIx
RoIM0aHx/oY9nCpK441o25oCWAAyzk6E5/+q9hMa7B4aHUGKqiDUC6L9dC8UiiSN
yvoBv4g7F81QnmyazwYI64S6vnbr4Cqe7K/mvVqQ/vbJiugD25zY8mflRV9YAuMk
Oe7Ff/mCA+I/kqyKhJE3cf3qNhZ61FsFI886fOSvIE7g4THKqo5eGPpIQxR4mXiU
Ri2JWffTaeHr2m0sAfFeLH4VTZxfAgBkNQUEWeG6f2kDGTEKibXFRsU4+zxjn3gl
xug+9jfnKN1ceKyNlVeJJZKAfr2TiyUtrlSE5d+subIRKKBaAGgnCQDasaFAluzd
aZkOYz30PCebhN+KTrR84FySHCaxnev04jqdtVGAQEDbTvyNagFUdZFGhWijJShV
l2l4A0gFSYJmPfPVuuAwOJnnZtA1sRH9oz/Sny3+z9BKloZh+Nc/+Cu9zC8SLjaU
BF3Qv2gU9HKTJ+MSy2JrGS52cONfpO5ngFHoOMilZ1KBHrfSb1eiy32PDT+vK60Y
PFEmI8SWl7bmrO1snVUCfGaHBsHJSu5KMqwBGmM4xSRzJpyvRe493xC7+nFvqNLY
vFOFc4jGeusOXgiLPpfGduppkTGcM7sy75UMLwTSLcQbDK99mus=
=ZAPY
-----END PGP SIGNATURE-----
Merge tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/gds fixes from Dave Hansen:
"Mitigate Gather Data Sampling issue:
- Add Base GDS mitigation
- Support GDS_NO under KVM
- Fix a documentation typo"
* tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/x86: Fix backwards on/off logic about YMM support
KVM: Add GDS_NO support to KVM
x86/speculation: Add Kconfig option for GDS
x86/speculation: Add force option to GDS mitigation
x86/speculation: Add Gather Data Sampling mitigation
vulnerability on AMD processors. In short, this is yet another issue
where userspace poisons a microarchitectural structure which can then be
used to leak privileged information through a side channel.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTQs1gACgkQEsHwGGHe
VUo1UA/8C34PwJveZDcerdkaxSF+WKx7AjOI/L2ws1qn9YVFA3ItFMgVuFTrlY6c
1eYKYB3FS9fVN3KzGOXGyhho6seHqfY0+8cyYupR+PVLn9rSy7GqHaIMr37FdQ2z
yb9xu26v+gsvuPEApazS6MxijYS98u71rHhmg97qsHCnUiMJ01+TaGucntukNJv8
FfwjZJvgeUiBPQ/6IeA/O0413tPPJ9weawPyW+sV1w7NlXjaUVkNXwiq/Xxbt9uI
sWwMBjFHpSnhBRaDK8W5Blee/ZfsS6qhJ4jyEKUlGtsElMnZLPHbnrbpxxqA9gyE
K+3ZhoHf/W1hhvcZcALNoUHLx0CvVekn0o41urAhPfUutLIiwLQWVbApmuW80fgC
DhPedEFu7Wp6Okj5+Bqi/XOsOOWN2WRDSzdAq10o1C+e+fzmkr6y4E6gskfz1zXU
ssD9S4+uAJ5bccS5lck4zLffsaA03nAYTlvl1KRP4pOz5G9ln6eyO20ar1WwfGAV
o5ZsTJVGQMyVA49QFkksj+kOI3chkmDswPYyGn2y8OfqYXU4Ip4eN+VkjorIAo10
zIec3Z0bCGZ9UUMylUmdtH3KAm8q0wVNoFrUkMEmO8j6nn7ew2BhwLMn4uu+nOnw
lX2AG6PNhRLVDVaNgDsWMwejaDsitQPoWRuCIAZ0kQhbeYuwfpM=
=73JY
-----END PGP SIGNATURE-----
Merge tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/srso fixes from Borislav Petkov:
"Add a mitigation for the speculative RAS (Return Address Stack)
overflow vulnerability on AMD processors.
In short, this is yet another issue where userspace poisons a
microarchitectural structure which can then be used to leak privileged
information through a side channel"
* tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/srso: Tie SBPB bit setting to microcode patch detection
x86/srso: Add a forgotten NOENDBR annotation
x86/srso: Fix return thunks in generated code
x86/srso: Add IBPB on VMEXIT
x86/srso: Add IBPB
x86/srso: Add SRSO_NO support
x86/srso: Add IBPB_BRTYPE support
x86/srso: Add a Speculative RAS Overflow mitigation
x86/bugs: Increase the x86 bugs vector size to two u32s
Booting the kernel with "maxcpus=1" is a common technique for CPU
partitioning and isolation. It delays the CPU bringup process until
when the bootup scripts are ready to bring CPUs online by writing 1 to
/sys/device/system/cpu/cpu<X>/online. However, it was found that not
all the CPUs were online after bootup. The collection of offline CPUs
are different after every reboot.
Further investigation reveals that some "online" write operations
fail with an -EBUSY error. This error is returned when CPU hotplug is
temporiarly disabled when cpu_hotplug_disable() is called.
During bootup, the main caller of cpu_hotplug_disable() is
pci_call_probe() for PCI device initialization. By measuring the times
spent with cpu_hotplug_disabled set in a typical 2-socket server, most
of them last less than 10ms. However, there are a few that can last
hundreds of ms. Note that the cpu_hotplug_disabled period of different
devices can overlap leading to longer cpu_hotplug_disabled hold time.
Since the CPU hotplug disable condition is transient and it is not
that easy to modify all the existing bootup scripts to handle this
condition, the kernel can help by retrying the online operation when
an -EBUSY error is returned. This patch retries the online operation
in cpu_subsys_online() when an -EBUSY error is returned for up to 5
times after an exponentially increasing delay that can last a total of
at least 620ms of waiting time by calling msleep().
With this patch in place, booting up the patched kernel with "maxcpus=1"
does not leave any CPU in an offline state in 10 reboot attempts.
Reported-by: Vishal Agrawal <vagrawal@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20230724143826.3996163-1-longman@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When test_remove is enabled really_probe() does not properly pair
dma_configure() with dma_remove(), it will end up calling dma_configure()
twice. This corrupts the owner_cnt and renders the group unusable with
VFIO/etc.
Add the missing cleanup before going back to re_probe.
Fixes: 25f3bcfc54 ("driver core: Add dma_cleanup callback in bus_type")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Closes: https://lore.kernel.org/all/6472f254-c3c4-8610-4a37-8d9dfdd54ce8@huawei.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v2-4deed94e283e+40948-really_probe_dma_cleanup_jgg@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The test_platform_device_register_node() function should return error
pointers instead of NULL. That is what the callers are expecting.
Fixes: 57ea974fb8 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/1e11ed19-e1f6-43d8-b352-474134b7c008@moroto.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Replace open coded functionality of kstrdup_and_replace() with a call.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230804143910.15504-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
There's no reason the generic platform bus code needs to call
of_platform_register_reconfig_notifier(). The notifier can be setup
before the platform bus is. Let's move it into of_core_init() which is
called just before platform_bus_init() instead to keep more of the DT
bits in the DT code.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230717143718.1715773-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
SWIOTLB implementation details should not be exposed to the rest of the
kernel. This will allow to make changes to the implementation without
modifying non-swiotlb code.
To avoid breaking existing users, provide helper functions for the few
required fields.
As a bonus, using a helper function to initialize struct device allows to
get rid of an #ifdef in driver core.
Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Add a mitigation for the speculative return address stack overflow
vulnerability found on AMD processors.
The mitigation works by ensuring all RET instructions speculate to
a controlled location, similar to how speculation is controlled in the
retpoline sequence. To accomplish this, the __x86_return_thunk forces
the CPU to mispredict every function return using a 'safe return'
sequence.
To ensure the safety of this mitigation, the kernel must ensure that the
safe return sequence is itself free from attacker interference. In Zen3
and Zen4, this is accomplished by creating a BTB alias between the
untraining function srso_untrain_ret_alias() and the safe return
function srso_safe_ret_alias() which results in evicting a potentially
poisoned BTB entry and using that safe one for all function returns.
In older Zen1 and Zen2, this is accomplished using a reinterpretation
technique similar to Retbleed one: srso_untrain_ret() and
srso_safe_ret().
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Drop the wake-irq enable and disable helpers which have not been used
since commit bed570307e ("PM / wakeirq: Fix dedicated wakeirq for
drivers not using autosuspend").
Note that these functions are essentially just leftovers from the first
iteration of the wake-irq implementation where device drivers were
supposed to call these functions themselves instead of PM core (as
is also indicated by the bogus kernel doc comments).
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The decision whether to enable a wake irq during suspend can not be done
based on the runtime PM state directly as a driver may use wake irqs
without implementing runtime PM. Such drivers specifically leave the
state set to the default 'suspended' and the wake irq is thus never
enabled at suspend.
Add a new wake irq flag to track whether a dedicated wake irq has been
enabled at runtime suspend and therefore must not be enabled at system
suspend.
Note that pm_runtime_enabled() can not be used as runtime PM is always
disabled during late suspend.
Fixes: 69728051f5 ("PM / wakeirq: Fix unbalanced IRQ enable for wakeirq")
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thanks to Dan and Guenter's very prompt updates of the rbtree and maple
caches to support GPF_ATOMIC allocations and since the update shook out
a bunch of users at least some of whom have been suitably careful about
ensuring that the cache is prepoulated so there are no dynamic
allocations after init let's revert the warnings.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230721-regmap-enable-kmalloc-v1-1-f78287e794d3@kernel.org
The kunit tests discovered a sleeping in atomic bug. The allocations
in the regcache-rbtree code should use the map->alloc_flags instead of
GFP_KERNEL.
[ 5.005510] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
[ 5.005960] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 117, name: kunit_try_catch
[ 5.006219] preempt_count: 1, expected: 0
[ 5.006414] 1 lock held by kunit_try_catch/117:
[ 5.006590] #0: 833b9010 (regmap_kunit:86:(config)->lock){....}-{2:2}, at: regmap_lock_spinlock+0x14/0x1c
[ 5.007493] irq event stamp: 162
[ 5.007627] hardirqs last enabled at (161): [<80786738>] crng_make_state+0x1a0/0x294
[ 5.007871] hardirqs last disabled at (162): [<80c531ec>] _raw_spin_lock_irqsave+0x7c/0x80
[ 5.008119] softirqs last enabled at (0): [<801110ac>] copy_process+0x810/0x2138
[ 5.008356] softirqs last disabled at (0): [<00000000>] 0x0
[ 5.008688] CPU: 0 PID: 117 Comm: kunit_try_catch Tainted: G N 6.4.4-rc3-g0e8d2fdfb188 #1
[ 5.009011] Hardware name: Generic DT based system
[ 5.009277] unwind_backtrace from show_stack+0x18/0x1c
[ 5.009497] show_stack from dump_stack_lvl+0x38/0x5c
[ 5.009676] dump_stack_lvl from __might_resched+0x188/0x2d0
[ 5.009860] __might_resched from __kmem_cache_alloc_node+0x1dc/0x25c
[ 5.010061] __kmem_cache_alloc_node from kmalloc_trace+0x30/0xc8
[ 5.010254] kmalloc_trace from regcache_rbtree_write+0x26c/0x468
[ 5.010446] regcache_rbtree_write from _regmap_write+0x88/0x140
[ 5.010634] _regmap_write from regmap_write+0x44/0x68
[ 5.010803] regmap_write from basic_read_write+0x8c/0x270
[ 5.010980] basic_read_write from kunit_try_run_case+0x48/0xa0
Fixes: 28644c809f ("regmap: Add the rbtree cache support")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/ee59d128-413c-48ad-a3aa-d9d350c80042@roeck-us.net/
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/58f12a07-5f4b-4a8f-ab84-0a42d1908cb9@moroto.mountain
Signed-off-by: Mark Brown <broonie@kernel.org>
REGCACHE_MAPLE needs to allocate memory for regmap operations.
This results in lockdep splats if used with fast_io since fast_io uses
spinlocks for locking.
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 167, name: kunit_try_catch
preempt_count: 1, expected: 0
1 lock held by kunit_try_catch/167:
#0: 838e9c10 (regmap_kunit:86:(config)->lock){....}-{2:2}, at: regmap_lock_spinlock+0x14/0x1c
irq event stamp: 146
hardirqs last enabled at (145): [<8078bfa8>] crng_make_state+0x1a0/0x294
hardirqs last disabled at (146): [<80c5f62c>] _raw_spin_lock_irqsave+0x7c/0x80
softirqs last enabled at (0): [<80110cc4>] copy_process+0x810/0x216c
softirqs last disabled at (0): [<00000000>] 0x0
CPU: 0 PID: 167 Comm: kunit_try_catch Tainted: G N 6.5.0-rc1-00028-gc4be22597a36-dirty #6
Hardware name: Generic DT based system
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x38/0x5c
dump_stack_lvl from __might_resched+0x188/0x2d0
__might_resched from __kmem_cache_alloc_node+0x1f4/0x258
__kmem_cache_alloc_node from __kmalloc+0x48/0x170
__kmalloc from regcache_maple_write+0x194/0x248
regcache_maple_write from _regmap_write+0x88/0x140
_regmap_write from regmap_write+0x44/0x68
regmap_write from basic_read_write+0x8c/0x27c
basic_read_write from kunit_generic_run_threadfn_adapter+0x1c/0x28
kunit_generic_run_threadfn_adapter from kthread+0xf8/0x120
kthread from ret_from_fork+0x14/0x3c
Exception stack(0x881a5fb0 to 0x881a5ff8)
5fa0: 00000000 00000000 00000000 00000000
5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
Use map->alloc_flags instead of GFP_KERNEL for memory allocations to fix
the problem.
Fixes: f033c26de5 ("regmap: Add maple tree based register cache")
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230720172021.2617326-1-linux@roeck-us.net
Signed-off-by: Mark Brown <broonie@kernel.org>
REGCACHE_RBTREE and REGCACHE_MAPLE dynamically allocate memory for regmap
operations. This is incompatible with spinlock based locking which is used
for fast_io operations. Reject affected configurations.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230720032848.1306349-2-linux@roeck-us.net
Signed-off-by: Mark Brown <broonie@kernel.org>
REGCACHE_RBTREE and REGCACHE_MAPLE dynamically allocate memory
for regmap operations. This is incompatible with spinlock based locking
which is used for fast_io operations. Disable locking for the associated
unit tests to avoid lockdep splashes.
Fixes: f033c26de5 ("regmap: Add maple tree based register cache")
Fixes: 2238959b6a ("regmap: Add some basic kunit tests")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230720032848.1306349-1-linux@roeck-us.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Gather Data Sampling (GDS) is a hardware vulnerability which allows
unprivileged speculative access to data which was previously stored in
vector registers.
Intel processors that support AVX2 and AVX512 have gather instructions
that fetch non-contiguous data elements from memory. On vulnerable
hardware, when a gather instruction is transiently executed and
encounters a fault, stale data from architectural or internal vector
registers may get transiently stored to the destination vector
register allowing an attacker to infer the stale data using typical
side channel techniques like cache timing attacks.
This mitigation is different from many earlier ones for two reasons.
First, it is enabled by default and a bit must be set to *DISABLE* it.
This is the opposite of normal mitigation polarity. This means GDS can
be mitigated simply by updating microcode and leaving the new control
bit alone.
Second, GDS has a "lock" bit. This lock bit is there because the
mitigation affects the hardware security features KeyLocker and SGX.
It needs to be enabled and *STAY* enabled for these features to be
mitigated against GDS.
The mitigation is enabled in the microcode by default. Disable it by
setting gather_data_sampling=off or by disabling all mitigations with
mitigations=off. The mitigation status can be checked by reading:
/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Currently the regcache core unconditionally enables async I/O for all cache
types, causing problems for the maple tree cache which dynamically allocates
the buffers used to write registers to the device since async requires the
buffers to be kept around until the I/O has been completed.
This use of async I/O is mainly for the rbtree cache which stores data in
a format directly usable for regmap_raw_write(), though there is a special
case for single register writes which would also have allowed it to be used
with the flat cache. It is a bit of a landmine for other caches since it
implicitly converts sync operations to async, and with modern hardware it
is not clear that async I/O is actually a performance win as shown by the
performance work David Jander did with SPI. In multi core systems the cost
of managing concurrency ends up swamping the performance benefit and almost
all modern systems are multi core.
Address this by pushing the enablement of async I/O down into the rbtree
cache where it is actively used, avoiding surprises for other cache
implementations.
Reported-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Fixes: bfa0b38c14 ("regmap: maple: Implement block sync for the maple tree cache")
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230719-regcache-async-rbtree-v1-1-b03d30cf1daf@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The HDA driver has a use case for checking if a register is cached which
it bodges in awkwardly and unclearly. Provide an API which allows it to
directly do what it's trying to do.
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230717-regmap-cache-check-v1-1-73ef688afae3@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The SMBus I2C buses have limits on the size of transfers they can do but
do not factor in the register length meaning we may try to do a transfer
longer than our length limit, the core will not take care of this.
Future changes will factor this out into the core but there are a number
of users that assume current behaviour so let's just do something
conservative here.
This does not take account padding bits but practically speaking these
are very rarely if ever used on I2C buses given that they generally run
slowly enough to mean there's no issue.
Cc: stable@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Xu Yilun <yilun.xu@intel.com>
Link: https://lore.kernel.org/r/20230712-regmap-max-transfer-v1-2-80e2aed22e83@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
When problems were noticed with the register address not being taken
into account when limiting raw transfers with I2C devices we fixed this
in the core. Unfortunately it has subsequently been realised that a lot
of buses were relying on the prior behaviour, partly due to unclear
documentation not making it obvious what was intended in the core. This
is all more involved to fix than is sensible for a fix commit so let's
just drop the original fixes, a separate commit will fix the originally
observed problem in an I2C specific way
Fixes: 3981514180 ("regmap: Account for register length when chunking")
Fixes: c8e796895e ("regmap: spi-avmm: Fix regmap_bus max_raw_write")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Xu Yilun <yilun.xu@intel.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230712-regmap-max-transfer-v1-1-80e2aed22e83@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Since apparently enabling all the KUnit tests shouldn't enable any new
subsystems it is hard to enable the regmap KUnit tests in normal KUnit
testing scenarios that don't enable any drivers. Add a Kconfig option
to help with this and include it in the KUnit all tests config.
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230712-regmap-kunit-enable-v1-1-13e296bd0204@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
When allocating the 2D array for handling IRQ type registers in
regmap_add_irq_chip_fwnode(), the intent is to allocate a matrix
with num_config_bases rows and num_config_regs columns.
This is currently handled by allocating a buffer to hold a pointer for
each row (i.e. num_config_bases). After that, the logic attempts to
allocate the memory required to hold the register configuration for
each row. However, instead of doing this allocation for each row
(i.e. num_config_bases allocations), the logic erroneously does this
allocation num_config_regs number of times.
This scenario can lead to out-of-bounds accesses when num_config_regs
is greater than num_config_bases. Fix this by updating the terminating
condition of the loop that allocates the memory for holding the register
configuration to allocate memory only for each row in the matrix.
Amit Pundir reported a crash that was occurring on his db845c device
due to memory corruption (see "Closes" tag for Amit's report). The KASAN
report below helped narrow it down to this issue:
[ 14.033877][ T1] ==================================================================
[ 14.042507][ T1] BUG: KASAN: invalid-access in regmap_add_irq_chip_fwnode+0x594/0x1364
[ 14.050796][ T1] Write of size 8 at addr 06ffff8081021850 by task init/1
[ 14.242004][ T1] The buggy address belongs to the object at ffffff8081021850
[ 14.242004][ T1] which belongs to the cache kmalloc-8 of size 8
[ 14.255669][ T1] The buggy address is located 0 bytes inside of
[ 14.255669][ T1] 8-byte region [ffffff8081021850, ffffff8081021858)
Fixes: faa87ce919 ("regmap-irq: Introduce config registers for irq types")
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Closes: https://lore.kernel.org/all/CAMi1Hd04mu6JojT3y6wyN2YeVkPR5R3qnkKJ8iR8if_YByCn4w@mail.gmail.com/
Tested-by: John Stultz <jstultz@google.com>
Tested-by: Amit Pundir <amit.pundir@linaro.org> # tested on Dragonboard 845c
Cc: stable@vger.kernel.org # v6.0+
Cc: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Isaac J. Manjarres" <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20230711193059.2480971-1-isaacmanjarres@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
With unsigned int type we never ever can pass 64-bit value.
Remove never properly worked code.
Note, there are no users in kernel for this size of register
offsets or data.
This reverts commit afcc00b91f.
Also revert other 64-bit code excerpts in the regmap implementation
that had been induced by the false impression made by the above
mentioned change that there is a support of that data size.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230622183613.58762-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Here are a small set of changes for 6.5-rc1 for some driver core
changes. Included in here are:
- device property cleanups to make it easier to write "agnostic"
drivers when regards to the firmware layer underneath them (DT vs.
ACPI)
- debugfs documentation updates
- devres additions
- sysfs documentation and changes to handle empty directory creation
logic better
- tiny kernfs optimizations
- other tiny changes
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZKKSEQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymoowCfT+Joha+cz4edAFUvd55lKPPJJFsAoNiprHmX
di37sirvn6vo54Hk0Nyq
=qqTo
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here are a small set of changes for 6.5-rc1 for some driver core
changes. Included in here are:
- device property cleanups to make it easier to write "agnostic"
drivers when regards to the firmware layer underneath them (DT vs.
ACPI)
- debugfs documentation updates
- devres additions
- sysfs documentation and changes to handle empty directory creation
logic better
- tiny kernfs optimizations
- other tiny changes
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: Skip empty folders creation
sysfs: Improve readability by following the kernel coding style
drivers: fwnode: fix fwnode_irq_get[_byname]()
ata: ahci_platform: Make code agnostic to OF/ACPI
device property: Implement device_is_compatible()
ACPI: Move ACPI_DEVICE_CLASS() to mod_devicetable.h
base/node: Use 'property' to identify an access parameter
driver core: device.h: add some missing kerneldocs
kernfs: fix missing kernfs_idr_lock to remove an ID from the IDR
isa: Remove unnecessary checks
MAINTAINERS: add entry for auxiliary bus
debugfs: Correct the 'debugfs_create_str' docs
serial: qcom_geni: Comment use of devm_krealloc rather than devm_krealloc_array
iio: adc: Use devm_krealloc_array
hwmon: pmbus: Use devm_krealloc_array
Another busy release for regmap with the second half fo the maple tree
register cache implementation, there's some smaller optimisations that
could be done but this should now be able to replace the rbtree cache
for most devices.
We also had a followup from Aidan MacDonald's refactoring of some of the
regmap-irq interfaces, the conversion is complete so the old interfaces
are removed. This means that even with the new features for the maple
tree cache we'd have a nice negative diffstat were it not for the
addition of a bunch more KUnit coverage.
There's one GPIO patch in here, it was a dependency for a cleanup of an
API in the regmap-irq code for which the gpio-104-dio-48e driver was the
only user.
Highlights:
- The maple tree cache can now load in default values more efficiently,
and is capabale of syncing multiple registers in a single write
during cache sync.
- More KUnit coverage, including some coverage for raw I/O and a dummy
RAM backed cache to support it.
- Removal of several old interfaces in regmap-irq now all the users
have been modernised.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmSZj0cACgkQJNaLcl1U
h9BNdAf+Kjw+WbqDj/glPTAfrth5aPOyheq9stklzY3mE0E+AxAsHd2PrvYwn1Rt
jiAb96CqsVNP3WFVj6vmM/kfonE7xiBzPng2QtIWKJjxM/PYyLJZkElF6VqZz4cz
ftS4GMAWJadOvIMZgtCFOOujTaBoN0ik2ryZYofVvI8e98W89ifLvCv4aRiJ3Qn8
2R4Wk37JvZtIc2q5kaZ+wo+JIXVijlt7gU+9ZMT7BvJu1ot/KXY3Q3npfSK2Bv5u
qDkMWNZoy9kNd035E8rXt2OTzMlxgUu766wpg3YGU2Hqt15N0n5rpgLc2uB24niG
0yiCYbA2NT5J6+P+/VsE/VTBmJwK8w==
=E+Tk
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"Another busy release for regmap with the second half of the maple tree
register cache implementation, there's some smaller optimisations that
could be done but this should now be able to replace the rbtree cache
for most devices.
We also had a followup from Aidan MacDonald's refactoring of some of
the regmap-irq interfaces, the conversion is complete so the old
interfaces are removed. This means that even with the new features for
the maple tree cache we'd have a nice negative diffstat were it not
for the addition of a bunch more KUnit coverage.
There's one GPIO patch in here, it was a dependency for a cleanup of
an API in the regmap-irq code for which the gpio-104-dio-48e driver
was the only user.
Highlights:
- The maple tree cache can now load in default values more
efficiently, and is capabale of syncing multiple registers
in a single write during cache sync
- More KUnit coverage, including some coverage for raw I/O
and a dummy RAM backed cache to support it
- Removal of several old interfaces in regmap-irq now all
users have been modernised"
* tag 'regmap-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (23 commits)
regmap: Allow reads from write only registers with the flat cache
regmap: Drop early readability check
regmap: Check for register readability before checking cache during read
regmap: Add test to make sure we don't sync to read only registers
regmap: Add a test case for write only registers
regmap: Add test that writes to write only registers are prevented
regmap: Add debugfs file for forcing field writes
regmap: Don't check for changes in regcache_set_val()
regmap: maple: Implement block sync for the maple tree cache
regmap: Provide basic KUnit coverage for the raw register I/O
regmap: Provide a ram backed regmap with raw support
regmap: Add missing cache_only checks
regmap: regmap-irq: Move handle_post_irq to before pm_runtime_put
regmap: Load register defaults in blocks rather than register by register
regmap: mmio: Allow passing an empty config->reg_stride
regmap-irq: Drop backward compatibility for inverted mask/unmask
regmap-irq: Minor adjustments to .handle_mask_sync()
regmap-irq: Remove support for not_fixed_stride
regmap-irq: Remove type registers
regmap-irq: Remove virtual registers
...
- Yosry has also eliminated cgroup's atomic rstat flushing.
- Nhat Pham adds the new cachestat() syscall. It provides userspace
with the ability to query pagecache status - a similar concept to
mincore() but more powerful and with improved usability.
- Mel Gorman provides more optimizations for compaction, reducing the
prevalence of page rescanning.
- Lorenzo Stoakes has done some maintanance work on the get_user_pages()
interface.
- Liam Howlett continues with cleanups and maintenance work to the maple
tree code. Peng Zhang also does some work on maple tree.
- Johannes Weiner has done some cleanup work on the compaction code.
- David Hildenbrand has contributed additional selftests for
get_user_pages().
- Thomas Gleixner has contributed some maintenance and optimization work
for the vmalloc code.
- Baolin Wang has provided some compaction cleanups,
- SeongJae Park continues maintenance work on the DAMON code.
- Huang Ying has done some maintenance on the swap code's usage of
device refcounting.
- Christoph Hellwig has some cleanups for the filemap/directio code.
- Ryan Roberts provides two patch series which yield some
rationalization of the kernel's access to pte entries - use the provided
APIs rather than open-coding accesses.
- Lorenzo Stoakes has some fixes to the interaction between pagecache
and directio access to file mappings.
- John Hubbard has a series of fixes to the MM selftesting code.
- ZhangPeng continues the folio conversion campaign.
- Hugh Dickins has been working on the pagetable handling code, mainly
with a view to reducing the load on the mmap_lock.
- Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
128 to 8.
- Domenico Cerasuolo has improved the zswap reclaim mechanism by
reorganizing the LRU management.
- Matthew Wilcox provides some fixups to make gfs2 work better with the
buffer_head code.
- Vishal Moola also has done some folio conversion work.
- Matthew Wilcox has removed the remnants of the pagevec code - their
functionality is migrated over to struct folio_batch.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
=B7yQ
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
- Yosry Ahmed brought back some cgroup v1 stats in OOM logs
- Yosry has also eliminated cgroup's atomic rstat flushing
- Nhat Pham adds the new cachestat() syscall. It provides userspace
with the ability to query pagecache status - a similar concept to
mincore() but more powerful and with improved usability
- Mel Gorman provides more optimizations for compaction, reducing the
prevalence of page rescanning
- Lorenzo Stoakes has done some maintanance work on the
get_user_pages() interface
- Liam Howlett continues with cleanups and maintenance work to the
maple tree code. Peng Zhang also does some work on maple tree
- Johannes Weiner has done some cleanup work on the compaction code
- David Hildenbrand has contributed additional selftests for
get_user_pages()
- Thomas Gleixner has contributed some maintenance and optimization
work for the vmalloc code
- Baolin Wang has provided some compaction cleanups,
- SeongJae Park continues maintenance work on the DAMON code
- Huang Ying has done some maintenance on the swap code's usage of
device refcounting
- Christoph Hellwig has some cleanups for the filemap/directio code
- Ryan Roberts provides two patch series which yield some
rationalization of the kernel's access to pte entries - use the
provided APIs rather than open-coding accesses
- Lorenzo Stoakes has some fixes to the interaction between pagecache
and directio access to file mappings
- John Hubbard has a series of fixes to the MM selftesting code
- ZhangPeng continues the folio conversion campaign
- Hugh Dickins has been working on the pagetable handling code, mainly
with a view to reducing the load on the mmap_lock
- Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
from 128 to 8
- Domenico Cerasuolo has improved the zswap reclaim mechanism by
reorganizing the LRU management
- Matthew Wilcox provides some fixups to make gfs2 work better with the
buffer_head code
- Vishal Moola also has done some folio conversion work
- Matthew Wilcox has removed the remnants of the pagevec code - their
functionality is migrated over to struct folio_batch
* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
mm/hugetlb: remove hugetlb_set_page_subpool()
mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
hugetlb: revert use of page_cache_next_miss()
Revert "page cache: fix page_cache_next/prev_miss off by one"
mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
mm: memcg: rename and document global_reclaim()
mm: kill [add|del]_page_to_lru_list()
mm: compaction: convert to use a folio in isolate_migratepages_block()
mm: zswap: fix double invalidate with exclusive loads
mm: remove unnecessary pagevec includes
mm: remove references to pagevec
mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
mm: remove struct pagevec
net: convert sunrpc from pagevec to folio_batch
i915: convert i915_gpu_error to use a folio_batch
pagevec: rename fbatch_count()
mm: remove check_move_unevictable_pages()
drm: convert drm_gem_put_pages() to use a folio_batch
i915: convert shmem_sg_free_table() to use a folio_batch
scatterlist: add sg_set_folio()
...
- Introduce power capping core support for Intel TPMI (Topology Aware
Register and PM Capsule Interface) and a TPMI interface driver for
Intel RAPL (Zhang Rui, Dan Carpenter).
- Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping
driver (Zhang Rui).
- Fix invalid initialization for pl4_supported field in the Intel RAPL
power capping driver (Sumeet Pawnikar).
- Clean up the intel_idle driver, make it work with VM guests that
cannot use the MWAIT instruction and address the case in which the
host may enter a deep idle state when the guest is idle (Arjan van
de Ven).
- Prevent cpufreq drivers that provide the ->adjust_perf() callback
without a ->fast_switch() one which is used as a fallback from the
former in some cases (Wyes Karny).
- Fix some issues related to the AMD P-state cpufreq driver (Mario
Limonciello, Wyes Karny).
- Fix the energy_performance_preference attribute handling in the
intel_pstate driver in passive mode (Tero Kristo).
- Fix the handling of pm_suspend_target_state when CONFIG_PM is unset
(Kai-Heng Feng).
- Correct spelling mistake in a comment in the hibernation code (Wang
Honghui).
- Add arch_resume_nosmt() prototype to avoid a "missing prototypes"
build warning (Arnd Bergmann).
- Restrict pm_pr_dbg() to system-wide power transitions and use it in
a few additional places (Mario Limonciello).
- Drop verification of in-params from genpd_add_device() and ensure
that all of its callers will do it (Ulf Hansson).
- Prevent possible integer overflows from occurring in
genpd_parse_state() (Nikita Zhandarovich).
- Reorder fieldls in 'struct devfreq_dev_status' to reduce its size
somewhat (Christophe JAILLET).
- Ensure that the Exynos PPMU driver is already loaded before the
Exynos Bus driver starts probing so as to avoid a possible freeze
loading of the kernel modules (Marek Szyprowski).
- Fix variable deferencing before NULL check in the mtk-cci devfreq
driver (Sukrut Bellary).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmSZwPASHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxu44P/AouvVFDMt+eE76nPfNc10X1lswS0vrT
X7LmylSDPyWiuz6p8MWGXB6T2nQ+3DbvSfVyBJ960ymkBnE/F9me8o3wB8eGbd6z
ZvOD8+wVPXS4Cq8gUxy2zV1ul+o5IwwT20cYC6mWjasvByl13vTevN5d6ZQ9o6hS
1hAQQDd6JjsdLIUyU0EbE4aD+l4h96o45IFxbV86qVH77ywa6VMNdulRKmDcONj3
kM7jHFYL4xl0TfMjHp4IhGWXK32qGYgX1zYTOU5kSc11IExJfVzQcL2uQ9A0KSLp
RJ0c93loUsHdMhenNkN4nSBFWBIaftKDLbS+5Ubt0DBuNN7kxWivEVts4DM/wxuB
72PNl5h8YglcW7LHH2IXb/6HEerzbj42+6y459o+M0DcNTq18gu19OQTK5IGtRrQ
Yf6+5BhgLR3R1REg0eaBg6njtGq0f5fmW7Iqo52eA8cXhHU0MTDJE1p6ytfN40gH
ViA+T8HB6Mh91lWHVftbwo3wONHGcfJy+S2hGM45V5LKEGeKHILgmw0nUzO7epWE
7VIPKGzkVd7h/Drk7X3nQR3DJFA/x5eNhjxt5LZD83cVVg34SS3ST5oH13FI9S7Q
8zwG5KoHTDrmYug3sxQ+Q9pq8MnOl0ZbgqlVfwyiWjKYmNMbg4elsQG/2zD3t0kv
y8zXbr7Kr4QB
=SZV7
-----END PGP SIGNATURE-----
Merge tag 'pm-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add Intel TPMI (Topology Aware Register and PM Capsule
Interface) support to the power capping subsystem, extend the
intel_idle driver to work in VM guests where MWAIT is not available,
extend the system-wide power management diagnostics, fix bugs and
clean up code.
Specifics:
- Introduce power capping core support for Intel TPMI (Topology Aware
Register and PM Capsule Interface) and a TPMI interface driver for
Intel RAPL (Zhang Rui, Dan Carpenter)
- Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping
driver (Zhang Rui)
- Fix invalid initialization for pl4_supported field in the Intel
RAPL power capping driver (Sumeet Pawnikar)
- Clean up the intel_idle driver, make it work with VM guests that
cannot use the MWAIT instruction and address the case in which the
host may enter a deep idle state when the guest is idle (Arjan van
de Ven)
- Prevent cpufreq drivers that provide the ->adjust_perf() callback
without a ->fast_switch() one which is used as a fallback from the
former in some cases (Wyes Karny)
- Fix some issues related to the AMD P-state cpufreq driver (Mario
Limonciello, Wyes Karny)
- Fix the energy_performance_preference attribute handling in the
intel_pstate driver in passive mode (Tero Kristo)
- Fix the handling of pm_suspend_target_state when CONFIG_PM is unset
(Kai-Heng Feng)
- Correct spelling mistake in a comment in the hibernation code (Wang
Honghui)
- Add arch_resume_nosmt() prototype to avoid a "missing prototypes"
build warning (Arnd Bergmann)
- Restrict pm_pr_dbg() to system-wide power transitions and use it in
a few additional places (Mario Limonciello)
- Drop verification of in-params from genpd_add_device() and ensure
that all of its callers will do it (Ulf Hansson)
- Prevent possible integer overflows from occurring in
genpd_parse_state() (Nikita Zhandarovich)
- Reorder fieldls in 'struct devfreq_dev_status' to reduce its size
somewhat (Christophe JAILLET)
- Ensure that the Exynos PPMU driver is already loaded before the
Exynos Bus driver starts probing so as to avoid a possible freeze
loading of the kernel modules (Marek Szyprowski)
- Fix variable deferencing before NULL check in the mtk-cci devfreq
driver (Sukrut Bellary)"
* tag 'pm-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (42 commits)
intel_idle: Add a "Long HLT" C1 state for the VM guest mode
cpufreq: intel_pstate: Fix energy_performance_preference for passive
cpufreq: amd-pstate: Add a kernel config option to set default mode
cpufreq: amd-pstate: Set a fallback policy based on preferred_profile
ACPI: CPPC: Add definition for undefined FADT preferred PM profile value
cpufreq: amd-pstate: Set default governor to schedutil
PM: domains: Move the verification of in-params from genpd_add_device()
cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated
cpufreq: amd-pstate: Write CPPC enable bit per-socket
intel_idle: Add support for using intel_idle in a VM guest using just hlt
cpufreq: Fail driver register if it has adjust_perf without fast_switch
intel_idle: clean up the (new) state_update_enter_method function
intel_idle: refactor state->enter manipulation into its own function
platform/x86/amd: pmc: Use pm_pr_dbg() for suspend related messages
pinctrl: amd: Use pm_pr_dbg to show debugging messages
ACPI: x86: Add pm_debug_messages for LPS0 _DSM state tracking
include/linux/suspend.h: Only show pm_pr_dbg messages at suspend/resume
powercap: RAPL: Fix a NULL vs IS_ERR() bug
powercap: RAPL: Fix CONFIG_IOSF_MBI dependency
powercap: RAPL: fix invalid initialization for pl4_supported field
...
The gist of it all is that Intel TDX and AMD SEV-SNP confidential
computing guests define the notion of accepting memory before using it
and thus preventing a whole set of attacks against such guests like
memory replay and the like.
There are a couple of strategies of how memory should be accepted
- the current implementation does an on-demand way of accepting.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSZ0f4ACgkQEsHwGGHe
VUpasw//RKoNW9HSU1csY+XnG9uuaT6QKgji+gIEZWWIGPO9iibvbBj6P5WxJE8T
fe7yb6CGa6d6thoU0v+mQGVVvCd7OjCFwPD5wAo4mXToD7Ig+4mI6jMkaKifqa2f
N1Uuy8u/zQnGyWrP5Y//WH5bJYfsmds4UGwXI2nLvKlhE7MG90/ePjt7iqnnwZsy
waLp6a0Q1VeOvnfRszFLHZw/SoER5RSJ4qeVqttkFNmPPEKMK1Kirrl2poR56OQJ
nMr6LqVtD7erlSJ36VRXOKzLI443A4iIEIg/wBjIOU6L5ZEWJGNqtCDnIqFJ6+TM
XatsejfRYkkMZH0qXtX9+M0u+HJHbZPCH5rEcA21P3Nbd7od/ANq91qCGoMjtUZ4
7pZohMG8M6IDvkLiOb8fQVkR5k/9Jbk8UvdN/8jdPx1ERxYMFO3BDvJpV2gzrW4B
KYtFTPR7j2nY3eKfDpe3flanqYzKUBsKoTlLnlH7UHaiMZ2idwG8AQjlrhC/erCq
/Lq1LXt4Mq46FyHABc+PSHytu0WWj1nBUftRt+lviY/Uv7TlkBldOTT7wm7itsfF
HUCTfLWl0CJXKPq8rbbZhAG/exN6Ay6MO3E3OcNq8A72E5y4cXenuG3ic/0tUuOu
FfjpiMk35qE2Qb4hnj1YtF3XINtd1MpKcuwzGSzEdv9s3J7hrS0=
=FS95
-----END PGP SIGNATURE-----
Merge tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 confidential computing update from Borislav Petkov:
- Add support for unaccepted memory as specified in the UEFI spec v2.9.
The gist of it all is that Intel TDX and AMD SEV-SNP confidential
computing guests define the notion of accepting memory before using
it and thus preventing a whole set of attacks against such guests
like memory replay and the like.
There are a couple of strategies of how memory should be accepted -
the current implementation does an on-demand way of accepting.
* tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
virt: sevguest: Add CONFIG_CRYPTO dependency
x86/efi: Safely enable unaccepted memory in UEFI
x86/sev: Add SNP-specific unaccepted memory support
x86/sev: Use large PSC requests if applicable
x86/sev: Allow for use of the early boot GHCB for PSC requests
x86/sev: Put PSC struct on the stack in prep for unaccepted memory support
x86/sev: Fix calculation of end address based on number of pages
x86/tdx: Add unaccepted memory support
x86/tdx: Refactor try_accept_one()
x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub
efi/unaccepted: Avoid load_unaligned_zeropad() stepping into unaccepted memory
efi: Add unaccepted memory support
x86/boot/compressed: Handle unaccepted memory
efi/libstub: Implement support for unaccepted memory
efi/x86: Get full memory map in allocate_e820()
mm: Add support for unaccepted memory
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmSV8dwQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpilGD/9Yys1oxIXJpRf00fzrylAlBthRxMjFQVWw
zAut106hAQiBHvU8IkmGA3MvEFVHxtzwYhHI7IR8K3aZBIqscweCqmVI9JyogJw9
U9Twnzel47VmuKdM94FeoN+hbj1fP8EWTjzmy67/zEEfFCdmHvNlMi3lSrGYIpFy
39LxTB99Y4UarM5PtWbes37GYYljzMSWKuo4AfBkvq1eQa+sZ0Vq2xAABKq3UM7f
apqhgHtkJooRePDP0eQp+kAyyVMgW2jIK+oIdJDxNF3CKTu2w40RzaYz6fp+jVSU
H4R/xS59GW4/xql+VBJDh/qJg9K62DPPYjlW8BmSR8+IjvfFpsyH3/MacE50CD3P
20fs/Mnj49H79fDrQEHJI53cOOb2EmUitbwLbvOcColNTPpt8loBtdQxjF2RMU8R
Nyort9DJPFclYCxky1LYg1CNEC2Ln4Zy/jD47wPvqRmOQphOoVlV/hPnOEqvjaZC
49Vn70W2DeE9cXvYI7ha+XIg6/oj+Gs3iusEbV08Ci7EAtXgI+ZUUsQ97K8UNiUh
h2lqSJtuI7lBpYP9sf+BeCch5UCC+xGYyTdoM5f58lehWBBPtbs0g7S9RyRyOYxe
n+yxEUo3dAGzJ/xsKAjinbZfeWIpr0b1TkAh4w3Cq/BKzRr9Bp8lBAxYuancbQ+Y
1ADPteUOTA==
=zP4Y
-----END PGP SIGNATURE-----
Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith:
- Various cleanups all around (Irvin, Chaitanya, Christophe)
- Better struct packing (Christophe JAILLET)
- Reduce controller error logs for optional commands (Keith)
- Support for >=64KiB block sizes (Daniel Gomez)
- Fabrics fixes and code organization (Max, Chaitanya, Daniel
Wagner)
- bcache updates via Coly:
- Fix a race at init time (Mingzhe Zou)
- Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye)
- use page pinning in the block layer for dio (David)
- convert old block dio code to page pinning (David, Christoph)
- cleanups for pktcdvd (Andy)
- cleanups for rnbd (Guoqing)
- use the unchecked __bio_add_page() for the initial single page
additions (Johannes)
- fix overflows in the Amiga partition handling code (Michael)
- improve mq-deadline zoned device support (Bart)
- keep passthrough requests out of the IO schedulers (Christoph, Ming)
- improve support for flush requests, making them less special to deal
with (Christoph)
- add bdev holder ops and shutdown methods (Christoph)
- fix the name_to_dev_t() situation and use cases (Christoph)
- decouple the block open flags from fmode_t (Christoph)
- ublk updates and cleanups, including adding user copy support (Ming)
- BFQ sanity checking (Bart)
- convert brd from radix to xarray (Pankaj)
- constify various structures (Thomas, Ivan)
- more fine grained persistent reservation ioctl capability checks
(Jingbo)
- misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan,
Jordy, Li, Min, Yu, Zhong, Waiman)
* tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits)
scsi/sg: don't grab scsi host module reference
ext4: Fix warning in blkdev_put()
block: don't return -EINVAL for not found names in devt_from_devname
cdrom: Fix spectre-v1 gadget
block: Improve kernel-doc headers
blk-mq: don't insert passthrough request into sw queue
bsg: make bsg_class a static const structure
ublk: make ublk_chr_class a static const structure
aoe: make aoe_class a static const structure
block/rnbd: make all 'class' structures const
block: fix the exclusive open mask in disk_scan_partitions
block: add overflow checks for Amiga partition support
block: change all __u32 annotations to __be32 in affs_hardblocks.h
block: fix signed int overflow in Amiga partition support
block: add capacity validation in bdev_add_partition()
block: fine-granular CAP_SYS_ADMIN for Persistent Reservation
block: disallow Persistent Reservation on partitions
reiserfs: fix blkdev_put() warning from release_journal_dev()
block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions()
block: document the holder argument to blkdev_get_by_path
...
Merge updates related to system-wide power management and generic power
domains (genpd) updates for 6.5-rc1:
- Fix the handling of pm_suspend_target_state when CONFIG_PM is unset
(Kai-Heng Feng).
- Correct spelling mistake in a comment in the hibernation code (Wang
Honghui).
- Add arch_resume_nosmt() prototype to avoid a "missing prototypes"
build warning (Arnd Bergmann).
- Restrict pm_pr_dbg() to system-wide power transitions and use it in
a few additional places (Mario Limonciello).
- Drop verification of in-params from genpd_add_device() and ensure
that all of its callers will do it (Ulf Hansson).
- Prevent possible integer overflows from occurring in
genpd_parse_state() (Nikita Zhandarovich).
* pm-sleep:
platform/x86/amd: pmc: Use pm_pr_dbg() for suspend related messages
pinctrl: amd: Use pm_pr_dbg to show debugging messages
ACPI: x86: Add pm_debug_messages for LPS0 _DSM state tracking
include/linux/suspend.h: Only show pm_pr_dbg messages at suspend/resume
PM: suspend: add a arch_resume_nosmt() prototype
PM: hibernate: Correct spelling mistake in a comment
PM: suspend: Fix pm_suspend_target_state handling for !CONFIG_PM
* pm-domains:
PM: domains: Move the verification of in-params from genpd_add_device()
PM: domains: fix integer overflow issues in genpd_parse_state()
The earlier fix to take account of the register data size when limiting
raw register writes exposed the fact that the Intel AVMM bus was
incorrectly specifying too low a limit on the maximum data transfer, it
is only capable of transmitting one register so had set a transfer size
limit that couldn't fit both the value and the the register address into
a single message.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmSS8bEACgkQJNaLcl1U
h9B1zwf/emwrogeLq+eQ03Q9O4oE7OmbYmbTKKhZHtdwtUxZ4qsZ9aNekcgPw1Kv
riZHiL2EC7EKFAqTl1KiGsoJTWQONt32DFcc+27fW6IyXAFc4AMDf2JPnKMtZC83
0R+H0vC/FDVO6lzxNerPuC62ydaBr38XdimyCwYgLtzNWwsZUNh6leXgjFbF3sh7
MEi/SaLH9hcBr0suERnXh3DZUAT7d2kvwL1krSDHF7pvhQ+Z2pn96B1NVnaVVhM1
OlyfQLieRZz1Q2yF2LASI3I9n2IWeToR9i2iaWn2RlGiBzXpo5E5YpEFtH7umCH0
AjoI5Cf44xFXh5asJskRydgI7iKwPQ==
=yQ+K
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"One more fix for v6.4
The earlier fix to take account of the register data size when
limiting raw register writes exposed the fact that the Intel AVMM bus
was incorrectly specifying too low a limit on the maximum data
transfer, it is only capable of transmitting one register so had set a
transfer size limit that couldn't fit both the value and the the
register address into a single message"
* tag 'regmap-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: spi-avmm: Fix regmap_bus max_raw_write
The max_raw_write member of the regmap_spi_avmm_bus structure is defined
as:
.max_raw_write = SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT
SPI_AVMM_VAL_SIZE == 4 and MAX_WRITE_CNT == 1 so this results in a
maximum write transfer size of 4 bytes which provides only enough space to
transfer the address of the target register. It provides no space for the
value to be transferred. This bug became an issue (divide-by-zero in
_regmap_raw_write()) after the following was accepted into mainline:
commit 3981514180 ("regmap: Account for register length when chunking")
Change max_raw_write to include space (4 additional bytes) for both the
register address and value:
.max_raw_write = SPI_AVMM_REG_SIZE + SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT
Fixes: 7f9fb67358 ("regmap: add Intel SPI Slave to AVMM Bus Bridge support")
Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20230620202824.380313-1-russell.h.weight@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit f38d1a6d00 ("PM: domains: Allocate governor data dynamically
based on a genpd governor") started to use the in-parameters in
genpd_add_device(), without first doing a verification of them.
This isn't really a big problem, as most callers do a verification already.
Therefore, let's drop the verification from genpd_add_device() and make
sure all the callers take care of it instead.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: f38d1a6d00 ("PM: domains: Allocate governor data dynamically based on a genpd governor")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We have some drivers that have a use case for cached write only
registers, doing read/modify/writes on read only registers in order to
work more easily with bitfields. Go back to trying the cache before we
check if we can read from the device.
Fixes: eab5abdeb7 ("regmap: Check for register readability before checking cache during read")
Reported-by: Konrad Dybcio <konradybcio@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230615-regmap-drop-early-readability-v1-1-8135094362de@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Another fix for the maple tree cache, Takashi noticed that unlike other
caches the maple tree cache didn't check for read only registers before
trying to sync which would result in spurious syncs for read only
registers where we don't have a default. This was due to the check
being open coded in the caches, we now check in the shared "does this
register need sync" function so that is fixed for this and future
caches.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmSK99kACgkQJNaLcl1U
h9B9YAf/cfOLwuU+OoHp6pvCZHQjojwRlXenp7pEzMlVvlzl/cm4Y0XDGpuP/vGc
MipxznnCzxLGWN2eVhoQcp2Ay8+bHJ9EohQ68HJ/WUv1GVjR+my7sqVVflqorIwm
gwDy2NBEBFe7m1MEITveLVWcV5Hviy7X0SMC7XPeAOs8Edg1AOqrsJ4Ta1+6z6zg
7/jlCwozDggV+qO/+jDGrKGCRidqH1E5KgTm+oWHV4HMJEPqt0GYYK+/5ELAQIsb
uWhye5LsIKbwLbOhcbkxqLf1r0Zeg4bwgubhfvgs8aky+uWCJrw9eVQD2S04mdVI
Rata8+XNkhYAv0BWRN87kg9wKyXyBQ==
=z6Cs
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"Another fix for the maple tree cache, Takashi noticed that unlike
other caches the maple tree cache didn't check for read only registers
before trying to sync which would result in spurious syncs for read
only registers where we don't have a default.
This was due to the check being open coded in the caches, we now check
in the shared 'does this register need sync' function so that is fixed
for this and future caches"
* tag 'regmap-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: regcache: Don't sync read-only registers
The fwnode_irq_get() and the fwnode_irq_get_byname() return 0 upon
device-tree IRQ mapping failure. This is contradicting the
fwnode_irq_get_byname() function documentation and can potentially be a
source of errors like:
int probe(...) {
...
irq = fwnode_irq_get_byname();
if (irq <= 0)
return irq;
...
}
Here we do correctly check the return value from fwnode_irq_get_byname()
but the driver probe will now return success. (There was already one
such user in-tree).
Change the fwnode_irq_get_byname() to work as documented and make also the
fwnode_irq_get() follow same common convention returning a negative errno
upon failure.
Fixes: ca0acb511c ("device property: Add fwnode_irq_get_byname")
Suggested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Suggested-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-ID: <3e64fe592dc99e27ef9a0b247fc49fa26b6b8a58.1685340157.git.mazziesaccount@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from Mark Brown <broonie@kernel.org>:
Since Takashi found an issue with maple tree syncing registers it
shouldn't do add some test cases that catch that case and some more
potential issues, ideally we'd run through the combination of
readability with all possible I/O calls but that's lifting for another
day. We did find one issue with missing readability checks which will
be fixed separately.
`_regmap_update_bits()` checks if the current register value differs
from the new value, and only writes to the register if they differ. When
testing hardware drivers, it might be desirable to always force a
register write, for example when writing to a `regmap_field`. This
enables and simplifies testing and verification of the hardware
interaction. For example, when using a hardware mock/simulation model,
one can then more easily verify that the driver makes the correct
expected register writes during certain events.
Add a bool variable `force_write_field` and a corresponding debugfs
entry to enable this. Since this feature could interfere with driver
operation, guard it with a macro.
Signed-off-by: Waqar Hameed <waqar.hameed@axis.com>
Link: https://lore.kernel.org/r/pnd1qifa7sj.fsf@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
regcache_maple_sync() tries to sync all cached values no matter
whether it's writable or not. OTOH, regache_sync_val() does care the
wrtability and returns -EIO for a read-only register. This results in
an error message like:
snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2f0009. -5
and the sync loop is aborted incompletely.
This patch adds the writable register check to regcache_sync_val() for
addressing the bug above.
Note that, although we may add the check in the caller side
(regcache_maple_sync()), here we put in regcache_sync_val(), so that a
similar case like this can be avoided in future.
Fixes: f033c26de5 ("regmap: Add maple tree based register cache")
Link: https://lore.kernel.org/r/877cs7g6f1.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20230613112240.3361-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Mark Brown <broonie@kernel.org>:
Our existing coverage only deals with buses that provide single register
read and write operations, extend it to cover raw buses using a similar
approach with a RAM backed register map that the tests can inspect to
check operations. This coverage could be more complete but provides a
good start.
The only user of regcache_set_val() ignores the return value so we may as
well not bother checking if the value we are trying to set is the same as
the value already stored.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230609-regcache-set-val-no-ret-v1-1-9a6932760cf8@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
For register maps where we can write multiple values in a single bus
operation it is generally much faster to do so. Improve the performance of
maple tree cache syncs on such devices by identifying blocks of adjacent
registers that need to be written out and combining them into a single
operation.
Combining writes does mean that we need to allocate a scratch buffer and
format the data into it but it is expected that for most cases where caches
are in use the cost of I/O will be much greater than the cost of doing the
allocation and format.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230609-regcache-maple-sync-raw-v1-1-8ddeb4e2b9ab@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmSGPiIeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGxWcH/AoUClfYYbYZYaoI
spQiFQiZEdniE3W/36GKtfJok4ur1Aogb/q99KvfQxFGkH+aLbTgXxJlTqrlmfWk
Kj4uVUecQl2OHOuEM6AVFyK6S4xdTjWS7TKMGIVe4LDQCAsFidgcYbcVrAhQ+s1s
eAOhlfrKMzVgRQ0KsJkn60SoSbVP6RyRfu+QQu24GfNtD8SzN8y0adB1PYXGVWmr
WW8+MqfZ1WIkn+NWW5bn3AEz8AYjQC66EvcWhlAWmyyBuUjIoNhpIgPHp6Vm7s+a
oPvw/VXwDX8NzpN0XYI+eVpsgClWtJ+HUcCSeuExTxrWu7Bewp6MMLeCjvi/lPpe
zKRDgnw=
=hx4k
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmSHIsUACgkQJNaLcl1U
h9AbDQf+OnspdI5uW+X7ObBdB7KvPAXzDd7rMsXOTdp13bciSmaOzl74f0jPNWYU
GVo3a84eIU0YsdBP7M9+42vyUDQctgoBa7yyxZ+hW8gZQbXwgOZGmvWMncankm9n
hTlUbuLiowPzzjLERMDS0T2iJF0aFG7eB8Xghk/Su38HOCJI6fBx7sQlvt+aFK6J
/WzUuAqHng+6T7FHqxerGz/XIfvnRdS8qC9FZ7ySurmopn2AGY4YlifSGmSlPuFt
MM+nGNTOYn8scBAPRQlVCOwv7XOg6XqEvP5UNe8KpKN1rHfgP/B0X3ZbyE58bpPI
lqZzTXmY/dWD27FUc+3779EGLZ+L8w==
=PyzM
-----END PGP SIGNATURE-----
regmap: Merge up v6.4-rc6
The fix for maple tree RCU locking on sync is a dependency for the
block sync code for the maple tree.
Simple tests that cover basic raw I/O, plus basic coverage of cache sync
since the caches generate bulk I/O with raw register maps. This could be
more comprehensive but it is good for testing generic code.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230610-regcache-raw-kunit-v1-2-583112cd28ac@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
UEFI Specification version 2.9 introduces the concept of memory
acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD
SEV-SNP, require memory to be accepted before it can be used by the
guest. Accepting happens via a protocol specific to the Virtual Machine
platform.
There are several ways the kernel can deal with unaccepted memory:
1. Accept all the memory during boot. It is easy to implement and it
doesn't have runtime cost once the system is booted. The downside is
very long boot time.
Accept can be parallelized to multiple CPUs to keep it manageable
(i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate
memory bandwidth and does not scale beyond the point.
2. Accept a block of memory on the first use. It requires more
infrastructure and changes in page allocator to make it work, but
it provides good boot time.
On-demand memory accept means latency spikes every time kernel steps
onto a new memory block. The spikes will go away once workload data
set size gets stabilized or all memory gets accepted.
3. Accept all memory in background. Introduce a thread (or multiple)
that gets memory accepted proactively. It will minimize time the
system experience latency spikes on memory allocation while keeping
low boot time.
This approach cannot function on its own. It is an extension of #2:
background memory acceptance requires functional scheduler, but the
page allocator may need to tap into unaccepted memory before that.
The downside of the approach is that these threads also steal CPU
cycles and memory bandwidth from the user's workload and may hurt
user experience.
Implement #1 and #2 for now. #2 is the default. Some workloads may want
to use #1 with accept_memory=eager in kernel command line. #3 can be
implemented later based on user's demands.
Support of unaccepted memory requires a few changes in core-mm code:
- memblock accepts memory on allocation. It serves early boot memory
allocations and doesn't limit them to pre-accepted pool of memory.
- page allocator accepts memory on the first allocation of the page.
When kernel runs out of accepted memory, it accepts memory until the
high watermark is reached. It helps to minimize fragmentation.
EFI code will provide two helpers if the platform supports unaccepted
memory:
- accept_memory() makes a range of physical addresses accepted.
- range_contains_unaccepted_memory() checks anything within the range
of physical addresses requires acceptance.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mike Rapoport <rppt@linux.ibm.com> # memblock
Link: https://lore.kernel.org/r/20230606142637.5171-2-kirill.shutemov@linux.intel.com
bool is the most sensible return value for a yes/no return. Also
add __init as this funtion is only called from the early boot code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531125535.676098-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
resolve a number of reported issues. Included in here are:
- iio driver fixes
- fpga driver fixes
- test_firmware bugfixes
- fastrpc driver tiny bugfixes
- MAINTAINERS file updates for some subsystems
All of these have been in linux-next this past week with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxDNg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl1ywCg0uz+E/GYKx5cP9chPFmbbaFwxH4AnRpn/kIH
xz6nbAqSf7CBbtxmED11
=4J1c
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
resolve a number of reported issues. Included in here are:
- iio driver fixes
- fpga driver fixes
- test_firmware bugfixes
- fastrpc driver tiny bugfixes
- MAINTAINERS file updates for some subsystems
All of these have been in linux-next this past week with no reported
issues"
* tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits)
test_firmware: fix the memory leak of the allocated firmware buffer
test_firmware: fix a memory leak with reqs buffer
test_firmware: prevent race conditions by a correct implementation of locking
firmware_loader: Fix a NULL vs IS_ERR() check
MAINTAINERS: Vaibhav Gupta is the new ipack maintainer
dt-bindings: fpga: replace Ivan Bornyakov maintainership
MAINTAINERS: update Microchip MPF FPGA reviewers
misc: fastrpc: reject new invocations during device removal
misc: fastrpc: return -EPIPE to invocations on device removal
misc: fastrpc: Reassign memory ownership only for remote heap
misc: fastrpc: Pass proper scm arguments for secure map request
iio: imu: inv_icm42600: fix timestamp reset
iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag
dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value
iio: dac: mcp4725: Fix i2c_master_send() return value handling
iio: accel: kx022a fix irq getting
iio: bu27034: Ensure reset is written
iio: dac: build ad5758 driver when AD5758 is selected
iio: addac: ad74413: fix resistance input processing
iio: light: vcnl4035: fixed chip ID check
...
Here are 2 small driver core cacheinfo fixes for 6.4-rc5 that resolve a
number of reported issues with that file. These changes have been in
linux-next this past week with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxChg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykrLACeJBLCDThdooct8G/7MzfpJhFcjSYAn1/EhJDA
GxgOmZrsB1HcO3Bo587a
=Cucq
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two small driver core cacheinfo fixes for 6.4-rc5 that
resolve a number of reported issues with that file. These changes have
been in linux-next this past week with no reported problems"
* tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug
drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
The current behaviour around cache_only is slightly inconsistent,
most paths will only check cache_only if cache_bypass is false,
and will return -EBUSY if a read attempts to go to the hardware
whilst cache_only is true. However, a couple of paths will not check
cache_only at all. The most notable of these being regmap_raw_read
which will check cache_only in the case it processes the transaction
one register at a time, but not in the case it handles them as a
block. In the typical case a device has been put into cache_only
whilst powered down this can cause physical reads to happen whilst the
device is unavailable.
Add a check in regmap_raw_read and move the check in regmap_noinc_read,
adding a check for cache_bypass, such that all paths are covered and
consistent.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230601101036.1499612-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Typically handle_post_irq is going to be used to manage some
additional chip specific hardware operations required on each IRQ,
these are very likely to want the chip to be resumed. For example the
current in tree user max77620 uses this to toggle a global mask bit,
which would obviously want the device resumed. It is worth noting this
device does not specify the runtime_pm flag in regmap_irq_chip, so
there is no actual issue.
Move the callback to before the pm_runtime_put, so it will be called
whilst the device is still resumed.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230601101036.1499612-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Until commit 5c2712387d ("cacheinfo: Fix LLC is not exported through
sysfs"), cacheinfo called populate_cache_leaves() for CPU coming online
which let the arch specific functions handle (at least on x86)
populating the shared_cpu_map. However, with the changes in the
aforementioned commit, populate_cache_leaves() is not called when a CPU
comes online as a result of hotplug since last_level_cache_is_valid()
returns true as the cacheinfo data is not discarded. The CPU coming
online is not present in shared_cpu_map, however, it will not be added
since the cpu_cacheinfo->cpu_map_populated flag is set (it is set in
populate_cache_leaves() when cacheinfo is first populated for x86)
This can lead to inconsistencies in the shared_cpu_map when an offlined
CPU comes online again. Example below depicts the inconsistency in the
shared_cpu_list in cacheinfo when CPU8 is offlined and onlined again on
a 3rd Generation EPYC processor:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# echo 1 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8
# cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
136
# cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
9-15,136-143
Clear the flag when the CPU is removed from shared_cpu_map when
cache_shared_cpu_map_remove() is called during CPU hotplug. This will
allow cache_shared_cpu_map_setup() to add the CPU coming back online in
the shared_cpu_map. Set the flag again when the shared_cpu_map is setup.
Following are results of performing the same test as described above with
the changes:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# echo 1 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
8,136
# cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
8-15,136-143
Fixes: 5c2712387d ("cacheinfo: Fix LLC is not exported through sysfs")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-3-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While building the shared_cpu_map, check if the cache level and cache
type matches. On certain systems that build the cache topology based on
the instance ID, there are cases where the same ID may repeat across
multiple cache levels, leading inaccurate topology.
In event of CPU offlining, the cache_shared_cpu_map_remove() does not
consider if IDs at same level are being compared. As a result, when same
IDs repeat across different cache levels, the CPU going offline is not
removed from all the shared_cpu_map.
Below is the output of cache topology of CPU8 and it's SMT sibling after
CPU8 is offlined on a dual socket 3rd Generation AMD EPYC processor
(2 x 64C/128T) running kernel release v6.3:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143
CPU8 is removed from index0 (L1i) but remains in the shared_cpu_list of
index1 (L1d) and index2 (L2). Since L1i, L1d, and L2 are shared by the
SMT siblings, and they have the same cache instance ID, CPU 2 is only
removed from the first index with matching ID which is index1 (L1i) in
this case. With this fix, the results are as expected when performing
the same experiment on the same system:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143
When rebuilding topology, the same problem appears as
cache_shared_cpu_map_setup() implements a similar logic. Consider the
same 3rd Generation EPYC processor: CPUs in Core 1, that share the L1
and L2 caches, have L1 and L2 instance ID as 1. For all the CPUs on
the second chiplet, the L3 ID is also 1 leading to grouping on CPUs from
Core 1 (1, 17) and the entire second chiplet (8-15, 24-31) as CPUs
sharing one cache domain. This went undetected since x86 processors
depended on arch specific populate_cache_leaves() method to repopulate
the shared_cpus_map when CPU came back online until kernel release
v6.3-rc5.
Fixes: 198102c910 ("cacheinfo: Fix shared_cpu_map to handle shared caches at different levels")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-2-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Usage of 'attr' and 'name' in the context of a sysfs attribute
definition are confusing because those read as being related to:
struct attribute .name
Rename 'name' to 'property' in preparation for renaming 'struct
node_hmem_attr' to a more generic name that can be used in more contexts
('struct access_coordinate'), and not be confused with 'struct
attribute'.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168332213518.2189163.18377767521423011290.stgit@djiang5-mobl3
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The isa_dev->dev.platform_data is initialized with incoming
parameter isa_driver. After it isa_dev->dev.platform_data is
checked for NULL, but incoming parameter isa_driver is not
NULL since it is dereferenced many times before this check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20230517125025.434005-1-VEfanov@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The most important fix here is for missing dropping of the RCU read lock
when syncing maple tree register caches, the physical devices I have
that use the code don't do any syncing so I'd only ever tested this with
virtual devices and missed the fact that we need to drop the lock in
order to write to buses that need to sleep. Otherwise there's a fix for
an edge case when splitting up large batch writes which has been lurking
for a long time, a check to make sure nobody writes new drivers with a
bug that was found in several SoundWire drivers and a tweak to the way
the new kunit tests are enabled to ensure they don't cause regmap to be
enabled when it wouldn't otherwise be.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmR15dQACgkQJNaLcl1U
h9BGKgf+NFD98ZNPsvOoFqAdG33yTDWISTf8FtIakM074vhz0xrBTFolqf/wmBGF
8VEabgfpmLSyuGEizjK/YLDK4f143XBgivyXUH04vGeAQUx4MfBaf0DSxBlQLh4K
K3LebH5QNQqdv9Qx7xMy7Y06r3KiQUxUvM8AINukspY0pmY8LtxyGLvv8T3kSpjz
WU28umwPrNsr36TcSuowncvZmMJ7Y1kSEnNjer1cdVlJVn4cFzkl/Sashmy8Rg3g
5Qmbg6Kuc6mLetoshtsculYntoLUJeCt6oX01Iq3cxVqkNi+mpOgwMOgjFFlwtrc
fela1GXUFEKe31FKyQPE0DRKv697eg==
=iAaD
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"The most important fix here is for missing dropping of the RCU read
lock when syncing maple tree register caches, the physical devices I
have that use the code don't do any syncing so I'd only ever tested
this with virtual devices and missed the fact that we need to drop the
lock in order to write to buses that need to sleep.
Otherwise there's a fix for an edge case when splitting up large batch
writes which has been lurking for a long time, a check to make sure
nobody writes new drivers with a bug that was found in several
SoundWire drivers and a tweak to the way the new kunit tests are
enabled to ensure they don't cause regmap to be enabled when it
wouldn't otherwise be"
* tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: maple: Drop the RCU read lock while syncing registers
regmap: sdw: check for invalid multi-register writes config
regmap: Account for register length when chunking
regmap: REGMAP_KUNIT should not select REGMAP
Move the pm_suspend_target_state definition for CONFIG_SUSPEND
unset from the wakeup code into the headers so as to allow it
to still be used elsewhere when CONFIG_SUSPEND is not set.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
[ rjw: Changelog and subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, while calculating residency and latency values, right
operands may overflow if resulting values are big enough.
To prevent this, albeit unlikely case, play it safe and convert
right operands to left ones' type s64.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: 30f604283e ("PM / Domains: Allow domain power states to be read from DT")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently we use the normal single register write function to load the
default values into the cache, resulting in a large number of reallocations
when there are blocks of registers as we extend the memory region we are
using to store the values. Instead scan through the list of defaults for
blocks of adjacent registers and do a single allocation and insert for each
such block. No functional change.
We do not take advantage of the maple tree preallocation, this is purely at
the regcache level. It is not clear to me yet if the maple tree level would
help much here or if we'd have more overhead from overallocating and then
freeing maple tree data.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230523-regcache-maple-load-defaults-v1-1-0c04336f005d@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Unfortunately the maple tree requires us to explicitly lock it so we need
to take the RCU read lock while iterating. When syncing this means that we
end up trying to write out register values while holding the RCU read lock
which triggers lockdep issues since that is an atomic context but most
buses can't be used in atomic context. Pause the iteration and drop the
lock for each register we check to avoid this.
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230523-regcache-maple-sync-lock-v1-1-530e4d68dfab@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
SoundWire code as it is only supports Bulk register writes and
it does not support multi-register writes.
Any drivers that set can_multi_write and use regmap_multi_reg_write() will
easily endup with programming the hardware incorrectly without any errors.
So, add this check in bus code to be able to validate the drivers config.
Fixes: 522272047d ("regmap: sdw: Remove 8-bit value size restriction")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20230523154747.5429-1-srinivas.kandagatla@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
When class_dev_iter is initialized, the reference count for the subsys
private structure is incremented, but never decremented, causing a
memory leak over time. To resolve this, save off a pointer to the
internal structure into the class_dev_iter structure and then when the
iterator is finished, drop the reference count.
Reported-and-tested-by: syzbot+e7afd76ad060fa0d2605@syzkaller.appspotmail.com
Fixes: 7b884b7f24 ("driver core: class.c: convert to only use class_to_subsys")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/r/2023051610-stove-condense-9a77@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, when regmap_raw_write() splits the data, it uses the
max_raw_write value defined for the bus. For any bus that includes
the target register address in the max_raw_write value, the chunked
transmission will always exceed the maximum transmission length.
To avoid this problem, subtract the length of the register and the
padding from the maximum transmission.
Signed-off-by: Jim Wylder <jwylder@google.com
Link: https://lore.kernel.org/r/20230517152444.3690870-2-jwylder@google.com
Signed-off-by: Mark Brown <broonie@kernel.org
Merge series from Aidan MacDonald <aidanmacdonald.0x0@gmail.com>:
This is a straightforward patch series, mostly just removing a bunch
of old features that were only used by a handful of drivers.
- 1/4 and 2/4 remove unused, deprecated functionality
- 3/4 makes the behavior of .handle_mask_sync() a bit more consistent
w.r.t. mask and unmask registers, to aid maintainability.
- 4/4 removes now-unused "inverted mask/unmask" compatibility code.
Regmap's stride is used for MMIO regmaps to check the correctness of
reg_width. However, it's acceptable to pass an empty config->reg_stride,
in that case the actual stride used is 1.
There are valid cases now to pass an empty stride, when using
down/upshifting of register address. In this case, the stride value
loses its sense, so ignore the reg_width when the stride isn't set.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com
Link: https://lore.kernel.org/r/20230511142735.316445-1-maxime.chevallier@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org
All users must now specify .mask_unmask_non_inverted = true to
ensure they are using the expected semantics: 1s disable IRQs
in the mask registers, and enable IRQs in the unmask registers.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com
Link: https://lore.kernel.org/r/20230511091342.26604-5-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org
If a .handle_mask_sync() callback is provided it supersedes all
inbuilt handling of mask registers, and judging by the commit
69af4bcaa0 ("regmap-irq: Add handle_mask_sync() callback") it
is intended to completely replace all default IRQ masking logic.
The implementation has two minor inconsistencies, which can be
fixed without breaking compatibility:
(1) mask_base must be set to enable .handle_mask_sync(), even
though mask_base is otherwise unused. This is easily fixed
because mask_base is already optional.
(2) Unmask registers aren't accounted for -- they are part of
the default IRQ masking logic and are just a bit-inverted
version of mask registers. It would be a bad idea to allow
them to be used at the same time as .handle_mask_sync(),
as the result would be confusing and unmaintainable, so
make sure this can't happen.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com
Link: https://lore.kernel.org/r/20230511091342.26604-4-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org
Enabling a (modular) test should not silently enable additional kernel
functionality, as that may increase the attack vector of a product.
Fix this by:
1. making REGMAP visible if CONFIG_KUNIT_ALL_TESTS is enabled,
2. making REGMAP_KUNIT depend on REGMAP instead of selecting it.
After this, one can safely enable CONFIG_KUNIT_ALL_TESTS=m to build
modules for all appropriate tests for ones system, without pulling in
extra unwanted functionality, while still allowing a tester to manually
enable REGMAP and its test suite on a system where REGMAP is not enabled
by default.
Fixes: 2238959b6a ("regmap: Add some basic kunit tests")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org
Link: https://lore.kernel.org/r/b0a5dbb17c1d5ea482e052e585ae83bb69c48806.1682516005.git.geert@linux-m68k.org
Signed-off-by: Mark Brown <broonie@kernel.org
Remove the map parameter from the struct regmap_irq_chip callback
handle_mask_sync() because it can be passed via the irq_drv_data
parameter instead. The gpio-104-dio-48e driver is the only consumer of
this callback and is thus updated accordingly.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org
Signed-off-by: William Breathitt Gray <william.gray@linaro.org
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com
Link: https://lore.kernel.org/r/1f44fb0fbcd3dccea3371215b00f1b9a956c1a12.1679323449.git.william.gray@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page().
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
than exclusive, and has fixed a bunch of errors which were caused by its
unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
=J+Dj
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page()
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
rather than exclusive, and has fixed a bunch of errors which were
caused by its unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics
flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim
accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
shmem: restrict noswap option to initial user namespace
mm/khugepaged: fix conflicting mods to collapse_file()
sparse: remove unnecessary 0 values from rc
mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
maple_tree: fix allocation in mas_sparse_area()
mm: do not increment pgfault stats when page fault handler retries
zsmalloc: allow only one active pool compaction context
selftests/mm: add new selftests for KSM
mm: add new KSM process and sysfs knobs
mm: add new api to enable ksm per process
mm: shrinkers: fix debugfs file permissions
mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
migrate_pages_batch: fix statistics for longterm pin retry
userfaultfd: use helper function range_in_vma()
lib/show_mem.c: use for_each_populated_zone() simplify code
mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
fs/buffer: convert create_page_buffers to folio_create_buffers
fs/buffer: add folio_create_empty_buffers helper
...
Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening in
the driver core in the quest to be able to move "struct bus" and "struct
class" into read-only memory, a task now complete with these changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules for
all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most of
them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
LEGadNS38k5fs+73UaxV
=7K4B
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
- First part of DT header detangling dropping cpu.h from of_device.h
and replacing some includes with forward declarations. A handful of
drivers needed some adjustment to their includes as a result.
- Refactor of_device.h to be used by bus drivers rather than various
device drivers. This moves non-bus related functions out of
of_device.h. The end goal is for of_platform.h and of_device.h to stop
including each other.
- Refactor open coded parsing of "ranges" in some bus drivers to use DT
address parsing functions
- Add some new address parsing functions of_property_read_reg(),
of_range_count(), and of_range_to_resource() in preparation to convert
more open coded parsing of DT addresses to use them.
- Treewide clean-ups to use of_property_read_bool() and
of_property_present() as appropriate. The ones here are the ones
that didn't get picked up elsewhere.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmRIOrkACgkQ+vtdtY28
YcN9WA//R+QrmSPExhfgio5y+aOJDWucqnAcyAusPctLcF7h7j0CdzpwaSRkdaH4
KiLjeyt6tKn8wt8w7m/+SmCsSYXPn81GH/Y5I2F40x6QMrY3cVOXUsulKQA+6ZjZ
PmW3bMcz0Dw9IhUK3R/WX96+9UdoytKg5qoTzNzPTKpvKA1yHa/ogl2FnHJS5W+8
Rxz+1oJ70VMIWGpBOc0acHuB2S0RHZ46kPKkPTBgFYEwtmJ8qobvV3r3uQapNaIP
2jnamPu0tAaQoSaJKKSulToziT+sd1sNB+9oyu/kP+t3PXzq4qwp2Gr4jzUYKs4A
ZF3DPhMR3YLLN41g/L3rtB0T/YIS287sZRuaLhCqldNpRerSDk4b0HRAksGk1XrI
HqYXjWPbRxqYiIUWkInfregSTYJfGPxeLfLKrawNO34/eEV4JrkSKy8d0AJn04EK
jTRqI3L7o23ZPxs29uH/3+KK90J3emPZkF7GWVJTEAMsM8jYZduGh7EpsttJLaz/
QnxbTBm9295ahIdCfo/OQhqjWnaNhpbTzf31pyrBZ/itXV7gQ0xjwqPwiyFwI+o/
F/r81xqdwQ3Ni8MKt2c7zLyVA95JHPe95KQ3GrDXR68aByJr4RuhKG8Y2Pj1VOb3
V+Hsu5uhwKrK7Yqe+rHDnJBO00OCO8nwbWhMy2xVxoTkSFCjDmo=
=89Zj
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull more devicetree updates from Rob Herring:
- First part of DT header detangling dropping cpu.h from of_device.h
and replacing some includes with forward declarations. A handful of
drivers needed some adjustment to their includes as a result.
- Refactor of_device.h to be used by bus drivers rather than various
device drivers. This moves non-bus related functions out of
of_device.h. The end goal is for of_platform.h and of_device.h to
stop including each other.
- Refactor open coded parsing of "ranges" in some bus drivers to use DT
address parsing functions
- Add some new address parsing functions of_property_read_reg(),
of_range_count(), and of_range_to_resource() in preparation to
convert more open coded parsing of DT addresses to use them.
- Treewide clean-ups to use of_property_read_bool() and
of_property_present() as appropriate. The ones here are the ones that
didn't get picked up elsewhere.
* tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits)
bus: tegra-gmi: Replace of_platform.h with explicit includes
hte: Use of_property_present() for testing DT property presence
w1: w1-gpio: Use of_property_read_bool() for boolean properties
virt: fsl: Use of_property_present() for testing DT property presence
soc: fsl: Use of_property_present() for testing DT property presence
sbus: display7seg: Use of_property_read_bool() for boolean properties
sparc: Use of_property_read_bool() for boolean properties
sparc: Use of_property_present() for testing DT property presence
bus: mvebu-mbus: Remove open coded "ranges" parsing
of/address: Add of_property_read_reg() helper
of/address: Add of_range_count() helper
of/address: Add support for 3 address cell bus
of/address: Add of_range_to_resource() helper
of: unittest: Add bus address range parsing tests
of: Drop cpu.h include from of_device.h
OPP: Adjust includes to remove of_device.h
irqchip: loongson-eiointc: Add explicit include for cpuhotplug.h
cpuidle: Adjust includes to remove of_device.h
cpufreq: sun50i: Add explicit include for cpu.h
cpufreq: Adjust includes to remove of_device.h
...
- Fix the frequency unit in cpufreq_verify_current_freq checks()
(Sanjay Chandrashekara).
- Make mode_state_machine in amd-pstate static (Tom Rix).
- Make the cpufreq core require drivers with target_index() to set
freq_table (Viresh Kumar).
- Fix typo in the ARM_BRCMSTB_AVS_CPUFREQ Kconfig entry (Jingyu Wang).
- Use of_property_read_bool() for boolean properties in the pmac32
cpufreq driver (Rob Herring).
- Make the cpufreq sysfs interface return proper error codes on
obviously invalid input (qinyu).
- Add guided autonomous mode support to the AMD P-state driver (Wyes
Karny).
- Make the Intel P-state driver enable HWP IO boost on all server
platforms (Srinivas Pandruvada).
- Add opp and bandwidth support to tegra194 cpufreq driver (Sumit
Gupta).
- Use of_property_present() for testing DT property presence (Rob
Herring).
- Remove MODULE_LICENSE in non-modules (Nick Alcock).
- Add SM7225 to cpufreq-dt-platdev blocklist (Luca Weiss).
- Optimizations and fixes for qcom-cpufreq-hw driver (Krzysztof
Kozlowski, Konrad Dybcio, and Bjorn Andersson).
- DT binding updates for qcom-cpufreq-hw driver (Konrad Dybcio and
Bartosz Golaszewski).
- Updates and fixes for mediatek driver (Jia-Wei Chang and
AngeloGioacchino Del Regno).
- Use of_property_present() for testing DT property presence in the
cpuidle code (Rob Herring).
- Drop unnecessary (void *) conversions from the PM core (Li zeming).
- Add sysfs files to represent time spent in a platform sleep state
during suspend-to-idle and make AMD and Intel PMC drivers use them
(Mario Limonciello).
- Use of_property_present() for testing DT property presence (Rob
Herring).
- Add set_required_opps() callback to the 'struct opp_table', to make
the code paths cleaner (Viresh Kumar).
- Update the pm-graph siute of utilities to v5.11 with the following
changes:
* New script which allows users to install the latest pm-graph
from the upstream github repo.
* Update all the dmesg suspend/resume PM print formats to be able to
process recent timelines using dmesg only.
* Add ethtool output to the log for the system's ethernet device if
ethtool exists.
* Make the tool more robustly handle events where mangled dmesg or
ftrace outputs do not include all the requisite data.
- Make the sleepgraph utility recognize "CPU killed" messages (Xueqin
Luo).
- Remove unneeded SRCU selection in Kconfig because it's always set
from devfreq core (Paul E. McKenney).
- Drop of_match_ptr() macro from exynos-bus.c because this driver is
always using the DT table for driver probe (Krzysztof Kozlowski).
- Use the preferred of_property_present() instead of the low-level
of_get_property() on exynos-bus.c (Rob Herring).
- Use devm_platform_get_and_ioream_resource() in exyno-ppmu.c (Yang Li).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmRGvX4SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxcwsQAK5wK1HWLZDap8nTGGAyvpX+bNJ3YM+l
TS1zSzWV97K6kq2bg4GTgDi6EXJJNgfP9sThOEIee5GrWAjrk9yaxjEyIcrUBjfl
oyFN8SEuYbMN5t9Bir3GRqkL+tWErUiVafplML6vTT8W8GlL2rbxPXM6ifmK9IJq
7r3Rt+tlMrookTzV+ykSGVmC5cpnlNGsvMlGGw91Z8rlICy7MI/ecg8O6Zsy25dR
Vchrg0M+jVxtaFU9/ikQaNHx0B3AF7fpi472CYYWgk1ABfIfNyQATeHsCkKan/ZV
i4+gfgIhIQnO1Ov/05aGYbBhxVpFGQIcLkG0vEmdbHsnC/WDuMCrr5wg1HCgCdpQ
+0eQem5bWxrzKp0g9tL07QG8LuiJTfjuA4DrRZNhudKFU9oglZfZeywRk+s6ta4v
rQFzz7qdlKpcM87pz/Bm8tSTc8UYNCDd7hLe+ZI940CMs/vQ4CfQJ2tlYaIl0AiO
q33Nz1iqhEycQ9OZDzBDyQtK+Xm6lsXUehIBtbqBsFsP3Ry+nxe/fz6UMs5tVNeM
BYaaNhhkiZMhXgJncMi2oR8/LRLYtOHjn1rdOGSMu9Rck5i5TVPsxqzUOzkhvuM9
eXAwts6SwFVYxtaPJs+i6yl8cdLOFORsntIBWFKuwsgH8BFx7pNFuZA33eMOA+Iw
UFey2fKDn3W5
=p/5G
-----END PGP SIGNATURE-----
Merge tag 'pm-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These update several cpufreq drivers and the cpufreq core, add sysfs
interface for exposing the time really spent in the platform low-power
state during suspend-to-idle, update devfreq (core and drivers) and
the pm-graph suite of tools and clean up code.
Specifics:
- Fix the frequency unit in cpufreq_verify_current_freq checks()
Sanjay Chandrashekara)
- Make mode_state_machine in amd-pstate static (Tom Rix)
- Make the cpufreq core require drivers with target_index() to set
freq_table (Viresh Kumar)
- Fix typo in the ARM_BRCMSTB_AVS_CPUFREQ Kconfig entry (Jingyu Wang)
- Use of_property_read_bool() for boolean properties in the pmac32
cpufreq driver (Rob Herring)
- Make the cpufreq sysfs interface return proper error codes on
obviously invalid input (qinyu)
- Add guided autonomous mode support to the AMD P-state driver (Wyes
Karny)
- Make the Intel P-state driver enable HWP IO boost on all server
platforms (Srinivas Pandruvada)
- Add opp and bandwidth support to tegra194 cpufreq driver (Sumit
Gupta)
- Use of_property_present() for testing DT property presence (Rob
Herring)
- Remove MODULE_LICENSE in non-modules (Nick Alcock)
- Add SM7225 to cpufreq-dt-platdev blocklist (Luca Weiss)
- Optimizations and fixes for qcom-cpufreq-hw driver (Krzysztof
Kozlowski, Konrad Dybcio, and Bjorn Andersson)
- DT binding updates for qcom-cpufreq-hw driver (Konrad Dybcio and
Bartosz Golaszewski)
- Updates and fixes for mediatek driver (Jia-Wei Chang and
AngeloGioacchino Del Regno)
- Use of_property_present() for testing DT property presence in the
cpuidle code (Rob Herring)
- Drop unnecessary (void *) conversions from the PM core (Li zeming)
- Add sysfs files to represent time spent in a platform sleep state
during suspend-to-idle and make AMD and Intel PMC drivers use them
Mario Limonciello)
- Use of_property_present() for testing DT property presence (Rob
Herring)
- Add set_required_opps() callback to the 'struct opp_table', to make
the code paths cleaner (Viresh Kumar)
- Update the pm-graph siute of utilities to v5.11 with the following
changes:
* New script which allows users to install the latest pm-graph
from the upstream github repo.
* Update all the dmesg suspend/resume PM print formats to be able
to process recent timelines using dmesg only.
* Add ethtool output to the log for the system's ethernet device
if ethtool exists.
* Make the tool more robustly handle events where mangled dmesg
or ftrace outputs do not include all the requisite data.
- Make the sleepgraph utility recognize "CPU killed" messages (Xueqin
Luo)
- Remove unneeded SRCU selection in Kconfig because it's always set
from devfreq core (Paul E. McKenney)
- Drop of_match_ptr() macro from exynos-bus.c because this driver is
always using the DT table for driver probe (Krzysztof Kozlowski)
- Use the preferred of_property_present() instead of the low-level
of_get_property() on exynos-bus.c (Rob Herring)
- Use devm_platform_get_and_ioream_resource() in exyno-ppmu.c (Yang
Li)"
* tag 'pm-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits)
platform/x86/intel/pmc: core: Report duration of time in HW sleep state
platform/x86/intel/pmc: core: Always capture counters on suspend
platform/x86/amd: pmc: Report duration of time in hw sleep state
PM: Add sysfs files to represent time spent in hardware sleep state
cpufreq: use correct unit when verify cur freq
cpufreq: tegra194: add OPP support and set bandwidth
cpufreq: amd-pstate: Make varaiable mode_state_machine static
PM: core: Remove unnecessary (void *) conversions
cpufreq: drivers with target_index() must set freq_table
PM / devfreq: exynos-ppmu: Use devm_platform_get_and_ioremap_resource()
OPP: Move required opps configuration to specialized callback
OPP: Handle all genpd cases together in _set_required_opps()
cpufreq: qcom-cpufreq-hw: Revert adding cpufreq qos
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCM2290
dt-bindings: cpufreq: cpufreq-qcom-hw: Sanitize data per compatible
dt-bindings: cpufreq: cpufreq-qcom-hw: Allow just 1 frequency domain
cpufreq: Add SM7225 to cpufreq-dt-platdev blocklist
cpufreq: qcom-cpufreq-hw: fix double IO unmap and resource release on exit
cpufreq: mediatek: Raise proc and sram max voltage for MT7622/7623
cpufreq: mediatek: raise proc/sram max voltage for MT8516
...
This is a much bigger change for regmap than is normal, the main things
being the addition of some KUnit coverage and a maple tree based
register cache which longer term is likely to replace the rbtree cache
except possibly for very small register maps. While it's complete
overkill for most applications the code for maple trees is there and
there are some larger, sparser devices where the data structure is a
better fit.
The maple tree support is still a work in progress but already useful,
there's some conversions of drivers ready to go after the merge window.
- Support for shifting register addresses up as well as down, there's a
use cases with memory mapped MDIO.
- Refactoring of the type configuration in regmap-irq to allow access
to driver data in the handler, needed by some GPIO devices.
- Some initial KUnit coverage, the bulk of the driver facing API is
covered but there's holes and things like the data marshalling for
bytestream buses are just not covered in the slightest.
- Removal of the compressed cache type, it had zero users and was
getting in the way of KUnit.
- Addition of a maple tree based register cache, there's more work to
do but it's already useful for some devices with a flatter data
structure than rbtree and getting to use all the optimisation work
Liam is doing.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmRGe7QACgkQJNaLcl1U
h9BNSwf9ElqCWZgVGjMoF4b5+1M0rD6jQXbWshwRYjnvVIBaMtTXuCIZmzdmV6pH
oQTiDaTCiZGQVne3Cobncf0Pk1FTtuE3IL39CB5X355cqrkMOOxw57NYufDhhQTk
UJNTORbHPZGLOmLigVeK5nVFyGUTcSt9s9Haqz1S85Ao3rnXMw9IrC5L1QAv73zQ
pNQTLOdZDyg+vdW5Yuc0sVY5PnmVLIF5abI6E1MXumCLyQ8wS2Px1fUYXnQHmtJ1
2pZFYoqjxxIfObzC1SIsZFh0lkDdPe7WBeSKsGORhWiiCY5nYba1CiZ5DFNWR8hf
hJkxbALhdk6Iksvz0X14nAP6ESv66A==
=1hr2
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This is a much bigger change for regmap than is normal, the main
things being the addition of some KUnit coverage and a maple tree
based register cache which longer term is likely to replace the rbtree
cache except possibly for very small register maps.
While it's complete overkill for most applications the code for maple
trees is there and there are some larger, sparser devices where the
data structure is a better fit.
The maple tree support is still a work in progress but already useful,
there's some conversions of drivers ready to go after the merge
window.
Summary:
- Support for shifting register addresses up as well as down, there's
a use cases with memory mapped MDIO.
- Refactoring of the type configuration in regmap-irq to allow access
to driver data in the handler, needed by some GPIO devices.
- Some initial KUnit coverage, the bulk of the driver facing API is
covered but there's holes and things like the data marshalling for
bytestream buses are just not covered in the slightest.
- Removal of the compressed cache type, it had zero users and was
getting in the way of KUnit.
- Addition of a maple tree based register cache, there's more work to
do but it's already useful for some devices with a flatter data
structure than rbtree and getting to use all the optimisation work
Liam is doing"
* tag 'regmap-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: allow upshifting register addresses before performing operations
regmap: Pass irq_drv_data as a parameter for set_type_config()
regmap: Use mas_walk() instead of mas_find()
regmap: Fix double unlock in the maple cache
regmap: Add maple tree based register cache
regmap: Factor out single value register syncing
regmap: Add some basic kunit tests
regmap: Add RAM backed register map
regmap: Removed compressed cache support
regmap: Support paging for buses with reg_read()/reg_write()
regmap: Clarify error for unknown cache types
regmap: Handle sparse caches in the default sync
regmap: add a helper to translate the register address
regmap: cache: Silence checkpatch warning
regmap: cache: Return error in cache sync operations for REGCACHE_NONE
regmap-irq: Place kernel doc of struct regmap_irq_chip in order
regmap-irq: Add no_status support
regmap: sdw: Remove 8-bit value size restriction
regmap: sdw: Update misleading comment
o MAINTAINERS files additions and changes.
o Fix hotplug warning in nohz code.
o Tick dependency changes by Zqiang.
o Lazy-RCU shrinker fixes by Zqiang.
o rcu-tasks stall reporting improvements by Neeraj.
o Initial changes for renaming of k[v]free_rcu() to its new k[v]free_rcu_mightsleep()
name for robustness.
o Documentation Updates:
o Significant changes to srcu_struct size.
o Deadlock detection for srcu_read_lock() vs synchronize_srcu() from Boqun.
o rcutorture and rcu-related tool, which are targeted for v6.4 from Boqun's tree.
o Other misc changes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmQuBnIACgkQqA4nf2o4
5hACVRAAoXu7/gfh5Pjw9O4E4pCdPJKsZZVYrcrVGrq6NAxRn6M1SgurAdC5grj2
96x0waoGaiO82V0H5iJMcKdAVu67x9R8WaQ1JoxN75Efn8h9W4TguB87TV1gk0xS
eZ18b/CyEaM5mNb80DFFF4FLohy5737p/kNTMqXQdUyR1BsDl16iRMgjiBiFhNUx
yPo8Y2kC2U2OTbldZgaE7s9bQO3xxEcifx93sGWsAex/gx54FYNisiwSlCOSgOE+
XkYo/OKk8Xvr82tLVX8XQVEPCMJ+rxea8T5zSs8/alvsPq7gA8wW3y6fsoa3vUU/
+Gd+W+Q/OsONIDtp8rQAY1qsD0ScDpaR8052RSH0zTa7pj8HsQgE5PjZ+cJW0SEi
cKN+Oe8+ETqKald+xZ6PDf58O212VLrru3RpQWrOQcJ7fmKmfT4REK0RcbLgg4qT
CBgOo6eg+ub4pxq2y11LZJBNTv1/S7xAEzFE0kArew64KB2gyVud0VJRZVAJnEfe
93QQVDFrwK2bhgWQZ6J6IbTvGeQW0L93IibuaU6jhZPR283VtUIIvM7vrOylN7Fq
4jsae0T7YGYfKUhgTpm7rCnm8A/D3Ni8MY0sKYYgDSyKmZUsnpI5wpx1xke4lwwV
ErrY46RCFa+k8wscc6iWfB4cGXyyFHyu+wtyg0KpFn5JAzcfz4A=
=Rgbj
-----END PGP SIGNATURE-----
Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux
Pull RCU updates from Joel Fernandes:
- Updates and additions to MAINTAINERS files, with Boqun being added to
the RCU entry and Zqiang being added as an RCU reviewer.
I have also transitioned from reviewer to maintainer; however, Paul
will be taking over sending RCU pull-requests for the next merge
window.
- Resolution of hotplug warning in nohz code, achieved by fixing
cpu_is_hotpluggable() through interaction with the nohz subsystem.
Tick dependency modifications by Zqiang, focusing on fixing usage of
the TICK_DEP_BIT_RCU_EXP bitmask.
- Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
kernels, fixed by Zqiang.
- Improvements to rcu-tasks stall reporting by Neeraj.
- Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
increased robustness, affecting several components like mac802154,
drbd, vmw_vmci, tracing, and more.
A report by Eric Dumazet showed that the API could be unknowingly
used in an atomic context, so we'd rather make sure they know what
they're asking for by being explicit:
https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/
- Documentation updates, including corrections to spelling,
clarifications in comments, and improvements to the srcu_size_state
comments.
- Better srcu_struct cache locality for readers, by adjusting the size
of srcu_struct in support of SRCU usage by Christoph Hellwig.
- Teach lockdep to detect deadlocks between srcu_read_lock() vs
synchronize_srcu() contributed by Boqun.
Previously lockdep could not detect such deadlocks, now it can.
- Integration of rcutorture and rcu-related tools, targeted for v6.4
from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
module parameter, and more
- Miscellaneous changes, various code cleanups and comment improvements
* tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
checkpatch: Error out if deprecated RCU API used
mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
rcu/trace: use strscpy() to instead of strncpy()
tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
...
device_property functions do not modify the device pointer passed to them.
The underlying of_device and fwnode_ functions actually already take
const * arguments. Mark the parameter constant to simplify conversion
from of_property to device_property functions, and to let the calling code
use const device pointers where possible.
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230419164127.3773278-1-linux@roeck-us.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Document that some subsystems are still going to use device_rename for
the time being, so it is not a good idea to assume it's not used. Also
remove mentions of a plan to stop renaming net devices.
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Link: https://lore.kernel.org/r/20230406045435.19452-1-wedsonaf@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't require the use of dynamic debug (or modification of the kernel to
add a #define DEBUG to the top of this file) to get the printk message
about driver probe timing. This printk is only emitted when
initcall_debug is enabled on the kernel commandline, and it isn't
immediately obvious that you have to do something else to debug boot
timing issues related to driver probe. Add a comment too so it doesn't
get converted back to pr_debug().
Fixes: eb7fbc9fb1 ("driver core: Add missing '\n' in log messages")
Cc: stable <stable@kernel.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Brian Norris <briannorris@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The crypto dependencies for the firmwware loader are incomplete,
in particular a built-in FW_LOADER fails to link against a modular
crypto hash driver:
ld.lld: error: undefined symbol: crypto_alloc_shash
ld.lld: error: undefined symbol: crypto_shash_digest
ld.lld: error: undefined symbol: crypto_destroy_tfm
>>> referenced by main.c
>>> drivers/base/firmware_loader/main.o:(fw_log_firmware_info) in archive vmlinux.a
Rework this to use the usual 'select' from the driver module,
to respect the built-in vs module dependencies, and add a
more verbose crypto dependency to the debug option to prevent
configurations that lead to a link failure.
Fixes: 02fe26f253 ("firmware_loader: Add debug message with checksum for FW file")
Reviewed-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230414080329.76176-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Having helped an user recently figure out why the customized path being
specified was not taken into account landed on a subtle difference
between using:
echo "/xyz/firmware" > /sys/module/firmware_class/parameters/path
which inserts an additional newline which is passed as is down to
fw_get_filesystem_firmware() and ultimately kernel_read_file_from_path()
and fails.
Strip off \n from the customized firmware path such that users do not
run into these hard to debug situations.
Link: https://lore.kernel.org/all/20230402135423.3235-1-f.fainelli@gmail.com/
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230413191757.1949088-1-f.fainelli@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cache information can be extracted from either a Device Tree(DT),
the PPTT ACPI table, or arch registers (clidr_el1 for arm64).
When the DT is used but no cache properties are advertised, the current
code doesn't correctly fallback to using arch information. The changes
fixes the same and also assuse the that L1 data/instruction caches
are private and L2/higher caches are shared when the cache information
is missing in DT/ACPI and is derived form clidr_el1/arch registers.
Currently the cacheinfo is built from the primary CPU prior to secondary
CPUs boot, if the DT/ACPI description contains cache information.
However, if not present, it still reverts to the old behavior, which
allocates the cacheinfo memory on each secondary CPUs which causes
RT kernels to triggers a "BUG: sleeping function called from invalid
context".
The changes here attempts to enable automatic detection for RT kernels
when no DT/ACPI cache information is available, by pre-allocating
cacheinfo memory on the primary CPU.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmQ9bmEACgkQAEG6vDF+
4pggDhAAo75vFky/U63PV9x4IMvVTkNGIK/DSihvG+s5GsujiSS78f4tSSpLAY1X
yWTEYsU+1q03cE8rv0Xmkw+cxLerjOGisvPZCdnKVljUF5TWez6OlwKw5V1QqWk9
faOmWBSb8fH4X0ys73e0SBsF3bzyJB/cORmbOL8OnCE7rGkyMQ0plhYVOBQ3CoV8
KMrw7rnPlc5Aoq/8LgTY+Gojqf82njacCIQPrn1TjS/V0SdobC8xm7ZoepZgG9Q1
MHzBtTmKzGrVxWawxcfTaE89t2rudAafa3noYr4xy+dI8ptSkTQNKCG9pvTxIcFo
KXKfBO6+hTWNREf7a4ginKdKcIKDD7Oo2oTnPBYr9O9/2r727c5Wa5FEGsXa4c14
0dN3vFbhwWIDc5sA7eBOCDYycJm6XyTGJ6iPR+fmaWKwgXhIM2WxakeO8aptuwhn
sV+XO7IJxSF79GrrAkSN6AtVj6NHvNpzwRx9mBwHkT3ajDGb04P23AOuqoNDt7kh
wcz217ng3hG4CDEbOk9JCQqtFhJeSuIWP5T3wofUfYeXy2opQhwzmiOL+NVROuGY
h/cwQCcXCEYUohJhln2PvbMS7w0bBNserpetzEFJfc5aLLdocaH3aQ/AFnCXmCdN
YPXP6bp0WdbmfcMskVY8vlb/TBlcOzu6+W0qB1dXTWuib7cwm3o=
=4hhJ
-----END PGP SIGNATURE-----
Merge tag 'cacheinfo-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
cacheinfo and arch_topology updates for v6.4
The cache information can be extracted from either a Device Tree(DT),
the PPTT ACPI table, or arch registers (clidr_el1 for arm64).
When the DT is used but no cache properties are advertised, the current
code doesn't correctly fallback to using arch information. The changes
fixes the same and also assuse the that L1 data/instruction caches
are private and L2/higher caches are shared when the cache information
is missing in DT/ACPI and is derived form clidr_el1/arch registers.
Currently the cacheinfo is built from the primary CPU prior to secondary
CPUs boot, if the DT/ACPI description contains cache information.
However, if not present, it still reverts to the old behavior, which
allocates the cacheinfo memory on each secondary CPUs which causes
RT kernels to triggers a "BUG: sleeping function called from invalid
context".
The changes here attempts to enable automatic detection for RT kernels
when no DT/ACPI cache information is available, by pre-allocating
cacheinfo memory on the primary CPU.
* tag 'cacheinfo-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
The cache information can be extracted from either a Device
Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
for arm64).
The clidr_el1 register is used only if DT/ACPI information is not
available. It does not states how caches are shared among CPUs.
Add a use_arch_cache_info field/function to identify when the
DT/ACPI doesn't provide cache information. Use this information
to assume L1 caches are privates and L2 and higher are shared among
all CPUs.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230414081453.244787-5-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
fetch_cache_info() tries to get the number of cache leaves/levels
for each CPU in order to pre-allocate memory for cacheinfo struct.
Allocating this memory later triggers a:
'BUG: sleeping function called from invalid context'
in PREEMPT_RT kernels.
If there is no cache related information available in DT or ACPI,
fetch_cache_info() fails and an error message is printed:
'Early cacheinfo failed, ret = ...'
Not having cache information should be a valid configuration.
Remove the error message if fetch_cache_info() fails with -ENOENT.
Suggested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-4-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
If a Device Tree (DT) is used, the presence of cache properties is
assumed. Not finding any is not considered. For arm64 platforms,
cache information can be fetched from the clidr_el1 register.
Checking whether cache information is available in the DT
allows to switch to using clidr_el1.
init_of_cache_level()
\-of_count_cache_leaves()
will assume there a 2 cache leaves (L1 data/instruction caches), which
can be different from clidr_el1 information.
cache_setup_of_node() tries to read cache properties in the DT.
If there are none, this is considered a success. Knowing no
information was available would allow to switch to using clidr_el1.
Fixes: de0df442ee ("cacheinfo: Check 'cache-unified' property to count cache leaves")
Reported-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-3-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
If there is no ACPI/DT information, it is assumed that L1 caches
are private and L2 (and higher) caches are shared. A cache is
'shared' between two CPUs if it is accessible from these two
CPUs.
Each CPU owns a representation (i.e. has a dedicated cacheinfo struct)
of the caches it has access to. cache_leaves_are_shared() tries to
identify whether two representations are designating the same actual
cache.
In cache_leaves_are_shared(), if 'this_leaf' is a L2 cache (or higher)
and 'sib_leaf' is a L1 cache, the caches are detected as shared as
only this_leaf's cache level is checked.
This is leads to setting sib_leaf as being shared with another CPU,
which is incorrect as this is a L1 cache.
Check 'sib_leaf->level'. Also update the comment as the function is
called when populating 'shared_cpu_map'.
Fixes: f16d1becf9 ("cacheinfo: Use cache identifiers to check if the caches are shared if available")
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-2-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
implicitly including other includes, and is no longer needed. Update the
includes to use of.h instead of of_device.h.
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230329-dt-cpu-header-cleanups-v1-10-581e2605fe47@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Recent work enables cacheinfo memory for secondary CPUs to be allocated
early, while still running on the primary CPU. That allows cacheinfo
memory to be allocated safely on RT kernels. To make that work, the
number of cache levels/leaves must be defined in the device tree or ACPI
tables. Further work adds a path for early detection of the number of
cache levels/leaves, which makes it possible to allocate the cacheinfo
memory early without requiring extra DT/ACPI information.
This patch addresses a specific issue with ACPI systems with no PPTT. In
that case, parse_acpi_topology() returns an error code, which in turn
makes init_cpu_topology() return early, before fetch_cache_info() is
called. In that case, the early cache level detection doesn't run.
The solution is to simply remove the "return" statement and let the code
flow fall through to calling fetch_cache_info().
Signed-off-by: Radu Rendec <rrendec@redhat.com>
Reported-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/all/dea94484-797f-3034-7b86-6d88801c0d91@arm.com/
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230412185759.755408-4-rrendec@redhat.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
This patch gives architecture specific code the ability to initialize
the cache level and allocate cacheinfo memory early, when cache level
initialization runs on the primary CPU for all possible CPUs.
This is part of a patch series that attempts to further the work in
commit 5944ce092b ("arch_topology: Build cacheinfo from primary CPU").
Previously, in the absence of any DT/ACPI cache info, architecture
specific cache detection and info allocation for secondary CPUs would
happen in non-preemptible context during early CPU initialization and
trigger a "BUG: sleeping function called from invalid context" splat on
an RT kernel.
More specifically, this patch adds the early_cache_level() function,
which is called by fetch_cache_info() as a fallback when the number of
cache leaves cannot be extracted from DT/ACPI. In the default generic
(weak) implementation, this new function returns -ENOENT, which
preserves the original behavior for architectures that do not implement
the function.
Since early detection can get the number of cache leaves wrong in some
cases*, additional logic is added to still call init_cache_level() later
on the secondary CPU, therefore giving the architecture specific code an
opportunity to go back and fix the initial guess. Again, the original
behavior is preserved for architectures that do not implement the new
function.
* For example, on arm64, CLIDR_EL1 detection works only when it runs on
the current CPU. In other words, a CPU cannot detect the cache depth
for any other CPU than itself.
Signed-off-by: Radu Rendec <rrendec@redhat.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230412185759.755408-2-rrendec@redhat.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Similar to the existing reg_downshift mechanism, that is used to
translate register addresses on busses that have a smaller address
stride, it's also possible to want to upshift register addresses.
Such a case was encountered when network PHYs and PCS that usually sit
on a MDIO bus (16-bits register with a stride of 1) are integrated
directly as memory-mapped devices. Here, the same register layout
defined in 802.3 is used, but the register now have a larger stride.
Introduce a mechanism to also allow upshifting register addresses.
Re-purpose reg_downshift into a more generic, signed reg_shift, whose
sign indicates the direction of the shift. To avoid confusion, also
introduce macros to explicitly indicate if we want to downshift or
upshift.
For bisectability, change any use of reg_downshift to use reg_shift.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230407152604.105467-1-maxime.chevallier@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Assignments from pointer variables of type (void *) do not require
explicit type casts, so remove such type cases from the code in
drivers/base/power/main.c where applicable.
Signed-off-by: Li zeming <zeming@nfschina.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge series from William Breathitt Gray <william.gray@linaro.org>:
The regmap API supports IO port accessors so we can take advantage of
regmap abstractions rather than handling access to the device registers
directly in the driver.
A patch to pass irq_drv_data as a parameter for struct regmap_irq_chip
set_type_config() is included. This is needed by the
idio_24_set_type_config() and ws16c48_set_type_config() callbacks in
order to update the type configuration on their respective devices.
For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
torture tests that do offlining to end up trying to offline this CPU causing
test failures. Such failure happens on all architectures.
Fix the repeated error messages thrown by this (even if the hotplug errors are
harmless) by asking the opinion of the nohz subsystem on whether the CPU can be
hotplugged.
[ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ]
For drivers/base/ portion:
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: rcu <rcu@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 2987557f52 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Liam recommends using mas_walk() instead of mas_find() for our use case so
let's do that, it avoids some minor overhead associated with being able to
restart the operation which we don't need since we do a simple search.
Suggested-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230403-regmap-maple-walk-fine-v2-1-c07371c8a867@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Doing the dance to drop the maple tree's internal spinlock means we need
multiple exit paths in our error handling.
Reported-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230403-regmap-maple-unlock-v1-1-89998991b16c@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The add_dev and remove_dev callbacks in struct class_interface currently
pass in a pointer back to the class_interface structure that is calling
them, but none of the callback implementations actually use this pointer
as it is pointless (the structure is known, the driver passed it in in
the first place if it is really needed again.)
So clean this up and just remove the pointer from the callbacks and fix
up all callback functions.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Weiyang <wangweiyang2@huawei.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Jakob Koschel <jakobkoschel@gmail.com>
Cc: Cai Xinchen <caixinchen1@huawei.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/2023040250-pushover-platter-509c@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The struct class pointer in struct class_interface is never modified, so
mark it as const so that no one accidentally tries to modify it in the
future.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040249-handball-gruffly-5da7@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the class code is cleaned up to not modify the class pointer
registered with it, change class_register() to take a const * to allow
the structure to be placed into read-only memory.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040248-customary-release-4aec@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The struct class callback, class_release(), is only called in 2 places,
the pcmcia cardservices code, and in the class driver core code. Both
places it is safe to mark the structure as a const *, to allow us to
in the future mark all struct class usages as constant and move into
read-only memory.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040248-outrage-obsolete-5a9a@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device_create() and device_create_with_groups() function comments
incorrectly state that they only work with a struct class that was
created using class_create(), but that is not true now and I am not sure
if it ever was. So just remove the comment as it's not needed now.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040218-scouts-unplowed-24d2@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current state of the art for sparse register maps is the
rbtree cache. This works well for most applications but isn't
always ideal for sparser register maps since the rbtree can get
deep, requiring a lot of walking. Fortunately the kernel has a
data structure intended to address this very problem, the maple
tree. Provide an initial implementation of a register cache
based on the maple tree to start taking advantage of it.
The entries stored in the maple tree are arrays of register
values, with the maple tree keys holding the register addresses.
We store data in host native format rather than device native
format as we do for rbtree, this will be a benefit for devices
where we don't marshal data within regmap and simplifies the code
but will result in additional CPU overhead when syncing the cache
on devices where we do marshal data in regmap.
This should work well for a lot of devices, though there's some
additional areas that could be looked at such as caching the
last accessed entry like we do for rbtree and trying to minimise
the maple tree level locking. We should also use bulk writes
rather than single register writes when resyncing the cache where
possible, even if we don't store in device native format.
Very small register maps may continue to to better with rbtree
longer term.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230325-regcache-maple-v3-2-23e271f93dc7@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
In order to support sparse caches that don't store data in raw format
factor out the parts of the raw block sync implementation that deal with
writing a single register via _regmap_write().
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230325-regcache-maple-v3-1-23e271f93dc7@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
We need the fixes in here for testing, as well as the driver core
changes for documentation updates to build on.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Syzbot found that we had forgotten to unregister the lock_class_key when
using it in commit dcfbb67e48 ("driver core: class: use lock_class_key
already present in struct subsys_private") so fix that up and correctly
release it when done.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reported-and-tested-by: <syzbot+41d665317c811d4d88aa@syzkaller.appspotmail.com>
Fixes: dcfbb67e48 ("driver core: class: use lock_class_key already present in struct subsys_private")
Link: https://lore.kernel.org/r/2023040126-blandness-duckling-bd55@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nothing outside of drivers/base/core.c uses sysfs_dev_char_kobj, so
make it static and document what it is used for so we remember it the
next time we touch it 15 years from now.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nothing outside of drivers/base/core.c uses sysfs_dev_block_kobj, so
make it static and document what it is used for so we remember it the
next time we touch it 15 years from now.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The dev_kobj field in struct class is now only written to, but never
read from, so it can be removed as it is useless.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a dev_t is set in a struct device, an symlink in /sys/dev/ is
created for it either under /sys/dev/block/ or /sys/dev/char/ depending
on the device type.
The logic to determine this would trigger off of the class of the
object, and the kobj_type set in that location. But it turns out that
this deep nesting isn't needed at all, as it's either a choice of block
or "everything else" which is a char device. So make the logic a lot
more simple and obvious, and remove the incorrect comments in the code
that tried to document something that was not happening at all (it is
impossible to set class->dev_kobj to NULL as the class core prevented
that from happening.
This removes the only place that class->dev_kobj was being used, so
after this, it can be removed entirely.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the last users of the subsystem private pointer in struct class
are gone, the pointer can be removed, as no one is using it. One step
closer to allowing struct class to be const and moved into read-only
memory.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-3-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some classes (i.e. gpio), want to know if they have been registered or
not, and poke around in the class's internal structures to try to figure
this out. Because this is not really a good idea, provide a function
for classes to call to try to figure this out.
Note, this is racy as the state of the class could change at any moment
in time after the call is made, but as usually a class only wants to
know if it has been registered yet or not, it should be fairly safe to
use, and is just as safe as the previous "poke at the class internals"
check was.
Move the gpiolib code to use this function as proof that it works
properly.
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-gpio@vger.kernel.org
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are a number of places in core.c that need access to the private
subsystem structure of struct class, so move them to use
class_to_subsys() instead of accessing it directly.
This requires exporting class_to_subsys() out of class.c, but keeping it
local to the driver core.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On the theory that it's better to make a start let's add some KUnit tests
for regmap. Currently this is a bit of a mess but it passes and hopefully
will at some point help catch problems. We provide very basic cover for
most of the core functionality that operates at the register level,
repeating each test for each cache type in order to exercise the caches.
There is no coverage of anything to do with the bulk operations at the bus
level or formatting for byte stream buses yet.
Each test creates it's own regmap since the cache structures are built
incrementally, meaning we gain coverage from the different access
patterns, and some of the tests cover different init scenarios.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regmap-kunit-v2-2-b208801dc2c8@kernel.org
Add a register map that is a simple array of memory, for use in
KUnit testing of the framework. This is not exposed in regmap.h
since I can't think of a non-test use case, it is purely for use
internally. To facilitate testing we track if registers have been
read or written to.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regmap-kunit-v2-1-b208801dc2c8@kernel.org
The compressed register cache support has assumptions that make it hard to
cover in testing, mainly that it requires raw registers defaults be
provided. Rather than either address these assumptions or leave it untested
by the forthcoming KUnit tests let's remove it, the use case is quite thin
and there are no current users.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regcache-lzo-v1-1-08c5d63e2a5e@kernel.org
Several SoC drivers use the same of-based mechanism to populate the machine
name. Therefore move this to the core and try to populate the machine name
in soc_device_register if it's not set yet.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/6dbdf458-9f46-613e-de58-b4a56a6cdd9f@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After entering 6.3-rc1 the LLC cacheinfo is not exported on our ACPI
based arm64 server. This is because the LLC cacheinfo is partly reset
when secondary CPUs boot up. On arm64 the primary cpu will allocate
and setup cacheinfo:
init_cpu_topology()
for_each_possible_cpu()
fetch_cache_info() // Allocate cacheinfo and init levels
detect_cache_attributes()
cache_shared_cpu_map_setup()
if (!last_level_cache_is_valid()) // not valid, setup LLC
cache_setup_properties() // setup LLC
On secondary CPU boot up:
detect_cache_attributes()
populate_cache_leaves()
get_cache_type() // Get cache type from clidr_el1,
// for LLC type=CACHE_TYPE_NOCACHE
cache_shared_cpu_map_setup()
if (!last_level_cache_is_valid()) // Valid and won't go to this branch,
// leave LLC's type=CACHE_TYPE_NOCACHE
The last_level_cache_is_valid() use cacheinfo->{attributes, fw_token} to
test it's valid or not, but populate_cache_leaves() will only reset
LLC's type, so we won't try to re-setup LLC's type and leave it
CACHE_TYPE_NOCACHE and won't export it through sysfs.
This patch tries to fix this by not re-populating the cache leaves if
the LLC is valid.
Fixes: 5944ce092b ("arch_topology: Build cacheinfo from primary CPU")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230328114915.33340-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that class_to_subsys() can be used to get access to the internal
class private pointer, convert the remaining few places in class.c that
were accessing the pointer directly to use class_to_subsys() instead.
By doing this, the need for class_get() and class_put() goes away as no
one actually tries to increment the class structures anymore, only the
internal dynamic one.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230325194234.46588-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Much like what was done in commit 273afac615 ("driver core: bus:
implement bus_get/put() without the private pointer"), it is time to
move the driver core away from using the internal private pointer in
struct class in order to enable it to be always a constant and be placed
in read-only memory in the future.
First step in doing this is to create a helper function that turns a
'struct class' into 'struct subsys_private' called class_to_subsys().
class_to_subsys() walks the list of registered busses in the system and
finds the matching one based on the pointer to the class itself. As
this is a short list, and this function is not on any fast path, it
should not be noticable.
Implement class_get() and class_put() using this new helper function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230325194234.46588-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct class should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
class to be moved to read-only memory.
While we are touching all class sysfs callbacks also mark the attribute
as constant as it can not be modified. The bonding code still uses this
structure so it can not be removed from the function callbacks.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steve French <sfrench@samba.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: linux-cifs@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: netdev@vger.kernel.org
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230325084537.3622280-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a build time equivalent of fw_devlink.sync_state=timeout so that
board specific kernels could enable it and not have to deal with setting
or cluttering the kernel commandline.
Cc: Doug Anderson <dianders@chromium.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20230317205134.964098-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The class_unregister() and class_destroy() function should be taking a
const * to struct class, not just a *, so fix that up.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230325084526.3622123-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The structure sysfs_dev_char_kobj is local only to the driver core code,
so move it out of the global class.h file and into the internal base.h
file as no one else should be touching this symbol.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230327160319.513974-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit dcfbb67e48 ("driver core: class: use lock_class_key already
present in struct subsys_private") we removed the key parameter to the
function class_create() but forgot to remove it from the kerneldoc,
which causes a build warning. Fix that up by removing the key parameter
from the documentation as it is now gone.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: dcfbb67e48 ("driver core: class: use lock_class_key already present in struct subsys_private")
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230327081828.1087364-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The error message printed when we fail to locate the cache type the map
requested says it can't find a compress type rather than a cache type,
fix that. Since the compressed type is the only one currently compiled
conditionally it's likely to be the missing type but that might not always
be true and is still unclear.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regcache-unknown-v1-1-80deecbf196b@kernel.org
In commit 37e98d9bed ("driver core: bus: move lock_class_key into
dynamic structure"), the lock_key variable moved out of struct bus_type
and into struct subsys_private, yet the documentation for it did not
move. Fix that up and place the documentation comment in the correct
location.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Fixes: 37e98d9bed ("driver core: bus: move lock_class_key into dynamic structure")
Link: https://lore.kernel.org/r/20230324090814.386654-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from Maxime Chevallier <maxime.chevallier@bootlin.com>:
This introduces a helper that factors out register rewriting, it
will be the basis for further work that will need cross tree
merges so is on a branch.
Register addresses passed to regmap operations can be offset with
regmap.reg_base and downshifted with regmap.reg_downshift.
Add a helper to apply both these operations and return the translated
address, that we can then use to perform the actual register operation
ont the underlying bus.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://lore.kernel.org/r/20230324093644.464704-2-maxime.chevallier@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/base/physical_location.h as
they are not needed.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230324122711.2664537-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/base/base.h as they are not
needed.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230324122711.2664537-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit 37e98d9bed ("driver core: bus: move lock_class_key into
dynamic structure"), we moved the lock_class_key into the internal
structure shared by busses and classes, but only used it for buses.
Move the class code to use this structure as it is already present and
being allocated, instead of the statically allocated on-the-stack
variable that class_create() was using as part of a macro wrapper around
the core function call.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230324100132.1633647-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fwnode parameter is not altered in the following APIs:
- fwnode_get_next_parent_dev()
- fwnode_is_ancestor_of()
- fwnode_graph_get_endpoint_count()
so constify them.
Reported-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230324112720.71315-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fwnode_get_phy_mode() does not modify the fwnode argument, merely
using it to obtain the phy-mode property value. Therefore, it can
be made const.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/E1pfdh9-00EQ8t-HB@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's funny to think about getting a reference count of a constant
structure pointer, but this locks into place the private data
"underneath" the struct bus_type() which is important to not go away
while we are working with the bus structure for some callbacks.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-27-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bus_rescan_devices() function was missed in the previous change of
the bus_for_each* constant pointer changes, so fix it up now to take a
const * to struct bus_type.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-25-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bus_register() is now safe to take a constant * to bus_type, so make
that change and mark the subsys_private bus_type * constant as well.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-24-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct bus_type should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
bus_type to be moved to read-only memory.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hu Haowen <src.res@email.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd
Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl
Reviewed-by: Alex Shi <alexs@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi
Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that all accesses of dev_root is through the bus_get_dev_root()
call, move the pointer out of struct bus_type and into the private
dynamic structure, subsys_private.
With this change, there is no modifiable portions of struct bus_type so
it can be marked as a constant structure and moved to read-only memory.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-22-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The functions device_create() and device_create_with_groups() do not
modify the struct class passed into it, so enforce this by changing the
function parameters to be struct const class.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-12-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pointer to a struct class in a struct device should never be used to
change anything in that class. So mark it as constant to enforce this
requirement.
This requires a few minor changes to some internal driver core functions
to enforce the const * being used here now.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-11-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The class_create_file*() and class_remove_file*() functions do not
modify the struct class at all, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-8-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The class_find_device*() functions do not modify the struct class or the
struct device passed into it, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
class_for_each_device() does not modify the struct class or the struct
device passed into it, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
class_dev_iter_init() does not modify the struct class or the struct
device passed into it, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something. So just remove it and fix up all callers of the function in
the kernel tree at the same time.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to manually set the owner of a struct class, as the
registering function does it automatically, so remove all of the
explicit settings from various drivers that did so as it is unneeded.
This allows us to remove this pointer entirely from this structure going
forward.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's no need to manually have to set the module owner of a class, the
compiler should do it automatically for you, so add a module * to the
__class_register() function and allow it to set the module owner
automatically.
This will let us move the module pointer out of struct class eventually,
as it should not be embedded in there if we wish for it to be a
read-only structure eventually.
And, funny story, this module pointer isn't even being used for
anything, so while we will keep it around for now, it's not like it even
matters.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
checkpatch.pl warned:
WARNING: ENOSYS means 'invalid syscall nr' and nothing else
Align the return value to regcache_drop_region().
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/20230313071812.13577-2-alexander.stein@ew.tq-group.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no sense in doing a cache sync on REGCACHE_NONE regmaps.
Instead of panicking the kernel due to missing cache_ops, return an error
to client driver.
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/20230313071812.13577-1-alexander.stein@ew.tq-group.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Some of the functions do not provide Return: section on absence of which
kernel-doc complains. Besides that several functions return the fwnode
handle with incremented reference count. Add a respective note to make sure
that the caller decrements it when it's not needed anymore.
While at it, unify the style of the Return: sections.
Reported-by: Daniel Kaehn <kaehndan@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230217133344.79278-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the file is written to and sync_state() hasn't been called for the
device yet, then call sync_state() for the device independent of the
state of its consumers.
This is useful for supplier devices that have one or more consumers that
don't have a driver but the consumers are in a state that don't use the
resources supplied by the supplier device.
This gives finer grained control than using the
fw_devlink.sync_state=timeout kernel commandline parameter.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When all devices that could probe have finished probing (based on
deferred_probe_timeout configuration or late_initcall() when
!CONFIG_MODULES), this parameter controls what to do with devices that
haven't yet received their sync_state() calls.
fw_devlink.sync_state=strict is the default and the driver core will
continue waiting on all consumers of a device to probe successfully
before sync_state() is called for the device. This is the default
behavior since calling sync_state() on a device when all its consumers
haven't probed could make some systems unusable/unstable. When this
option is selected, we also print the list of devices that haven't had
sync_state() called on them by the time all devices the could probe have
finished probing.
fw_devlink.sync_state=timeout will cause the driver core to give up
waiting on consumers and call sync_state() on any devices that haven't
yet received their sync_state() calls. This option is provided for
systems that won't become unusable/unstable as they might be able to
save power (depends on state of hardware before kernel starts) if all
devices get their sync_state().
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In removing the CONFIG_SYSFS_DEPRECATED and CONFIG_SYSFS_DEPRECATED_V2
config options, I messed up in the __class_register() function and got
the logic incorrect. Fix this all up by just removing the special case
of a block device class logic in this function, as that is what is
intended.
In testing, this solves the boot problem on my systems, hopefully on
others as well.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 721da5cee9 ("driver core: remove CONFIG_SYSFS_DEPRECATED and CONFIG_SYSFS_DEPRECATED_V2")
Link: https://lore.kernel.org/r/20230307075102.3537-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from William Breathitt Gray <william.gray@linaro.org>:
There are devices which have interrupt support with mask and ack
registers but no status register. Add a flag which lets us support
them, we just assume that all the interrupts fired.
CONFIG_SYSFS_DEPRECATED was added in commit 88a22c985e
("CONFIG_SYSFS_DEPRECATED") in 2006 to allow systems with older versions
of some tools (i.e. Fedora 3's version of udev) to boot properly. Four
years later, in 2010, the option was attempted to be removed as most of
userspace should have been fixed up properly by then, but some kernel
developers clung to those old systems and refused to update, so we added
CONFIG_SYSFS_DEPRECATED_V2 in commit e52eec13cd ("SYSFS: Allow boot
time switching between deprecated and modern sysfs layout") to allow
them to continue to boot properly, and we allowed a boot time parameter
to be used to switch back to the old format if needed.
Over time, the logic that was covered under these config options was
slowly removed from individual driver subsystems successfully, removed,
and the only thing that is now left in the kernel are some changes in
the block layer's representation in sysfs where real directories are
used instead of symlinks like normal.
Because the original changes were done to userspace tools in 2006, and
all distros that use those tools are long end-of-life, and older
non-udev-based systems do not care about the block layer's sysfs
representation, it is time to finally remove this old logic and the
config entries from the kernel.
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-block@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20230223073326.2073220-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices lack status registers, yet expect to handle interrupts.
Introduce a no_status flag to indicate such a configuration, where
rather than read a status register to verify, all interrupts received
are assumed to be active.
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/bd501b4b5ff88da24d467f75e8c71b4e0e6f21e2.1677515341.git.william.gray@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Some SoundWire devices have larger width device specific register
maps, in addition to the standard SoundWire 8-bit map. Update the
helpers to allow accessing arbitrarily sized register values and remove
the explicit 8-bit restriction from regmap_sdw_config_check.
Signed-off-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230112171840.2098463-3-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
In the regmap config reg_bits represents the number of address bits not
the number of value bits. Correct the misleading comment which looks a
lot like it suggests the register value itself is 32-bits wide.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230112171840.2098463-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
- Prevent possible NULL pointer derefences in irq_data_get_affinity_mask()
and irq_domain_create_hierarchy().
- Take the per device MSI lock before invoking code which relies
on it being hold.
- Make sure that MSI descriptors are unreferenced before freeing
them. This was overlooked when the platform MSI code was converted to
use core infrastructure and results in a fals positive warning.
- Remove dead code in the MSI subsystem.
- Clarify the documentation for pci_msix_free_irq().
- More kobj_type constification.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmQEVToTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYodc9EACg5HBOGsh5OV8pwnuEDqfThK/dOZ5k
LJ8xQjGAx29JeNWu4gkkHaSDEGjZhwLiZlB6qUeH+LPQ1NgmAlLL3T2NxEOOWa6y
z/xQv+1Ceu6XxazpCSFRWR/6w4Nyup92jhlsUIkmmsWkVvKH/pV6Uo+3ta0WagWg
heb3vqts6J0AOJaMepF8azYGbwAPSIElNLI1UtiEuQYEKU55N8jLK20VJTL6lzJ2
FyRg/0ghNWDAaBdnv4cZCQ/MzoG5UkoU3f2cqhdSce5mqnq2fKRfgBjzllNgaRgA
zxOxIR88QaKTMHIr+WKD1dyWxDQlotFbBOkmVW39XAa13rn42s4GIeW88VCjJGww
RAm52SbC48cCIyNQlh4A6Vhb4vjPx2DndWbWnnVj5fWlUevdAPRxSlm4BjfxFxe+
LbuZCRRL1jjlC0fXmhVXTTxeE1/K7jarAZwRV7Nxhr3g0gT+Zv1jyaaW9rWuHq5U
3pS+xBl89LA/VYp9tv6jDfJlocmRwgrFbGX4UlfikqtObdTFqcH0FtmqisE61fZS
n0194BMWNDfPSibSpDohf/CDPoHZ6pNxeuqkVDiisUJHPpIYOt8+lH+8//DgBL7a
oi9zS0JazPIn2VM6NB4f/WXOYmS9GZq5+loiYEWb52AYtodKUKmOoWG0SUyy6XFr
E7yJzemsUwrJVg==
=jWot
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"A set of updates for the interrupt susbsystem:
- Prevent possible NULL pointer derefences in
irq_data_get_affinity_mask() and irq_domain_create_hierarchy()
- Take the per device MSI lock before invoking code which relies on
it being hold
- Make sure that MSI descriptors are unreferenced before freeing
them. This was overlooked when the platform MSI code was converted
to use core infrastructure and results in a fals positive warning
- Remove dead code in the MSI subsystem
- Clarify the documentation for pci_msix_free_irq()
- More kobj_type constification"
* tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced
genirq/msi: Drop dead domain name assignment
irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy()
genirq/irqdesc: Make kobj_type structures constant
PCI/MSI: Clarify usage of pci_msix_free_irq()
genirq/msi: Take the per-device MSI lock before validating the control structure
genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
Here is another small set of driver core patches for 6.3-rc1
They resolve some reported problems with the previous driver core
patches that are in your tree.
They solve a problem with the bus_type cleanup as reported and fixced by
Geert, and 2 fw_devlink changes to make debugging problems easier.
There is one known outstanding problem with the fw_deflink changes in
your tree that is still being worked on, and it looks like a clk core
change will be submitted soon for that, probably after 6.3-rc1.
All 3 of these have been in linux-next with no reported problems (only
reports that they fixed problems.)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZAB4LQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymOqgCfWIGanNU4fg2DOB4eyJu2JBQmq00AoJeHmJhR
/pAjBOMYYJYVCqQCR6ik
=1UhM
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.3-rc1_2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here is another small set of driver core patches.
They resolve some reported problems with the previous driver core
patches that are in your tree.
They solve a problem with the bus_type cleanup as reported and fixed
by Geert, and two fw_devlink changes to make debugging problems
easier.
There is one known outstanding problem with the fw_deflink changes in
your tree that is still being worked on, and it looks like a clk core
change will be submitted soon for that, probably after 6.3-rc1.
All three of these have been in linux-next with no reported problems
(only reports that they fixed problems)"
* tag 'driver-core-6.3-rc1_2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: fw_devlink: Print full path and name of fwnode
driver core: fw_devlink: Avoid spurious error message
driver core: bus: Handle early calls to bus_to_subsys()
Miquel reported a warning in the MSI core which is triggered when
interrupts are freed via platform_msi_device_domain_free().
This code got reworked to use core functions for freeing the MSI
descriptors, but nothing took care to clear the msi_desc->irq entry, which
then triggers the warning in msi_free_msi_desc() which uses desc->irq to
validate that the descriptor has been torn down. The same issue exists in
msi_domain_populate_irqs().
Up to the point that msi_free_msi_descs() grew a warning for this case,
this went un-noticed.
Provide the counterpart of msi_domain_populate_irqs() and invoke it in
platform_msi_device_domain_free() before freeing the interrupts and MSI
descriptors and also in the error path of msi_domain_populate_irqs().
Fixes: 2f2940d168 ("genirq/msi: Remove filter from msi_free_descs_free_range()")
Reported-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87mt4wkwnv.ffs@tglx
case with the CLK_OPS_PARENT_ENABLE flag combined with clk_core_is_enabled()
where it hangs the system. We'll simply assume the clk is disabled if the
parent is disabled and the flag is set. Trying to turn on the parent to check
the enable state of the clk runs into system hangs at boot. We let this bake in
-next for a couple weeks to make sure there aren't any more issues because the
last attempt to fix this ran into hangs and had to be reverted.
Note: There were some more patches to the core framework around sync_state and
disabling unused clks, but I asked for that to be reverted from the qcom PR
because it isn't ready and we're still discussing the best solution on the
list.
Outside of the core clk framework, we have the usual collection of clk driver
updates and support for new SoCs (which seems to never stop). The dirstat is
dominated by Qualcomm because they added support for quite a few SoCs this time
around and also migrated quite a few of their drivers to clk_parent_data. The
other big diff is in the Mediatek clk drivers that saw a significant rework
this cycle to similarly modernize the code, and we'll see that work continue in
the next cycle as well. Nothing really jumps out as scary here, except that the
significant churn in parent data descriptions can have typos that go unnoticed.
More details below.
Core:
- Honor CLK_OPS_PARENT_ENABLE in clk_core_is_enabled()
New Drivers:
- Add a new clk-gpr-mux clock type and use it on i.MX6Q to add ENET ref
clocks
- Support for Mediatek MT7891 SoC clks
- Support for many Qualcomm clk controllers:
- QDU1000/QRU1000 global clock controller
- SA8775P global clock controller
- SM8550 TCSR and display clock controller
- SM6350 clock controller
- MSM8996 CBF and APCS clock controllers
Updates:
- Various cleanups and improvements to Mediatek clk drivers to reduce
code size and modernize the drivers
- Support for Versa 5P49V60 clks
- Disable R-Car H3 ES1.*, as it was only available to an internal
development group and needed a lot of quirks and workarounds
- Add PWM, Compare-Match Timer (TIM), USB, SDHI, and eMMC clocks and
resets on Renesas RZ/V2M
- Add display clocks on Renesas R-Car V4H
- Add Camera Receiving Unit (CRU) clocks and resets on Renesas RZ/G2L
- Free the imx_uart_clocks even if imx_register_uart_clocks returns early
- Get the stdout clocks count from device tree on i.MX
- Drop the clock count argument from imx_register_uart_clocks()
- Keep the uart clocks on i.MX93 for when earlycon is used
- Fix SPDX comment in i.MX6SLL clocks bindings header
- Drop some unnecessary spaces from i.MX8ULP clocks bindings header
- Add imx_obtain_fixed_of_clock() for allowing to add a clock that is
not configured via devicetree
- Fix the ENET1 gate configuration for i.MX6UL according to the
reference manual
- Add ENET refclock mux support for i.MX6UL
- Add support for USB host/device configuration on Renesas RZ/N1
- Add PLL2 programming support, and CAN-FD clocks on Renesas R-Car V4H
- Add D1 CAN bus gates and resets for Allwinner
- Mark D1 CPUX clock as critical on Allwinner
- Reuse D1 driver for Allwinner R528/T113
- Cleanup sunxi-ng Kconfig
- Fix sunxi-ng kernel-doc issues
- Model Allwinner H3/H5 DRAM clock as fixed clock
- Use .determine_rate() instead of .round_rate() for the dualdiv, mpll,
sclk-div and cpu-dyn-div amlogic clock drivers
- DDR clocks were marked as critical in the proper clock driver for each
AT91 SoC such that drivers/memory/atmel-sdramc.c to be deleted
in the next releases as it only does clock enablement
- Patch to avoid compiling dt-compat.o for all AT91 SoCs as only some of
them may use it
- Support synchronous power_off requests in the qcom GDSC driver for proper
GPU power collapse
- Drop test clocks from various Qualcomm clk drivers
- Update parent references to use clk_parent_data/clk_hw in various Qualcomm clk drivers
- Fixes for the Qualcomm MSM8996 CPU clock controller
- Transition Qualcomm MSM8974 GCC off the externally defined sleep_clk
- Add GDSCs in the global clock controller for Qualcomm QCS404
- The SDCC core clocks on Qualcomm SM6115 are moved to floor_ops
- Programming of clk_dis_wait for GPU CX GDSC on Qualcomm SC7180 and SDM845 are
moved to use the recently introduced properties in the GDSC struct
- Qualcomm's RPMh clock driver gains SM8550 and SA8775P clocks, and the IPA clock
is added on a variety of platforms
- De-duplicate identical clks in Qualcomm SMD RPM clk driver
- Add a few missing clocks across msm8998, msm8992, msm8916, qcs404 to
Qualcomm SDM RPM clk driver
- Various Qualcomm clk drivers use devm_pm_runtime_enable() to simplify
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmP5L68RHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSXLxRAAx5C2PBxGnQS5Dqy7yGFBKhoyM6MnD131
4wsWunTzw/fx3MWTRUBu1FYq7ZN38dmeUNeKVNO7QkfIzXe/Htxa5DJp8oJjeFkA
WBlJr/S9pnCKJ1+jGgnEJ3AL8rtssc2nasS8Gj66eu3Zs3dA1MlUz1M0wqiGeD/Y
2crg1nowHurxhsmdUM+6sBRZsCoUz1DxAynqOK25Ip08ygBGYRdkk3aCoyx1bICF
02RwSTQP9pGykgkO7BMkr1pA000mlcawXflzfbY0bA57GKvITBaXh3PhVWwsAnqk
utagT3G2/mQNBF+DVX4Xr5rRqYttDeATiS2D0B31x68Ovjw6kaA28QoY19oIjc1p
D7CabcPnrdK6JFimJL/uEnjIpVnMW2kbTAkTdgGFNKqYUC1O+Mm5ZscKo8RUaiQ7
8XAgJXnGxG7RlZDIMCj69xiZBR0I0wgyx6E7sqMK5j4/v9ezhUMemj4u/sNe3R1n
ih43L2vXjftAhl7jlGQb6eFQjPU/n8kf1rte5miJxX2vFBgWqiCfKHnVSe8KSNU+
gqhar55G9ACnYBjqApmySd/7IFBzmJSyWujXg+fwIu0ZI5ir+ZPr+rQ7RkAUqMij
QCPvJa6XlRIyl6j54E5GaSFO3ZDc3wAZEs3I2N4yhYTDUGv342G3YdwNU+b0ioHB
bDW5Gec9u2s=
=Fle/
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"We have one small patch to the clk core this time around. It fixes a
corner case with the CLK_OPS_PARENT_ENABLE flag combined with
clk_core_is_enabled() where it hangs the system. We'll simply assume
the clk is disabled if the parent is disabled and the flag is set.
Trying to turn on the parent to check the enable state of the clk runs
into system hangs at boot. We let this bake in -next for a couple
weeks to make sure there aren't any more issues because the last
attempt to fix this ran into hangs and had to be reverted.
Note: There were some more patches to the core framework around
sync_state and disabling unused clks, but I asked for that to be
reverted from the qcom PR because it isn't ready and we're still
discussing the best solution on the list.
Outside of the core clk framework, we have the usual collection of clk
driver updates and support for new SoCs (which seems to never stop).
The dirstat is dominated by Qualcomm because they added support for
quite a few SoCs this time around and also migrated quite a few of
their drivers to clk_parent_data. The other big diff is in the
Mediatek clk drivers that saw a significant rework this cycle to
similarly modernize the code, and we'll see that work continue in the
next cycle as well. Nothing really jumps out as scary here, except
that the significant churn in parent data descriptions can have typos
that go unnoticed. More details below.
Core:
- Honor CLK_OPS_PARENT_ENABLE in clk_core_is_enabled()
New Drivers:
- Add a new clk-gpr-mux clock type and use it on i.MX6Q to add ENET
ref clocks
- Support for Mediatek MT7891 SoC clks
- Support for many Qualcomm clk controllers:
- QDU1000/QRU1000 global clock controller
- SA8775P global clock controller
- SM8550 TCSR and display clock controller
- SM6350 clock controller
- MSM8996 CBF and APCS clock controllers
Updates:
- Various cleanups and improvements to Mediatek clk drivers to reduce
code size and modernize the drivers
- Support for Versa 5P49V60 clks
- Disable R-Car H3 ES1.*, as it was only available to an internal
development group and needed a lot of quirks and workarounds
- Add PWM, Compare-Match Timer (TIM), USB, SDHI, and eMMC clocks and
resets on Renesas RZ/V2M
- Add display clocks on Renesas R-Car V4H
- Add Camera Receiving Unit (CRU) clocks and resets on Renesas RZ/G2L
- Free the imx_uart_clocks even if imx_register_uart_clocks returns
early
- Get the stdout clocks count from device tree on i.MX
- Drop the clock count argument from imx_register_uart_clocks()
- Keep the uart clocks on i.MX93 for when earlycon is used
- Fix SPDX comment in i.MX6SLL clocks bindings header
- Drop some unnecessary spaces from i.MX8ULP clocks bindings header
- Add imx_obtain_fixed_of_clock() for allowing to add a clock that is
not configured via devicetree
- Fix the ENET1 gate configuration for i.MX6UL according to the
reference manual
- Add ENET refclock mux support for i.MX6UL
- Add support for USB host/device configuration on Renesas RZ/N1
- Add PLL2 programming support, and CAN-FD clocks on Renesas R-Car
V4H
- Add D1 CAN bus gates and resets for Allwinner
- Mark D1 CPUX clock as critical on Allwinner
- Reuse D1 driver for Allwinner R528/T113
- Cleanup sunxi-ng Kconfig
- Fix sunxi-ng kernel-doc issues
- Model Allwinner H3/H5 DRAM clock as fixed clock
- Use .determine_rate() instead of .round_rate() for the dualdiv,
mpll, sclk-div and cpu-dyn-div amlogic clock drivers
- DDR clocks were marked as critical in the proper clock driver for
each AT91 SoC such that drivers/memory/atmel-sdramc.c to be deleted
in the next releases as it only does clock enablement
- Patch to avoid compiling dt-compat.o for all AT91 SoCs as only some
of them may use it
- Support synchronous power_off requests in the qcom GDSC driver for
proper GPU power collapse
- Drop test clocks from various Qualcomm clk drivers
- Update parent references to use clk_parent_data/clk_hw in various
Qualcomm clk drivers
- Fixes for the Qualcomm MSM8996 CPU clock controller
- Transition Qualcomm MSM8974 GCC off the externally defined
sleep_clk
- Add GDSCs in the global clock controller for Qualcomm QCS404
- The SDCC core clocks on Qualcomm SM6115 are moved to floor_ops
- Programming of clk_dis_wait for GPU CX GDSC on Qualcomm SC7180 and
SDM845 are moved to use the recently introduced properties in the
GDSC struct
- Qualcomm's RPMh clock driver gains SM8550 and SA8775P clocks, and
the IPA clock is added on a variety of platforms
- De-duplicate identical clks in Qualcomm SMD RPM clk driver
- Add a few missing clocks across msm8998, msm8992, msm8916, qcs404
to Qualcomm SDM RPM clk driver
- Various Qualcomm clk drivers use devm_pm_runtime_enable() to
simplify"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (228 commits)
clk: qcom: apcs-msm8986: Include bitfield.h for FIELD_PREP
clk: qcom: Revert sync_state based clk_disable_unused
clk: imx: pll14xx: fix recalc_rate for negative kdiv
clk: rs9: Drop unused pin_xin field
MAINTAINERS: clk: imx: Add Peng Fan as reviewer
clk: sprd: Add dependency for SPRD_UMS512_CLK
clk: ralink: fix 'mt7621_gate_is_enabled()' function
clk: mediatek: clk-mtk: Remove unneeded semicolon
dt-bindings: clock: remove stih416 bindings
dt-bindings: clock: add loongson-2 clock
dt-bindings: clock: add loongson-2 clock include file
clk: imx: fix compile testing imxrt1050
clk: Honor CLK_OPS_PARENT_ENABLE in clk_core_is_enabled()
clk: imx: set imx_clk_gpr_mux_ops storage-class-specifier to static
clk: renesas: rcar-gen3: Disable R-Car H3 ES1.*
dt-bindings: clock: Merge qcom,gpucc-sm8350 into qcom,gpucc.yaml
clk: qcom: gpucc-sdm845: fix clk_dis_wait being programmed for CX GDSC
clk: qcom: gpucc-sc7180: fix clk_dis_wait being programmed for CX GDSC
dt-bindings: clock: qcom,sa8775p-gcc: add the power-domains property
clk: qcom: cpu-8996: add missing cputype include
...
Some of the log messages were printing just the fwnode name. While it's
short, it's not always uniquely identifiable in system. So print the
full path and name to make debugging easier.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230225065443.278284-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink can sometimes try to create a device link with the consumer
and supplier as the same device. These attempts will fail (correctly),
but are harmless. So, avoid printing an error for these cases. Also, add
more detail to the error message.
Fixes: 3fb16866b5 ("driver core: fw_devlink: Make cycle detection more robust")
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230225064148.274376-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling soc_device_match() from early_initcall(), bus_kset is still
NULL, causing a crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
...
Call trace:
__lock_acquire+0x530/0x20f0
lock_acquire.part.0+0xc8/0x210
lock_acquire+0x64/0x80
_raw_spin_lock+0x4c/0x60
bus_to_subsys+0x24/0xac
bus_for_each_dev+0x30/0xcc
soc_device_match+0x4c/0xe0
r8a7795_sysc_init+0x18/0x60
rcar_sysc_pd_init+0xb0/0x33c
do_one_initcall+0x128/0x2bc
Before, bus_for_each_dev() handled this gracefully by checking that
the back-pointer to the private structure was valid.
Fix this by adding a NULL check for bus_kset to bus_to_subsys().
Fixes: 83b9148df2 ("driver core: bus: bus iterator cleanups")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/0a92979f6e790737544638e8a4c19b0564e660a2.1676983596.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work falls
into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be moved
into read-only memory (i.e. const) The recent work with Rust has
pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only making
things safer overall. This is the contuation of that work (started
last release with kobject changes) in moving struct bus_type to be
constant. We didn't quite make it for this release, but the
remaining patches will be finished up for the release after this
one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
6oeFOjD3JDju3cQsfGgd
=Su6W
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work
falls into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be
moved into read-only memory (i.e. const) The recent work with Rust
has pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only
making things safer overall. This is the contuation of that work
(started last release with kobject changes) in moving struct
bus_type to be constant. We didn't quite make it for this release,
but the remaining patches will be finished up for the release after
this one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems"
[ Geert Uytterhoeven points out that that last sentence isn't true, and
that there's a pending report that has a fix that is queued up - Linus ]
* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
OPP: fix error checking in opp_migrate_dentry()
debugfs: update comment of debugfs_rename()
i3c: fix device.h kernel-doc warnings
dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
Revert "driver core: add error handling for devtmpfs_create_node()"
Revert "devtmpfs: add debug info to handle()"
Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
driver core: cpu: don't hand-override the uevent bus_type callback.
devtmpfs: remove return value of devtmpfs_delete_node()
devtmpfs: add debug info to handle()
driver core: add error handling for devtmpfs_create_node()
driver core: bus: update my copyright notice
driver core: bus: add bus_get_dev_root() function
driver core: bus: constify bus_unregister()
driver core: bus: constify some internal functions
driver core: bus: constify bus_get_kset()
driver core: bus: constify bus_register/unregister_notifier()
driver core: remove private pointer from struct bus_type
...
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter". These filters provide users
with finer-grained control over DAMOS's actions. SeongJae has also done
some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series "mm:
support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with his
series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings. The previous BPF-based approach had
shortcomings. See "mm: In-kernel support for memory-deny-write-execute
(MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a per-node
basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage during
compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in ths
series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's series
"mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
"fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest of
the kernel in the series "Simplify the external interface for GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the series
"mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
=MlGs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
A quiet release for regmap, we've seen several cleanups, an update for a
change in the MDIO APIs and one small fix.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmPzatMACgkQJNaLcl1U
h9BJjwf/Z8lGWykBdFhQw2vvauWMjm4cBCpLONK70mXOVouApepVW8xQAXT8sQ86
ADyeSGJmB1P+b/TyO4q7gqF0qBKE5T9gOhjhmp++robCEHXzX3/h2YtobTWpFdzO
34UOAi0PBIGhnEwgK9HWb/rYMS3DBy6oKd6IxVK1TX6LXxjW3OoTuxd+lNjDE7nM
eGMnyMF1TcxAsUdlTMiyUgt0e/xxir4UOvmzYz7rkbVQf6AyrmOUyMt+iTuW7jQ6
DUUwpAfVilEN5K6Drc+4Lv+ydCw4PATXLdYX3cBQygeptvQ6PCxgulXvT7OcQMph
rCirnu9gcYcKLFcFI5LhxQOZ7VAcBQ==
=tjm7
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A quiet release for regmap: we've seen several cleanups, an update for
a change in the MDIO APIs and one small fix"
* tag 'regmap-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Remove unused mask_invert flag
regmap-irq: Remove unused type_invert flag
regmap: Reorder fields in 'struct regmap_bus' to save some memory
regmap: apply reg_base and reg_downshift for single register ops
Core
----
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used
to describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols
---------
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP
path manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF
---
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key
to better support decap on GRE tunnel devices not operating
in collect metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk
and bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols
by livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter
---------
- Remove the CLUSTERIP target. It has been marked as obsolete
for years, and we still have WARN splats wrt. races of
the out-of-band /proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to
the existing 'delete' commands, but do not return an error if
the referenced object (set, chain, rule...) did not exist.
Driver API
----------
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into multiple
files, drop some of the unnecessarily granular locks and factor out
common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless Extensions
for Wi-Fi 7 devices at all. Everyone should switch to using nl80211
interface instead.
- Improve the CAN bit timing configuration. Use extack to return error
messages directly to user space, update the SJW handling, including
the definition of a new default value that will benefit CAN-FD
controllers, by increasing their oscillator tolerance.
New hardware / drivers
----------------------
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers
-------
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- enetc: support XDP_REDIRECT for XDP non-linear buffers
- enetc: improve reconfig, avoid link flap and waiting for idle
- enetc: support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmP1VIYACgkQMUZtbf5S
IrvsChAApz0rNL/sPKxXTEfxZ1tN7D3sYxYKQPomxvl5BV+MvicrLddJy3KmzEFK
nnJNO3nuRNuH422JQ/ylZ4mGX1opa6+5QJb0UINImXUI7Fm8HHBIuPGkv7d5CheZ
7JexFqjPJXUy9nPyh1Rra+IA9AcRd2U7jeGEZR38wb99bHJQj5Bzdk20WArEB0el
n44aqg49LXH71bSeXRz77x5SjkwVtYiccQxLcnmTbjLU2xVraLvI2J+wAhHnVXWW
9lrU1+V4Ex2Xcd1xR0L0cHeK+meP1TrPRAeF+JDpVI3a/zJiE7cZjfHdG/jH5xWl
leZJqghVozrZQNtewWWO7XhUFhMDgFu3W/1vNLjSHPZEqaz1JpM67J1+ql6s63l4
LMWoXbcYZz+SL9ZRCoPkbGue/5fKSHv8/Jl9Sh58+eTS+c/zgN8uFGRNFXLX1+EP
n8uvt985PxMd6x1+dHumhOUzxnY4Sfi1vjitSunTsNFQ3Cmp4SO0IfBVJWfLUCuC
xz5hbJGJJbSpvUsO+HWyCg83E5OWghRE/Onpt2jsQSZCrO9HDg4FRTEf3WAMgaqc
edb5KfbRZPTJQM08gWdluXzSk1nw3FNP2tXW4XlgUrEbjb+fOk0V9dQg2gyYTxQ1
Nhvn8ZQPi6/GMMELHAIPGmmW1allyOGiAzGlQsv8EmL+OFM6WDI=
=xXhC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
Karny, Arnd Bergmann, Bagas Sanjaya).
- Drop the custom cpufreq driver for loongson1 that is not necessary
any more and the corresponding cpufreq platform device (Keguang
Zhang).
- Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
entries (Paul E. McKenney).
- Enable thermal cooling for Tegra194 (Yi-Wei Wang).
- Register module device table and add missing compatibles for
cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss).
- Various dt binding updates for qcom-cpufreq-nvmem and opp-v2-kryo-cpu
(Christian Marangi).
- Make kobj_type structure in the cpufreq core constant (Thomas
Weißschuh).
- Make cpufreq_unregister_driver() return void (Uwe Kleine-König).
- Make the TEO cpuidle governor check CPU utilization in order to refine
idle state selection (Kajetan Puchalski).
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in that
driver with arch_cpu_idle() to allow MWAIT to be used (Li RongQing).
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann).
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh).
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki).
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski).
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald).
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald).
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap).
- Fix possible name leak in powercap_register_zone() (Yang Yingliang).
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui).
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada).
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui).
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman).
- Add missing 'cache-unified' property in the example for kryo OPP
bindings (Rob Herring).
- Fix error checking in opp_migrate_dentry() (Qi Zheng).
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio).
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler).
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPuJfMSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/5kQAJNOVImLEPLerLP8xufw30//LuDU5Gi0
STsyDOMql/I2MpkeqeCcgrSbpy6NlEglOvg16gfpQ3qqTCLF9ypENxs9E5BGGvW0
aEdCzvaoqmvi9PCr/jmj0EPP70/U+rIX5m/k0QdjLh9x0aLoAEe3uRJTfR9QVqXf
I7JX0N9kjKi7YxpA5DlkHrS7J7GPPiWlesJ3p4wXuHMo3jf+6fgkoPFt8yRrGWeh
AHzGT2BLrsy7aAUjGZB65Qx9q3fnSXMmXOjmn0Xh2njQah+zRZDwrNzwoY2HTLL/
KQ6/Ww16USYRZtCS1fmGwAj9I+ddq6AOvhPCMn0vLXXmKVAMUrVVWnQS/0+vpm9y
suUMK9Tndkgxd1vjby2246ThJn27uDd/ERFan4ouQo2j22uICY+SDo3osj2hMXka
wq4zthXkY8KgjZ+MuXnZxPhcOvo8KRvfxAU0fy5efQnSkbtwY9UlMvjPBMBHm/RA
21/6kjQNtq5vMmI37oC8DH+oPrRQ7sUKuY7HNqwO9P3QNKWVmNe7cF5UtXXxME7Q
ULvP1d+u+TNNdHFLryPwCSzBO34wQEccdRZBjalZ8tBe6JiDWUFHC3giSURZSuzZ
GDvzVaNX6PkgToyv4inBTB8lTp6pAuUjaWNvNJzVvUXiEKHB0ihzg5vpJW5NdwlH
15Tn8cjH7pp0
=lZLx
-----END PGP SIGNATURE-----
Merge tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add EPP support to the AMD P-state cpufreq driver, add support
for new platforms to the Intel RAPL power capping driver, intel_idle
and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
drop the custom cpufreq driver for loongson1 that is not necessary any
more (and the corresponding cpufreq platform device), fix assorted
issues and clean up code.
Specifics:
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
Karny, Arnd Bergmann, Bagas Sanjaya)
- Drop the custom cpufreq driver for loongson1 that is not necessary
any more and the corresponding cpufreq platform device (Keguang
Zhang)
- Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
entries (Paul E. McKenney)
- Enable thermal cooling for Tegra194 (Yi-Wei Wang)
- Register module device table and add missing compatibles for
cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)
- Various dt binding updates for qcom-cpufreq-nvmem and
opp-v2-kryo-cpu (Christian Marangi)
- Make kobj_type structure in the cpufreq core constant (Thomas
Weißschuh)
- Make cpufreq_unregister_driver() return void (Uwe Kleine-König)
- Make the TEO cpuidle governor check CPU utilization in order to
refine idle state selection (Kajetan Puchalski)
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in
that driver with arch_cpu_idle() to allow MWAIT to be used (Li
RongQing)
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy)
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann)
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh)
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki)
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski)
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald)
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap)
- Fix possible name leak in powercap_register_zone() (Yang Yingliang)
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui)
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada)
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui)
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman)
- Add missing 'cache-unified' property in the example for kryo OPP
bindings (Rob Herring)
- Fix error checking in opp_migrate_dentry() (Qi Zheng)
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio)
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler)
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap)"
* tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
Documentation: amd-pstate: disambiguate user space sections
cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
PM: Add EXPORT macros for exporting PM functions
cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
MIPS: loongson32: Drop obsolete cpufreq platform device
powercap: intel_rapl: Fix handling for large time window
cpuidle: driver: Update microsecond values of state parameters as needed
cpuidle: sysfs: make kobj_type structures constant
cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
PM: EM: fix memory leak with using debugfs_lookup()
PM: domains: fix memory leak with using debugfs_lookup()
cpufreq: Make kobj_type structure constant
cpufreq: davinci: Fix clk use after free
cpufreq: amd-pstate: avoid uninitialized variable use
cpufreq: Make cpufreq_unregister_driver() return void
OPP: fix error checking in opp_migrate_dentry()
dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
...
This pull request contains the following branches:
doc.2023.01.05a: Documentation updates.
fixes.2023.01.23a: Miscellaneous fixes, perhaps most notably:
o Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number
of callbacks.
o Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized.
o Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have doen this for mnay years.)
o Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled,
so this should not (yet) affect production use cases.)
kvfree.2023.01.03a: Cause kfree_rcu() and friends to take advantage of
polled grace periods, thus reducing memory footprint by almost
two orders of magnitude, admittedly on a microbenchmark.
This series also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs
where kfree_rcu(p), which can block, was typed instead of the
intended kfree_rcu(p, rh).
srcu.2023.01.03a: SRCU updates, perhaps most notably fixing a bug that
causes SRCU to fail when booted on a system with a non-zero boot
CPU. This surprising situation actually happens for kdump kernels
on the powerpc architecture. It also adds an srcu_down_read()
and srcu_up_read(), which act like srcu_read_lock() and
srcu_read_unlock(), but allow an SRCU read-side critical section
to be handed off from one task to another.
srcu-always.2023.02.02a: Cleans up the now-useless SRCU Kconfig option.
There are a few more commits that are not yet acked or pulled
into maintainer trees, and these will be in a pull request for
a later merge window.
tasks.2023.01.03a: RCU-tasks updates, perhaps most notably these fixes:
o A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang.
o A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can result
in a too-short grace period.
o A race between shrinking RCU tasks down to a single callback list
and queuing a new callback to some other CPU, but where that
queuing is delayed for more than an RCU grace period. This can
result in that callback being stranded on the non-boot CPU.
torture.2023.01.05a: Torture-test updates and fixes.
torturescript.2023.01.03a: Torture-test scripting updates and fixes.
stall.2023.01.09a: Provide additional RCU CPU stall-warning information
in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and
restore the full five-minute timeout limit for expedited RCU
CPU stall warnings.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPq29UTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jAhVEACEAKJY1VJ9IUqz7CwzAYkzgRJfiygh
oDUXmlqtm6ew9pr2GdLUVCVsUSldzBc0K7Djb/G1niv4JPs+v7YwupIV33+UbStU
Qxt6ztTdxc4lKospLm1+2vF9ZdzVEmiP4wVCc4iDarv5FM3FpWSTNc8+L7qmlC+X
myjv+GqMTxkXZBvYJOgJGFjDwN8noTd7Fr3mCCVLFm3PXMDa7tcwD6HRP5AqD2N8
qC5M6LEqepKVGmz0mYMLlSN1GPaqIsEcexIFEazRsPEivPh/iafyQCQ/cqxwhXmV
vEt7u+dXGZT/oiDq9cJ+/XRDS2RyKIS6dUE14TiiHolDCn1ONESahfA/gXWKykC2
BaGPfjWXrWv/hwbeZ+8xEdkAvTIV92tGpXir9Fby1Z5PjP3balvrnn6hs5AnQBJb
NdhRPLzy/dCnEF+CweAYYm1qvTo8cd5nyiNwBZHn7rEAIu3Axrecag1rhFl3AJ07
cpVMQXZtkQVa2X8aIRTUC+ijX6yIqNaHlu0HqNXgIUTDzL4nv5cMjOMzpNQP9/dZ
FwAMZYNiOk9IlMiKJ8ZiVcxeiA8ouIBlkYM3k6vGrmiONZ7a/EV/mSHoJqI8bvqr
AxUIJ2Ayhg3bxPboL5oKgCiLql0A7ZVvz6quX6McitWGMgaSvel1fDzT3TnZd41e
4AFBFd/+VedUGg==
=bBYK
-----END PGP SIGNATURE-----
Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes, perhaps most notably:
- Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number of
callbacks
- Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized
- Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have done this for many years)
- Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled, so
this should not (yet) affect production use cases)
- Make kfree_rcu() and friends take advantage of polled grace periods,
thus reducing memory footprint by almost two orders of magnitude,
admittedly on a microbenchmark
This also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs where
kfree_rcu(p), which can block, was typed instead of the intended
kfree_rcu(p, rh)
- SRCU updates, perhaps most notably fixing a bug that causes SRCU to
fail when booted on a system with a non-zero boot CPU. This
surprising situation actually happens for kdump kernels on the
powerpc architecture
This also adds an srcu_down_read() and srcu_up_read(), which act like
srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
critical section to be handed off from one task to another
- Clean up the now-useless SRCU Kconfig option
There are a few more commits that are not yet acked or pulled into
maintainer trees, and these will be in a pull request for a later
merge window
- RCU-tasks updates, perhaps most notably these fixes:
- A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang
- A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can
result in a too-short grace period
- A race between shrinking RCU tasks down to a single callback
list and queuing a new callback to some other CPU, but where
that queuing is delayed for more than an RCU grace period. This
can result in that callback being stranded on the non-boot CPU
- Torture-test updates and fixes
- Torture-test scripting updates and fixes
- Provide additional RCU CPU stall-warning information in kernels built
with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
timeout limit for expedited RCU CPU stall warnings
* tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
kernel/notifier: Remove CONFIG_SRCU
init: Remove "select SRCU"
fs/quota: Remove "select SRCU"
fs/notify: Remove "select SRCU"
fs/btrfs: Remove "select SRCU"
fs: Remove CONFIG_SRCU
drivers/pci/controller: Remove "select SRCU"
drivers/net: Remove "select SRCU"
drivers/md: Remove "select SRCU"
drivers/hwtracing/stm: Remove "select SRCU"
drivers/dax: Remove "select SRCU"
drivers/base: Remove CONFIG_SRCU
rcu: Disable laziness if lazy-tracking says so
rcu: Track laziness during boot and suspend
rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
rcu: Align the output of RCU CPU stall warning messages
rcu: Add RCU stall diagnosis information
sched: Add helper nr_context_switches_cpu()
...
- Improve the scalability of the CFS bandwidth unthrottling logic
with large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with
the generic scheduler code. Add __cpuidle methods as noinstr to
objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS,
to query previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period,
to improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- ... Misc other cleanups, fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmPzbJwRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iIvA//ZcEaB8Z6ChLRQjM+bsaudKJu3pdLQbPK
iYbP8Da+LsAfxbEfYuGV3m+jIp0LlBOtsI/EezxQrXV+V7FvNyAX9Y00eEu/zlj8
7Jn3LMy/DBYTwH7LwVdcU0MyIVI8ZPc6WNnkx0LOtGZn8n+qfHPSDzcP3CW+a5AV
UvllPYpYyEmsX0Eby7CF4Ue8mSmbViw/xR3rNr8ZSve0c25XzKabw8O9kE3jiHxP
d/zERJoAYeDyYUEuZqhfn5dTlB4an4IjNEkAfRE5SQ09RA8Gkxsa5Ar8gob9e9M1
eQsdd4/bdhnrkM8L5qDZczqmgCTZ2bukQrxkBXhRDhLgoFxwAn77b+2ZjmIW3Lae
AyGqRcDSg1q2oxaYm5ZiuO/t26aDOZu9vPHyHRDGt95EGbZlrp+GgeePyfCigJYz
UmPdZAAcHdSymnnnlcvdG37WVvaVkpgWZzd8LbtBi23QR+Zc4WQ2IlgnUS5WKNNf
VOBcAcP6E1IslDotZDQCc2dPFFQoQQEssVooyUc5oMytm7BsvxXLOeHG+Ncu/8uc
H+U8Qn8jnqTxJbC5hkWQIJlhVKCq2FJrHxxySYTKROfUNcDgCmxboFeAcXTCIU1K
T0S+sdoTS/CvtLklRkG0j6B8N4N98mOd9cFwUV3tX+/gMLMep3hCQs5L76JagvC5
skkQXoONNaM=
=l1nN
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Improve the scalability of the CFS bandwidth unthrottling logic with
large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with the
generic scheduler code. Add __cpuidle methods as noinstr to objtool's
noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query
previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period, to
improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- Misc other cleanups, fixes
* tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
sched/rt: pick_next_rt_entity(): check list_entry
sched/deadline: Add more reschedule cases to prio_changed_dl()
sched/fair: sanitize vruntime of entity being placed
sched/fair: Remove capacity inversion detection
sched/fair: unlink misfit task from cpu overutilized
objtool: mem*() are not uaccess safe
cpuidle: Fix poll_idle() noinstr annotation
sched/clock: Make local_clock() noinstr
sched/clock/x86: Mark sched_clock() noinstr
x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read()
x86/atomics: Always inline arch_atomic64*()
cpuidle: tracing, preempt: Squash _rcuidle tracing
cpuidle: tracing: Warn about !rcu_is_watching()
cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG
cpuidle: drivers: firmware: psci: Dont instrument suspend code
KVM: selftests: Fix build of rseq test
exit: Detect and fix irq disabled state in oops
cpuidle, arm64: Fix the ARM64 cpuidle logic
cpuidle: mvebu: Fix duplicate flags assignment
sched/fair: Limit sched slice duration
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NlQAKCRCRxhvAZXjc
orOaAP9i2h3OJy95nO2Fpde0Bt2UT+oulKCCcGlvXJ8/+TQpyQD/ZQq47gFQ0EAz
Br5NxeyGeecAb0lHpFz+CpLGsxMrMwQ=
=+BG5
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs idmapping updates from Christian Brauner:
- Last cycle we introduced the dedicated struct mnt_idmap type for
mount idmapping and the required infrastucture in 256c8aed2b ("fs:
introduce dedicated idmap type for mounts"). As promised in last
cycle's pull request message this converts everything to rely on
struct mnt_idmap.
Currently we still pass around the plain namespace that was attached
to a mount. This is in general pretty convenient but it makes it easy
to conflate namespaces that are relevant on the filesystem with
namespaces that are relevant on the mount level. Especially for
non-vfs developers without detailed knowledge in this area this was a
potential source for bugs.
This finishes the conversion. Instead of passing the plain namespace
around this updates all places that currently take a pointer to a
mnt_userns with a pointer to struct mnt_idmap.
Now that the conversion is done all helpers down to the really
low-level helpers only accept a struct mnt_idmap argument instead of
two namespace arguments.
Conflating mount and other idmappings will now cause the compiler to
complain loudly thus eliminating the possibility of any bugs. This
makes it impossible for filesystem developers to mix up mount and
filesystem idmappings as they are two distinct types and require
distinct helpers that cannot be used interchangeably.
Everything associated with struct mnt_idmap is moved into a single
separate file. With that change no code can poke around in struct
mnt_idmap. It can only be interacted with through dedicated helpers.
That means all filesystems are and all of the vfs is completely
oblivious to the actual implementation of idmappings.
We are now also able to extend struct mnt_idmap as we see fit. For
example, we can decouple it completely from namespaces for users that
don't require or don't want to use them at all. We can also extend
the concept of idmappings so we can cover filesystem specific
requirements.
In combination with the vfs{g,u}id_t work we finished in v6.2 this
makes this feature substantially more robust and thus difficult to
implement wrong by a given filesystem and also protects the vfs.
- Enable idmapped mounts for tmpfs and fulfill a longstanding request.
A long-standing request from users had been to make it possible to
create idmapped mounts for tmpfs. For example, to share the host's
tmpfs mount between multiple sandboxes. This is a prerequisite for
some advanced Kubernetes cases. Systemd also has a range of use-cases
to increase service isolation. And there are more users of this.
However, with all of the other work going on this was way down on the
priority list but luckily someone other than ourselves picked this
up.
As usual the patch is tiny as all the infrastructure work had been
done multiple kernel releases ago. In addition to all the tests that
we already have I requested that Rodrigo add a dedicated tmpfs
testsuite for idmapped mounts to xfstests. It is to be included into
xfstests during the v6.3 development cycle. This should add a slew of
additional tests.
* tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits)
shmem: support idmapped mounts for tmpfs
fs: move mnt_idmap
fs: port vfs{g,u}id helpers to mnt_idmap
fs: port fs{g,u}id helpers to mnt_idmap
fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap
fs: port i_{g,u}id_{needs_}update() to mnt_idmap
quota: port to mnt_idmap
fs: port privilege checking helpers to mnt_idmap
fs: port inode_owner_or_capable() to mnt_idmap
fs: port inode_init_owner() to mnt_idmap
fs: port acl to mnt_idmap
fs: port xattr to mnt_idmap
fs: port ->permission() to pass mnt_idmap
fs: port ->fileattr_set() to pass mnt_idmap
fs: port ->set_acl() to pass mnt_idmap
fs: port ->get_acl() to pass mnt_idmap
fs: port ->tmpfile() to pass mnt_idmap
fs: port ->rename() to pass mnt_idmap
fs: port ->mknod() to pass mnt_idmap
fs: port ->mkdir() to pass mnt_idmap
...
Merge updates of the powercap framework, generic PM domains, Energy
Model and operating performance points for 6.3-rc1:
- Fix possible name leak in powercap_register_zone() (Yang Yingliang).
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui).
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada).
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui).
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman).
- Add missing 'cache-unified' property in example for kryo OPP bindings
(Rob Herring).
- Fix error checking in opp_migrate_dentry() (Qi Zheng).
- Remove "select SRCU" (Paul E. McKenney).
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio).
* powercap:
powercap: intel_rapl: Fix handling for large time window
powercap: idle_inject: Support 100% idle injection
powercap: intel_rapl: add support for Emerald Rapids
powercap: intel_rapl: add support for Meteor Lake
powercap: fix possible name leak in powercap_register_zone()
* pm-domains:
PM: domains: fix memory leak with using debugfs_lookup()
* pm-em:
PM: EM: fix memory leak with using debugfs_lookup()
* pm-opp:
OPP: fix error checking in opp_migrate_dentry()
dt-bindings: opp: v2-qcom-level: Let qcom,opp-fuse-level be a 2-long array
drivers/opp: Remove "select SRCU"
dt-bindings: opp: opp-v2-kryo-cpu: Add missing 'cache-unified' property in example
Merge cpuidle updates, PM core updates and changes related to system
sleep handling for 6.3-rc1:
- Make the TEO cpuidle governor check CPU utilization in order to refine
idle state selection (Kajetan Puchalski).
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in that
driver with arch_cpu_idle() which allows MWAIT to be used (Li
RongQing).
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann).
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh).
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki).
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski).
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald).
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald).
- Drop "select SRCU" from system sleep Kconfig (Paul E. McKenney).
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap).
* pm-cpuidle:
cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
cpuidle: driver: Update microsecond values of state parameters as needed
cpuidle: sysfs: make kobj_type structures constant
cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
intel_idle: add Emerald Rapids Xeon support
cpuidle-haltpoll: Replace default_idle() with arch_cpu_idle()
cpuidle-haltpoll: select haltpoll governor
cpuidle: teo: Introduce util-awareness
cpuidle: teo: Optionally skip polling states in teo_find_shallower_state()
* pm-core:
PM: Add EXPORT macros for exporting PM functions
PM: runtime: Document that force_suspend() is incompatible with SMART_SUSPEND
* pm-sleep:
PM: sleep: Remove "select SRCU"
PM: hibernate: swap: don't use /** for non-kernel-doc comments
For some reason, the drivers/base/class.c file still had the "old style"
of exports, at the end of the file. Move the exports to the proper
location, right after the function, to be correct.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230214144117.158956-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 31b4b6730f as it is
reported to cause boot regressions.
Link: https://lore.kernel.org/r/Y+rSXg14z1Myd8Px@dev-arch.thelio-3990X
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Longlong Xia <xialonglong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 90a9d5ff22 as it is
reported to cause boot regressions.
Link: https://lore.kernel.org/r/Y+rSXg14z1Myd8Px@dev-arch.thelio-3990X
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Longlong Xia <xialonglong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 9d3fe6aa6b as it is
reported to cause boot regressions.
Link: https://lore.kernel.org/r/Y+rSXg14z1Myd8Px@dev-arch.thelio-3990X
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Longlong Xia <xialonglong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of having to change the uevent bus_type callback by hand at
runtime, set it at build time based on the build configuration options,
making this much simpler to maintain and understand (and allow to make
the structure constant.)
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230210102408.1083177-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The only caller of device_del() does not check the return value. And
there's nothing we can do when cleaning things up on a remove path.
Let's make it a void function.
Signed-off-by: Longlong Xia <xialonglong1@huawei.com>
Link: https://lore.kernel.org/r/20230210095444.4067307-4-xialonglong1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Because handle() is the core function for processing devtmpfs requests,
Let's add some debug info in handle() to help users know why failed.
Signed-off-by: Longlong Xia <xialonglong1@huawei.com>
Link: https://lore.kernel.org/r/20230210095444.4067307-3-xialonglong1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's been some work done recently to the drivers/base/bus.c file so
update the copyright notice in it to make those who track those types of
things have an easier job.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230210091318.733561-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of poking around in the struct bus_type directly for the
dev_root pointer, provide a function to return it properly reference
counted, if it is present in the bus. This will be needed to move the
pointer out of struct bus_type in the future.
Use the function in the driver core code at the same time it is
introduced to verify that it works properly.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230209093556.19132-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The functions add_probe_files() and remove_probe_files() should be
taking a const * to bus_type, not just a *, so fix that up. These
functions should really be removed entirely and an attribute group used
instead, but for now, make this change so that other const work can
continue.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-21-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bus_register_notifier() and bus_unregister_notifier() functions
should be taking a const * to bus_type, not just a * so fix that up.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-19-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the driver code has been refactored to not rely on the pointer
from a struct bus_type to the private structure it can be safely removed
from the structure entirely.
This will allow most bus_type structures to now be marked as const.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-18-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A local function to the driver core to determine if a bus really is
registered with the kernel or not. To be used only by the driver core
code, as part of the driver registration path as it's not really "safe"
because the bus could be unregistered instantly after being called.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-17-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function really is a bus function, not a driver one, so move it
from driver.c to bus.c so that we can clean up some internal bus logic
easier.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-15-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_sort_breadthfirst() function to use bus_to_subsys() and
not use the back-pointer to the private structure.
This also allows us to get rid of bus_get_device_klist() which was only
being used by this one internal function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-14-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_for_each_dev(), bus_find_device, and bus_for_each_drv()
functions to use bus_to_subsys() and not use the back-pointer to the
private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-13-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_add_driver() and bus_remove_driver() functions to use
bus_to_subsys() and not use the back-pointer to the private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-12-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_register_notifier() and bus_unregister_notifier() public
functions to use bus_to_subsys() and not use the back-pointer to the
private structure as well as the bus_notify() function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-11-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_get_kset() function function to use bus_to_subsys() and
not use the back-pointer to the private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-10-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the subsys_interface_register and subsys_interface_unregister()
functions to use bus_to_subsys() and not use the back-pointer to the
private structure.
This also requires changing the parameters on subsys_dev_iter_init() to
iterate over the list properly.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-9-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_register() and bus_unregister() functions to use
bus_to_subsys() and not use the back-pointer to the private structure.
Because bus_add_groups() and bus_remove_groups() were only called in one
place, remove those one-line-wrapper functions and call the real sysfs
group function where it is needed instead, saving another layer of
indirection.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-8-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_add_device(), bus_probe_device(), and
bus_remove_device() functions to use bus_to_subsys() and not use the
back-pointer to the private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the drivers_autoprobe show/store and uevent sysfs callbacks to
use bus_to_subsys() and not use the back-pointer to the private
structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bus_create_file() and bus_remove_file() can be made to take a constant
bus pointer, as it should not be modifying anything in the bus
structure. Make this change and move the functions to use the internal
subsys_get/put() logic as well, to prevent the use of the back-pointer
in struct bus_type.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All of the bus find and iterator functions do not modify the struct
bus_type passed to them, so mark them as constant to enforce this rule.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the quest to make 'struct bus_type' constant and in read-only memory,
we need to stop using the private pointer to the subsys_private
structure. First step in doing this is to create a helper function that
turns a 'struct bus_type' into 'struct subsys_private' called
bus_to_subsys().
bus_to_subsys() walks the list of registered busses in the system and
finds the matching one based on the pointer to the bus_type itself. As
this is a short list, and this function is not on any fast path, it
should not be noticable.
Implement bus_get() and bus_put() using this new helper function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-3-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need to control the reference count of the subsys private structure
instead of directly manipulating the kset reference count of it, so wrap
that logic up in a subsys_get() and subsys_put() function to make it more
obvious as to what is happening.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink could only detect a single and simple cycle because it relied
mainly on device link cycle detection code that only checked for cycles
between devices. The expectation was that the firmware wouldn't have
complicated cycles and multiple cycles between devices. That expectation
has been proven to be wrong.
For example, fw_devlink could handle:
+-+ +-+
|A+------> |B+
+-+ +++
^ |
| |
+----------+
But it couldn't handle even something as "simple" as:
+---------------------+
| |
v |
+-+ +-+ +++
|A+------> |B+------> |C|
+-+ +++ +-+
^ |
| |
+----------+
But firmware has even more complicated cycles like:
+---------------------+
| |
v |
+-+ +---+ +++
+--+A+------>| B +-----> |C|<--+
| +-+ ++--+ +++ |
| ^ | ^ | |
| | | | | |
| +---------+ +---------+ |
| |
+------------------------------+
And this is without including parent child dependencies or nodes in the
cycle that are just firmware nodes that'll never have a struct device
created for them.
The proper way to treat these devices it to not force any probe ordering
between them, while still enforce dependencies between node in the
cycles (A, B and C) and their consumers.
So this patch goes all out and just deals with all types of cycles. It
does this by:
1. Following dependencies across device links, parent-child and fwnode
links.
2. When it find cycles, it mark the device links and fwnode links as
such instead of just deleting them or making the indistinguishable
from proxy SYNC_STATE_ONLY device links.
This way, when new nodes get added, we can immediately find and mark any
new cycles whether the new node is a device or firmware node.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-9-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Consolidate the code that computes the flags to be used when creating a
device link from a fwnode link.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-8-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To improve detection and handling of dependency cycles, we need to be
able to mark fwnode links as being part of cycles. fwnode links marked
as being part of a cycle should not block their consumers from probing.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-7-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink uses DL_FLAG_SYNC_STATE_ONLY device link flag for two
purposes:
1. To allow a parent device to proxy its child device's dependency on a
supplier so that the supplier doesn't get its sync_state() callback
before the child device/consumer can be added and probed. In this
usage scenario, we need to ignore cycles for ensure correctness of
sync_state() callbacks.
2. When there are dependency cycles in firmware, we don't know which of
those dependencies are valid. So, we have to ignore them all wrt
probe ordering while still making sure the sync_state() callbacks
come correctly.
However, when detecting dependency cycles, there can be multiple
dependency cycles between two devices that we need to detect. For
example:
A -> B -> A and A -> C -> B -> A.
To detect multiple cycles correct, we need to be able to differentiate
DL_FLAG_SYNC_STATE_ONLY device links used for (1) vs (2) above.
To allow this differentiation, add a DL_FLAG_CYCLE that can be use to
mark use case (2). We can then use the DL_FLAG_CYCLE to decide which
DL_FLAG_SYNC_STATE_ONLY device links to follow when looking for
dependency cycles.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-6-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink shouldn't defer the probe of a device to wait on a supplier
that'll never have a struct device or will never be probed by a driver.
We currently check if a supplier falls into this category, but don't
check its ancestors. We need to check the ancestors too because if the
ancestor will never probe, then the supplier will never probe either.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a device X is bound successfully to a driver, if it has a child
firmware node Y that doesn't have a struct device created by then, we
delete fwnode links where the child firmware node Y is the supplier. We
did this to avoid blocking the consumers of the child firmware node Y
from deferring probe indefinitely.
While that a step in the right direction, it's better to make the
consumers of the child firmware node Y to be consumers of the device X
because device X is probably implementing whatever functionality is
represented by child firmware node Y. By doing this, we capture the
device dependencies more accurately and ensure better
probe/suspend/resume ordering.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.
Take advantage of this to constify the structure definitions to prevent
modification at runtime.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20230204-kobj_type-driver-core-v1-1-b9f809419f2c@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230202141621.2296458-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patch series "Introduce per NUMA node memory error statistics", v2.
Background
==========
In the RFC for Kernel Support of Memory Error Detection [1], one advantage
of software-based scanning over hardware patrol scrubber is the ability to
make statistics visible to system administrators. The statistics include
2 categories:
* Memory error statistics, for example, how many memory error are
encountered, how many of them are recovered by the kernel. Note these
memory errors are non-fatal to kernel: during the machine check
exception (MCE) handling kernel already classified MCE's severity to be
unnecessary to panic (but either action required or optional).
* Scanner statistics, for example how many times the scanner have fully
scanned a NUMA node, how many errors are first detected by the scanner.
The memory error statistics are useful to userspace and actually not
specific to scanner detected memory errors, and are the focus of this
patchset.
Motivation
==========
Memory error stats are important to userspace but insufficient in kernel
today. Datacenter administrators can better monitor a machine's memory
health with the visible stats. For example, while memory errors are
inevitable on servers with 10+ TB memory, starting server maintenance when
there are only 1~2 recovered memory errors could be overreacting; in cloud
production environment maintenance usually means live migrate all the
workload running on the server and this usually causes nontrivial
disruption to the customer. Providing insight into the scope of memory
errors on a system helps to determine the appropriate follow-up action.
In addition, the kernel's existing memory error stats need to be
standardized so that userspace can reliably count on their usefulness.
Today kernel provides following memory error info to userspace, but they
are not sufficient or have disadvantages:
* HardwareCorrupted in /proc/meminfo: number of bytes poisoned in total,
not per NUMA node stats though
* ras:memory_failure_event: only available after explicitly enabled
* /dev/mcelog provides many useful info about the MCEs, but doesn't
capture how memory_failure recovered memory MCEs
* kernel logs: userspace needs to process log text
Exposing memory error stats is also a good start for the in-kernel memory
error detector. Today the data source of memory error stats are either
direct memory error consumption, or hardware patrol scrubber detection
(either signaled as UCNA or SRAO). Once in-kernel memory scanner is
implemented, it will be the main source as it is usually configured to
scan memory DIMMs constantly and faster than hardware patrol scrubber.
How Implemented
===============
As Naoya pointed out [2], exposing memory error statistics to userspace is
useful independent of software or hardware scanner. Therefore we
implement the memory error statistics independent of the in-kernel memory
error detector. It exposes the following per NUMA node memory error
counters:
/sys/devices/system/node/node${X}/memory_failure/total
/sys/devices/system/node/node${X}/memory_failure/recovered
/sys/devices/system/node/node${X}/memory_failure/ignored
/sys/devices/system/node/node${X}/memory_failure/failed
/sys/devices/system/node/node${X}/memory_failure/delayed
These counters describe how many raw pages are poisoned and after the
attempted recoveries by the kernel, their resolutions: how many are
recovered, ignored, failed, or delayed respectively. This approach can be
easier to extend for future use cases than /proc/meminfo, trace event, and
log. The following math holds for the statistics:
* total = recovered + ignored + failed + delayed
These memory error stats are reset during machine boot.
The 1st commit introduces these sysfs entries. The 2nd commit populates
memory error stats every time memory_failure attempts memory error
recovery. The 3rd commit adds documentations for introduced stats.
[1] https://lore.kernel.org/linux-mm/7E670362-C29E-4626-B546-26530D54F937@gmail.com/T/#mc22959244f5388891c523882e61163c6e4d703af
[2] https://lore.kernel.org/linux-mm/7E670362-C29E-4626-B546-26530D54F937@gmail.com/T/#m52d8d7a333d8536bd7ce74253298858b1c0c0ac6
This patch (of 3):
Today kernel provides following memory error info to userspace, but each
has its own disadvantage
* HardwareCorrupted in /proc/meminfo: number of bytes poisoned in total,
not per NUMA node stats though
* ras:memory_failure_event: only available after explicitly enabled
* /dev/mcelog provides many useful info about the MCEs, but
doesn't capture how memory_failure recovered memory MCEs
* kernel logs: userspace needs to process log text
Exposes per NUMA node memory error stats as sysfs entries:
/sys/devices/system/node/node${X}/memory_failure/total
/sys/devices/system/node/node${X}/memory_failure/recovered
/sys/devices/system/node/node${X}/memory_failure/ignored
/sys/devices/system/node/node${X}/memory_failure/failed
/sys/devices/system/node/node${X}/memory_failure/delayed
These counters describe how many raw pages are poisoned and after the
attempted recoveries by the kernel, their resolutions: how many are
recovered, ignored, failed, or delayed respectively. The following math
holds for the statistics:
* total = recovered + ignored + failed + delayed
Link: https://lkml.kernel.org/r/20230120034622.2698268-1-jiaqiyan@google.com
Link: https://lkml.kernel.org/r/20230120034622.2698268-2-jiaqiyan@google.com
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in conditional compilation based on CONFIG_SRCU.
Therefore, remove the #ifdef and throw away the #else clause.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Use the pr_fmt() macro to prefix all the output with "devtmpfs: ".
while at it, convert printk(<LEVEL>) to pr_<level>().
Signed-off-by: Longlong Xia <xialonglong1@huawei.com>
Link: https://lore.kernel.org/r/20230202033203.1239239-2-xialonglong1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the lock_class_key structure out of struct bus_type and into the
dynamic structure we create already for all bus_types registered with
the kernel. This saves on static space and removes one more writable
field in struct bus_type.
In the future, the same field can be moved out of the struct class logic
because it shares this same private structure.
Most everyone will never notice this change, as lockdep is not enabled
in real systems so no memory or logic changes are happening for them.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230201083349.4038660-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__platform_driver_probe() pokes around in some bus and driver private
lists and locks in a way that is not needed at all. The code only wants
to know if a device was bound to the driver that was registered, so walk
all devices on the bus to see if there was a match. If there is not a
match, return an error. This is the same logic as was originally
present, but just done in a simpler and more obvious way that is not a
layering violation.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230131082459.301603-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the reworking of the function __platform_driver_probe() over the
years, it turns out that the variable 'code' does not actually do
anything or mean anything anymore and can be removed to simplify the
logic when trying to read and understand what this function is actually
doing.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230131082459.301603-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
SDc/Y/c=
=zpaD
-----END PGP SIGNATURE-----
Merge tag 'v6.2-rc6' into sched/core, to pick up fixes
Pick up fixes before merging another batch of cpuidle updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
reg_base and reg_downshift currently don't have any effect if used with
a regmap_bus or regmap_config which only offers single register
operations (ie. reg_read, reg_write and optionally reg_update_bits).
Fix that and take them into account also for regmap_bus with only
reg_read and read_write operations by applying reg_base and
reg_downshift in _regmap_bus_reg_write, _regmap_bus_reg_read.
Also apply reg_base and reg_downshift in _regmap_update_bits, but only
in case the operation is carried out with a reg_update_bits call
defined in either regmap_bus or regmap_config.
Fixes: 0074f3f2b1 ("regmap: allow a defined reg_base to be added to every address")
Fixes: 86fc59ef81 ("regmap: add configurable downshift for addresses")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/Y9clyVS3tQEHlUhA@makrotopia.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The soc_bus code pokes around in the internal bus structures assuming
that it "knows" if a field is not set that it has not been registered
yet. That isn't a safe assumption, so just remove the layering
violation entirely and keep track if the bus has been registered or not
ourselves.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20230130171059.1784057-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The uevent() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction. When doing so, fix up all existing uevent() callbacks to
have the correct signature to preserve the build.
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-17-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
device_get_devnode() should take a constant * to struct device as it
does not modify it in any way, so modify the function definition to do
this and move it out of device.h as it does not need to be exposed to
the whole kernel tree.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Won Chung <wonchung@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-8-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clear the class private pointer if __class_register() fails for it, so
as to allow its users to verify that the class is usable by checking
the value of that pointer.
For consistency, clear that pointer before freeing the object pointed
to by it in class_release().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/4463268.LvFx2qVVIh@kreacher
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The normal call sequence of using transport class is:
Add path:
transport_setup_device()
transport_setup_classdev() // call sas_host_setup() here
transport_add_device() // if fails, need call transport_destroy_device()
transport_configure_device()
Remove path:
transport_remove_device()
transport_remove_classdev // call sas_host_remove() here
transport_destroy_device()
If transport_add_device() fails, need call transport_destroy_device()
to free memory, but in this case, ->remove() is not called, and the
resources allocated in ->setup() are leaked. So fix these leaks by
calling ->remove() in transport_add_class_device() if it returns error.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221115031638.3816551-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct acpi_pld_info *pld should be freed before the return of allocation
failure, to prevent memory leak, add the ACPI_FREE() to fix it.
Fixes: bc443c31de ("driver core: location: Check for allocations failure")
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/1669102648-11517-1-git-send-email-guohanjun@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling kobject_add() failed in device_add(), it will call
cleanup_glue_dir() to free resource. But in kobject_add(),
dev->kobj.parent has been set to NULL. This will cause resource leak.
The process is as follows:
device_add()
get_device_parent()
class_dir_create_and_add()
kobject_add() //kobject_get()
...
dev->kobj.parent = kobj;
...
kobject_add() //failed, but set dev->kobj.parent = NULL
...
glue_dir = get_glue_dir(dev) //glue_dir = NULL, and goto
//"Error" label
...
cleanup_glue_dir() //becaues glue_dir is NULL, not call
//kobject_put()
The preceding problem may cause insmod mac80211_hwsim.ko to failed.
sysfs: cannot create duplicate filename '/devices/virtual/mac80211_hwsim'
Call Trace:
<TASK>
dump_stack_lvl+0x8e/0xd1
sysfs_warn_dup.cold+0x1c/0x29
sysfs_create_dir_ns+0x224/0x280
kobject_add_internal+0x2aa/0x880
kobject_add+0x135/0x1a0
get_device_parent+0x3d7/0x590
device_add+0x2aa/0x1cb0
device_create_groups_vargs+0x1eb/0x260
device_create+0xdc/0x110
mac80211_hwsim_new_radio+0x31e/0x4790 [mac80211_hwsim]
init_mac80211_hwsim+0x48d/0x1000 [mac80211_hwsim]
do_one_initcall+0x10f/0x630
do_init_module+0x19f/0x5e0
load_module+0x64b7/0x6eb0
__do_sys_finit_module+0x140/0x200
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
kobject_add_internal failed for mac80211_hwsim with -EEXIST, don't try to
register things with the same name in the same directory.
Fixes: cebf8fd169 ("driver core: fix race between creating/querying glue dir and its cleanup")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221123012042.335252-1-shaozhengchao@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to 'admin-guide/mm/memory-hotplug.rst', the memory block ID,
instead of the section index, is shown by '/sys/devices/system/memory/
memoryX/phys_index'.
Fix the comments to match with 'admin-guide/mm/memory-hotplug.rst'.
Besides, use the existing helper memory_block_id() to convert the section
index to the memory block index.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230120055727.355483-2-gshan@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The main change is to build the cache topology information for all
the CPUs from the primary CPU. Currently the cacheinfo for secondary CPUs
is created during the early boot on the respective CPU itself. Preemption
and interrupts are disabled at this stage. On PREEMPT_RT kernels, allocating
memory and even parsing the PPTT table for ACPI based systems triggers a:
'BUG: sleeping function called from invalid context'
To prevent this bug, the cacheinfo is now allocated from the primary CPU
when preemption and interrupts are enabled and before booting secondary
CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
only, without relying on any architecture specific mechanism if done so
early.
The other minor change included here is to handle shared caches at
different levels when not all the CPUs on the system have the same
cache hierarchy.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmPKg60ACgkQAEG6vDF+
4pi91hAAoSluqjbUHzzCW+OIIjKAAvQrAw6bsKGvSpcUfYno1Lry+9y76L6TMYSy
OPtiKGxcJyzhdlCwIpJzgaX9nTz7uiu70euNZiAp11XA2KlphtLoI3TMUa60jD4i
ZGfn9UiAp719Vog5m3CmZXjHZ6drI0HloL8ZWTl4VDATUu5pfcx4uYPT2o63Xc62
k5QglaRJWFhFAJ+R6R9vQS2zfeMI9xvehl72445wb8pxxPW2f91dvBhJqJgKlziw
gHKx+D1DnpAUd+v+7HAEmzjXKlY6JnQybmBHmRayllVAa8kGUtvhTcBlRGNsNBzR
m7VBFKq+eSk7VgxOgka1qXVtHUrlaEWf5qWnG+w4XEiE1VgzNagjaFRaGQQneKI/
z3yNKG8Xjp+3BdSz0pUDJVEWFnnjueAUEh6/xODEXnWdX166abQZslLIHCvmcnM8
q7blasuj2mxyCZFC1tyK9WHI7/KCe0cmHbdau3qs0j9bvhzfdB3DwMLsdRjXQTOv
8FVX0Z5EKY9/bW2oqCg/mb3KOWbmFX2ZHho4cds3IV+9GGB8JkD/6b8vpGejmmx2
E2vInzhP3gLd9WiWQWDjg4+aklE29P/nDAA8BSPnW3TtEGAFJMZZQRgNlCZnBW56
Tx2/lE0VD5/UX+1MqSFGchi+KEX3mcZykcra9VNt+uZH26a8gBE=
=nH4b
-----END PGP SIGNATURE-----
Merge tag 'archtopo-cacheinfo-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
"cacheinfo and arch_topology updates for v6.3
The main change is to build the cache topology information for all
the CPUs from the primary CPU. Currently the cacheinfo for secondary CPUs
is created during the early boot on the respective CPU itself. Preemption
and interrupts are disabled at this stage. On PREEMPT_RT kernels, allocating
memory and even parsing the PPTT table for ACPI based systems triggers a:
'BUG: sleeping function called from invalid context'
To prevent this bug, the cacheinfo is now allocated from the primary CPU
when preemption and interrupts are enabled and before booting secondary
CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
only, without relying on any architecture specific mechanism if done so
early.
The other minor change included here is to handle shared caches at
different levels when not all the CPUs on the system have the same
cache hierarchy."
* tag 'archtopo-cacheinfo-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
cacheinfo: Fix shared_cpu_map to handle shared caches at different levels
arch_topology: Build cacheinfo from primary CPU
ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()
ACPI: PPTT: Remove acpi_find_cache_levels()
cacheinfo: Check 'cache-unified' property to count cache leaves
cacheinfo: Return error code in init_of_cache_level()
cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
In test_async_probe_init, second set of asynchronous devices are saved
in sync_dev[sync_id], which should be async_dev[async_id].
This makes these devices not unregistered when exit.
> modprobe test_async_driver_probe && \
> modprobe -r test_async_driver_probe && \
> modprobe test_async_driver_probe
...
> sysfs: cannot create duplicate filename '/devices/platform/test_async_driver.4'
> kobject_add_internal failed for test_async_driver.4 with -EEXIST,
don't try to register things with the same name in the same directory.
Fixes: 57ea974fb8 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Link: https://lore.kernel.org/r/20221125063541.241328-1-chenzhongjin@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'parent' returned by fwnode_graph_get_port_parent()
with refcount incremented when 'prev' is not NULL, it
needs be put when finish using it.
Because the parent is const, introduce a new variable to
store the returned fwnode, then put it before returning
from fwnode_graph_get_next_endpoint().
Fixes: b5b41ab6b0 ("device property: Check fwnode->secondary in fwnode_graph_get_next_endpoint()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-and-tested-by: Daniel Scally <djrscally@gmail.com>
Link: https://lore.kernel.org/r/20221123022542.2999510-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
The cacheinfo sets up the shared_cpu_map by checking whether the caches
with the same index are shared between CPUs. However, this will trigger
slab-out-of-bounds access if the CPUs do not have the same cache hierarchy.
Another problem is the mismatched shared_cpu_map when the shared cache does
not have the same index between CPUs.
CPU0 I D L3
index 0 1 2 x
^ ^ ^ ^
index 0 1 2 3
CPU1 I D L2 L3
This patch checks each cache is shared with all caches on other CPUs.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Link: https://lore.kernel.org/r/20230117105133.4445-2-yongxuan.wang@sifive.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
commit 3fcbf1c77d ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.
The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce ("arch_topology: Drop LLC identifier stash from
the CPU topology")
allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.
When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].
Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.
[1]:
| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
| preempt_count: 1, expected: 0
| RCU nest depth: 1, expected: 1
| 3 locks held by swapper/111/0:
| #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
| #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
| #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
| irq event stamp: 0
| hardirqs last enabled at (0): 0x0
| hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last enabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last disabled at (0): 0x0
| Preemption disabled at:
| migrate_enable+0x30/0x130
| CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
| Call trace:
| __kmalloc+0xbc/0x1e8
| detect_cache_attributes+0x2d4/0x5f0
| update_siblings_masks+0x30/0x368
| store_cpu_topology+0x78/0xb8
| secondary_start_kernel+0xd0/0x198
| __secondary_switched+0xb0/0xb4
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The DeviceTree Specification v0.3 specifies that the cache node
'[d-|i-|]cache-size' property is required. The 'cache-unified'
property is specifies whether the cache level is separate
or unified.
If the cache-size property is missing, no cache leaves is accounted.
This can lead to a 'BUG: KASAN: slab-out-of-bounds' [1] bug.
Check 'cache-unified' property and always account for at least
one cache leaf when parsing the device tree.
[1] https://lore.kernel.org/all/0f19cb3f-d6cf-4032-66d2-dedc9d09a0e3@linaro.org/
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Tested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230104183033.755668-4-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The logic to touch the bus notifier was open-coded in numberous places
in the driver core. Clean that up by creating a local bus_notify()
function and have everyone call this function instead, making the
reading of the caller code simpler and easier to maintain over time.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230111092331.3946745-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make init_of_cache_level() return an error code when the cache
information parsing fails to help detecting missing information.
init_of_cache_level() is only called for riscv. Returning an error
code instead of 0 will prevent detect_cache_attributes() to allocate
memory if an incomplete DT is parsed.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-3-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
RISC-V's implementation of init_of_cache_level() is following
the Devicetree Specification v0.3 regarding caches, cf.:
- s3.7.3 'Internal (L1) Cache Properties'
- s3.8 'Multi-level and Shared Cache Nodes'
Allow reusing the implementation by moving it.
Also make 'levels', 'leaves' and 'level' unsigned int.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-2-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
When CONFIG_OF_IRQ is not enabled, there will be a stub method that always
returns 0 when getting IRQ. Thus, the if-branch can be removed safely.
Signed-off-by: Soha Jin <soha@lohu.info>
Link: https://lore.kernel.org/r/20221111094542.270540-1-soha@lohu.info
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are no more users of software_node_register_nodes() and
software_node_unregister_nodes(). Remove them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Daniel Scally <dan.scally@ideasonboard.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20221228094922.84119-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Switch property entry test to use software_node_register_node_group() API.
The current one is going to be removed soon.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Daniel Scally <dan.scally@ideasonboard.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20221228094922.84119-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct platform_driver::remove returning an integer made driver authors
expect that returning an error code was proper error handling. However
the driver core ignores the error and continues to remove the device
because there is nothing the core could do anyhow and reentering the
remove callback again is only calling for trouble.
So this is an source for errors typically yielding resource leaks in the
error path.
As there are too many platform drivers to neatly convert them all to
return void in a single go, do it in several steps after this patch:
a) Convert all drivers to implement .remove_new() returning void instead
of .remove() returning int;
b) Change struct platform_driver::remove() to return void and so make
it identical to .remove_new();
c) Change all drivers back to .remove() now with the better prototype;
d) drop struct platform_driver::remove_new().
While this touches all drivers eventually twice, steps a) and c) can be
done one driver after another and so reduces coordination efforts
immensely and simplifies review.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221209150914.3557650-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MDIO subsystem is getting rid of MII_ADDR_C45 and thus also
encoding associated encoding of the C45 device address and register
address into one value. regmap-mdio also uses this encoding for the
C45 bus.
Move to the new C45 helpers for MDIO access and provide regmap-mdio
helper macros.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Link: https://lore.kernel.org/r/20230116111509.4086236-1-michael@walle.cc
Signed-off-by: Mark Brown <broonie@kernel.org>
pm_runtime_force_suspend() cannot be used with DPM_FLAG_SMART_SUSPEND, so
note this in the kerneldoc.
If DPM_FLAG_SMART_SUSPEND is set and the PM core cannot skip system resume
it will call pm_runtime_active() on the driver. This can lead to an
inconsistent state where:
pm_runtime_force_suspend() called ->runtime_suspend
but
device_resume_noirq() called pm_runtime_set_active()
This leaves the driver actually suspended but marked as active.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
OMAP was the one and only user.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195541.782536366@infradead.org
The macro to_subsys_private() needs to switch to using
container_of_const() as it turned out to being incorrectly casting a
const pointer to a non-const one. Make this change and fix up the one
offending user to be correctly handling a const pointer properly.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230111093327.3955063-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is not used outside of its compilation unit, so there's no need to
export this variable.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20221227232152.3094584-1-javierm@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I got the following null-ptr-deref report while doing fault injection test:
BUG: kernel NULL pointer dereference, address: 0000000000000058
CPU: 2 PID: 278 Comm: 37-i2c-ds2482 Tainted: G B W N 6.1.0-rc3+
RIP: 0010:klist_put+0x2d/0xd0
Call Trace:
<TASK>
klist_remove+0xf1/0x1c0
device_release_driver_internal+0x196/0x210
bus_remove_device+0x1bd/0x240
device_add+0xd3d/0x1100
w1_add_master_device+0x476/0x490 [wire]
ds2482_probe+0x303/0x3e0 [ds2482]
This is how it happened:
w1_alloc_dev()
// The dev->driver is set to w1_master_driver.
memcpy(&dev->dev, device, sizeof(struct device));
device_add()
bus_add_device()
dpm_sysfs_add() // It fails, calls bus_remove_device.
// error path
bus_remove_device()
// The dev->driver is not null, but driver is not bound.
__device_release_driver()
klist_remove(&dev->p->knode_driver) <-- It causes null-ptr-deref.
// normal path
bus_probe_device() // It's not called yet.
device_bind_driver()
If dev->driver is set, in the error path after calling bus_add_device()
in device_add(), bus_remove_device() is called, then the device will be
detached from driver. But device_bind_driver() is not called yet, so it
causes null-ptr-deref while access the 'knode_driver'. To fix this, set
dev->driver to null in the error path before calling bus_remove_device().
Fixes: 57eee3d23e ("Driver core: Call device_pm_add() after bus_add_device() in device_add()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221205034904.2077765-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.
An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230102161757.v5.1.I3e6b1f078ad0f1ca9358c573daa7b70ec132cdbe@changeid
struct subsys_dev_iter is not used by any code outside of
drivers/base/bus.c so move it into that file and out of the global bus.h
file.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function subsys_dev_iter_exit() is not used outside of
drivers/base/bus.c so make it static to that file and remove the global
export.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function subsys_dev_iter_next() is only used in drivers/base/bus.c
so make it static to that file and remove the global export.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function has not been called by any code in the kernel tree in many
many years so remove it as it is unused.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No one calls this function outside of drivers/base/bus.c so make it
static so it does not need to be exported anymore.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Support zstd-compressed debug info
- Allow W=1 builds to detect objects shared among multiple modules
- Add srcrpm-pkg target to generate a source RPM package
- Make the -s option detection work for future GNU Make versions
- Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y
- Allow W=1 builds to detect -Wundef warnings in any preprocessed files
- Raise the minimum supported version of binutils to 2.25
- Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used
- Use $(file ...) to read a file if GNU Make >= 4.2 is used
- Print error if GNU Make older than 3.82 is used
- Allow modpost to detect section mismatches with Clang LTO
- Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmOeImsVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsG06IP/iVjuWFvnjDZT4X8X6zN8aKp1vtR
EMkmoRtt5cD4CLb1MG4N7irYHgedQSx4rYceP45MyW1I3egl6Ct14RDyeQ1xSIZb
XFTLDCZvfl/up3MdiqNAqKRS7x5lk9++7F0t+2SoQxKQyJvm735XreX+VhZ1FeLB
qcHrmzJ5veky5Ry/3OkNUgKFBjKEAL+qKMc55uvkXqfTb3KoBa2r4VC1OaoYGRru
R8oF9qQRnGVQAl/LbBVchmgSjxryxPrCvBGiKlK03VkXdzEMHMimEJh3BQ6e0PGo
gajdk+4liy7z+jQnI7jFhvJjGKzkEP/Bc99M/uS92QX5MgpH6mqpHMoqqPiqW87K
RmZH37FqRu1Vo8dpibmH6r2K6YD/HHRjaDHk1VuuCQYEn0dsNmokPXOqd/1v0I1i
TXPjWOw1AID5vMJWllqxFhpeVvf0vx5BT/UNrh68MLqlJZzv2eMVJb4fNy6640ml
U0NclMnOa3eOmf5z1T7/LqDRTa63Q0kpanRrBpcmVOaqW+ZpQ3SQjh4uBN1PyJHL
cX3Skc341DyRlFiT54QhGKlm57MEb2gjhBZ3Z4J+b7sEFgvjXH/W8vcOGIKlppmA
CfYMyres4OV+fJc89ONkWsvLiOP1OeUGPvytm33J5QMKXc8SzOLP0D/F8kjrDflm
EROKuZ4EA5ej/rOy
=Ig/Y
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Support zstd-compressed debug info
- Allow W=1 builds to detect objects shared among multiple modules
- Add srcrpm-pkg target to generate a source RPM package
- Make the -s option detection work for future GNU Make versions
- Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y
- Allow W=1 builds to detect -Wundef warnings in any preprocessed files
- Raise the minimum supported version of binutils to 2.25
- Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used
- Use $(file ...) to read a file if GNU Make >= 4.2 is used
- Print error if GNU Make older than 3.82 is used
- Allow modpost to detect section mismatches with Clang LTO
- Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y
* tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits)
buildtar: fix tarballs with EFI_ZBOOT enabled
modpost: Include '.text.*' in TEXT_SECTIONS
padata: Mark padata_work_init() as __ref
kbuild: ensure Make >= 3.82 is used
kbuild: refactor the prerequisites of the modpost rule
kbuild: change module.order to list *.o instead of *.ko
kbuild: use .NOTINTERMEDIATE for future GNU Make versions
kconfig: refactor Makefile to reduce process forks
kbuild: add read-file macro
kbuild: do not sort after reading modules.order
kbuild: add test-{ge,gt,le,lt} macros
Documentation: raise minimum supported version of binutils to 2.25
kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds
kbuild: move -Werror from KBUILD_CFLAGS to KBUILD_CPPFLAGS
kbuild: Port silent mode detection to future gnu make.
init/version.c: remove #include <generated/utsrelease.h>
firmware_loader: remove #include <generated/utsrelease.h>
modpost: Mark uuid_le type to be suitable only for MEI
kbuild: add ability to make source rpm buildable using koji
kbuild: warn objects shared among multiple modules
...
Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass in
a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be used
no matter what the const value is. This prevents every subsystem from
having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject, objects
as being "non-mutable". The changes to the kobject and driver core in
this pull request are the result of that, as there are lots of paths
where kobjects and device pointers are not modified at all, so marking
them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object rules.
All of this has been bike-shedded in quite a lot of detail on lkml with
different names and implementations resulting in the tiny version we
have in here, much better than my original proposal. Lots of subsystem
maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with no
problems, OTHER than some merge issues with other trees that should be
obvious when you hit them (block tree deletes a driver that this tree
modifies, iommufd tree modifies code that this tree also touches). If
there are merge problems with these trees, please let me know.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wz3A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yks0ACeKYUlVgCsER8eYW+x18szFa2QTXgAn2h/VhZe
1Fp53boFaQkGBjl8mGF8
=v+FB
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass
in a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be
used no matter what the const value is. This prevents every subsystem
from having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject,
objects as being "non-mutable". The changes to the kobject and driver
core in this pull request are the result of that, as there are lots of
paths where kobjects and device pointers are not modified at all, so
marking them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object
rules.
All of this has been bike-shedded in quite a lot of detail on lkml
with different names and implementations resulting in the tiny version
we have in here, much better than my original proposal. Lots of
subsystem maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with
no problems"
* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
device property: Fix documentation for fwnode_get_next_parent()
firmware_loader: fix up to_fw_sysfs() to preserve const
usb.h: take advantage of container_of_const()
device.h: move kobj_to_dev() to use container_of_const()
container_of: add container_of_const() that preserves const-ness of the pointer
driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
driver core: fix up missed scsi/cxlflash class.devnode() conversion.
driver core: fix up some missing class.devnode() conversions.
driver core: make struct class.devnode() take a const *
driver core: make struct class.dev_uevent() take a const *
cacheinfo: Remove of_node_put() for fw_token
device property: Add a blank line in Kconfig of tests
device property: Rename goto label to be more precise
device property: Move PROPERTY_ENTRY_BOOL() a bit down
device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
kernfs: fix all kernel-doc warnings and multiple typos
driver core: pass a const * into of_device_uevent()
kobject: kset_uevent_ops: make name() callback take a const *
kobject: kset_uevent_ops: make filter() callback take a const *
kobject: make kobject_namespace take a const *
...
- Convert flexible array members, fix -Wstringop-overflow warnings,
and fix KCFI function type mismatches that went ignored by
maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
add more __alloc_size attributes, and introduce full testing
of all allocator functions. Finally remove the ksize() side-effect
so that each allocation-aware checker can finally behave without
exceptions.
- Introduce oops_limit (default 10,000) and warn_limit (default off)
to provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook).
- Introduce overflows_type() and castable_to_type() helpers for
cleaner overflow checking.
- Improve code generation for strscpy() and update str*() kern-doc.
- Convert strscpy and sigphash tests to KUnit, and expand memcpy
tests.
- Always use a non-NULL argument for prepare_kernel_cred().
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
- Adjust orphan linker section checking to respect CONFIG_WERROR
(Xin Li).
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
- Fix um vs FORTIFY warnings for always-NULL arguments.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
AHXcibguPRQBPAdiPQ==
=yaaN
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
- Convert flexible array members, fix -Wstringop-overflow warnings, and
fix KCFI function type mismatches that went ignored by maintainers
(Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
more __alloc_size attributes, and introduce full testing of all
allocator functions. Finally remove the ksize() side-effect so that
each allocation-aware checker can finally behave without exceptions
- Introduce oops_limit (default 10,000) and warn_limit (default off) to
provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook)
- Introduce overflows_type() and castable_to_type() helpers for cleaner
overflow checking
- Improve code generation for strscpy() and update str*() kern-doc
- Convert strscpy and sigphash tests to KUnit, and expand memcpy tests
- Always use a non-NULL argument for prepare_kernel_cred()
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)
- Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
Li)
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)
- Fix um vs FORTIFY warnings for always-NULL arguments
* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
ksmbd: replace one-element arrays with flexible-array members
hpet: Replace one-element array with flexible-array member
um: virt-pci: Avoid GCC non-NULL warning
signal: Initialize the info in ksignal
lib: fortify_kunit: build without structleak plugin
panic: Expose "warn_count" to sysfs
panic: Introduce warn_limit
panic: Consolidate open-coded panic_on_warn checks
exit: Allow oops_limit to be disabled
exit: Expose "oops_count" to sysfs
exit: Put an upper limit on how often we can oops
panic: Separate sysctl logic from CONFIG_SMP
mm/pgtable: Fix multiple -Wstringop-overflow warnings
mm: Make ksize() a reporting-only function
kunit/fortify: Validate __alloc_size attribute results
drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
driver core: Add __alloc_size hint to devm allocators
overflow: Introduce overflows_type() and castable_to_type()
coredump: Proactively round up to kmalloc bucket size
...
- More userfaultfs work from Peter Xu.
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying.
- Some filemap cleanups from Vishal Moola.
- David Hildenbrand added the ability to selftest anon memory COW handling.
- Some cpuset simplifications from Liu Shixin.
- Addition of vmalloc tracing support by Uladzislau Rezki.
- Some pagecache folioifications and simplifications from Matthew Wilcox.
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it.
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword. This series shold have been in the
non-MM tree, my bad.
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages.
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages.
- Peter Xu utilized the PTE marker code for handling swapin errors.
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient.
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand.
- zram support for multiple compression streams from Sergey Senozhatsky.
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway.
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations.
- Vishal Moola removed the try_to_release_page() wrapper.
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache.
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking.
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend.
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range().
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen.
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect.
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages().
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting.
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines.
- Many singleton patches, as usual.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5j6ZwAKCRDdBJ7gKXxA
jkDYAP9qNeVqp9iuHjZNTqzMXkfmJPsw2kmy2P+VdzYVuQRcJgEAgoV9d7oMq4ml
CodAgiA51qwzId3GRytIo/tfWZSezgA=
=d19R
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- More userfaultfs work from Peter Xu
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying
- Some filemap cleanups from Vishal Moola
- David Hildenbrand added the ability to selftest anon memory COW
handling
- Some cpuset simplifications from Liu Shixin
- Addition of vmalloc tracing support by Uladzislau Rezki
- Some pagecache folioifications and simplifications from Matthew
Wilcox
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
it
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword.
This series should have been in the non-MM tree, my bad
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages
- Peter Xu utilized the PTE marker code for handling swapin errors
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand
- zram support for multiple compression streams from Sergey Senozhatsky
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations
- Vishal Moola removed the try_to_release_page() wrapper
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range()
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages()
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines
- Many singleton patches, as usual
* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
mm: mmu_gather: allow more than one batch of delayed rmaps
mm: fix typo in struct pglist_data code comment
kmsan: fix memcpy tests
mm: add cond_resched() in swapin_walk_pmd_entry()
mm: do not show fs mm pc for VM_LOCKONFAULT pages
selftests/vm: ksm_functional_tests: fixes for 32bit
selftests/vm: cow: fix compile warning on 32bit
selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
mm,thp,rmap: fix races between updates of subpages_mapcount
mm: memcg: fix swapcached stat accounting
mm: add nodes= arg to memory.reclaim
mm: disable top-tier fallback to reclaim on proactive reclaim
selftests: cgroup: make sure reclaim target memcg is unprotected
selftests: cgroup: refactor proactive reclaim code to reclaim_until()
mm: memcg: fix stale protection of reclaim target memcg
mm/mmap: properly unaccount memory on mas_preallocate() failure
omfs: remove ->writepage
jfs: remove ->writepage
...
A few new APIs here, support for the FSI bus (which is used in some
PowerPC systems) plus a couple of new APIs, one allowing abstractions
built on top of regmap to tell if the regmap can be used in an atomic
context and one providing a callback for an in flight device which can't
do interrupt masking very well.
There's also a fix that I never got round to sending because it really
should be fixed better but that's not happened yet and it does avoid the
problem, the fix was in -next for a long time.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmOXG3wACgkQJNaLcl1U
h9BqHgf/YmdUx3/jh8hmDufqJCKbML0SlIb1ODQlLHsjpMSuCGlmQGSCHa/peVMk
1c6Tn2FboJ5+mHUQixMx6jhlSsJ1fO0i0TbRjj9vL6eLJKCtfdiBS2UmJuGFtyKB
swOISPEsVIWrc2t/e8/DjZ3JznwdFup81vjcYUhlA6Xglk5Ch0szb5+p2ElSWwI9
GA6wDUe0YB3eqU6vSAsjHN/hhUUC2BkGPv1fLzW11kNsoxJbxJ7KsUVmbQQMEMRg
HXXmdlooZqH9og47jGLH+3v3onJb7ZnKkx+wU6no98mb++v0OuiLUzj0IA3TLKk4
OacxbPLBk3cLmpdaPD9eimwV7ZcdVQ==
=a/WH
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A few new APIs here, support for the FSI bus (which is used in some
PowerPC systems) plus a couple of new APIs, one allowing abstractions
built on top of regmap to tell if the regmap can be used in an atomic
context and one providing a callback for an in flight device which
can't do interrupt masking very well.
There's also a fix that I never got round to sending because it really
should be fixed better but that's not happened yet and it does avoid
the problem, the fix was in -next for a long time"
* tag 'regmap-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Add handle_mask_sync() callback
regmap: Add FSI bus support
regmap: add regmap_might_sleep()
regmap-irq: Use the new num_config_regs property in regmap_add_irq_chip_fwnode
- Fix nasty and hard to debug race condition introduced by mistake
in the runtime PM core code and clean up that code somewhat on
top of the fix (Rafael Wysocki).
- Generalize of_perf_domain_get_sharing_cpumask phandle format (Hector
Martin).
- Add new cpufreq driver for Apple SoC CPU P-states (Hector Martin).
- Update Qualcomm cpufreq driver, including:
* CPU clock provider support,
* Generic cleanups or reorganization.
* Potential memleak fix.
* Fix of the return value of cpufreq_driver->get().
(Manivannan Sadhasivam, Chen Hui).
- Update Qualcomm cpufreq driver's DT bindings, including:
* Support for CPU clock provider.
* Missing cache-related properties fixes.
* Support for QDU1000/QRU1000.
(Manivannan Sadhasivam, Rob Herring, Melody Olvera).
- Add support for ti,am625 SoC and enable build of ti-cpufreq for
ARCH_K3 (Dave Gerlach, and Vibhore Vardhan).
- Use flexible array to simplify memory allocation in the tegra186
cpufreq driver (Christophe JAILLET).
- Convert cpufreq statistics code to use sysfs_emit_at() (ye xingchen).
- Allow intel_pstate to use no-HWP mode on Sapphire Rapids (Giovanni
Gherdovich).
- Add missing pci_dev_put() to the amd_freq_sensitivity cpufreq driver
(Xiongfeng Wang).
- Initialize the kobj_unregister completion before calling
kobject_init_and_add() in the cpufreq core code (Yongqiang Liu).
- Defer setting boost MSRs in the ACPI cpufreq driver (Stuart Hayes,
Nathan Chancellor).
- Make intel_pstate accept initial EPP value of 0x80 (Srinivas
Pandruvada).
- Make read-only array sys_clk_src in the SPEAr cpufreq driver static
(Colin Ian King).
- Make array speeds in the longhaul cpufreq driver static (Colin Ian
King).
- Use str_enabled_disabled() helper in the ACPI cpufreq driver (Andy
Shevchenko).
- Drop a reference to CVS from cpufreq documentation (Conghui Wang).
- Improve kernel messages printed by the PSCI cpuidle driver (Ulf
Hansson).
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson).
- Modify the tasks freezing code to avoid using pr_cont() and refine an
error message printed by it (Rafael Wysocki).
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo).
- Fix mistake in a kerneldoc comment in the hibernation code (xiongxin).
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa).
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in the
generic power domains code (Abel Vesa).
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo).
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo).
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo).
- Fix compiler warnings with make W=1 in the idle_inject power capping
driver (Srinivas Pandruvada).
- Use kstrtobool() instead of strtobool() in the power capping sysfs
interface (Christophe JAILLET).
- Add SCMI Powercap based power capping driver (Cristian Marussi).
- Add Emerald Rapids support to the intel-uncore-freq driver (Artem
Bityutskiy).
- Repair slips in kernel-doc comments in the generic notifier code
(Lukas Bulwahn).
- Fix several DT issues in the OPP library reorganize code around
opp-microvolt-<named> DT property (Viresh Kumar).
- Allow any of opp-microvolt, opp-microamp, or opp-microwatt properties
to be present without the others present (James Calligeros).
- Fix clock-latency-ns property in DT example (Serge Semin).
- Add a private governor_data for devfreq governors (Kant Fan).
- Reorganize devfreq code to use device_match_of_node() and
devm_platform_get_and_ioremap_resource() instead of open coding
them (ye xingchen, Minghao Chi).
- Make cpupower choose base_cpu to display default cpupower details
instead of picking CPU 0 (Saket Kumar Bhaskar).
- Add Georgian translation to cpupower documentation (Zurab
Kargareteli).
- Introduce powercap intel-rapl library, powercap-info command, and
RAPL monitor into cpupower (Thomas Renninger).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmOXWKsSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxzKsP/jEKIwMSG4KHyXjJSDopFvppA13468ma
ao2G5EnbtPgZOiN66BOcAfPB+pzBM8WBCnpy8sfNzcpQSaGaJr+flQDqQV/1QG/H
GNQ0MUYN6TF/zfz/hKawDtQJihw9OrJgqQfUJIyc7Djo8ntSBu299XAt3X8VB5D1
azU1WwfOnEhr8evkqd8DS81fwm6b5cWvLkfG3Qvk2VxlwC/BCFdygqNjwOXmMNMb
DPYWv1xoVhSKzJsPHbAtzFq6veLsw2Glf2xPDyjf9ZPB0ujrftFoRoeCrC/neBDb
5bB4P5Injg3IB7SAHf97XgGAH2biUKwVnQhVUOTWXdQ7u/xDbH5fOLFJkBOBP6n6
gZiEOqzg5wVXk+ZfKx4fjsf4LvB1r+nM2tmx/bzhxyt9UDLUfB9kY0PMXLRuYqyn
ITvk00CJ/hkwD98pql4pCnc1PYZLUv/CHiaqTjwwOKuue3Jb3OTSPrSWtYIyTyNx
s2eBz/CxGSg4Q25u3loIiNVAaCOul6SZq+Iz6BlVP8sy3q62LWi8mp5b+kb8HFWH
lk8GpavqOLF6brxpPL/n0vav2bCmdwblMjTcowtGbLgiGSZaD97AkPFTN2H7tGPv
iUZDTdK3H24aqY62yKzo2HK3PhwNCg06gF0VTsuvJ7iIQmfeUpLjB/3qGeJNjlEQ
20fQ6YU/NytB
=B9Uu
-----END PGP SIGNATURE-----
Merge tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These include two new drivers (cpufreq driver for Apple SoC CPU
P-states and the SCMI Powercap based power capping driver), other new
hardware support and driver extensions (Qualcomm cpufreq driver and
its DT bindings, TI cpufreq driver, intel_pstate, intel-uncore-freq),
a bunch of fixes and cleanups all over and a cpupower utility update
including new features related to RAPL support.
Specifics:
- Fix nasty and hard to debug race condition introduced by mistake in
the runtime PM core code and clean up that code somewhat on top of
the fix (Rafael Wysocki)
- Generalize of_perf_domain_get_sharing_cpumask phandle format
(Hector Martin)
- Add new cpufreq driver for Apple SoC CPU P-states (Hector Martin)
- Update Qualcomm cpufreq driver (Manivannan Sadhasivam, Chen Hui):
- CPU clock provider support
- Generic cleanups or reorganization
- Potential memleak fix
- Fix of the return value of cpufreq_driver->get()
- Update Qualcomm cpufreq driver's DT bindings (Manivannan
Sadhasivam, Rob Herring, Melody Olvera):
- Support for CPU clock provider
- Missing cache-related properties fixes
- Support for QDU1000/QRU1000
- Add support for ti,am625 SoC and enable build of ti-cpufreq for
ARCH_K3 (Dave Gerlach, and Vibhore Vardhan)
- Use flexible array to simplify memory allocation in the tegra186
cpufreq driver (Christophe JAILLET)
- Convert cpufreq statistics code to use sysfs_emit_at() (ye
xingchen)
- Allow intel_pstate to use no-HWP mode on Sapphire Rapids (Giovanni
Gherdovich)
- Add missing pci_dev_put() to the amd_freq_sensitivity cpufreq
driver (Xiongfeng Wang)
- Initialize the kobj_unregister completion before calling
kobject_init_and_add() in the cpufreq core code (Yongqiang Liu)
- Defer setting boost MSRs in the ACPI cpufreq driver (Stuart Hayes,
Nathan Chancellor)
- Make intel_pstate accept initial EPP value of 0x80 (Srinivas
Pandruvada)
- Make read-only array sys_clk_src in the SPEAr cpufreq driver static
(Colin Ian King)
- Make array speeds in the longhaul cpufreq driver static (Colin Ian
King)
- Use str_enabled_disabled() helper in the ACPI cpufreq driver (Andy
Shevchenko)
- Drop a reference to CVS from cpufreq documentation (Conghui Wang)
- Improve kernel messages printed by the PSCI cpuidle driver (Ulf
Hansson)
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson)
- Modify the tasks freezing code to avoid using pr_cont() and refine
an error message printed by it (Rafael Wysocki)
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo)
- Fix mistake in a kerneldoc comment in the hibernation code
(xiongxin)
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa)
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in
the generic power domains code (Abel Vesa)
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo)
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo)
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo)
- Fix compiler warnings with make W=1 in the idle_inject power
capping driver (Srinivas Pandruvada)
- Use kstrtobool() instead of strtobool() in the power capping sysfs
interface (Christophe JAILLET)
- Add SCMI Powercap based power capping driver (Cristian Marussi)
- Add Emerald Rapids support to the intel-uncore-freq driver (Artem
Bityutskiy)
- Repair slips in kernel-doc comments in the generic notifier code
(Lukas Bulwahn)
- Fix several DT issues in the OPP library reorganize code around
opp-microvolt-<named> DT property (Viresh Kumar)
- Allow any of opp-microvolt, opp-microamp, or opp-microwatt
properties to be present without the others present (James
Calligeros)
- Fix clock-latency-ns property in DT example (Serge Semin)
- Add a private governor_data for devfreq governors (Kant Fan)
- Reorganize devfreq code to use device_match_of_node() and
devm_platform_get_and_ioremap_resource() instead of open coding
them (ye xingchen, Minghao Chi)
- Make cpupower choose base_cpu to display default cpupower details
instead of picking CPU 0 (Saket Kumar Bhaskar)
- Add Georgian translation to cpupower documentation (Zurab
Kargareteli)
- Introduce powercap intel-rapl library, powercap-info command, and
RAPL monitor into cpupower (Thomas Renninger)"
* tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
PM: runtime: Adjust white space in the core code
cpufreq: Remove CVS version control contents from documentation
cpufreq: stats: Convert to use sysfs_emit_at() API
cpufreq: ACPI: Only set boost MSRs on supported CPUs
PM: sleep: Refine error message in try_to_freeze_tasks()
PM: sleep: Avoid using pr_cont() in the tasks freezing code
PM: runtime: Relocate rpm_callback() right after __rpm_callback()
PM: runtime: Do not call __rpm_callback() from rpm_idle()
PM / devfreq: event: use devm_platform_get_and_ioremap_resource()
PM / devfreq: event: Use device_match_of_node()
PM / devfreq: Use device_match_of_node()
powercap: idle_inject: Fix warnings with make W=1
PM: hibernate: Complain about memory map mismatches during resume
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QDU1000/QRU1000 cpufreq
cpufreq: tegra186: Use flexible array to simplify memory allocation
cpupower: rapl monitor - shows the used power consumption in uj for each rapl domain
cpupower: Introduce powercap intel-rapl library and powercap-info command
cpupower: Add Georgian translation
cpufreq: intel_pstate: Add Sapphire Rapids support in no-HWP mode
cpufreq: amd_freq_sensitivity: Add missing pci_dev_put()
...
- Core:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for PCI/MSI[-X]
and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows device
manufactures to provide implementation defined storage for MSI messages
contrary to the uniform and specification defined storage mechanisms for
PCI/MSI and PCI/MSI-X. IMS not only allows to overcome the size limitations
of the MSI-X table, but also gives the device manufacturer the freedom to
store the message in arbitrary places, even in host memory which is shared
with the device.
There have been several attempts to glue this into the current MSI code,
but after lengthy discussions it turned out that there is a fundamental
design problem in the current PCI/MSI-X implementation. This needs some
historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management was
completely different from what we have today in the actively developed
architectures. Interrupt management was completely architecture specific
and while there were attempts to create common infrastructure the
commonalities were rudimentary and just providing shared data structures and
interfaces so that drivers could be written in an architecture agnostic
way.
The initial PCI/MSI[-X] support obviously plugged into this model which
resulted in some basic shared infrastructure in the PCI core code for
setting up MSI descriptors, which are a pure software construct for holding
data relevant for a particular MSI interrupt, but the actual association to
Linux interrupts was completely architecture specific. This model is still
supported today to keep museum architectures and notorious stranglers
alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel,
which was creating yet another architecture specific mechanism and resulted
in an unholy mess on top of the existing horrors of x86 interrupt handling.
The x86 interrupt management code was already an incomprehensible maze of
indirections between the CPU vector management, interrupt remapping and the
actual IO/APIC and PCI/MSI[-X] implementation.
At roughly the same time ARM struggled with the ever growing SoC specific
extensions which were glued on top of the architected GIC interrupt
controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86 vector
domain and interrupt remapping and also allowed ARM to handle the zoo of
SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that it is
not available the PCI/MSI[-X] domains have the vector domain as their
parent. This reduced the required interaction between the domains pretty
much to the initialization phase where it is obviously required to
establish the proper parent relation ship in the components of the
hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware
it's clear that the actual PCI/MSI[-X] interrupt controller is not a global
entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the easy
solution of providing "global" PCI/MSI domains which was possible because
the PCI/MSI[-X] handling is uniform across the devices. This also allowed
to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in
turn made it simple to keep the existing architecture specific management
alive.
A similar problem was created in the ARM world with support for IP block
specific message storage. Instead of going all the way to stack a IP block
specific domain on top of the generic MSI domain this ended in a construct
which provides a "global" platform MSI domain which allows overriding the
irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the MSI
infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into the
existing infrastructure. Some of this just works by chance on particular
platforms but will fail in hard to diagnose ways when the driver is used
on platforms where the underlying MSI interrupt management code does not
expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront to
avoid resource exhaustion. They are expanded at run-time when the guest
actually tries to use them. The way how this is implemented is that the
host disables MSI-X and then re-enables it with a larger number of
vectors again. That works by chance because most device drivers set up
all interrupts before the device actually will utilize them. But that's
not universally true because some drivers allocate a large enough number
of vectors but do not utilize them until it's actually required,
e.g. for acceleration support. But at that point other interrupts of the
device might be in active use and the MSI-X disable/enable dance can
just result in losing interrupts and therefore hard to diagnose subtle
problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS
is not longer providing a uniform storage and configuration model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for PCI/IMS.
PCI/IMS has been verified with the work in progress IDXD driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
- Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUsygTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYXiD/40tXKzCzf0qFIqUlZLia1N3RRrwrNC
DVTixuLtR9MrjwE+jWLQILa85SHInV8syXHSd35SzhsGDxkURFGi+HBgVWmysODf
br9VSh3Gi+kt7iXtIwAg8WNWviGNmS3kPksxCko54F0YnJhMY5r5bhQVUBQkwFG2
wES1C9Uzd4pdV2bl24Z+WKL85cSmZ+pHunyKw1n401lBABXnTF9c4f13zC14jd+y
wDxNrmOxeL3mEH4Pg6VyrDuTOURSf3TjJjeEq3EYqvUo0FyLt9I/cKX0AELcZQX7
fkRjrQQAvXNj39RJfeSkojDfllEPUHp7XSluhdBu5aIovSamdYGCDnuEoZ+l4MJ+
CojIErp3Dwj/uSaf5c7C3OaDAqH2CpOFWIcrUebShJE60hVKLEpUwd6W8juplaoT
gxyXRb1Y+BeJvO8VhMN4i7f3232+sj8wuj+HTRTTbqMhkElnin94tAx8rgwR1sgR
BiOGMJi4K2Y8s9Rqqp0Dvs01CW4guIYvSR4YY+WDbbi1xgiev89OYs6zZTJCJe4Y
NUwwpqYSyP1brmtdDdBOZLqegjQm+TwUb6oOaasFem4vT1swgawgLcDnPOx45bk5
/FWt3EmnZxMz99x9jdDn1+BCqAZsKyEbEY1avvhPVMTwoVIuSX2ceTBMLseGq+jM
03JfvdxnueM3gw==
=9erA
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt core and driver subsystem:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for
PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows
device manufactures to provide implementation defined storage for MSI
messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified
message store which is uniform accross all devices). The PCI/MSI[-X]
uniformity allowed us to get away with "global" PCI/MSI domains.
IMS not only allows to overcome the size limitations of the MSI-X
table, but also gives the device manufacturer the freedom to store the
message in arbitrary places, even in host memory which is shared with
the device.
There have been several attempts to glue this into the current MSI
code, but after lengthy discussions it turned out that there is a
fundamental design problem in the current PCI/MSI-X implementation.
This needs some historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management
was completely different from what we have today in the actively
developed architectures. Interrupt management was completely
architecture specific and while there were attempts to create common
infrastructure the commonalities were rudimentary and just providing
shared data structures and interfaces so that drivers could be written
in an architecture agnostic way.
The initial PCI/MSI[-X] support obviously plugged into this model
which resulted in some basic shared infrastructure in the PCI core
code for setting up MSI descriptors, which are a pure software
construct for holding data relevant for a particular MSI interrupt,
but the actual association to Linux interrupts was completely
architecture specific. This model is still supported today to keep
museum architectures and notorious stragglers alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the
kernel, which was creating yet another architecture specific mechanism
and resulted in an unholy mess on top of the existing horrors of x86
interrupt handling. The x86 interrupt management code was already an
incomprehensible maze of indirections between the CPU vector
management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X]
implementation.
At roughly the same time ARM struggled with the ever growing SoC
specific extensions which were glued on top of the architected GIC
interrupt controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86
vector domain and interrupt remapping and also allowed ARM to handle
the zoo of SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that
it is not available the PCI/MSI[-X] domains have the vector domain as
their parent. This reduced the required interaction between the
domains pretty much to the initialization phase where it is obviously
required to establish the proper parent relation ship in the
components of the hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the
hardware it's clear that the actual PCI/MSI[-X] interrupt controller
is not a global entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the
easy solution of providing "global" PCI/MSI domains which was possible
because the PCI/MSI[-X] handling is uniform across the devices. This
also allowed to keep the existing PCI/MSI[-X] infrastructure mostly
unchanged which in turn made it simple to keep the existing
architecture specific management alive.
A similar problem was created in the ARM world with support for IP
block specific message storage. Instead of going all the way to stack
a IP block specific domain on top of the generic MSI domain this ended
in a construct which provides a "global" platform MSI domain which
allows overriding the irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the
MSI infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into
the existing infrastructure. Some of this just works by chance on
particular platforms but will fail in hard to diagnose ways when the
driver is used on platforms where the underlying MSI interrupt
management code does not expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront
to avoid resource exhaustion. They are expanded at run-time when the
guest actually tries to use them. The way how this is implemented is
that the host disables MSI-X and then re-enables it with a larger
number of vectors again. That works by chance because most device
drivers set up all interrupts before the device actually will utilize
them. But that's not universally true because some drivers allocate a
large enough number of vectors but do not utilize them until it's
actually required, e.g. for acceleration support. But at that point
other interrupts of the device might be in active use and the MSI-X
disable/enable dance can just result in losing interrupts and
therefore hard to diagnose subtle problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact
that IMS is not longer providing a uniform storage and configuration
model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per
device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for
PCI/IMS. PCI/IMS has been verified with the work in progress IDXD
driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place"
* tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits)
irqchip/ti-sci-inta: Fix kernel doc
irqchip/gic-v2m: Mark a few functions __init
irqchip/gic-v2m: Include arm-gic-common.h
irqchip/irq-mvebu-icu: Fix works by chance pointer assignment
iommu/amd: Enable PCI/IMS
iommu/vt-d: Enable PCI/IMS
x86/apic/msi: Enable PCI/IMS
PCI/MSI: Provide pci_ims_alloc/free_irq()
PCI/MSI: Provide IMS (Interrupt Message Store) support
genirq/msi: Provide constants for PCI/IMS support
x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN
PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X
PCI/MSI: Provide prepare_desc() MSI domain op
PCI/MSI: Split MSI-X descriptor setup
genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN
genirq/msi: Provide msi_domain_alloc_irq_at()
genirq/msi: Provide msi_domain_ops:: Prepare_desc()
genirq/msi: Provide msi_desc:: Msi_data
genirq/msi: Provide struct msi_map
x86/apic/msi: Remove arch_create_remap_msi_irq_domain()
...
There are few major updates in the SoC specific drivers, mainly the usual
reworks and support for variants of the existing SoC. While this remains
Arm centric for the most part, the branch now also contains updates to
risc-v and loongarch specific code in drivers/soc/.
Notable changes include:
- Support for the newly added Qualcomm Snapdragon variants
(MSM8956, MSM8976, SM6115, SM4250, SM8150, SA8155 and SM8550) in the
soc ID, rpmh, rpm, spm and powerdomain drivers.
- Documentation for the somewhat controversial qcom,board-id
properties that are required for booting a number of machines
- A new SoC identification driver for the loongson-2 (loongarch)
platform
- memory controller updates for stm32, tegra, and renesas.
- a new DT binding to better describe LPDDR2/3/4/5 chips in
the memory controller subsystem
- Updates for Tegra specific drivers across multiple subsystems,
improving support for newer SoCs and better identification
- Minor fixes for Broadcom, Freescale, Apple, Renesas, Sifive,
TI, Mediatek and Marvell SoC drivers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmOSAZ8ACgkQmmx57+YA
GNmoDw/9Hdz2rx6TtdjA2eMKFt97bK0EgrQADT1d4lPQzXzZzFDC9ZxL0bwZRujZ
b8Q6WrMMgcRiWmzmRlxQWMWEdBU8Y0OzeYlo4lbyCSOV+UA2OA/eu6rm0chapBgM
1/lkquYLUUcB31wg+NmADoKy5Ejxj9SL1Va36Nvng4YpHDrYHKt4gPyCr/EV+KRO
Q8JpH7vEzQ0P5CGUzeri2UlYWDdF1GXmObqQGF8pq9s6Qz4ACe63r+eJFXAQFiXK
xewRK7PuvqmQWLVaEnN8dAcSna5P4aIGKOVjQyZjCCp6qTvfm4d2hxTl4dt9gVtt
vbQPiPQ5ORRzeMmW6wHxSIdy2QCa9CKQDXuMRoOWHfGMrAZQaxruISpcmHYv9Ug+
nSfedIEtxtmpGK2SZ1Mvndkezbb0o5QXZF4+kxqpiE/EaxVWmxiecmrUqyvJ5RVv
RuaZeMQpeOaWElnxb2P/T5uLuoHGhDdZ98HXICuCWPAitvA2rRK4Rv3dqTeclPLa
HR9gVYgZK3CSj+e9xbe5uczIc664bscRl9unghtB3UWkGTiLt2rroX4T2pTU/2xf
YvzDHC+f42NEkXUzcs4cZ87R8iY2hr0LmePY5/lqF9k6qx0Rc3syNc7q4N4EBxGC
2y5dDpKXfFL6fEV4YNeGpNcrwmCwnNppcePjmNvgrdtqmUUB/mY=
=heNV
-----END PGP SIGNATURE-----
Merge tag 'soc-drivers-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC driver updates from Arnd Bergmann:
"There are few major updates in the SoC specific drivers, mainly the
usual reworks and support for variants of the existing SoC. While this
remains Arm centric for the most part, the branch now also contains
updates to risc-v and loongarch specific code in drivers/soc/.
Notable changes include:
- Support for the newly added Qualcomm Snapdragon variants (MSM8956,
MSM8976, SM6115, SM4250, SM8150, SA8155 and SM8550) in the soc ID,
rpmh, rpm, spm and powerdomain drivers.
- Documentation for the somewhat controversial qcom,board-id
properties that are required for booting a number of machines
- A new SoC identification driver for the loongson-2 (loongarch)
platform
- memory controller updates for stm32, tegra, and renesas.
- a new DT binding to better describe LPDDR2/3/4/5 chips in the
memory controller subsystem
- Updates for Tegra specific drivers across multiple subsystems,
improving support for newer SoCs and better identification
- Minor fixes for Broadcom, Freescale, Apple, Renesas, Sifive, TI,
Mediatek and Marvell SoC drivers"
* tag 'soc-drivers-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (137 commits)
soc: qcom: socinfo: Add SM6115 / SM4250 SoC IDs to the soc_id table
dt-bindings: arm: qcom,ids: Add SoC IDs for SM6115 / SM4250 and variants
soc: qcom: socinfo: Add SM8150 and SA8155 SoC IDs to the soc_id table
dt-bindings: arm: qcom,ids: Add SoC IDs for SM8150 and SA8155
dt-bindings: soc: qcom: apr: document generic qcom,apr compatible
soc: qcom: Select REMAP_MMIO for ICC_BWMON driver
soc: qcom: Select REMAP_MMIO for LLCC driver
soc: qcom: rpmpd: Add SM4250 support
dt-bindings: power: rpmpd: Add SM4250 support
dt-bindings: soc: qcom: aoss: Add compatible for SM8550
soc: qcom: llcc: Add configuration data for SM8550
dt-bindings: arm: msm: Add LLCC compatible for SM8550
soc: qcom: llcc: Add v4.1 HW version support
soc: qcom: socinfo: Add SM8550 ID
soc: qcom: rpmh-rsc: Avoid unnecessary checks on irq-done response
soc: qcom: rpmh-rsc: Add support for RSC v3 register offsets
soc: qcom: rpmhpd: Add SM8550 power domains
dt-bindings: power: rpmpd: Add SM8550 to rpmpd binding
soc: qcom: socinfo: Add MSM8956/76 SoC IDs to the soc_id table
dt-bindings: arm: qcom,ids: Add SoC IDs for MSM8956 and MSM8976
...
Merge cpuidle changes, updates related to system sleep amd generic power
domains code fixes for 6.2-rc1:
- Improve kernel messages printed by the cpuidle PCI driver (Ulf
Hansson).
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson).
- Modify the tasks freezing code to avoid using pr_cont() and refine an
error message printed by it (Rafael Wysocki).
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo).
- Fix mistake in a kerneldoc comment in the hibernation code (xiongxin).
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa).
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in the
generic power domains code (Abel Vesa).
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo).
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo).
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo).
* pm-cpuidle:
cpuidle: dt: Clarify a comment and simplify code in dt_init_idle_driver()
cpuidle: dt: Return the correct numbers of parsed idle states
cpuidle: psci: Extend information in log about OSI/PC mode
* pm-sleep:
PM: sleep: Refine error message in try_to_freeze_tasks()
PM: sleep: Avoid using pr_cont() in the tasks freezing code
PM: hibernate: Complain about memory map mismatches during resume
PM: hibernate: Fix mistake in kerneldoc comment
* pm-domains:
PM: domains: Reverse the order of performance and enabling ops
PM: domains: Power off[on] domain in hibernate .freeze[thaw]_noirq hook
PM: domains: Consolidate genpd_restore_noirq() and genpd_resume_noirq()
PM: domains: Pass generic PM noirq hooks to genpd_finish_suspend()
PM: domains: Drop genpd status manipulation for hibernate restore
utsrelease.h is potentially generated on each build.
By removing this unused include we can get rid of some spurious
recompilations.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Provide a public callback handle_mask_sync() that drivers can use when
they have more complex IRQ masking logic. The default implementation is
regmap_irq_handle_mask_sync(), used if the chip doesn't provide its own
callback.
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/e083474b3d467a86e6cb53da8072de4515bd6276.1669100542.git.william.gray@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Some inconsistent usage of white space in the PM-runtime core code
causes that code to be somewhat harder to read that it would have
been otherwise, so adjust the white space in there to be more
consistent with the rest of the code.
No expected functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use fwnode_handle_put() on the node pointer to release the refcount.
Change fwnode_handle_node() to fwnode_handle_put().
Fixes: 233872585d ("device property: Add fwnode_get_next_parent()")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Daniel Scally <djrscally@gmail.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20221207112219.2652411-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
to_fw_sysfs() was changed in commit 23680f0b7d ("driver core: make
struct class.dev_uevent() take a const *") to pass in a const pointer
but not pass it back out to handle some changes in the driver core.
That isn't the best idea as it could cause problems if used incorrectly,
so switch to use the container_of_const() macro instead which will
preserve the const status of the pointer and enforce it by the compiler.
Fixes: 23680f0b7d ("driver core: make struct class.dev_uevent() take a const *")
Cc: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221205121206.166576-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Switch to the new domain id aware interfaces to phase out the previous
ones. No functional change.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124230314.513924920@linutronix.de
Because rpm_callback() is a wrapper around __rpm_callback(), and the
only caller of it after the change eliminating an invocation of it
from rpm_idle(), move the former next to the latter to make the code
a bit easier to follow.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Calling __rpm_callback() from rpm_idle() after adding device links
support to the former is a clear mistake.
Not only it causes rpm_idle() to carry out unnecessary actions, but it
is also against the assumption regarding the stability of PM-runtime
status across __rpm_callback() invocations, because rpm_suspend() and
rpm_resume() may run in parallel with __rpm_callback() when it is called
by rpm_idle() and the device's PM-runtime status can be updated by any
of them.
Fixes: 21d5c57b37 ("PM / runtime: Use device links")
Link: https://lore.kernel.org/linux-pm/36aed941-a73e-d937-2721-4f0decd61ce0@quicinc.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Merge series from Eddie James <eajames@linux.ibm.com>:
The SBEFIFO hardware can now be attached over a new I2C endpoint interface
called the I2C Responder (I2CR). In order to use the existing SBEFIFO
driver, add a regmap driver for the FSI bus and an endpoint driver for the
I2CR. Then, refactor the SBEFIFO and OCC drivers to clean up and use the
new regmap driver or the I2CR interface.
This branch just has the regmap change so it can be shared with the FSI
code.
The ->set_performance_state() needs to be called before ->power_on()
when a genpd is powered on, and after ->power_off() when a genpd is
powered off. Do this in order to let the provider know to which
performance state to power on the genpd, on the power on sequence, and
also to maintain the performance for that genpd until after powering off,
on power off sequence.
There is no scenario where a consumer would need its genpd enabled and
then its performance state increased. Instead, in every scenario, the
consumer needs the genpd to be enabled from the start at a specific
performance state.
And same logic applies to the powering down. No consumer would need its
genpd performance state dropped right before powering down.
Now, there are currently two vendors which use ->set_performance_state()
in their genpd providers. One of them is Tegra, but the only genpd provider
(PMC) that makes use of ->set_performance_state() doesn't implement the
->power_on() or ->power_off(), and so it will not be affected by the ops
reversal.
The other vendor that uses it is Qualcomm, in multiple genpd providers
actually (RPM, RPMh and CPR). But all Qualcomm genpd providers that make
use of ->set_performance_state() need the order between enabling ops and
the performance setting op to be reversed. And the reason for that is that
it currently translates into two different voltages in order to power on
a genpd to a specific performance state. Basically, ->power_on() switches
to the minimum (enabling) voltage for that genpd, and then
->set_performance_state() sets it to the voltage level required by the
consumer.
By reversing the call order, we rely on the provider to know what to do
on each call, but most popular usecase is to cache the performance state
and postpone the voltage setting until the ->power_on() gets called.
As for the reason of still needing the ->power_on() and ->power_off() for a
provider which could get away with just having ->set_performance_state()
implemented, there are consumers that do not (nor should) provide an
opp-table. For those consumers, ->set_performance_state() will not be
called, and so they will enable the genpd to its minimum performance state
by a ->power_on() call. Same logic goes for the disabling.
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The dev_uevent() in struct class should not be modifying the device that
is passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Johan Hovold <johan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Raed Salem <raeds@nvidia.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Jakob Koschel <jakobkoschel@gmail.com>
Cc: Antoine Tenart <atenart@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Wang Yufen <wangyufen@huawei.com>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: linux-pm@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221123122523.1332370-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
For DT based systems, fw_token is set to a pointer to a DT node.
commit 3da72e1837 ("cacheinfo: Decrement refcount in
cache_setup_of_node()")
doesn't increment the refcount of fw_token anymore in
cache_setup_of_node(). fw_token is indeed used as a token and not
as a (struct device_node*), so no reference to fw_token should be
kept.
However, [1] is triggered when hotplugging a CPU multiple times
since cache_shared_cpu_map_remove() decrements the refcount to
fw_token at each CPU unplugging, eventually reaching 0.
Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().
[1]
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
Modules linked in:
CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G W 6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
[...]
Call trace:
[...]
of_node_release (drivers/of/dynamic.c:335)
kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
of_node_put (drivers/of/dynamic.c:49)
free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
cpuhp_thread_fun (kernel/cpu.c:785)
smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
kthread (kernel/kthread.c:376)
ret_from_fork (arch/arm64/kernel/entry.S:861)
---[ end trace 0000000000000000 ]---
Fixes: 3da72e1837 ("cacheinfo: Decrement refcount in cache_setup_of_node()")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20221116094958.2141072-1-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Seems the blank line to separate entries in Kconfig was missing.
Add it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20221122133600.49897-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the fwnode_property_match_string() the goto label out has
an additional task. Rename the label to be more precise on
what is going to happen if goto it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221122133600.49897-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The name() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction. When doing so, fix up the single existing name() callback
to have the correct signature to preserve the build.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20221121094649.1556002-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The filter() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction. When doing so, fix up all existing filter() callbacks to
have the correct signature to preserve the build.
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Christian König <christian.koenig@amd.com> for the changes to
Link: https://lore.kernel.org/r/20221121094649.1556002-3-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The call, kobject_get_ownership(), does not modify the kobject passed
into it, so make it const. This propagates down into the kobj_type
function callbacks so make the kobject passed into them also const,
ensuring that nothing in the kobject is being changed here.
This helps make it more obvious what calls and callbacks do, and do not,
modify structures passed to them.
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20221121094649.1556002-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the dawn of MMIO gpio-regmap users, it is desirable to let
gpio-regmap ask the regmap if it might sleep during an access so
it can pass that information to gpiochip. Add a new regmap_might_sleep()
to query the regmap.
Signed-off-by: Michael Walle <michael@walle.cc>
Link: https://lore.kernel.org/r/20221121150843.1562603-1-michael@walle.cc
Signed-off-by: Mark Brown <broonie@kernel.org>
Adjust to reality and remove another layer of pointless Kconfig
indirection. CONFIG_GENERIC_MSI_IRQ is good enough to serve
all purposes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122014.524842979@linutronix.de
When a range of descriptors is freed then all of them are not associated to
a linux interrupt. Remove the filter and add a warning to the free function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122013.888850936@linutronix.de
Not only platform devices described by OF have named interrupts, but
devices described by ACPI also have named interrupts. The fwnode is an
abstraction to different standards, and using fwnode_irq_get_byname can
support more devices.
Signed-off-by: Soha Jin <soha@lohu.info>
Tested-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.
This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.
In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.
While at it, include the corresponding header file (<linux/kstrtox.h>)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/02ba683a5c0716638ad8ca11e8b0fdca97c4f294.1667336095.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refcounts to DT nodes are only incremented in the function
and never decremented. Decrease the refcounts when necessary.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20221026185954.991547-1-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
driver_allows_async_probing is only used in drivers/base/dd.c, so mark
it static and remove the declaration in drivers/base/base.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221030092255.872280-1-hch@lst.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no in-kernel user of this function, so it is not needed anymore
and can be removed.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221109140711.105222-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no in-kernel user of this function, so it is not needed anymore
and can be removed.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221109140711.105222-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The arch timer cannot wake up the Qualcomm Technologies, Inc. (QTI) SoCs
from the deeper CPUidle states. To be able to wakeup from these deeper
states, another always-on timer needs to be programmed through the so
called CONTROL_TCS.
As the RSC is part of CPU subsystem and the corresponding APSS RSC device
is attached to the cluster PM domain (through genpd), it holds the
responsibility to program the always-on timer, before entering any of these
deeper CPUidle states.
However, programming the timer requires information about the next hrtimer
wakeup for the cluster PM domain, which is currently only known by genpd.
Therefore, let's share this data through a new genpd helper function,
dev_pm_genpd_get_next_hrtimer().
Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
[Ulf: Reworked the code and updated the commit message]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # SM8450
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221018152837.619426-5-ulf.hansson@linaro.org
Commit faa87ce919 ("regmap-irq: Introduce config registers for irq
types") added the num_config_regs, then commit 9edd4f5aee ("regmap-irq:
Deprecate type registers and virtual registers") suggested to replace
num_type_reg with it. However, regmap_add_irq_chip_fwnode wasn't modified
to use the new property. Later on, commit 255a03bb1b ("ASoC: wcd9335:
Convert irq chip to config regs") removed the old num_type_reg property
from the WCD9335 driver's struct regmap_irq_chip, causing a null pointer
dereference in regmap_irq_set_type when it tried to index d->type_buf as
it was never allocated in regmap_add_irq_chip_fwnode:
[ 39.199374] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 39.200006] Call trace:
[ 39.200014] regmap_irq_set_type+0x84/0x1c0
[ 39.200026] __irq_set_trigger+0x60/0x1c0
[ 39.200040] __setup_irq+0x2f4/0x78c
[ 39.200051] request_threaded_irq+0xe8/0x1a0
Use num_config_regs in regmap_add_irq_chip_fwnode instead of num_type_reg,
and fall back to it if num_config_regs isn't defined to maintain backward
compatibility.
Fixes: faa87ce919 ("regmap-irq: Introduce config registers for irq types")
Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com>
Link: https://lore.kernel.org/r/20221107202114.823975-1-y.oudjana@protonmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The callbacks in struct class namespace() and get_ownership() do not
modify the struct device passed to them, so mark the pointer as constant
and fix up all callbacks in the kernel to have the correct function
signature.
This helps make it more obvious what calls and callbacks do, and do not,
modify structures passed to them.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221001165426.2690912-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Round up allocations with kmalloc_size_roundup() so that devres's use
of ksize() is always accurate and no special handling of the memory is
needed by KASAN, UBSAN_BOUNDS, nor FORTIFY_SOURCE.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221018090406.never.856-kees@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If class_add_groups() returns error, the 'cp->subsys' need be
unregister, and the 'cp' need be freed.
We can not call kset_unregister() here, because the 'cls' will
be freed in callback function class_release() and it's also
freed in caller's error path, it will cause double free.
So fix this by calling kobject_del() and kfree_const(name) to
cleanup kobject. Besides, call kfree() to free the 'cp'.
Fault injection test can trigger this:
unreferenced object 0xffff888102fa8190 (size 8):
comm "modprobe", pid 502, jiffies 4294906074 (age 49.296s)
hex dump (first 8 bytes):
70 6b 74 63 64 76 64 00 pktcdvd.
backtrace:
[<00000000e7c7703d>] __kmalloc_track_caller+0x1ae/0x320
[<000000005e4d70bc>] kstrdup+0x3a/0x70
[<00000000c2e5e85a>] kstrdup_const+0x68/0x80
[<000000000049a8c7>] kvasprintf_const+0x10b/0x190
[<0000000029123163>] kobject_set_name_vargs+0x56/0x150
[<00000000747219c9>] kobject_set_name+0xab/0xe0
[<0000000005f1ea4e>] __class_register+0x15c/0x49a
unreferenced object 0xffff888037274000 (size 1024):
comm "modprobe", pid 502, jiffies 4294906074 (age 49.296s)
hex dump (first 32 bytes):
00 40 27 37 80 88 ff ff 00 40 27 37 80 88 ff ff .@'7.....@'7....
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
backtrace:
[<00000000151f9600>] kmem_cache_alloc_trace+0x17c/0x2f0
[<00000000ecf3dd95>] __class_register+0x86/0x49a
Fixes: ced6473e74 ("driver core: class: add class_groups support")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221026082803.3458760-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently PageHWPoison flag does not behave well when experiencing memory
hotremove/hotplug. Any data field in struct page is unreliable when the
associated memory is offlined, and the current mechanism can't tell
whether a memory block is onlined because a new memory devices is
installed or because previous failed offline operations are undone.
Especially if there's a hwpoisoned memory, it's unclear what the best
option is.
So introduce a new mechanism to make struct memory_block remember that a
memory block has hwpoisoned memory inside it. And make any online event
fail if the onlining memory block contains hwpoison. struct memory_block
is freed and reallocated over ACPI-based hotremove/hotplug, but not over
sysfs-based hotremove/hotplug. So the new counter can distinguish these
cases.
Link: https://lkml.kernel.org/r/20221024062012.1520887-5-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On platforms which use SHUTDOWN as hibernation mode, the genpd noirq
hooks will be called like below.
genpd_freeze_noirq() genpd_restore_noirq()
↓ ↑
Create snapshot image Restore target kernel
↓ ↑
genpd_thaw_noirq() genpd_freeze_noirq()
↓ ↑
Write snapshot image Read snapshot image
↓ ↑
power_down() Kernel boot
As of today suspend hooks genpd_suspend[resume]_noirq() manages domain
on/off state, but hibernate hooks genpd_freeze[thaw]_noirq() doesn't.
This results in a different behavior of domain power state between suspend
and hibernate freeze, i.e. domain is powered off for the former while on
for the later. It causes a problem on platforms like i.MX where the
domain needs to be powered on/off by calling clock and regulator interface.
When the platform restores from hibernation, the domain is off in hardware
and genpd_restore_noirq() tries to power it on, but will never succeed
because software state of domain (clock and regulator) is left on from the
last hibernate freeze, so kernel thinks that clock and regulator are
enabled while they are actually not turned on in hardware. The
consequence would be that devices in the power domain will access
registers without clock or power, and cause hardware lockup.
Power off[on] domain in hibernate .freeze[thaw]_noirq hook for reasons:
- Align the behavior between suspend and hibernate freeze.
- Have power state of domains stay in sync between hardware and software
for hibernate freeze, and thus fix the lockup issue seen on i.MX
platform.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Most of the logic between genpd_restore_noirq() and genpd_resume_noirq()
are identical. The suspended_count decrement for restore should be the
right thing to do anyway, considering there is an increment in
genpd_finish_suspend() for hibernation. So consolidate these two
functions into genpd_finish_resume().
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
While argument `poweroff` works fine for genpd_finish_suspend() to handle
distinction between suspend and poweroff, it won't scale if we want to
use it for freeze as well. Pass generic PM noirq hooks as arguments
instead, so that the function can possibly cover freeze case too.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The genpd status manipulation for hibernate restore has really never
worked as intended. For example, if the genpd->status was GENPD_STATE_ON,
the parent domain's `sd_count` must have been increased, so it needs to
be adjusted too. So drop this status manipulation.
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A common exploit pattern for ROP attacks is to abuse prepare_kernel_cred()
in order to construct escalated privileges[1]. Instead of providing a
short-hand argument (NULL) to the "daemon" argument to indicate using
init_cred as the base cred, require that "daemon" is always set to
an actual task. Replace all existing callers that were passing NULL
with &init_task.
Future attacks will need to have sufficiently powerful read/write
primitives to have found an appropriately privileged task and written it
to the ROP stack as an argument to succeed, which is similarly difficult
to the prior effort needed to escalate privileges before struct cred
existed: locate the current cred and overwrite the uid member.
This has the added benefit of meaning that prepare_kernel_cred() can no
longer exceed the privileges of the init task, which may have changed from
the original init_cred (e.g. dropping capabilities from the bounding set).
[1] https://google.com/search?q=commit_creds(prepare_kernel_cred(0))
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steve French <sfrench@samba.org>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Link: https://lore.kernel.org/r/20221026232943.never.775-kees@kernel.org
- Fix the documentation of the *_match_string() family of functions to
properly cover the return value (Andy Shevchenko).
- Fix a possible integer overflow during multiplication in the ACPI
PCC code (Manank Patel).
- Make the ACPI device resources code skip IRQ override on Asus
Vivobook S5602ZA (Tamim Khan).
- Add LATT2021 to the list of device IDs that are ignored when
returned by _DEP, because there are no drivers for them in the
kernel and no plans to add such drivers (Hans de Goede).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmNb7XoSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx4AgP/1Q78/7GisCGmymtggxYqYwo7Zr0F6GO
yZ/OA4+TQOHeYLjjDH3njlmQnQLyLm8xEwPw+9ThK3p35eVMy01UoQdC5nf2mz9o
mzvR2Vh/7Ed0W58DP4KEyo5IAKGBYWD3SGSoKY0H8v0VBcQt2QLO0LFeZjulTFnb
TsIXTMG/e5ymwYN1tDil/Fiwpr2HUq1D+/jL7bxjw5Pb/IkYCuQXv/5DPyi3TH6e
d8d0CRoc8HKBOjyvAhuQ6bfVotYE/qSOz3gpVXcBwQGiTkPW6Ytk65sfXLEMO0Bz
LTKyQTkduJHPiEMNw32iAwTCrRIfsu5s+98Z/gYCHDY1oojQBMoFQhELzL2/HnLu
1Ab0v9sm8M6yPqVMC2F+wTQLAjkP4LZuTGt/xBkZ4VYlJnpr03ChyBlwut1AjHAs
Xsspal6FDupGI6VY935QBOMwVF75WEFAoR/CfKFIJhkEGxN7VrxlXlLo6ks+06Bi
85dLk0CiTxBcm3bfRaCgLt5AsQYYaDhRxEB70Hs0R4n9gkCTmuiMyZ+hdJYw5SCc
t0sPFF1Fd2pqaZtTMchV7nH9oaAeM/K+6TztXcR5b4iuhoTPEorVR+3mJwb80HRl
Y+Fhl/T7Q2kS+W5LxINsjkKyHBxkEdd0JadEfRlAESOfpwkIZIac0SHVPfv/KGt4
RrV/T6dbzkfi
=OG6y
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and device properties fixes from Rafael Wysocki:
"These fix device properties documentation and the ACPI PCC code, add a
new IRQ override quirk for resource handling and add one more item to
the list of device IDs to be ignored when returned by _DEP.
Specifics:
- Fix the documentation of the *_match_string() family of functions
to properly cover the return value (Andy Shevchenko)
- Fix a possible integer overflow during multiplication in the ACPI
PCC code (Manank Patel)
- Make the ACPI device resources code skip IRQ override on Asus
Vivobook S5602ZA (Tamim Khan)
- Add LATT2021 to the list of device IDs that are ignored when
returned by _DEP, because there are no drivers for them in the
kernel and no plans to add such drivers (Hans de Goede)"
* tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: scan: Add LATT2021 to acpi_ignore_dep_ids[]
ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA
ACPI: PCC: Fix unintentional integer overflow
device property: Fix documentation for *_match_string() APIs
Platforms can provide the information about the availability of each
idle states via status flag. Platforms may have to disable one or more
idle states for various reasons like broken firmware or other unmet
dependencies.
Fix handling of such unavailable/disabled idle states by ignoring them
while parsing the states.
Fixes: a3381e3a65 ("PM / domains: Fix up domain-idle-states OF parsing")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The returned value on success is an index of the matching string,
starting from 0. Reflect this in the documentation.
Fixes: 3f5c8d3187 ("device property: Add fwnode_property_match_string()")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Constify parameter in device_dma_supported() and device_get_dma_attr()
since they don't alter anything related to it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device parameter is not altered in the device child node APIs,
constify them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fwnode and device parameters are not altered in the fwnode
connection match APIs, constify them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's not fully correct to take a const parameter pointer to a struct
and return a non-const pointer to a member of that struct.
Instead, introduce a const version of the dev_fwnode() API which takes
and returns const pointers and use it where it's applicable.
With this, convert dev_fwnode() to be a macro wrapper on top of const
and non-const APIs that chooses one based on the type.
Suggested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Fixes: aade55c860 ("device property: Add const qualifier to device_get_match_data() parameter")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Core code:
- Provide a generic wrapper which can be utilized in drivers to handle
the problem of force threaded demultiplex interrupts on RT enabled
kernels. This avoids conditionals and horrible quirks in drivers all
over the place.
- Fix up affected pinctrl and GPIO drivers to make them cleanly RT safe.
- Interrupt drivers:
- A new driver for the FSL MU platform specific MSI implementation.
- Make irqchip_init() available for pure ACPI based systems.
- Provide a functional DT binding for the Realtek RTL interrupt chip.
- The usual DT updates and small code improvements all over the place.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmNGxRYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWJyD/0emJAlIuD0DzkEkoAtnHSq7eyGFMpI
PFMyZ0IYXlVWuxEmQMyd7E9M+fmlRqnnhErg6x7jPW1bKzoyIn1A7eNE/cvhXPru
BiTy6g2o7pNegUh5bQrE8p0Yyq6/HsVO4YyE3RGxpUQVh/qwB+RKnzUY6RfDj87z
naQx10+15b+76SXvTQpIrvQTWhfTswk9un2MYDkjHctfVgjcnb/8dTPQuXsZrdTQ
VBWWwjLpCKcqqQS1e9MQqmQKpVqGs/DGW8XNTPk3jI4QF1fIHjhNdcoI51/lM4Ri
r912FPE8R48FS9g0dQgpMxGmHjikYpf3rXXosn8uyWkt5zNy6CXOEEg3DRIoAIdg
czKve+bgZZXUK/QcSSdPuPthBoLKQCG5MZsVFNF8IArmPCHaiYcOQBe7pel3U4cc
MpQe9yUXJI40XgwTAyAOlidjmD69384nEhzbI5d/AfJI5ssdXcBMrFN/xEeBDWdz
Dg2+Yle9HNglxBA6E3GX3yiaCQJxHFhKMnqd1zhxWjXFRzkfGF7bBpRj1j+vXnzN
ap/wMQuMlOWriWsH3UkZtFrC4PvgByGVfzlzYA076CjutyYfQolQ8k0bLHnp2VSu
VWUn4WATfaxJcqij7vyI9BYtFXdrB/yYhFasDBepQbDgiy8WEAmX+bObvXWs9XYa
UGVCNGsYx2TKMA==
=2ok5
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt updates from Thomas Gleixner:
"Core code:
- Provide a generic wrapper which can be utilized in drivers to
handle the problem of force threaded demultiplex interrupts on RT
enabled kernels. This avoids conditionals and horrible quirks in
drivers all over the place
- Fix up affected pinctrl and GPIO drivers to make them cleanly RT
safe
Interrupt drivers:
- A new driver for the FSL MU platform specific MSI implementation
- Make irqchip_init() available for pure ACPI based systems
- Provide a functional DT binding for the Realtek RTL interrupt chip
- The usual DT updates and small code improvements all over the
place"
* tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
irqchip: IMX_MU_MSI should depend on ARCH_MXC
irqchip/imx-mu-msi: Fix wrong register offset for 8ulp
irqchip/ls-extirq: Fix invalid wait context by avoiding to use regmap
dt-bindings: irqchip: Describe the IMX MU block as a MSI controller
irqchip: Add IMX MU MSI controller driver
dt-bindings: irqchip: renesas,irqc: Add r8a779g0 support
irqchip/gic-v3: Fix typo in comment
dt-bindings: interrupt-controller: ti,sci-intr: Fix missing reg property in the binding
dt-bindings: irqchip: ti,sci-inta: Fix warning for missing #interrupt-cells
irqchip: Allow extra fields to be passed to IRQCHIP_PLATFORM_DRIVER_END
platform-msi: Export symbol platform_msi_create_irq_domain()
irqchip/realtek-rtl: use parent interrupts
dt-bindings: interrupt-controller: realtek,rtl-intc: require parents
irqchip/realtek-rtl: use irq_domain_add_linear()
irqchip: Make irqchip_init() usable on pure ACPI systems
bcma: gpio: Use generic_handle_irq_safe()
gpio: mlxbf2: Use generic_handle_irq_safe()
platform/x86: intel_int0002_vgpio: Use generic_handle_irq_safe()
ssb: gpio: Use generic_handle_irq_safe()
pinctrl: amd: Use generic_handle_irq_safe()
...
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
* Add support for two new platforms (Zhang Rui).
* Adjust energy unit for Sapphire Rapids (Zhang Rui).
* Do not dump TRL if turbo is not supported (Artem Bityutskiy).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmNEVSISHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxKqUP/2Etq/OiSpMx7TYfLw9wpR6o+QOMSPuz
fjEByubypGBA387Z7F3siRBhy0tRiiU0h3Ti/cPtxkNVx7lLc+yEJYYvbkIvm9Fg
lhev8ky9m/JB9vq+r4GWnUWW3DntM84a3f/iNoqEMDeUUy0YhsPLfdIcEKN5y/qU
Q1BQ+4XY4PSwY0G315JT+EDxkMK/YcoJyi+1dlSO5iilwz9DJBT2iD8M+18Lz1k4
GWUryTvcANSimx4WzLFvCXGHagA5Yv7czgIMI3eBhTaln7P2G/Jn0LCrbVcHJk0t
dBgoK/FN9akbXzg6697No32soDLxf5uLzsUsYS3Xn/tdfp1b5jTX2o548qKjNBl7
x0ot2hUp713fS/4RVGeRTQcwzy1dMyzp35pMh+B61qKyVZTPcOXC6OcXRR5FmXOP
iZo84snmPRhQPT387HpRwhpv6Pm/M9EIdCpXY1e21V8fpGdwxAbo+sLN4Kh+bkbS
McxC7q5gOKFij8ucyVGUFQPMkkOuRi0kQMNVH8raobVUvsI8bAGau61LTpudyG6p
6kmcaKu4oEpl2cYpgQwdF5fT8NAp3H8cGuzkJRJqHijs0E4Kuqd9bigoPigUQTT2
1D92S/kiUcxJo4pJZ0eDaxB05LDPMeGjnm1DUz5kh8PJE060tjUMlaTG9Gh4HjlH
uQjgblji2NVc
=IsN+
-----END PGP SIGNATURE-----
Merge tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the turbostat utility, extend the macros used for
defining device power management callbacks and add a diagnostic
message to the generic power domains code.
Specifics:
- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan
Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
- Add support for two new platforms (Zhang Rui).
- Adjust energy unit for Sapphire Rapids (Zhang Rui).
- Do not dump TRL if turbo is not supported (Artem Bityutskiy)"
* tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/power turbostat: version 2022.10.04
tools/power turbostat: Use standard Energy Unit for SPR Dram RAPL domain
tools/power turbostat: Do not dump TRL if turbo is not supported
tools/power turbostat: Add support for MeteorLake platforms
tools/power turbostat: Add support for RPL-S
PM: Improve EXPORT_*_DEV_PM_OPS macros
PM: domains: log failures to register always-on domains
* Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
* The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
* The CD-ROM filesystems have been enabled in the defconfig.
* Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNAWgwTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYicSiEACmuB9WuGZmAasKvmPgz7thyLqakg7/
cE4YK0MxgJxkhsXzYSAv1Fn+WUfX7DSzhK4OOM5wEngAYul7QoFdc84MF0DYKO+E
InjdOvVavzUsWYqETNCuMHPRK6xyzvfHCqqBDDxKHx5jUoicCQfFwJyHLw+cvouR
7WSJoFdvOEV01QyN5Qw9bQp7ASx61ZZX1yE6OAPc2/EJlDEA2QSnjBAi4M+n2ZCx
ZsQz+Dp9RfSU8/nIr13oGiL3Zm+kyXwdOS/8PaDqtrkyiGh6+vSeGqZZwRLVITP/
oUxqGEgnn2eFBD1y8vjsQNWMLWoi9Av4746Fxr8CEHX+jX1cp9CCkU2OkkLxaFcv
6XFtXPJIh/UjzVgPmjZxK+ArEX28QOM5IVyBFxsSl0dNtvyVqKpBXCV1RQ+fFHkO
ntHF3ZxibqOn8ZJmziCn0nzWSOqugNTdAhD4dJAbl58RB/IQtQT0OnHpmpXCG3xh
+/JBzy//xkr7u2HMqU69PzwPtWwZrENUV6jl5SHUDUoW8pySng2Pl4pbmTFqgWty
JTfc5EdyWOWyshhoSCtK2//bnVFryl2ntwGr3LIZrZxkiUiOeYjn+C/YedXZIRob
yy2CN+QanW/FXdIa4GMNeGc9sGDApd3/RtP+8L9mV1kWK6OE0EVskkI1UMCGXrIP
5JoE1jLMVhjcKQ==
=LJg6
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
- The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
- The CD-ROM filesystems have been enabled in the defconfig.
- Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: enable THP_SWAP for RV64
RISC-V: Print SSTC in canonical order
riscv: compat: s/failed/unsupported if compat mode isn't supported
RISC-V: Increase range and default value of NR_CPUS
cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
perf: RISC-V: throttle perf events
perf: RISC-V: exclude invalid pmu counters from SBI calls
riscv: enable CD-ROM file systems in defconfig
riscv: topology: fix default topology reporting
arm64: topology: move store_cpu_topology() to shared code
am sending out early due to me travelling next week. There is a
lone mm patch for which Andrew gave an informal ack at
https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org.
I will send the bulk of ARM work, as well as other
architectures, at the end of next week.
ARM:
* Account stage2 page table allocations in memory stats.
x86:
* Account EPT/NPT arm64 page table allocations in memory stats.
* Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses.
* Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of
Hyper-V now support eVMCS fields associated with features that are
enumerated to the guest.
* Use KVM's sanitized VMCS config as the basis for the values of nested VMX
capabilities MSRs.
* A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed
a longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed
for good.
* A handful of fixes for memory leaks in error paths.
* Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow.
* Never write to memory from non-sleepable kvm_vcpu_check_block()
* Selftests refinements and cleanups.
* Misc typo cleanups.
Generic:
* remove KVM_REQ_UNHALT
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz
QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/
FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4
3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J
mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE
+cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig==
=/hqX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"The first batch of KVM patches, mostly covering x86.
ARM:
- Account stage2 page table allocations in memory stats
x86:
- Account EPT/NPT arm64 page table allocations in memory stats
- Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR
accesses
- Drop eVMCS controls filtering for KVM on Hyper-V, all known
versions of Hyper-V now support eVMCS fields associated with
features that are enumerated to the guest
- Use KVM's sanitized VMCS config as the basis for the values of
nested VMX capabilities MSRs
- A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed a
longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed for
good
- A handful of fixes for memory leaks in error paths
- Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow
- Never write to memory from non-sleepable kvm_vcpu_check_block()
- Selftests refinements and cleanups
- Misc typo cleanups
Generic:
- remove KVM_REQ_UNHALT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
KVM: remove KVM_REQ_UNHALT
KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM: x86: never write to memory from kvm_vcpu_check_block()
KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending
KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter
KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set
KVM: x86: lapic does not have to process INIT if it is blocked
KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific
KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed
KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
KVM: x86: make vendor code check for all nested events
mailmap: Update Oliver's email address
KVM: x86: Allow force_emulation_prefix to be written without a reload
KVM: selftests: Add an x86-only test to verify nested exception queueing
KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events()
KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior
KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
...
Here is the big set of driver core and debug printk changes for 6.1-rc1.
Included in here is:
- dynamic debug updates for the core and the drm subsystem. The
drm changes have all been acked by the relevant maintainers.
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they
were not being used and they really did not actually do
anything.)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0BYUA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylozwCdFRlcghaf7XBUyNgRZRwMC+oQI8EAn1G/nEDE
6aFd2er41uK0IGQnSmYO
=OK0k
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core and debug printk changes for
6.1-rc1. Included in here is:
- dynamic debug updates for the core and the drm subsystem. The drm
changes have all been acked by the relevant maintainers
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they were
not being used and they really did not actually do anything)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits)
docs: filesystems: sysfs: Make text and code for ->show() consistent
Documentation: NBD_REQUEST_MAGIC isn't a magic number
a.out: restore CMAGIC
device property: Add const qualifier to device_get_match_data() parameter
drm_print: add _ddebug descriptor to drm_*dbg prototypes
drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print: optimize drm_debug_enabled for jump-label
drm-print: add drm_dbg_driver to improve namespace symmetry
drm-print.h: include dyndbg header
drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
drm_print: interpose drm_*dbg with forwarding macros
drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
drm_print: condense enum drm_debug_category
debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops
driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs()
Documentation: ENI155_MAGIC isn't a magic number
Documentation: NBD_REPLY_MAGIC isn't a magic number
nbd: remove define-only NBD_MAGIC, previously magic number
Documentation: FW_HEADER_MAGIC isn't a magic number
Documentation: EEPROM_MAGIC_VALUE isn't a magic number
...
This has been a busy release for regmap with one thing and other,
there's been an especially large interest in MMIO regmaps for some
reason. The bulk of the changes are cleanups but there are several user
visible changes too:
- Support for I/O ports in regmap-mmio.
- Support for accelerated noinc operations in regmap-mmio.
- Support for tracing the register values in bulk operations.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmM6sskACgkQJNaLcl1U
h9Bj1Qf+PLNz4gQq7ki06KI6+Liz6daN5j/bfUGSizx6rns5qOZxt1A/CwDFM5ZR
6aY+tGL4ksYfEUEZHsr7qaQKptWoLmwDVX0oYZqHdBf4Wf7I6iht8WCq68KwDCzz
zQoswzoLmVuRd6aplFifEF3SOqjBrTQO3gkXBteIeA6/i1pwO9wOJdM4ZU54FX+Q
zexv7H/E9uKVonrViBMLPczaPhge4+ILNEDekSUW4AZ0RmUZ6JjW3aOZbwR6ut1x
7bS3ric9xGhW4IQdOZISY+ARhPPworcgdK5GoqBfjWV2vYc0c1iCawvF73Wm/NJC
GMGc5FIBi3a82oCMmSR1dAcci9CRLw==
=TTeQ
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This has been a busy release for regmap with one thing and other,
there's been an especially large interest in MMIO regmaps for some
reason. The bulk of the changes are cleanups but there are several
user visible changes too:
- Support for I/O ports in regmap-mmio
- Support for accelerated noinc operations in regmap-mmio
- Support for tracing the register values in bulk operations"
* tag 'regmap-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: mmio: replace return 0 with break in switch statement
regmap: spi-avmm: Use swabXX_array() helpers
regmap: mmio: Use swabXX_array() helpers
swab: Add array operations
regmap: trace: Remove unneeded blank lines
regmap: trace: Remove explicit castings
regmap: trace: Remove useless check for NULL for bulk ops
regmap: mmio: Fix rebase error
regmap: check right noinc bounds in debug print
regmap: introduce value tracing for regmap bulk operations
regmap/hexagon: Properly fix the generic IO helpers
regmap: mmio: Support accelerared noinc operations
regmap: Support accelerated noinc operations
regmap: Make use of get_unaligned_be24(), put_unaligned_be24()
regmap: mmio: Fix MMIO accessors to avoid talking to IO port
regmap: mmio: Introduce IO accessors that can talk to IO port
regmap: mmio: Get rid of broken 64-bit IO
regmap: mmio: Remove mmio_relaxed member from context
Always-on PM domains must be on during initialisation or the domain is
currently silently rejected.
Print an error message in case an always-on domain is not on to make it
easier to debug drivers getting this wrong (e.g. by setting an always-on
genpd flag without making sure that the state matches).
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- A new driver for the FSL MU widget that provides platform MSI
- An update for the Realtek RTL irqchip to use a DT binding that
actually describes the hardware
- A handful of DT updates, as well as minor code and spelling fixes
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5iLgPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDvUgP/1XJGuYUsx3yvhe+JQNlO4R/T7DoH4rKhVNI
KU2UPubXzE7e9D5Z91Po9QBF+Y/kyBdIL9Gh9GcK79dQzAJdPy4WxYPX3dWXKuvi
swnmOSeoTJgEXHYan7yViRS+lVk6QOrXUhpL2fiRlG4QOoX0KpMljPhr5+P0LM7Y
QmerbXAuufa1sGz51Cdsptn16hSNke8MMJ54UB0y9gHL0Rp8VuIqQCIno9t93RKM
JXCKpTFUwMbtGX0twREyN6kPCVb9cbXL8NGj1jz+MoUcesCnsaObxmIST/cxTml8
79U+5PVu+HCBE/sVco0MVKBMHw4MAnHZyQl4+8snsyH7NVlOJFQ9VH2+1GykOjJ+
4TCHZVUbHAgpkp1cB85tlANGUSdpn9mAR+nPc70eNTOYoSXgNITstjea0oIMVusA
KavzKzUCGuHeyhVeEpCdreaxfcUhpiSfYXbLIfVzrzYTsbmxLXcqHxwETxAFduW5
kp9owyj1ZpEmqQHTqHM7YgALjOq/u08/IJ55eY50XPy4/eztaOS+quMYWHc6k3TR
M6lqtx+AGBCc62lUibLf6O+tsJjguM/98zLdVhiWThSNsMhjQ8yMYPwlr6VDVCvi
FnsvYLxHi6rhJlEK+1WOMZ3Omub8l9dkrL/jVDPPCkHwkGtHB0nmGQLjpMryN01v
uhtroZZ4
=JTl5
-----END PGP SIGNATURE-----
Merge tag 'irqchip-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates from Marc Zyngier:
- A new driver for the FSL MU widget that provides platform MSI
- An update for the Realtek RTL irqchip to use a DT binding that
actually describes the hardware
- A handful of DT updates, as well as minor code and spelling fixes
Link: https://lore.kernel.org/r/20221002125554.3902840-1-maz@kernel.org
The memory-notify-based approach aims to handle meory-less nodes, however,
it just adds the complexity of code as pointed by David in thread [1].
The handling of memory-less nodes is introduced by commit 4faf8d950e
("hugetlb: handle memory hot-plug events"). >From its commit message, we
cannot find any necessity of handling this case. So, we can simply
register/unregister sysfs entries in register_node/unregister_node to
simlify the code.
BTW, hotplug callback added because in hugetlb_register_all_nodes() we
register sysfs nodes only for N_MEMORY nodes, seeing commit 9b5e5d0fdc,
which said it was a preparation for handling memory-less nodes via memory
hotplug. Since we want to remove memory hotplug, so make sure we only
register per-node sysfs for online (N_ONLINE) nodes in
hugetlb_register_all_nodes().
https://lore.kernel.org/linux-mm/60933ffc-b850-976c-78a0-0ee6e0ea9ef0@redhat.com/ [1]
Link: https://lkml.kernel.org/r/20220914072603.60293-3-songmuchun@bytedance.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "simplify handling of per-node sysfs creation and removal",
v4.
This patch (of 2):
The following commit offload per-node sysfs creation and removal to a
kworker and did not say why it is needed. And it also said "I don't know
that this is absolutely required". It seems like the author was not sure
as well. Since it only complicates the code, this patch will revert the
changes to simplify the code.
39da08cb07 ("hugetlb: offload per node attribute registrations")
We could use memory hotplug notifier to do per-node sysfs creation and
removal instead of inserting those operations to node registration and
unregistration. Then, it can reduce the code coupling between node.c and
hugetlb.c. Also, it can simplify the code.
Link: https://lkml.kernel.org/r/20220914072603.60293-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220914072603.60293-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
Smythies).
- Update the AMD P-state driver (Perry Yuan):
* Fix wrong lowest perf fetch.
* Map desired perf into pstate scope for powersave governor.
* Update pstate frequency transition delay time.
* Fix initial highest_perf value.
* Clean up.
- Move max CPU capacity to sugov_policy in the schedutil cpufreq
governor (Lukasz Luba).
- Add SM6115 to cpufreq-dt blocklist (Adam Skladowski).
- Add support for Tegra239 and minor cleanups (Sumit Gupta, ye xingchen,
and Yang Yingliang).
- Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen Yan,
and Viresh Kumar).
- Minor cleanups around functions called at module_init() (Xiu Jianfeng).
- Use module_init and add module_exit for bmips driver (Zhang Jianhua).
- Add AlderLake-N support to intel_idle (Zhang Rui).
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang).
- Remove redundant check from cpuidle_switch_governor() (Yu Liao).
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang).
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang).
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki).
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello).
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang).
- Update the intel_rapl power capping driver:
* Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
* Add support for RAPTORLAKE_S (Zhang Rui).
* Fix UBSAN shift-out-of-bounds issue (Chao Qin).
- Handle -EPROBE_DEFER when regulator is not probed on
mtk-ci-devfreq.c (AngeloGioacchino Del Regno).
- Fix message typo and use dev_err_probe() in rockchip-dfi.c
(Christophe JAILLET).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OrYSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxeKAP/jFiZ1lhTGRngiVLMV6a6SSSy5xzzXZZ
b/V0oqsuUvWWo6CzVmfU4QfmKGr55+77NgI9Yh5qN6zJTEJmunuCYwVD80KdxPDJ
8SjMUNCACiVwfryLR1gFJlO+0BN4CWTxvto2gjGxzm0l1UQBACf71wm9MQCP8b7A
gcBNuOtM7o5NLywDB+/528SiF9AXfZKjkwXhJACimak5yQytaCJaqtOWtcG2KqYF
USunmqSB3IIVkAa5LJcwloc8wxHYo5mTPaWGGuSA65hfF42k3vJQ2/b8v8oTVza7
bKzhegErIYtL6B9FjB+P1FyknNOvT7BYr+4RSGLvaPySfjMn1bwz9fM1Epo59Guk
Azz3ExpaPixDh+x7b89W1Gb751FZU/zlWT+h1CNy5sOP/ChfxgCEBHw0mnWJ2Y0u
CPcI/Ch0FNQHG+PdbdGlyfvORHVh7te/t6dOhoEHXBue+1r3VkOo8tRGY9x+2IrX
/JB968u1r0oajF0btGwaDdbbWlyMRTzjrxVl3bwsuz/Kv/0JxsryND2JT0zkKAMZ
qYT29HQxhdE0Duw1chgAK6X+BsgP58Bu6LeM3mVcwnGPZE9QvcFa0GQh7z+H71AW
3yOGNmMVMqQSThBYFC6GDi7O2N1UEsLOMV9+ThTRh6D11nU4uiITM5QVIn8nWZGR
z3IZ52Jg0oeJ
=+3IL
-----END PGP SIGNATURE-----
Merge tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for some new hardware, extend the existing hardware
support, fix some issues and clean up code
Specifics:
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
Smythies)
- Update the AMD P-state driver (Perry Yuan):
- Fix wrong lowest perf fetch
- Map desired perf into pstate scope for powersave governor
- Update pstate frequency transition delay time
- Fix initial highest_perf value
- Clean up
- Move max CPU capacity to sugov_policy in the schedutil cpufreq
governor (Lukasz Luba)
- Add SM6115 to cpufreq-dt blocklist (Adam Skladowski)
- Add support for Tegra239 and minor cleanups (Sumit Gupta, ye
xingchen, and Yang Yingliang)
- Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen
Yan, and Viresh Kumar)
- Minor cleanups around functions called at module_init() (Xiu
Jianfeng)
- Use module_init and add module_exit for bmips driver (Zhang
Jianhua)
- Add AlderLake-N support to intel_idle (Zhang Rui)
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang)
- Remove redundant check from cpuidle_switch_governor() (Yu Liao)
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang)
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang)
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki)
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello)
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang)
- Update the intel_rapl power capping driver:
- Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
- Add support for RAPTORLAKE_S (Zhang Rui).
- Fix UBSAN shift-out-of-bounds issue (Chao Qin)
- Handle -EPROBE_DEFER when regulator is not probed on
mtk-ci-devfreq.c (AngeloGioacchino Del Regno)
- Fix message typo and use dev_err_probe() in rockchip-dfi.c
(Christophe JAILLET)"
* tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
cpufreq: qcom-cpufreq-hw: Add cpufreq qos for LMh
cpufreq: Add __init annotation to module init funcs
cpufreq: tegra194: change tegra239_cpufreq_soc to static
PM / devfreq: rockchip-dfi: Fix an error message
PM / devfreq: mtk-cci: Handle sram regulator probe deferral
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
intel_idle: Add AlderLake-N support
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
cpufreq: tegra194: Add support for Tegra239
cpufreq: qcom-cpufreq-hw: Fix uninitialized throttled_freq warning
cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode
powercap: intel_rapl: Add support for RAPTORLAKE_S
cpufreq: amd-pstate: Fix initial highest_perf value
cpuidle: Remove redundant check in cpuidle_switch_governor()
PM: wakeup: Add extra debugging statement for multiple active IRQs
cpufreq: tegra194: Remove the unneeded result variable
PM: suspend: move from strlcpy() with unused retval to strscpy()
intel_idle: move from strlcpy() with unused retval to strscpy()
cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
...
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki).
- Rename ACPI device object reference counting functions (Rafael
Wysocki).
- Rearrange ACPI device object initialization code (Rafael Wysocki).
- Drop parent field from struct acpi_device (Rafael Wysocki).
- Extend the the int3472-tps68470 driver to support multiple consumers
of a single TPS68470 along with the requisite framework-level
support (Daniel Scally).
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus).
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw).
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus).
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello).
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry).
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen).
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv).
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko).
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello).
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin).
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede).
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner).
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
- Drop unneeded result variable from ec_write() (ye xingchen).
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun
Guo).
- Reorder symbols to get rid of a few forward declarations in the ACPI
fan driver (Uwe Kleine-König).
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander).
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam).
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki).
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang).
- Do not initialize ret in main() in the pfrut utility (Shi junming).
- Drop useless ACPI DSDT override documentation (Rafael Wysocki).
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare).
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into an
integer value (Andy Shevchenko).
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko).
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OhkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/TkQALQ4TN451dPSj9jcYSNY6qZ/9b4P9Iym
TmRf3wO3+IVZQ8JajeKKRuVKNsW3sC0RcFkJJVmgZkydJBr1Uui2L0ZLzi8axGNy
RlbZm5NyBeFnlP0fA8Gb2iRMXVAUcRIx+RvZCulxxFmgQ8UhoU4wlVZWlEcko4TQ
hGp++lJYcRHR1NbVLSXZhFvzopKLdhGL6vB1Awsjb/I7TVqn23+k4jVRV1DYkIQ7
qgFM+Z7osRVZiVQbaPoOgdykeSa43qXu7Vgs7F/QeJuIiUYx59xDh0/WCJBxnuDM
cHGiaNnvuJghKmCg43X8+joaHEH/jCFyvBVGfiSzRvjz03WOPRs1XztwdEiCi+py
RcZGzrPaXmkCjNeytPRooiifyqm95HT7aMBN/aTvKBXDaGRrfPheXF+i2idl24HM
NrHqMaa0+5qoDGHLUEaf5znlCHfS+3lwq6+lGVrq/UGf6B3cP+9HwOyevEW493JX
4nuv69Y517moR9W3mBU8sAn5mUjshcka7pghRj7QnuoqRqWLbU3lIz8oUDHr84cI
ixpIPvt2KlZ5UjnN9aqu/6k70JkJvy4SrKjnx4iqu03ePmMrRc0Hcpy7+VMlgumD
tgN9aW+YDgy0/Z5QmO1MOvFodVmA5sX6+gnX1neAjuDdIo3LkJptlkO1fCx2jfQu
cgPQk1CtPOos
=xyUK
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"ACPI and PNP updates for 6.1-rc1.
These rearrange the ACPI device object initialization code (to get rid
of a redundant parent pointer from struct acpi_device among other
things), unify the _UID handling, drop support for some _OSI strings
that should not be necessary any more, add new IDs to support more
hardware and some more quirks, fix a few issues and clean up code all
over.
Specifics:
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki)
- Rename ACPI device object reference counting functions (Rafael
Wysocki)
- Rearrange ACPI device object initialization code (Rafael Wysocki)
- Drop parent field from struct acpi_device (Rafael Wysocki)
- Extend the the int3472-tps68470 driver to support multiple
consumers of a single TPS68470 along with the requisite
framework-level support (Daniel Scally)
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus)
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw)
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus)
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello)
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry)
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen)
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv)
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko)
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello)
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin)
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede)
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner)
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton)
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan)
- Fix Tx acknowledge in the PCC address space handler (Huisong Li)
- Use wait_for_completion_timeout() for PCC mailbox operations
(Huisong Li)
- Release resources on PCC address space setup failure path (Rafael
Mendonca)
- Remove unneeded result variables from APEI code (ye xingchen)
- Print total number of records found during BERT log parsing (Dmitry
Monakhov)
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello)
- Drop unneeded result variable from ec_write() (ye xingchen)
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver
(Hanjun Guo)
- Reorder symbols to get rid of a few forward declarations in the
ACPI fan driver (Uwe Kleine-König)
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander)
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam)
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki)
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang)
- Do not initialize ret in main() in the pfrut utility (Shi junming)
- Drop useless ACPI DSDT override documentation (Rafael Wysocki)
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare)
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into
an integer value (Andy Shevchenko)
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko)
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui)"
* tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits)
ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device()
ACPI: LPSS: Replace loop with first entry retrieval
ACPI: x86: s2idle: Add another ID to s2idle_dmi_table
ACPI: x86: s2idle: Fix a NULL pointer dereference
MAINTAINERS: Drop records pointing to 01.org/linux-acpi
ACPI: Kconfig: Drop link to https://01.org/linux-acpi
ACPI: docs: Drop useless DSDT override documentation
ACPI: DPTF: Drop stale link from Kconfig help
ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13
ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7
ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14
ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE
ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID
ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt
ACPI: x86: s2idle: Move _HID handling for AMD systems into structures
platform/x86: int3472: Add board data for Surface Go2 IR camera
platform/x86: int3472: Support multiple gpio lookups in board data
platform/x86: int3472: Support multiple clock consumers
ACPI: bus: Add iterator for dependent devices
ACPI: scan: Add acpi_dev_get_next_consumer_dev()
...
Merge cpuidle changes, PM core changes and power capping changes for
6.1-rc1:
- Add AlderLake-N support to intel_idle (Zhang Rui).
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang).
- Remove redundant check from cpuidle_switch_governor() (Yu Liao).
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang).
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang).
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki).
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello).
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang).
- Update the intel_rapl power capping driver:
* Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
* Add support for RAPTORLAKE_S (Zhang Rui).
* Fix UBSAN shift-out-of-bounds issue (Chao Qin).
* pm-cpuidle:
intel_idle: Add AlderLake-N support
cpuidle: Remove redundant check in cpuidle_switch_governor()
intel_idle: move from strlcpy() with unused retval to strscpy()
cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
cpuidle: coupled: Drop duplicate word from a comment
* pm-core:
PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
* pm-sleep:
PM: wakeup: Add extra debugging statement for multiple active IRQs
PM: suspend: move from strlcpy() with unused retval to strscpy()
* powercap:
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
powercap: intel_rapl: Add support for RAPTORLAKE_S
Merge new material related to CPPC, PCC, APEI and OSI strings handling
for 6.1-rc1:
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
* acpi-cppc:
ACPI: CPPC: Disable FIE if registers in PCC regions
ACPI: CPPC: Add ACPI disabled check to acpi_cpc_valid()
* acpi-pcc:
ACPI: PCC: Fix Tx acknowledge in the PCC address space handler
ACPI: PCC: replace wait_for_completion()
ACPI: PCC: Release resources on address space setup failure path
* acpi-apei:
ACPI: APEI: Remove unneeded result variables
ACPI: APEI: Add BERT error log footer
* acpi-osi:
ACPI: OSI: Update Documentation on custom _OSI strings
ACPI: OSI: Remove Linux-HPI-Hybrid-Graphics _OSI string
ACPI: OSI: Remove Linux-Lenovo-NV-HDMI-Audio _OSI string
ACPI: OSI: Remove Linux-Dell-Video _OSI string
The prospective callers of rpm_resume() passing RPM_NOWAIT to it may
be confused when it returns 0 without actually resuming the device
which may happen if the device is suspending at the given time and it
will only resume when the suspend in progress has completed. To avoid
that confusion, return -EINPROGRESS from rpm_resume() in that case.
Since none of the current callers passing RPM_NOWAIT to rpm_resume()
check its return value, this change has no functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Add const qualifier to the device_get_match_data() parameter.
Some of the future users may utilize this function without
forcing the type.
All the same, dev_fwnode() may be used with a const qualifier.
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220922135410.49694-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use IS_ERR_OR_NULL() helper in device_create_groups_vargs() to simplify code
and improve readiblity. No functional change.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220914140753.3799982-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In following scenario(diagram), when one thread X running dev_coredumpm()
adds devcd device to the framework which sends uevent notification to
userspace and another thread Y reads this uevent and call to
devcd_data_write() which eventually try to delete the queued timer that
is not initialized/queued yet.
So, debug object reports some warning and in the meantime, timer is
initialized and queued from X path. and from Y path, it gets reinitialized
again and timer->entry.pprev=NULL and try_to_grab_pending() stucks.
To fix this, introduce mutex and a boolean flag to serialize the behaviour.
cpu0(X) cpu1(Y)
dev_coredump() uevent sent to user space
device_add() ======================> user space process Y reads the
uevents writes to devcd fd
which results into writes to
devcd_data_write()
mod_delayed_work()
try_to_grab_pending()
del_timer()
debug_assert_init()
INIT_DELAYED_WORK()
schedule_delayed_work()
debug_object_fixup()
timer_fixup_assert_init()
timer_setup()
do_init_timer()
/*
Above call reinitializes
the timer to
timer->entry.pprev=NULL
and this will be checked
later in timer_pending() call.
*/
timer_pending()
!hlist_unhashed_lockless(&timer->entry)
!h->pprev
/*
del_timer() checks h->pprev and finds
it to be NULL due to which
try_to_grab_pending() stucks.
*/
Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/1663073424-13663-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This merges the driver core changes in 6.0-rc7 into driver-core-next as
they are needed here as well for testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Variable min_stride is assigned a value that is never read, fix this by
replacing the return 0 with a break statement. This also makes the case
statement consistent with the other cases in the switch statement.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220922080445.818020-1-colin.i.king@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 71066545b4.
It causes boot problems on some systems, so revert it for now until it
is worked out.
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Saravana Kannan <saravanak@google.com>
Fixes: 71066545b4 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Olof Johansson <olof@lixom.net>
Link: https://lore.kernel.org/r/CAOesGMjQHhTUMBGHQcME4JBkZCof2NEQ4gaM1GWFgH40+LN9AQ@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMeQ2keHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYRMH+gLNHiGirGZlm2GQ
tKaZQUy7MiXuIP0hGDonDIIIAmIVhnjm9MDG8KT4W8AvEd7ukncyYqJfwWeWQPhP
4mZcf6l3Z8Ke+qiaFpXpMPCxTyWcln1ox0EoNx2g9gdPxZntaRuuaTQVljUfTiey
aVPHxve8ip3G7jDoJnuLSxESOqWxkb8v/SshBP1E5bF5BZ+cgZRqq7FNigFqxjbk
wF29K09BVOPjdgkSvY/b0/SnL5KlSdMAv+FrPcJNGivcdIPgf/qJks5cI2HRUo7o
CpKgbcLorCVyD+d+zLonJBwIy3arbmKD8JqYnfdTSIqVOUqHXWUDfeydsH32u1Gu
lPSI2Hw=
=7LTL
-----END PGP SIGNATURE-----
Merge 6.0-rc5 into driver-core-next
We need the driver core and debugfs changes in this branch.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Directly check state of struct memory_block, no need a single function.
Link: https://lkml.kernel.org/r/20220827112043.187028-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYxuERw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylPqwCgjU6xlN2y/80HH+66k+yyzlxocE8AoLPgnGrA
dJZIGWFXExzO26tvMT52
=zGHA
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems"
* tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
arch_topology: Make cluster topology span at least SMT CPUs
sched/debug: fix dentry leak in update_sched_domain_debugfs
debugfs: add debugfs_lookup_and_remove()
driver core: fix driver_set_override() issue with empty strings
Revert "arch_topology: Make cluster topology span at least SMT CPUs"
arch_topology: Make cluster topology span at least SMT CPUs
make_class_name has been removed since
commit 39aba963d9 ("driver core: remove CONFIG_SYSFS_DEPRECATED_V2
but keep it for block devices"), so remove it.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://lore.kernel.org/r/20220909063337.1146151-1-cuigaosheng1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A fix for how we handle controller constraints on SPI message sizes,
only impacting systems with SPI controllers with very low limits like
the AMD controller used in the Steam Deck.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmMZ3OEACgkQJNaLcl1U
h9CKOwf+O49lVDWbWr1YSR1hLvUtqV+rZsk5+kHCurTbEsa6eOuajx9hqHtEpSg9
CDHODkBk1TZmyfhS6/qWU5YUVGFpskkxebsJkxvYOCqaTi5UCDCLMRTWTP4yW5Gq
YRVgjFRmPAcnfaJCFB9/1l6NOMez0EfbWwApPPh7PNYp/KiexfzTFYzWbwLLhvmX
piY2tkG1SXrUjszFOZdXeukPU+HdQy40u1V7Akcs9ihwRMiIjeo0iqlE/R8QO5i4
vgevTPKPhsQJWGnZsqJvu9gtG6+Eoac6Kp4nqk4W/OVXpQNBhDHTI94D7X78+5hS
xUwgHjYT4PjR9HmuYMksgjYov4DmFw==
=ZO8J
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix for how we handle controller constraints on SPI message sizes,
only impacting systems with SPI controllers with very low limits like
the AMD controller used in the Steam Deck"
* tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: spi: Reserve space for register address/padding
Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.
Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"
We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain
Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.
Fixes: bfcc439743 ("arch_topology: Limit span of cpu_clustergroup_mask()")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220905122615.12946-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since we have a few helpers to swab elements of a given size in an array
use them instead of open coded variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831212744.56435-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Since we have a few helpers to swab elements of a given size in an array
use them instead of open coded variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831212744.56435-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is a few unneeded blank lines in some of event definitions,
remove them in order to make those definitions consistent with
the rest.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901132336.33234-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no need to have explicit castings to the same type the
variables are of. Remove the explicit castings.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220901132336.33234-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
If the buffer pointer is NULL we already are in troubles since
regmap bulk API expects caller to provide valid parameters,
it dereferences that without any checks before we call for
traces.
Moreover, the current code will print garbage in the case of
buffer is NULL and length is not 0.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901132336.33234-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Python likes to send an empty string for some sysfs files, including the
driver_override field. When commit 23d99baf9d ("PCI: Use
driver_set_override() instead of open-coding") moved the PCI core to use
the driver core function instead of hand-rolling their own handler, this
showed up as a regression from some userspace tools, like DPDK.
Fix this up by actually looking at the length of the string first
instead of trusting that userspace got it correct.
Fixes: 23d99baf9d ("PCI: Use driver_set_override() instead of open-coding")
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@kernel.org>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220901163734.3583106-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit cb1f65c1e1 ("PM: s2idle: ACPI: Fix wakeup interrupts
handling") was introduced the kernel can now handle multiple
simultaneous interrupts during wakeup. Ths uncovered some existing
subtle firmware bugs where multiple IRQs are unintentionally active.
To help with fixing those bugs add an extra message when PM debugging
is enabled that can show the individual IRQs triggered as if a variety
are fired they'll potentially be lost as /sys/power/pm_wakeup_irq only
contains the first one that triggered the wakeup after resume is
complete but all may be needed to demonstrate the whole picture.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215770
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[ rjw: Added empty line after if () ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 6b66ca0bac as it
breaks the build on some arches as reported by the kernel test robot.
Link: https://lore.kernel.org/r/202209030824.SouwDV5M-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 6b66ca0bac ("arch_topology: Make cluster topology span at least SMT CPUs")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.
Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic \
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"
We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain
Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.
Fixes: bfcc439743 ("arch_topology: Limit span of cpu_clustergroup_mask()")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220825092007.8129-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the gfp flag used for the memory allocation already has __GFP_ZERO,
then there is no need to explicitly clear the "struct devres_node". It is
already zeroed.
This saves a few cycles when using devm_zalloc() and co.
In the case of devres_alloc() (which calls __devres_alloc_node()), the
compiler could remove the test and the memset() because it should be able
to see that the __GFP_ZERO flag is set.
So this would make the code both faster and smaller.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/d255bd871484e63cdd628e819f929e2df59afb02.1658352383.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case of firmware-upload, an instance of struct fw_upload is
allocated in firmware_upload_register(). This data needs to be freed
in fw_dev_release(). Create a new fw_upload_free() function in
sysfs_upload.c to handle the firmware-upload specific memory frees
and incorporate the missing kfree call for the fw_upload structure.
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220831002518.465274-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the following code within firmware_upload_unregister(), the call to
device_unregister() could result in the dev_release function freeing the
fw_upload_priv structure before it is dereferenced for the call to
module_put(). This bug was found by the kernel test robot using
CONFIG_KASAN while running the firmware selftests.
device_unregister(&fw_sysfs->dev);
module_put(fw_upload_priv->module);
The problem is fixed by copying fw_upload_priv->module to a local variable
for use when calling device_unregister().
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220829174557.437047-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Architectures which do not have cacheinfo such as ARM 32-bit would spit
out the following during boot:
Early cacheinfo failed, ret = -2
Treat -ENOENT specifically to silence this error since it means that the
platform does not support reporting its cache information.
Fixes: 3fcbf1c77d ("arch_topology: Fix cache attributes detection in the CPU hotplug path")
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220805230736.1562801-1-f.fainelli@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A dangling pointless "ret 0" was left in and some unneeded
whitespace can go too.
Fixes: 81c0386c13 ("regmap: mmio: Support accelerared noinc operations")
Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831141303.501548-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Make acpi_cpc_valid() check if ACPI is disabled, so that its callers
don't need to check that separately. This will also cause the AMD
pstate driver to refuse to load right away when ACPI is disabled.
Also update the warning message in amd_pstate_init() to mention the
ACPI disabled case for completeness.
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We keep track of several kernel memory stats (total kernel memory, page
tables, stack, vmalloc, etc) on multiple levels (global, per-node,
per-memcg, etc). These stats give insights to users to how much memory
is used by the kernel and for what purposes.
Currently, memory used by KVM mmu is not accounted in any of those
kernel memory stats. This patch series accounts the memory pages
used by KVM for page tables in those stats in a new
NR_SECONDARY_PAGETABLE stat. This stat can be later extended to account
for other types of secondary pages tables (e.g. iommu page tables).
KVM has a decent number of large allocations that aren't for page
tables, but for most of them, the number/size of those allocations
scales linearly with either the number of vCPUs or the amount of memory
assigned to the VM. KVM's secondary page table allocations do not scale
linearly, especially when nested virtualization is in use.
From a KVM perspective, NR_SECONDARY_PAGETABLE will scale with KVM's
per-VM pages_{4k,2m,1g} stats unless the guest is doing something
bizarre (e.g. accessing only 4kb chunks of 2mb pages so that KVM is
forced to allocate a large number of page tables even though the guest
isn't accessing that much memory). However, someone would need to either
understand how KVM works to make that connection, or know (or be told) to
go look at KVM's stats if they're running VMs to better decipher the stats.
Furthermore, having NR_PAGETABLE side-by-side with NR_SECONDARY_PAGETABLE
is informative. For example, when backing a VM with THP vs. HugeTLB,
NR_SECONDARY_PAGETABLE is roughly the same, but NR_PAGETABLE is an order
of magnitude higher with THP. So having this stat will at the very least
prove to be useful for understanding tradeoffs between VM backing types,
and likely even steer folks towards potential optimizations.
The original discussion with more details about the rationale:
https://lore.kernel.org/all/87ilqoi77b.wl-maz@kernel.org
This stat will be used by subsequent patches to count KVM mmu
memory usage.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220823004639.2387269-2-yosryahmed@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
We were using the wrong bound in the debug prints: this
needs to be the number of elements, not the number of bytes,
since we're indexing into an element-size typed array.
Fixes: c20cc099b3 ("regmap: Support accelerated noinc operations")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220823135700.265019-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently, only one-register io operations support tracepoints with
value logging. For the regmap bulk operations developer can view
hw_start/hw_done tracepoints with starting reg number and registers
count to be reading or writing. This patch injects tracepoints with
dumping registers values in the hex format to regmap bulk reading
and writing.
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Link: https://lore.kernel.org/r/20220816181451.5628-1-ddrokosov@sberdevices.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 9cbffc7a59.
There are a few more issues to fix that have been reported in the thread
for the original series [1]. We'll need to fix those before this will work.
So, revert it for now.
[1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/
Fixes: 9cbffc7a59 ("driver core: Delete driver_deferred_probe_check_state()")
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the max_raw_read and max_raw_write limits in regmap_spi struct
do not take into account the additional size of the transmitted register
address and padding. This may result in exceeding the maximum permitted
SPI message size, which could cause undefined behaviour, e.g. data
corruption.
Fix regmap_get_spi_bus() to properly adjust the above mentioned limits
by reserving space for the register address/padding as set in the regmap
configuration.
Fixes: f231ff38b7 ("regmap: spi: Set regmap max raw r/w from max_transfer_size")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220818104851.429479-1-cristian.ciocaltea@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Use the newly added callback for accelerated noinc MMIO
to provide writesb, writesw, writesl, writesq, readsb, readsw,
readsl and readsq.
A special quirk is needed to deal with big endian regmaps: there
are no accelerated operations defined for big endian, so fall
back to calling the big endian operations itereatively for this
case.
The Hexagon architecture turns out to have an incomplete
<asm/io.h>: writesb() is not implemented. Fix this by doing
what other architectures do: include <asm-generic/io.h> into
the <asm/io.h> file.
Cc: Brian Cain <bcain@quicinc.com>
Cc: linux-hexagon@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220816204832.265837-2-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Several architectures have accelerated operations for MMIO
operations writing to a single register, such as writesb, writesw,
writesl, writesq, readsb, readsw, readsl and readsq but regmap
currently cannot use them because we have no hooks for providing
an accelerated noinc back-end for MMIO.
Solve this by providing reg_[read/write]_noinc callbacks for
the bus abstraction, so that the regmap-mmio bus can use this.
Currently I do not see a need to support this for custom regmaps
so it is only added to the bus.
Callbacks are passed a void * with the array of values and a
count which is the number of items of the byte chunk size for
the specific register width.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220816204832.265837-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
Currently regmap MMIO doesn't support IO ports, while being inconsistent
in used IO accessors. Fix the latter and extend framework with the
former.
Since we have a proper endianness converters for BE 24-bit data use
them. While at it, format the code using switch-cases as it's done
for the rest of the endianness handlers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220726151213.71712-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently regmap MMIO is inconsistent with IO accessors. I.e.
the Big Endian counterparts are using ioreadXXbe() / iowriteXXbe()
which are not clean implementations of readXXbe().
That said, reimplement current Big Endian MMIO accessors by replacing
ioread()/iowrite() with respective read()/write() and swab() calls.
Note, there are no current in-kernel users that may utilize the
functionality of the IO ports on Big Endian hardware. All drivers
that use regmap MMIO either Little Endian, or they don't map IO
ports in a way that ioreadXX()/iowriteXX() may be utilized.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Some users may use regmap MMIO for IO ports, and this can be done
by assigning ioreadXX()/iowriteXX() and their Big Endian counterparts
to the regmap context.
Add IO port support with a corresponding flag added.
While doing that, make sure that user won't select relaxed MMIO access
along with IO port because the latter have no relaxed variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The current implementation, besides having no active users, is broken
by design of regmap. For 64-bit IO we need to supply 64-bit value,
otherwise there is no way to handle upper 32 bits in 64-bit register.
Hence, remove the broken IO accessors for good and wait for real user
that can fix entire regmap API for that.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no need to keep mmio_relaxed member in the context, it's
onetime used during generation of the context. Remove it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.
This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.
With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Here is the set of driver core and kernfs changes for 6.0-rc1.
"biggest" thing in here is some scalability improvements for kernfs for
large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed
and discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYuqCnw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym/JgCcCnaycJY00ZPRQm3LQCyzfJ0HgqoAn2qxGV+K
NKycLeXZSnuvIA87dycE
=/4Jk
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core / kernfs updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.0-rc1.
The "biggest" thing in here is some scalability improvements for
kernfs for large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed and
discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems"
* tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (63 commits)
docs: embargoed-hardware-issues: fix invalid AMD contact email
firmware_loader: Replace kmap() with kmap_local_page()
sysfs docs: ABI: Fix typo in comment
kobject: fix Kconfig.debug "its" grammar
kernfs: Fix typo 'the the' in comment
docs: driver-api: firmware: add driver firmware guidelines. (v3)
arch_topology: Fix cache attributes detection in the CPU hotplug path
ACPI: PPTT: Leave the table mapped for the runtime usage
cacheinfo: Use atomic allocation for percpu cache attributes
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
MAINTAINERS: Change mentions of mpm to olivia
docs: ABI: sysfs-devices-soc: Update Lee Jones' email address
docs: ABI: sysfs-class-pwm: Update Lee Jones' email address
Documentation/process: Add embargoed HW contact for LLVM
Revert "kernfs: Change kernfs_notify_list to llist."
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
...
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq policy
which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
fucntion parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoKy8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx3+oQAJNVU+W14EaRPWXQRMuwBC5zk3hb6T9q
JqmMd8coEd+9/4ABAeRAWso1B26rUzB6JyBvw3lGH9OXInpYmvnJEhEPrTpK2h0D
U9HxEARuGJolrDm0X9NAkn7tKKMC9GnvPS9z2s7s+N97VFFWC/QiU+PHB0SypGNb
JxRfbVJZQCuxmNG9UeK+xeHFQ9lM2Z9ZdTxR71G0n7nQPPR+sUvnFufFby3Aogf3
XnBYfia+YNqkUlefxxwB5a0cFwPXOUGsQkIf4d64gZnq1TgZ+71kht1GEF08PDFl
wV8v1rOWuXEae8dozuf5xszp/eVyAqzgB+IShT9APREOO3Wg6I16XdBm8R1TGwCK
JTdZqnm6HVKBNqchEwYViJILX69rrNUT+AwHBWhtKKDNh3qeTuwi/JGTeDVN++en
xf3TNKx3LV31Nq6nWJFzDGLehfZMnAPkhfYohUBI7FNyblpk4mJRVcZ0bYI7UNnS
als77uoipvb5KdFCtdhxYBHd/y867NvXKa1qsAuDxusAsfJHf4SnlMdbgOepBH2y
jJg06CGrMDU3TZ8BL+WpqUYk4irQnAMs/159Txh7A6/dOnTjE7S9NHrENCwmt2og
FrHSLH1eLX6Sa4RSibiGHPC7mNULP2/TOtryf3zFdlIVcjm3NEU3bnfzx7nlJn05
8t6ObMxgMhWT
=XeLV
-----END PGP SIGNATURE-----
Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly minor improvements all over including new CPU IDs for
the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
updates, documentation cleanups and a new version of the pm-graph
suite of utilities.
Specifics:
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq
policy which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
function parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello)"
* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
PM: QoS: Add check to make sure CPU freq is non-negative
PM: hibernate: defer device probing when resuming from hibernation
intel_idle: make SPR C1 and C1E be independent
cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
cpufreq: loongson2: fix Kconfig "its" grammar
pm-graph v5.9
cpufreq: Warn users while freeing active policy
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
PM / devfreq: shut up kernel-doc warnings
dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
dt-bindings: interconnect: Add MediaTek CCI dt-bindings
PM: domains: Ensure genpd_debugfs_dir exists before remove
PM: runtime: Extend support for wakeirq for force_suspend|resume
...
The big thing this release is a big cleanup of the interrupt code from
Aidan MacDonald, plus a few new API updates:
- Rework of the interrupt code, making it much simpler and easier to
extend.
- Support for device specific update bits operations with devices that
otherwise use bitstream interfaces.
- Support for bit operations on fields as well as whole registers.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmLnvh0ACgkQJNaLcl1U
h9DlbQf/RELySinbi6WTIthuigJTfQq7xFIrR3LsEIrLWTuKsb+mBtxPkv0sAUF4
lmTtxmixtCAp3z2xWLXTw99dxDcII49YPmR1TzKZ9vBsK0vkAof6t7BQyhFoICpy
cbGdw4Mqi3qOHHeDH3obNbYhz1IwWL47Q0eASkNyaHrrnxykIyjeJ0TTURoVWJpX
FEzGwvFtH4+5w3yc0aE+WvjzHXgPj/xAsyE835TF9jv8cW9a2/KAYPLo7gW+icaz
Qx4JrmwMBuxzJEuaRvx2dxtasDCZKPFyod1cKS2gTpF4OnNKHocDwC3cSoGDLztr
BRljUV3VfWzOB8DdeAaB5XQM8LJeiw==
=talc
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The big thing this release is a big cleanup of the interrupt code from
Aidan MacDonald, plus a few new API updates:
- Rework of the interrupt code, making it much simpler and easier to
extend
- Support for device specific update bits operations with devices
that otherwise use bitstream interfaces
- Support for bit operations on fields as well as whole registers"
* tag 'regmap-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: permit to set reg_update_bits with bulk implementation
regmap: add WARN_ONCE when invalid mask is provided to regmap_field_init()
regmap-irq: Fix bug in regmap_irq_get_irq_reg_linear()
regmap: cache: Add extra parameter check in regcache_init
regmap-irq: Deprecate the not_fixed_stride flag
regmap-irq: Add get_irq_reg() callback
regmap-irq: Fix inverted handling of unmask registers
regmap-irq: Deprecate type registers and virtual registers
regmap-irq: Introduce config registers for irq types
regmap-irq: Refactor checks for status bulk read support
regmap-irq: Remove mask_writeonly and regmap_irq_update_bits()
regmap-irq: Remove inappropriate uses of regmap_irq_update_bits()
regmap-irq: Remove an unnecessary restriction on type_in_mask
regmap-irq: Cleanup sizeof(...) use in memory allocation
regmap-irq: Remove unused type_reg_stride field
regmap-irq: Convert bool bitfields to unsigned int
regmap: Don't warn about cache only mode for devices with no cache
regmap: provide regmap_field helpers for simple bit operations
regmap: cache: Fix syntax errors in comments
Merge core device power management changes for v5.20-rc1:
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
* pm-core:
PM: runtime: Extend support for wakeirq for force_suspend|resume
* pm-sleep:
PM: hibernate: defer device probing when resuming from hibernation
PM: wakeup: Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP
* powercap:
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
powercap: intel_rapl: Add support for RAPTORLAKE_P
* pm-domains:
PM: domains: Ensure genpd_debugfs_dir exists before remove
* pm-em:
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
The use of kmap() is being deprecated in favor of kmap_local_page().
Two main problems with kmap(): (1) It comes with an overhead as mapping
space is restricted and protected by a global lock for synchronization and
(2) kmap() also requires global TLB invalidation when the kmap’s pool
wraps and it might block when the mapping space is fully utilized until a
slot becomes available.
kmap_local_page() is preferred over kmap() and kmap_atomic(). Where it
cannot mechanically replace the latters, code refactor should be considered
(special care must be taken if kernel virtual addresses are aliases in
different contexts).
With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
Call kmap_local_page() in firmware_loader wherever kmap() is currently
used. In firmware_rw() use the helpers copy_{from,to}_page() instead of
open coding the local mappings + memcpy().
Successfully tested with "firmware" selftests on a QEMU/KVM 32-bits VM
with 4GB RAM, booting a kernel with HIGHMEM64GB enabled.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Link: https://lore.kernel.org/r/20220714235030.12732-1-fmdefrancesco@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
init_cpu_topology() is called only once at the boot and all the cache
attributes are detected early for all the possible CPUs. However when
the CPUs are hotplugged out, the cacheinfo gets removed. While the
attributes are added back when the CPUs are hotplugged back in as part
of CPU hotplug state machine, it ends up called quite late after the
update_siblings_masks() are called in the secondary_start_kernel()
resulting in wrong llc_sibling_masks.
Move the call to detect_cache_attributes() inside update_siblings_masks()
to ensure the cacheinfo is updated before the LLC sibling masks are
updated. This will fix the incorrect LLC sibling masks generated when
the CPUs are hotplugged out and hotplugged back in again.
Reported-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-3-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On couple of architectures like RISC-V and ARM64, we need to detect
cache attribues quite early during the boot when the secondary CPUs
start. So we will call detect_cache_attributes in the atomic context
and since use of normal allocation can sleep, we will end up getting
"sleeping in the atomic context" bug splat.
In order avoid that, move the allocation to use atomic version in
preparation to move the actual detection of cache attributes in the
CPU hotplug path which is atomic.
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-1-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A regmap may still require to set a custom reg_update_bits instead of
relying to the regmap_bus_read/write general function.
Permit to set it in the map if provided by the regmap config.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220715201032.19507-1-ansuelsmth@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Using bin_attributes with a 0 size causes fstat and friends to return that
0 size. This breaks userspace code that retrieves the size before reading
the file. Rather than reverting 75bd50fa84 ("drivers/base/node.c: use
bin_attribute to break the size limitation of cpumap ABI") let's put in a
size value at compile time.
For cpulist the maximum size is on the order of
NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2
which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need
a system with every other CPU on one node. For example: (0,2,4,8, ... ).
To simplify the math and support larger NR_CPUS in the future we are using
(NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older
behavior for smaller NR_CPUS.
The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1
(or NR_CPUS * 9/32 - 1) including the ","s.
Add a set of macros for these values to cpumask.h so they can be used in
multiple places. Apply these to the handful of such files in
drivers/base/topology.c as well as node.c.
As an example, on an 80 cpu 4-node system (NR_CPUS == 8192):
before:
-r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist
-r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap
after:
-r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist
-r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap
CONFIG_NR_CPUS = 16384
-r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist
-r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap
The actual number of cpus doesn't matter for the reported size since they
are based on NR_CPUS.
Fixes: 75bd50fa84 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI")
Fixes: bb9ec13d15 ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h)
Signed-off-by: Phil Auld <pauld@redhat.com>
Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both genpd_debug_add() and genpd_debug_remove() may be called
indirectly by other drivers while genpd_debugfs_dir is not yet
set. For example, drivers can call pm_genpd_init() in probe or
pm_genpd_init() in probe fail/cleanup path:
pm_genpd_init()
--> genpd_debug_add()
pm_genpd_remove()
--> genpd_remove()
--> genpd_debug_remove()
At this time, genpd_debug_init() may not yet be called.
genpd_debug_add() checks that if genpd_debugfs_dir is NULL, it
will return directly. Make sure this is also checked
in pm_genpd_remove(), otherwise components under debugfs root
which has the same name as other components under pm_genpd may
be accidentally removed, since NULL represents debugfs root.
Fixes: 718072ceb2 ("PM: domains: create debugfs nodes when adding power domains")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the now
pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLKqAgACgkQEsHwGGHe
VUoM5w/8CSvwPZ3otkhmu8MrJPtWc7eLDPjYN4qQP+19e+bt094MoozxeeWG2wmp
hkDJAYHT2Oik/qDuEdhFgNYwS7XGgbV3Py3B8syO4//5SD5dkOSG+QqFXvXMdFri
YsVqqNkjJOWk/YL9Ql5RS/xQewsrr0OqEyWWocuI6XAvfWV4kKvlRSd+6oPqtZEO
qYlAHTXElyIrA/gjmxChk1HTt5HZtK3uJLf4twNlUfzw7LYFf3+sw3bdNuiXlyMr
WcLXMwGpS0idURwP3mJa7JRuiVBzb4+kt8mWwWqA02FkKV45FRRRFhFUsy667r00
cdZBaWdy+b7dvXeliO3FN/x1bZwIEUxmaNy1iAClph4Ifh0ySPUkxAr8EIER7YBy
bstDJEaIqgYg8NIaD4oF1UrG0ZbL0ImuxVaFdhG1hopQsh4IwLSTLgmZYDhfn/0i
oSqU0Le+A7QW9s2A2j6qi7BoAbRW+gmBuCgg8f8ECYRkFX1ZF6mkUtnQxYrU7RTq
rJWGW9nhwM9nRxwgntZiTjUUJ2HtyXEgYyCNjLFCbEBfeG5QTg7XSGFhqDbgoymH
85vsmSXYxgTgQ/kTW7Fs26tOqnP2h1OtLJZDL8rg49KijLAnISClEgohYW01CWQf
ZKMHtz3DM0WBiLvSAmfGifScgSrLB5AjtvFHT0hF+5/okEkinVk=
=09fW
-----END PGP SIGNATURE-----
Merge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retbleed fixes from Borislav Petkov:
"Just when you thought that all the speculation bugs were addressed and
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the
now pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide"
* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
x86/speculation: Disable RRSBA behavior
x86/kexec: Disable RET on kexec
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
x86/bugs: Add Cannon lake to RETBleed affected CPU list
x86/retbleed: Add fine grained Kconfig knobs
x86/cpu/amd: Enumerate BTC_NO
x86/common: Stamp out the stepping madness
KVM: VMX: Prevent RSB underflow before vmenter
x86/speculation: Fill RSB on vmexit for IBRS
KVM: VMX: Fix IBRS handling after vmexit
KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
KVM: VMX: Convert launched argument to flags
KVM: VMX: Flatten __vmx_vcpu_run()
objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
x86/speculation: Remove x86_spec_ctrl_mask
x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
x86/speculation: Fix SPEC_CTRL write on SMT state change
x86/speculation: Fix firmware entry SPEC_CTRL handling
x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
...
A driver that makes use of pm_runtime_force_suspend|resume() to support
system suspend/resume, currently needs to manage the wakeirq support
itself. To avoid the boilerplate code in the driver's system suspend/resume
callbacks in particular, let's extend pm_runtime_force_suspend|resume() to
deal with the wakeirq.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.
In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmLFak4ACgkQAEG6vDF+
4pjHNxAAyzXpazkWuTjTot9UcX2TE8kIlEB4wXoGr1WJfi2uefVuvo3owvSKLdmr
pBNUf0fSLKShljueXOYcmwxVvSoN3zWSrOh4huUBWv0VPBg5yyNplZChwbhxjmiL
N5FGtSDHoTgPjjMUXnlsa/Y7/RDUhxR+s4ZdGS/vnMHbGr8Gsm5bjc6BNCq9E6cz
xnGDzUOS3+Sin+751De09HIuH5FOoCEpWOj6FGVK3MtEsizHU4ANEKTgFdsE26mG
nmVY1CU3GJmHluvG1JgL4+HmZsW02h2yU0tRSLqcseJCUou8gJ5yr0wYF6wmsHGk
nzGDfV7GzLdQg5rVnWcgzrzibqbBKJvh795e2cW3tV60VjMxlh57L7OWnHAEzQh8
QCF8HeE2CGg6VlYC+oB5JZ4pLdPE69e0+fHzhFy7hqK/B9yOr0olIKXxcm4tR/TV
5Ri7D8bX4Oviq1pQcT+GE/8Of5vX5R9LTiH1V0ld38npVvA55KDnO6WKvcadEucO
tKnHZx2dZYR/sRJ8ABz4hb7+UlLiLpCPFudx+BcXLHn+nWSqXYlu+F/2D2nsGRP+
HpmJHISJ40gD669KvupLg0/TtPdQ1oRmuf9CXUiMkzQZcySIW4wmFvCvbUfMcApl
7OzQ+FtAPb7n821mSV7KJBYJ5xOxNVR7DfUovmXCa/91xfqmE1M=
=UCxa
-----END PGP SIGNATURE-----
Merge tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
cacheinfo and arch_topology updates for v5.20
These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.
In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).
* tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (21 commits)
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Drop LLC identifier stash from the CPU topology
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Use the last level cache information from the cacheinfo
arch_topology: Add support to parse and detect cache attributes
cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
cacheinfo: Use cache identifiers to check if the caches are shared if available
cacheinfo: Allow early detection and population of cache attributes
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
...
In regmap_field_init() when a invalid mask is provided it still
initializes with any warnings.
An example of this is when the LSB is greater than MSB a mask of zero
is produced.
WARN_ONCE() is not ideal for this but requires less changes to core regmap
code.
Cc: Mark Brown <broonie@kernel.org>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Matt Ranostay <mranostay@ti.com>
Link: https://lore.kernel.org/r/20220708013125.313892-1-mranostay@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Previously the CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP device_init_wakeup()
implementations differed in confusing ways:
- The PM_SLEEP version checked for a NULL device pointer and returned
-EINVAL, while the !PM_SLEEP version did not and would simply
dereference a NULL pointer.
- When called with "false", the !PM_SLEEP version cleared "capable" and
"enable" in the opposite order of the PM_SLEEP version. That was
harmless because for !PM_SLEEP they're simple assignments, but it's
unnecessary confusion.
Use a simplified version of the PM_SLEEP implementation for both cases.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
irq_reg_stride in struct regmap_irq_chip is often 0, but that
actually means to use the default stride of 1. The effective
stride is stored in struct regmap_irq_chip_data->irq_reg_stride
and will get the corrected default value.
The default ->get_irq_reg() callback was using the stride from
the chip definition, which is wrong; fix it to use the effective
stride from the chip data instead.
Link: https://lore.kernel.org/lkml/acaaf77f-3282-8544-dd3c-7915fc1a6a4f@samsung.com/
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220704112847.23844-1-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
We don't support the topology for clusters of CPU clusters while the
DT and ACPI bindings theoritcally support the same. Just warn about the
same so that it is clear to the users of arch_topology that the nested
clusters are not yet supported.
Link: https://lore.kernel.org/r/20220704101605.1318280-21-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.
Also it is likely that most single socket systems skip to as the node
since it is optional.
Link: https://lore.kernel.org/r/20220704101605.1318280-20-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.
We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.
Link: https://lore.kernel.org/r/20220704101605.1318280-19-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently the cluster identifier is not set on DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly, the cluster_sibling mask will be
populated and returned by cpu_clustergroup_mask() to contribute in the
creation of the CLS scheduling domain level, if SCHED_CLUSTER is
enabled.
To avoid topologies that will result in questionable or incorrect
scheduling domains, impose restrictions regarding the span of clusters,
as presented to scheduling domains building code: cluster_sibling should
not span more or the same CPUs as cpu_coregroup_mask().
This is needed in order to obtain a strict separation between the MC and
CLS levels, and maintain the same domains for existing platforms in
the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
is redundant and irrelevant for the scheduler.
While previously the scheduling domain builder code would have removed MC
as redundant and kept CLS if SCHED_CLUSTER was enabled and the
cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
now CLS will be removed and MC kept.
Link: https://lore.kernel.org/r/20220704101605.1318280-18-sudeep.holla@arm.com
Cc: Darren Hart <darren@os.amperecomputing.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.
The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not supported on most of the old and existing systems, we
can assume all such systems have single socket/physical package.
Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.
Link: https://lore.kernel.org/r/20220704101605.1318280-17-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.
Let us just break out of the loop early in such case.
Link: https://lore.kernel.org/r/20220704101605.1318280-16-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.
Link: https://lore.kernel.org/r/20220704101605.1318280-15-sudeep.holla@arm.com
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that may result in getting the thread
siblings wrong as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.
So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.
This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.
Link: https://lore.kernel.org/r/20220704101605.1318280-14-sudeep.holla@arm.com
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.
Remove the redundant LLC ID from the generic CPU arch_topology
information.
Link: https://lore.kernel.org/r/20220704101605.1318280-13-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.
This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.
Link: https://lore.kernel.org/r/20220704101605.1318280-11-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.
In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
Note that this change builds the cacheinfo early even on ACPI systems,
but the current mechanism of building llc_sibling mask remains unchanged.
Link: https://lore.kernel.org/r/20220704101605.1318280-10-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The checks to skip the CPU itself or no cacheinfo case are implemented
bit differently though the effect is exactly same. Just align the
implementation in both cache_shared_cpu_map_{setup,remove} just for
improved readability. No functional change.
Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.com
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.
We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.
Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.
Allow detect_cache_attributes to be called quite early during the boot.
Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.
This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.
Also add helper to check if the llc information in cacheinfo is valid
or not.
Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
cache_leaves_are_shared is already used even with ACPI and PPTT. It
checks if the cache leaves are the shared based on fw_token pointer.
However it is defined conditionally only if CONFIG_OF is enabled which
is wrong.
Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.
Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.
Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.
Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
much earlier before the CPUs are registered as devices.
Link: https://lore.kernel.org/r/20220704101605.1318280-3-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Because pm_runtime_get_suppliers() bumps up the rpm_active counter
of each device link to a supplier of the given device in addition
to bumping up the supplier's PM-runtime usage counter, a runtime
suspend of the consumer device may case the latter to go down to 0
when pm_runtime_put_suppliers() is running on a remote CPU. If that
happens after pm_runtime_put_suppliers() has released power.lock for
the consumer device, and a runtime resume of that device takes place
immediately after it, before pm_runtime_put() is called for the
supplier, that pm_runtime_put() call may cause the supplier to be
suspended even though the consumer is active.
To prevent that from happening, modify pm_runtime_get_suppliers() to
call pm_runtime_get_sync() for the given device's suppliers without
touching the rpm_active counters of the involved device links
Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put()
for the given device's suppliers without looking at the rpm_active
counters of the device links at hand. [This is analogous to what
happened before commit 4c06c4e6cf ("driver core: Fix possible
supplier PM-usage counter imbalance").]
Since pm_runtime_get_suppliers() sets supplier_preactivated for each
device link where the supplier's PM-runtime usage counter has been
incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for
the suppliers whose device links have supplier_preactivated set, the
PM-runtime usage counter is balanced for each supplier and this is
independent of the runtime suspend and resume of the consumer device.
However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped
during the consumer device probe, so pm_runtime_get_suppliers() bumps
up the supplier's PM-runtime usage counter, but it cannot be dropped by
pm_runtime_put_suppliers(), make device_link_release_fn() take care of
that.
Fixes: 4c06c4e6cf ("driver core: Fix possible supplier PM-usage counter imbalance")
Reported-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Instead of passing an extra bool argument to pm_runtime_release_supplier(),
make its callers take care of triggering a runtime-suspend of the
supplier device as needed.
No expected functional impact.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Merge series from Aidan MacDonald <aidanmacdonald.0x0@gmail.com>:
This series is an attempt at cleaning up the regmap-irq API in order
to simplify things and consolidate existing features, while at the
same time generalizing it to support a wider range of hardware.
There is a new system for IRQ type configuration, some tweaks to
unmask registers so they're more intuitive and useful, and a new
callback for calculating register addresses. There's also a few
minor code cleanups in here.
In v2 I've taken the approach of adding new features and deprecating
existing ones rather than removing them aggressively. Warnings will
be issued for any drivers that use deprecated features, but they'll
otherwise continue to function normally.
One important caveat: not all of these changes are tested beyond
compile testing, since I don't have hardware to exercise all of
the features.
When num_reg_defaults > 0 but reg_defaults is NULL, there will be a
NULL pointer exception.
Current code has no such usage, but as additional hardening, also
check this to prevent any chance of crashing.
Signed-off-by: Schspa Shi <schspa@gmail.com>
Link: https://lore.kernel.org/r/20220629130951.63040-1-schspa@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This flag is a bit of a hack and the same thing can be accomplished
using a custom ->get_irq_reg() callback. Add a warning to catch any
use of the flag.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-13-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Replace the internal sub_irq_reg() function with a public callback
that drivers can use when they have more complex register layouts.
The default implementation is regmap_irq_get_irq_reg_linear(), used
if the chip doesn't provide its own callback.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-12-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
To me "unmask" suggests that we write 1s to the register when
an interrupt is enabled. This also makes sense because it's the
opposite of what the "mask" register does (write 1s to disable
an interrupt).
But regmap-irq does the opposite: for a disabled interrupt, it
writes 1s to "unmask" and 0s to "mask". This is surprising and
deviates from the usual way mask registers are handled.
Additionally, mask_invert didn't interact with unmask registers
properly -- it caused them to be ignored entirely.
Fix this by making mask and unmask registers orthogonal, using
the following behavior:
* Mask registers are written with 1s for disabled interrupts.
* Unmask registers are written with 1s for enabled interrupts.
This behavior supports both normal or inverted mask registers
and separate set/clear registers via different combinations of
mask_base/unmask_base.
The old unmask register behavior is deprecated. Drivers need to
opt-in to the new behavior by setting mask_unmask_non_inverted.
Warnings are issued if the driver relies on deprecated behavior.
Chips that only set one of mask_base/unmask_base don't have to
use the mask_unmask_non_inverted flag because that use case was
previously not supported.
The mask_invert flag is also deprecated in favor of describing
inverted mask registers as unmask registers.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-11-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Config registers can be used to replace both type and virtual
registers, so mark both features are deprecated and issue a
warning if they're used.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-10-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Config registers provide a more uniform approach to handling irq type
registers. They are essentially an extension of the virtual registers
used by the qcom-pm8008 driver.
Config registers can be represented as a 2D array:
config_base[0] reg0,0 reg0,1 reg0,2 reg0,3
config_base[1] reg1,0 reg1,1 reg1,2 reg1,3
config_base[2] reg2,0 reg2,1 reg2,2 reg2,3
There are 'num_config_bases' base registers, each of which is used to
address 'num_config_regs' registers. The addresses are calculated in
the same way as for other bases. It is assumed that an irq's type is
controlled by one column of registers; that column is identified by
the irq's 'type_reg_offset'.
The set_type_config() callback is responsible for updating the config
register contents. It receives an array of buffers (each represents a
row of registers) and the index of the column to update, along with
the 'struct regmap_irq' description and requested irq type.
Buffered values are written to registers in regmap_irq_sync_unlock().
Note that the entire register contents are overwritten, which is a
minor change in behavior from type registers via 'type_base'.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-9-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There are several conditions that must be satisfied to support
bulk read of status registers. Move the check into a function
to avoid duplicating it in two places.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-8-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit a71411dbf6 ("regmap: irq: add chip option mask_writeonly")
introduced the mask_writeonly option, but it isn't used now and it
appears it's never been used by any in-tree drivers. The motivation
for the option is mentioned in the commit message,
Some irq controllers have writeonly/multipurpose register
layouts. In those cases we read invalid data back. [...]
The option causes mask register updates to use regmap_write_bits()
instead of regmap_update_bits().
However, regmap_write_bits() doesn't solve the reading invalid data
problem. It's still a read-modify-write op like regmap_update_bits().
The difference is that 'update bits' will only write the new value
if it is different from the current value, while 'write bits' will
write the new value unconditionally, even if it's the same as the
current value.
This seems like a bit of a specialized use case and probably isn't
that useful for regmap-irq, so let's just remove the option and go
back to using an 'update bits' op for the mask registers. We can
always add the option back if some driver ends up needing it in the
future.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-7-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
regmap_irq_update_bits() is misnamed and should only be used for
updating mask registers, since it checks the mask_writeonly flag.
However, it was also used for updating wake and type registers.
It's safe to replace these uses with regmap_update_bits() because
there are no users of the mask_writeonly flag.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-6-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Check types_supported instead of checking type_rising/falling_val
when using type_in_mask interrupts. This makes the intent clearer
and allows a type_in_mask irq to support level or edge triggers,
rather than only edge triggers.
Update the documentation and comments to reflect the new behavior.
This shouldn't affect existing drivers, because if they didn't
set types_supported properly the type buffer wouldn't be updated.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-5-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Instead of mentioning unsigned int directly, use a sizeof(...)
involving the buffer we're allocating to ensure the types don't
get out of sync.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-4-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
It appears that no chip ever required a nonzero type_reg_stride
and commit 1066cfbdfa ("regmap-irq: Extend sub-irq to support
non-fixed reg strides") broke support. Just remove the field.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When firmware sets the FWNODE_FLAG_BEST_EFFORT flag for a fwnode,
fw_devlink will do a best effort ordering for that device where it'll
only enforce the probe/suspend/resume ordering of that device with
suppliers that have drivers. The driver of that device can then decide
if it wants to defer probe or probe without the suppliers.
This will be useful for avoid probe delays of the console device that
were caused by commit 71066545b4 ("driver core: Set
fw_devlink.strict=1 by default").
Fixes: 71066545b4 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Sascha Hauer <sha@pengutronix.de>
Reported-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220623080344.783549-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
stack like commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
list below:
In __driver_attach function, The lock holding logic is as follows:
...
__driver_attach
if (driver_allows_async_probing(drv))
device_lock(dev) // get lock dev
async_schedule_dev(__driver_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__driver_attach_async_helper will get lock dev as
will, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:
[ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid:
0 flags:0x00004000
[ 370.788865] Call Trace:
[ 370.789374] <TASK>
[ 370.789841] __schedule+0x482/0x1050
[ 370.790613] schedule+0x92/0x1a0
[ 370.791290] schedule_preempt_disabled+0x2c/0x50
[ 370.792256] __mutex_lock.isra.0+0x757/0xec0
[ 370.793158] __mutex_lock_slowpath+0x1f/0x30
[ 370.794079] mutex_lock+0x50/0x60
[ 370.794795] __device_driver_lock+0x2f/0x70
[ 370.795677] ? driver_probe_device+0xd0/0xd0
[ 370.796576] __driver_attach_async_helper+0x1d/0xd0
[ 370.797318] ? driver_probe_device+0xd0/0xd0
[ 370.797957] async_schedule_node_domain+0xa5/0xc0
[ 370.798652] async_schedule_node+0x19/0x30
[ 370.799243] __driver_attach+0x246/0x290
[ 370.799828] ? driver_allows_async_probing+0xa0/0xa0
[ 370.800548] bus_for_each_dev+0x9d/0x130
[ 370.801132] driver_attach+0x22/0x30
[ 370.801666] bus_add_driver+0x290/0x340
[ 370.802246] driver_register+0x88/0x140
[ 370.802817] ? virtio_scsi_init+0x116/0x116
[ 370.803425] scsi_register_driver+0x1a/0x30
[ 370.804057] init_sd+0x184/0x226
[ 370.804533] do_one_initcall+0x71/0x3a0
[ 370.805107] kernel_init_freeable+0x39a/0x43a
[ 370.805759] ? rest_init+0x150/0x150
[ 370.806283] kernel_init+0x26/0x230
[ 370.806799] ret_from_fork+0x1f/0x30
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the devtmpfs fails to mount, a dangling pointer still remains in
global. Specifically, the err variable is passed by a pointer to the
devtmpfsd. When the devtmpfsd exits, it sets the error and completes the
setup_done. In this situation, the thread pointer is not set to null.
After the devtmpfsd exited, the devtmpfs can wakes up the destroyed
devtmpfsd thread by wake_up_process if a device change event comes.
Signed-off-by: Yangxi Xiang <xyangxi5@gmail.com>
Link: https://lore.kernel.org/r/20220627120409.11174-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 77515ebaf0 as it
causes build problems in linux-next. It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.
Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For devices with no cache it can make sense to use cache only mode as a
mechanism for trapping writes to hardware which is inaccessible but since
no cache is equivalent to cache bypass we force such devices into bypass
mode. This means that our check that bypass and cache only mode aren't both
enabled simultanously is less sensible for devices without a cache so relax
it.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220622171723.1235749-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Fixes for post-5.18 changes:
- fix for a damon boot hang, from SeongJae
- fix for a kfence warning splat, from Jason Donenfeld
- fix for zero-pfn pinning, from Alex Williamson
- fix for fallocate hole punch clearing, from Mike Kravetz
Fixes pre-5.18 material:
- fix for a performance regression, from Marcelo
- fix for a hwpoisining BUG from zhenwei pi
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYri4RgAKCRDdBJ7gKXxA
jmhsAQDCvGqtIUhgkTwid8KBRNbowsg0LXd6k+gUjcxBhH403wEA0r0cxxkDAmgr
QNXn/qZRzQP2ji+pdjH9NBOsd2g2XQA=
=UGJ7
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Minor things, mainly - mailmap updates, MAINTAINERS updates, etc.
Fixes for this merge window:
- fix for a damon boot hang, from SeongJae
- fix for a kfence warning splat, from Jason Donenfeld
- fix for zero-pfn pinning, from Alex Williamson
- fix for fallocate hole punch clearing, from Mike Kravetz
Fixes for previous releases:
- fix for a performance regression, from Marcelo
- fix for a hwpoisining BUG from zhenwei pi"
* tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mailmap: add entry for Christian Marangi
mm/memory-failure: disable unpoison once hw error happens
hugetlbfs: zero partial pages during fallocate hole punch
mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py
mm: re-allow pinning of zero pfns
mm/kfence: select random number before taking raw lock
MAINTAINERS: add maillist information for LoongArch
MAINTAINERS: update MM tree references
MAINTAINERS: update Abel Vesa's email
MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer
MAINTAINERS: add Miaohe Lin as a memory-failure reviewer
mailmap: add alias for jarkko@profian.com
mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
mm: lru_cache_disable: use synchronize_rcu_expedited
mm/page_isolation.c: fix one kernel-doc comment
Two sets of fixes - one for things that were missed with the support for
custom bulk I/O operations introduced in the last merge window, and
another for some long standing issues with regmap-irq which affect a
fairly small subset of devices.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmK16ucACgkQJNaLcl1U
h9CXmAf/Rpq3pky1EEgc/BAx+Vi6K1PD/5VvUX1Wb0xHYK52yk+XBuyqkpzhTptm
BhY5TIZWriai6UdFnuZtxRYiDy65IBZo1SGtNMlwcnjBWzD1weIWukGmkPzRGr9y
Vdr6PK1SOsP+qVrxrnNsbdEIICev2KDzmVxXa81zsKeyj9Dx4W4ETbpbGDvwyaHk
y+cBucVrb3RSoruoJUL3iV7XeobroLNO1ZcD9Wim1Kx2QFtdgbOEF5oBmqWmfVlu
Au59ug97u3AGVJpeAFwvM6EFlG7ft2wn4gsP92QCJ58zUPLdxHaS/+mWAfUButCk
km/yP4Su9KyRFFecMLwIYa4+UvH9kA==
=vT5N
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two sets of fixes - one for things that were missed with the support
for custom bulk I/O operations introduced in the last merge window,
and another for some long standing issues with regmap-irq which affect
a fairly small subset of devices"
* tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Fix offset/index mismatch in read_sub_irq_data()
regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips
regmap: Wire up regmap_config provided bulk write in missed functions
regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't set
regmap: Re-introduce bulk read support check in regmap_bulk_read()
We need to divide the sub-irq status register offset by register
stride to get an index for the status buffer to avoid an out of
bounds write when the register stride is greater than 1.
Fixes: a2d21848d9 ("regmap: regmap-irq: Add main status register support")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When enabling a type_in_mask irq, the type_buf contents must be
AND'd with the mask of the IRQ we're enabling to avoid enabling
other IRQs by accident, which can happen if several type_in_mask
irqs share a mask register.
Fixes: bc998a7303 ("regmap: irq: handle HW using separate rising/falling edge interrupts")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-2-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The dev_coredumpv() and dev_coredumpm() could not be used in atomic
context, because they call kvasprintf_const() and kstrdup() with
GFP_KERNEL parameter. The process is shown below:
dev_coredumpv(.., gfp_t gfp)
dev_coredumpm(.., gfp_t gfp)
dev_set_name
kobject_set_name_vargs
kvasprintf_const(GFP_KERNEL, ...); //may sleep
kstrdup(s, GFP_KERNEL); //may sleep
This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm()
and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to
GFP_KERNEL in order to show they could not be used in atomic context.
Fixes: 833c95456a ("device coredump: add new device coredump class")
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are some functions that were missed by commit d77e745613 ("regmap:
Add bulk read/write callbacks into regmap_config") when support to define
bulk read/write callbacks in regmap_config was introduced.
The regmap_bulk_write() and regmap_noinc_write() functions weren't changed
to use the added map->write instead of the map->bus->write handler.
Also, the regmap_can_raw_write() was not modified to take map->write into
account. So will only return true if a bus with a .write callback is set.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-4-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Before adding support to define bulk read/write callbacks in regmap_config
by the commit d77e745613 ("regmap: Add bulk read/write callbacks into
regmap_config"), the regmap_noinc_read() function returned an errno early
a map->bus->read callback wasn't set.
But that commit dropped the check and now a call to _regmap_raw_read() is
attempted even when bulk read operations are not supported. That function
checks for map->read anyways but there's no point to continue if the read
can't succeed.
Also is a fragile assumption to make so is better to make it fail earlier.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-3-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Support for drivers to define bulk read/write callbacks in regmap_config
was introduced by the commit d77e745613 ("regmap: Add bulk read/write
callbacks into regmap_config"), but this commit wrongly dropped a check
in regmap_bulk_read() to determine whether bulk reads can be done or not.
Before that commit, it was checked if map->bus was set. Now has to check
if a map->read callback has been set.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-2-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull writeback and ext2 fixes from Jan Kara:
"A fix for writeback bug which prevented machines with kdevtmpfs from
booting and also one small ext2 bugfix in IO error handling"
* tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
init: Initialize noop_backing_dev_info early
ext2: fix fs corruption when trying to remove a non-empty directory with IO error
noop_backing_dev_info is used by superblocks of various
pseudofilesystems such as kdevtmpfs. After commit 10e1407310
("writeback: Fix inode->i_io_list not be protected by inode->i_lock
error") this broke because __mark_inode_dirty() started to access more
fields from noop_backing_dev_info and this led to crashes inside
locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty().
Fix the problem by initializing noop_backing_dev_info before the
filesystems get mounted.
Fixes: 10e1407310 ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error")
Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Add simple bit operations for setting, clearing and testing specific
bits with regmap_field.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKpskgACgkQJNaLcl1U
h9BF7Af7Bg1qhqvTJQKvfc1TU7pRnO2Mt90MJTTAFMgiMP/+FZhiaDrKm12SBTPh
w8Gomg8X+EYVpiqtywFoc0mZIUe03g9+8RpuKyK2EOReaguhU2qHQPDo2Dhx6x5p
VnNY9w/rhQsrX8153c6ICZ8YqrxNqCbgCh7eulMeXz74wbxwbrNdZbbCaM9+WCS/
fTWtNUwZ+3JtaSP55UCX/ITNrLct0gPdWqkRpLExv9HdOf1HaBRUTZH69ftQg82S
PHsxEBpPWKbViflSU7V5/UgeBCTS7FjRpwB6OEX69V+VGhRIey9IgZqbSsX9KoXX
RYPKj5DZD9JtPNA0W5Dzs0KyUIlDtA==
=vniQ
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKqENwACgkQJNaLcl1U
h9D1pQf/cbMWcZeRe6eRxvDueIhyMYXwhahxEHMCL6EOR5SLLARB40klJq5DDPlM
D7SZlPES1oXXQ8zm8rpBahfQf3AAeiUMZ4+bOMMU45xaNTKeOGPuihiP69PQxzp1
RryrB6/AN+qKepHaWsnhOyarVnJ2XUDJ58QBmYwee5UFdxiPPIzjMLC0FVEgLhIc
Ng75dFIur3SknYyoYyd9gsH/VydlaEiYQ9s3GRKtq+D0P/fbsylPp1w5Lup3tXWh
kGYyJroz9doevY5K+TeMsVqSJhES3VJkz9GVxkdwCB6qhrXzGkbdnU7ZF1MZGlnf
bOC6+lE+CxoDWx/apLmdJg6ZmnHxdQ==
=CWeE
-----END PGP SIGNATURE-----
Merge tag 'regmap-field-bit-helpers' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into regmap-5.20
regmap: Add regmap_field helpers for simple bit operations
Add simple bit operations for setting, clearing and testing specific
bits with regmap_field.
We have set/clear/test operations for regmap, but not for regmap_field yet.
So let's introduce regmap_field helpers too.
In many instances regmap_field_update_bits() is used for simple bit setting
and clearing. In these cases the last argument is redundant and we can
hide it with a static inline function.
This adds three new helpers for simple bit operations: set_bits,
clear_bits and test_bits (the last one defined as a regular function).
Signed-off-by: Li Chen <lchen@ambarella.com>
Link: https://lore.kernel.org/r/180eef422c3.deae9cd960729.8518395646822099769@zohomail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stale Data.
They are a class of MMIO-related weaknesses which can expose stale data
by propagating it into core fill buffers. Data which can then be leaked
using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmKXMkMTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWGPD/idalLIhhV5F2+hZIKm0WSnsBxAOh9K
7y8xBxpQQ5FUfW3vm7Pg3ro6VJp7w2CzKoD4lGXzGHriusn3qst3vkza9Ay8xu8g
RDwKe6hI+p+Il9BV9op3f8FiRLP9bcPMMReW/mRyYsOnJe59hVNwRAL8OG40PY4k
hZgg4Psfvfx8bwiye5efjMSe4fXV7BUCkr601+8kVJoiaoszkux9mqP+cnnB5P3H
zW1d1jx7d6eV1Y063h7WgiNqQRYv0bROZP5BJkufIoOHUXDpd65IRF3bDnCIvSEz
KkMYJNXb3qh7EQeHS53NL+gz2EBQt+Tq1VH256qn6i3mcHs85HvC68gVrAkfVHJE
QLJE3MoXWOqw+mhwzCRrEXN9O1lT/PqDWw8I4M/5KtGG/KnJs+bygmfKBbKjIVg4
2yQWfMmOgQsw3GWCRjgEli7aYbDJQjany0K/qZTq54I41gu+TV8YMccaWcXgDKrm
cXFGUfOg4gBm4IRjJ/RJn+mUv6u+/3sLVqsaFTs9aiib1dpBSSUuMGBh548Ft7g2
5VbFVSDaLjB2BdlcG7enlsmtzw0ltNssmqg7jTK/L7XNVnvxwUoXw+zP7RmCLEYt
UV4FHXraMKNt2ZketlomC8ui2hg73ylUp4pPdMXCp7PIXp9sVamRTbpz12h689VJ
/s55bWxHkR6S
=LBxT
-----END PGP SIGNATURE-----
Merge tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MMIO stale data fixes from Thomas Gleixner:
"Yet another hw vulnerability with a software mitigation: Processor
MMIO Stale Data.
They are a class of MMIO-related weaknesses which can expose stale
data by propagating it into core fill buffers. Data which can then be
leaked using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too"
* tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation/mmio: Print SMT warning
KVM: x86/speculation: Disable Fill buffer clear within guests
x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
x86/speculation/srbds: Update SRBDS mitigation selection
x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
x86/speculation: Add a common function for MD_CLEAR mitigation update
x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Documentation: Add documentation for Processor MMIO Stale Data
There are several places in the kernel where this kind of functionality is
being used. Provide a generic helper for such cases.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220610120219.18988-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function is no longer used. So delete it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that deferred_probe_timeout is non-zero by default, fw_devlink will
never permanently block the probing of devices. It'll try its best to
probe the devices in the right order and then finally let devices probe
even if their suppliers don't have any drivers.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-8-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices might need to be probed and bound successfully before the
kernel boot sequence can finish and move on to init/userspace. For
example, a network interface might need to be bound to be able to mount
a NFS rootfs.
With fw_devlink=on by default, some of these devices might be blocked
from probing because they are waiting on a optional supplier that
doesn't have a driver. While fw_devlink will eventually identify such
devices and unblock the probing automatically, it might be too late by
the time it unblocks the probing of devices. For example, the IP4
autoconfig might timeout before fw_devlink unblocks probing of the
network interface.
This function is available to temporarily try and probe all devices that
have a driver even if some of their suppliers haven't been added or
don't have drivers.
The drivers can then decide which of the suppliers are optional vs
mandatory and probe the device if possible. By the time this function
returns, all such "best effort" probes are guaranteed to be completed.
If a device successfully probes in this mode, we delete all fw_devlink
discovered dependencies of that device where the supplier hasn't yet
probed successfully because they have to be optional dependencies.
This also means that some devices that aren't needed for init and could
have waited for their optional supplier to probe (when the supplier's
module is loaded later on) would end up probing prematurely with limited
functionality. So call this function only when boot would fail without
it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that fw_devlink=on by default and fw_devlink supports
"power-domains" property, the execution will never get to the point
where driver_deferred_probe_check_state() is called before the supplier
has probed successfully or before deferred probe timeout has expired.
So, delete the call and replace it with -ENODEV.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 23cfbc6ec4 ("firmware: Add the support for ZSTD-compressed
firmware files") added support for ZSTD compression, but in the process
also made the previously default XZ compression a config option.
That means that anybody who upgrades their kernel and does a
make oldconfig
to update their configuration, will end up without the XZ compression
that the configuration used to have.
Add the 'default y' to make sure this doesn't happen.
The whole compression question should probably be improved upon, since
it is now possible to "enable" compression in the kernel config but not
enable any actual compression algorithm, which makes it all very
useless. It makes no sense to ask Kconfig questions that enable
situations that are nonsensical like that.
This at least fixes the immediate problem of a kernel update resulting
in a nonbootable machine because of a missed option.
Fixes: 23cfbc6ec4 ("firmware: Add the support for ZSTD-compressed firmware files")
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since we had to effectively reverted
commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") in an earlier patch, a non-zero
deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set
the default back to zero until we can fix that.
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
Fixes: 2b28a1a84a ("driver core: Extend deferred probe timeout on driver registration")
Cc: Mark Brown <broonie@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1]. This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.
Commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.
However, if wait_for_device_probe() is called from the kernel_init()
context:
- Before deferred_probe_initcall() [2], it causes the boot process to
hang due to a deadlock.
- After deferred_probe_initcall() [3], it blocks kernel_init() from
continuing till deferred_probe_timeout expires and beats the point of
deferred_probe_timeout that's trying to wait for userspace to load
modules.
Neither of this is good. So revert the changes to
wait_for_device_probe().
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/
Fixes: 35a672363a ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz <jstultz@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the set of driver core changes for 5.19-rc1.
Note, I'm not really happy with this pull request as-is, see below for
details, but overall this is all good for everything but a small set of
systems, which we have a fix for already.
Lots of tiny driver core changes and cleanups happened this cycle,
but the two major things were:
- firmware_loader reorganization and additions including the
ability to have XZ compressed firmware images and the ability
for userspace to initiate the firmware load when it needs to,
instead of being always initiated by the kernel. FPGA devices
specifically want this ability to have their firmware changed
over the lifetime of the system boot, and this allows them to
work without having to come up with yet-another-custom-uapi
interface for loading firmware for them.
- physical location support added to sysfs so that devices that
know this information, can tell userspace where they are
located in a common way. Some ACPI devices already support
this today, and more bus types should support this in the
future.
Smaller changes included:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten any
linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request if you want to take them directly, _OR_ I can just revert
the probe timeout changes and they can wait for the next -rc1 merge
cycle. Given that the fixes are tested, and pretty simple, I'm leaning
toward that choice. Sorry this all came at the end of the merge window,
I should have resolved this all 2 weeks ago, that's my fault as it was
in the middle of some travel for me.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time outs.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
GkjLXBGNWOPBa5cU52qf
=yEi/
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.19-rc1.
Lots of tiny driver core changes and cleanups happened this cycle, but
the two major things are:
- firmware_loader reorganization and additions including the ability
to have XZ compressed firmware images and the ability for userspace
to initiate the firmware load when it needs to, instead of being
always initiated by the kernel. FPGA devices specifically want this
ability to have their firmware changed over the lifetime of the
system boot, and this allows them to work without having to come up
with yet-another-custom-uapi interface for loading firmware for
them.
- physical location support added to sysfs so that devices that know
this information, can tell userspace where they are located in a
common way. Some ACPI devices already support this today, and more
bus types should support this in the future.
Smaller changes include:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten
any linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time-outs"
* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
driver core: fix deadlock in __device_attach
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
topology: Remove unused cpu_cluster_mask()
driver core: Extend deferred probe timeout on driver registration
MAINTAINERS: add Russ Weight as a firmware loader maintainer
driver: base: fix UAF when driver_attach failed
test_firmware: fix end of loop test in upload_read_show()
driver core: location: Add "back" as a possible output for panel
driver core: location: Free struct acpi_pld_info *pld
driver core: Add "*" wildcard support to driver_async_probe cmdline param
driver core: location: Check for allocations failure
arch_topology: Trace the update thermal pressure
kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
export: fix string handling of namespace in EXPORT_SYMBOL_NS
rpmsg: use local 'dev' variable
rpmsg: Fix calling device_lock() on non-initialized device
firmware_loader: describe 'module' parameter of firmware_upload_register()
firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
firmware_loader: Fix configs for sysfs split
selftests: firmware: Add firmware upload selftests
...
Here is the "big" set of USB and Thunderbolt driver changes for
5.18-rc1. For the most part it's been a quiet development cycle for the
USB core, but there are the usual "hot spots" of development activity.
Included in here are:
- Thunderbolt driver updates:
- fixes for devices without displayport adapters
- lane bonding support and improvements
- other minor changes based on device testing
- dwc3 gadget driver changes. It seems this driver will never
be finished given that the IP core is showing up in zillions
of new devices and each implementation decides to do something
different with it...
- uvc gadget driver updates as more devices start to use and
rely on this hardware as well
- usb_maxpacket() api changes to remove an unneeded and unused
parameter.
- usb-serial driver device id updates and small cleanups
- typec cleanups and fixes based on device testing
- device tree updates for usb properties
- lots of other small fixes and driver updates.
All of these have been in linux-next for weeks with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnZGw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymQhwCeLVANsQjBcL4ys4skl+1In17y28gAn3rEZ7rQ
Yv4uP9zadUqg3Cx0vjgf
=3s5s
-----END PGP SIGNATURE-----
Merge tag 'usb-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the "big" set of USB and Thunderbolt driver changes for
5.18-rc1. For the most part it's been a quiet development cycle for
the USB core, but there are the usual "hot spots" of development
activity.
Included in here are:
- Thunderbolt driver updates:
- fixes for devices without displayport adapters
- lane bonding support and improvements
- other minor changes based on device testing
- dwc3 gadget driver changes.
It seems this driver will never be finished given that the IP core
is showing up in zillions of new devices and each implementation
decides to do something different with it...
- uvc gadget driver updates as more devices start to use and rely on
this hardware as well
- usb_maxpacket() api changes to remove an unneeded and unused
parameter.
- usb-serial driver device id updates and small cleanups
- typec cleanups and fixes based on device testing
- device tree updates for usb properties
- lots of other small fixes and driver updates.
All of these have been in linux-next for weeks with no reported
problems"
* tag 'usb-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
USB: new quirk for Dell Gen 2 devices
usb: dwc3: core: Add error log when core soft reset failed
usb: dwc3: gadget: Move null pinter check to proper place
usb: hub: Simplify error and success path in port_over_current_notify
usb: cdns3: allocate TX FIFO size according to composite EP number
usb: dwc3: Fix ep0 handling when getting reset while doing control transfer
usb: Probe EHCI, OHCI controllers asynchronously
usb: isp1760: Fix out-of-bounds array access
xhci: Don't defer primary roothub registration if there is only one roothub
USB: serial: option: add Quectel BG95 modem
USB: serial: pl2303: fix type detection for odd device
xhci: Allow host runtime PM as default for Intel Alder Lake N xHCI
xhci: Remove quirk for over 10 year old evaluation hardware
xhci: prevent U2 link power state if Intel tier policy prevented U1
xhci: use generic command timer for stop endpoint commands.
usb: host: xhci-plat: omit shared hcd if either root hub has no ports
usb: host: xhci-plat: prepare operation w/o shared hcd
usb: host: xhci-plat: create shared hcd after having added main hcd
xhci: prepare for operation w/o shared hcd
xhci: factor out parts of xhci_gen_setup()
...
Including:
- Intel VT-d driver updates
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU
legacy binding
- Minor cleanups
- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group
is either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Patches to make forcing of cache-coherent DMA more coherent
between IOMMU drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
=W0hj
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Intel VT-d driver updates:
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates:
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU legacy
binding
- Minor cleanups
- Fix a BUG_ON in the vfio_iommu_group_notifier:
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group is
either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Make forcing of cache-coherent DMA more coherent between IOMMU
drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu/amd: Increase timeout waiting for GA log enablement
iommu/s390: Tolerate repeat attach_dev calls
iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
iommu/vt-d: Remove domain_update_iommu_snooping()
iommu/vt-d: Check domain force_snooping against attached devices
iommu/vt-d: Block force-snoop domain attaching if no SC support
iommu/vt-d: Size Page Request Queue to avoid overflow condition
iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
iommu/vt-d: Change return type of dmar_insert_one_dev_info()
iommu/vt-d: Remove unneeded validity check on dev
iommu/dma: Explicitly sort PCI DMA windows
iommu/dma: Fix iova map result check bug
iommu/mediatek: Fix NULL pointer dereference when printing dev_name
iommu: iommu_group_claim_dma_owner() must always assign a domain
iommu/arm-smmu: Force identity domains for legacy binding
iommu/arm-smmu: Support Tegra234 SMMU
dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
dt-bindings: arm-smmu: Document nvidia,memory-controller property
iommu/arm-smmu-qcom: Add SC8280XP support
dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
...